Spot Control

Closed-loop Mobile Manipulator Control Using a Distance Sensor

William Sun
Michael Gleicher

Abstract

We propose a closed-loop control system for mobile manipulator sweeping using a miniature optical time-of-flight sensor. As the arm follows a designated path, we utilize arm-mounted proximity sensors to ensure that the trajectory stays within the intended target range with minimal deviation throughout its sweep. We do so by combining direct distance measurements with proprioceptive data through a Kalman Filter to probabilistically estimate the true height of the robot hand. We are motivated to use SPADs for sensing because they are low-cost, lightweight, and low-power. This enables a diverse set of robot sensing applications without compromising any performance capabilities. We demonstrate the effectiveness of our method with a visual servoing application, where a mobile manipulator is able to correct its sweeping trajectory based on the distance sensor feedback.

Video Demonstration

The video demonstrates a simple visual servoing example, where a mobile manipulator corrects its hand position based on distance sensor feedback. At each update from the distance sensor, the mobile manipulator adjusts its hand position to maintain a constant height above the ground. When an object is placed underneath the hand, we use a tuned Kalman Filter to interpolate the true height of the hand between the proprioception and exteroception. With closed-loop control, we readjust its hand position to maintain constant height.

Data Capture

The Boston Dynamics Spot is the mobile manipulator used in this project. We mount an AMS TMF8820 sensor (9 pixel, 3x3 configuration) underneath the hand. We use a system of 8 OptiTrack motion capture cameras to provide ground truth data. For data collection, we use a single floor surface with a flat plane, and we move the Spot arm and base in various configurations around the room.

Methodology

Data synchronization: during data capture, we have three streams of data (distance sensor, proprioception, and ground truth). The distance sensor is by far the slowest stream (~2 Hz due to limitations in I2C bandwidth). The other two streams are much faster (~100 Hz). Hence, we use the closest timestamps from the proprioception and ground truth streams to the distance sensor timestamps to synchronize the data.

Proprioceptive data: the Spot provides proprioceptive data in the form of joint angles, joint velocities, and corresponding transforms between frames of reference. The Cartesian estimate of the robot hand can be directly calculated from the joint angles and transforms. Then, the distance corresponding to the hand-mounted sensor is given by \(1/\cos(\psi) * pose.z\), where \(pose.z\) is the z component of the hand pose in the world frame and \(\psi\) is the yaw angle of the hand in the world frame.

Distance sensor data: first, the peak bin of the sensor's center zone is a reasonable approximation of the distance from the sensor to the ground. A more robust method is plane fitting, where a plane is fit to peak bin distance to all nine zones. Then, the distance estimate is given by the distance to the plane from the center ray. Finally, there is a constant position offset from the hand frame to the sensor frame that can be directly calculated/measured.

Kalman Filter: we use a 1D Kalman Filter from the FilterPy library for state estimation. The hyperparameters are initialized based on noise from the dataset. Then, the hyperparameters are tuned to give a balance between the distance sensor and proprioceptive data.

Results

Figure: Absolute error boxplot of proprioception, exteroception, and Kalman Filter. These results are derived from data collected in the same way as shown in the video. The data is collected over time with around 90 total samples. We visualize both the error from single-zone peak-finding distances as well as the error from plane fitting. Throughout data collection, we see that even on flat terrain, Spot's proprioception can be much less accurate than the distance sensor.

With the Kalman Filter, we can meaningfully interpolate the true height of the robot hand between proprioceptive and distance sensor measurements. We can see that the Kalman Filter is able to provide a more accurate estimate of the true height of the robot hand compared to proprioception alone. With the Kalman Filter, we are able to improve Spot hand height localization from a MAE of 2.9mm (proprioception) to 1.5mm.

Next Steps

We have shown that localization can be improved using a single miniature optical time-of-flight sensor even on an expensive, precise mobile manipulator. While this demonstration is only in one dimension, it indicates the potential for 3D localization using a single distance sensor. With a wide-range, multi-zone distance sensor, if we can identify and locate multiple planes within the sensor's field of view, it would enable 3D localization. For example, the floor and walls of a room could be identified and localized with a single configuration on a single sensor. This would be a significant improvement for lightweight computing applications (e.g., on microdrones), where bulky LIDARs are not feasible. It also offers an edge over stereo cameras by directly encoding geometric information with reduced data requirements.

The website template was borrowed from Michaël Gharbi.