0% found this document useful (0 votes)
11 views77 pages

Vision-Aided Planning For Robust Autonomous Navigation of Small-Scale Quadruped Robots

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views77 pages

Vision-Aided Planning For Robust Autonomous Navigation of Small-Scale Quadruped Robots

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 77

Vision-Aided Planning for Robust Autonomous

Navigation of Small-Scale Quadruped Robots


by
Thomas Dudzik
S.B., Computer Science and Engineering, and Mathematics,
Massachusetts Institute of Technology (2019)
Submitted to the Department of Electrical Engineering and Computer
Science
in partial fulfillment of the requirements for the degree of
Master of Engineering in Electrical Engineering and Computer Science
at the
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
September 2020

c Massachusetts Institute of Technology 2020. All rights reserved.

Author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Department of Electrical Engineering and Computer Science
August 14, 2020

Certified by . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Sangbae Kim
Professor
Thesis Supervisor

Accepted by . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Katrina LaCurts
Chair, Master of Engineering Thesis Committee
2
Vision-Aided Planning for Robust Autonomous Navigation of
Small-Scale Quadruped Robots
by
Thomas Dudzik

Submitted to the Department of Electrical Engineering and Computer Science


on August 14, 2020, in partial fulfillment of the
requirements for the degree of
Master of Engineering in Electrical Engineering and Computer Science

Abstract
Robust path planning in non-flat and non-rigid terrain poses a significant challenge for
small-scale legged robots. For a quadruped robot to reliably operate autonomously
in complex environments, it must be able to continuously determine sequences of
feasible body positions that lead it towards a goal while maintaining balance and
avoiding obstacles. Current solutions to the problem of motion planning have several
shortcoming such as not exploiting the full flexibility of legged robots and not scaling
well with environment size or complexity. In this thesis, we address the problem of
navigation of quadruped robots by proposing and implementing a vision-aided plan-
ning framework on top of existing motion controllers that combines terrain awareness
with graph-based search techniques. In particular, the proposed approach exploits the
distinctive obstacle-negotiation capabilities of legged robots while keeping the com-
putational complexity low enough to enable planning over considerable distances in
real-time. We showcase the effectiveness of our approach both in simulated environ-
ments and on actual hardware using the MIT Mini-Cheetah Vision robotic platform.

Thesis Supervisor: Sangbae Kim


Title: Professor

3
4
Acknowledgments
Many people helped make this work possible – I could not hope to acknowledge each
and every individual that contributed in making this thesis a reality, in ways big
and small, directly and indirectly. With that said, I hope to convey the immense
appreciation I have for at least a few of these people.
First and foremost, I would like to thank my advisor, Sangbae Kim, for welcoming
me into the Biomimetics Lab and giving me the opportunity to develop on the robotic
cheetah platform. His guidance along the way proved invaluable, teaching me to
balance the ups and downs of research life while introducing me to many fascinating
aspects of robotics, optimization, planning, and machine learning.
Thank you to Donghyun Kim for mentoring me and spending endless hours brain-
storming ideas and debugging code to help make the project a success. Without his
contributions the robot codebase would still be in its infancy.
I would also like to thank the various other members of the Biomimetics Lab with
whom I’ve had the pleasure of collaborating with over the past year: Matt Chignoli,
Gerardo Bledt, AJ Miller, Bryan Lim, and Albert Wang. Through long discussions
and endless nights in the lab, I came to understand robotics at a deeper level from
both a software and a hardware perspective. It was a huge honor to be a part of this
lab and I appreciate each and every one of you for making it such an amazing place
to learn and work.
Last but certainly not least, I’d like to thank my family for their endless love and
support throughout my academic journey and for instilling in me the drive to tackle
the tough challenges, as well as my close friends who’ve made the past five years at
MIT full of great memories and ultimately all worth it.

5
6
Contents

1 Introduction 17
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.2 Recent Advancements . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.2.1 Quadruped Robotic Platforms . . . . . . . . . . . . . . . . . . 19
1.2.2 Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.2.3 Current Challenges . . . . . . . . . . . . . . . . . . . . . . . . 21
1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.4 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2 System Overview 25
2.1 MIT Mini-Cheetah . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.1.1 Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.1.2 Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2 MIT Mini-Cheetah Vision . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2.1 Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2.2 Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3 Robot System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4 Software System Architecture . . . . . . . . . . . . . . . . . . . . . . 33
2.5 Simulation Environment . . . . . . . . . . . . . . . . . . . . . . . . . 35

3 Dynamic Terrain Mapping 39


3.1 Terrain Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2 Elevation Map Construction . . . . . . . . . . . . . . . . . . . . . . . 40

7
3.3 Traversability Estimation . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.4 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . 43

4 Path Planning 45
4.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.2 Algorithm Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.2.1 Implementation Details . . . . . . . . . . . . . . . . . . . . . . 49
4.3 Path Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.4 Continuous Replanning . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.5 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

5 Practical Tests and Results 55


5.1 Evaluation in Simulation . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.1.1 Software Demo 1 . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.1.2 Software Demo 2 . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.1.3 Software Demo 3 . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.2 Evaluation on Hardware . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.2.1 Hardware Demo 1 . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.2.2 Hardware Demo 2 . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.2.3 Hardware Demo 3 . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.2.4 Hardware Demo 4 . . . . . . . . . . . . . . . . . . . . . . . . . 63

6 Conclusion 65
6.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
6.1.1 Perception Improvements . . . . . . . . . . . . . . . . . . . . . 66
6.1.2 Motion Controller and Gait Selection Improvements . . . . . . 66

A Hardware Specifications 69
A.1 UP board . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
A.2 NVIDIA Jetson TX2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

8
B Algorithm Psuedocode 71
B.1 A* . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

9
10
List of Figures

1-1 Popular Quadruped Robotic Platforms. From left to right: Boston


Dynamics’ Spot Mini, ETH Zurich’s ANYmal, and IIT’s HyQ robot. 20

2-1 MIT Mini-Cheetah Robot. A small-scale, electrically-actuated,


high-performance quadruped robot. . . . . . . . . . . . . . . . . . . . 26
2-2 MIT Mini-Cheetah Vision Robot. A variant of the original Mini-
Cheetah upgraded with additional compute power and three Intel Re-
alSense perception sensors for exteroceptive sensing. . . . . . . . . . . 28
2-3 Intel RealSense D435 Depth Camera. A depth camera that pro-
vides the robot with vision data used for constructing a terrain map
of the surrounding area. . . . . . . . . . . . . . . . . . . . . . . . . . 29
2-4 Intel RealSense T265 Tracking Sensor. A low-power tracking
camera used for global localization of the robot. . . . . . . . . . . . . 30
2-5 Mini-Cheetah Vision Coordinate Systems. A body-fixed coordi-
nate frame, ℬ, originates at the robot’s CoM 𝑝. A second coordinate
frame originates at the center of the depth camera mounted at the
front, denoted as 𝒞. The robot itself moves around in the world iner-
tial frame, ℐ. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2-6 High-Level System Architecture. Block diagram visualizing the
software architecture and the interaction between modules. Green rep-
resents the perception and planning module, blue represents the state
estimation module, and red represents the locomotion control module.
The motion library (not shown) is a separate component. . . . . . . . 34

11
2-7 The MIT Biomimetic Robotics Lab’s Open-Source Simulation
Environment. A custom, open-source simulation software environ-
ment was designed to allow for fast, risk-free experimentation with
realistic dynamics. It includes a control panel GUI that allows an
operator to change robot and simulation parameters on the fly. . . . . 36

3-1 Example Pointcloud and Heightmap. A visualization of the fine-


resolution pointcloud output by the front-mounted RealSense D435
camera. The corresponding local heightmap is overlaid in simulation. 41

3-2 Overview of the Traversability Map Generation Process. An


example of how a traversability map is derived from pointcloud data.
The first figure contains a pointcloud representation of a stairset with
a single noisy return. The sensor data is then binned into discrete
cells to form a 2.5D heightmap. The heightmap is then subsequently
filtered to deal with noise and sparsity. Finally, the gradients of the
filtered heightmap are computed in order to segment the terrain based
on traversability. In the right-most figure, blue represents traversable
regions while yellow represents non-traversable regions. . . . . . . . . 43

4-1 Workspace. An illustration of an example workspace. The shaded


regions represent obstacles in the environment. . . . . . . . . . . . . . 46

4-2 Configuration Space. An illustration of the configuration space de-


rived from the example workspace in Figure 4-1. The white space is
𝐶𝑓 𝑟𝑒𝑒 , the inner shaded regions make up 𝐶𝑜𝑏𝑠 , and the shaded region
between the solid lines and the dotted lines make up 𝐶𝑏𝑢𝑓 . . . . . . . 47

5-1 Software Demo 1: Hallway. (a) The robot autonomously navigates


through a simulated hallway to the desired waypoint (red sphere). The
planned path is visualized as a blue line. (b) The local traversability
map is overlaid in simulation. Blue patches represent non-traversable
locations by the CoM while green is traversable. . . . . . . . . . . . . 56

12
5-2 Software Demo 2: Maze. The robot continuously replans for the
optimal, shortest path as the user moves the waypoint (red sphere) to
different locations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5-3 Software Demo 3: Stairs. The robot successfully recognizes the
stairs as a traversable obstacle and autonomously climbs to the top
without directly stepping on any of the vertical sections. The over-
laid local traversability map can be seen with blue signifying invalid
foothold locations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5-4 Hardware Demo 1: Treadmill Platform. The robot successfully
recognizes and avoids two different-sized obstacles in its path as it
walks from one side of the platform to the other end. . . . . . . . . . 60
5-5 Hardware Demo 2: MIT Hallway. A timelapse of the Mini-
Cheetah Vision autonomously navigating a cluttered hallway. . . . . . 61
5-6 Hardware Demo 3: Outdoor Environments. The Mini-Cheetah
Vision during a handful of experiments in various real-world outdoor
environments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5-7 Hardware Demo 4: Perturbation-Robustness Testing. A graph
showing the 𝑥 and 𝑦 coordinates of the robot’s center of mass as it
follows a planned trajectory. We see that the robot successfully reaches
the desired waypoint even in the presence of two human-applied pushes
at randomly-selected times. . . . . . . . . . . . . . . . . . . . . . . . 63

13
14
List of Tables

A.1 Specifications for the UP board . . . . . . . . . . . . . . . . . . . . . 69


A.2 Specifications for the NVIDIA Jetson TX2 . . . . . . . . . . . . . . . 70

15
16
Chapter 1

Introduction

Inspired by the impressive agility of biological systems, robotics research has long been
motivated by the potential to deploy robots in disaster response scenarios where they
may augment or even surpass human capabilities. Although there has been exciting
progress over the past few years resulting in mobile robots being used for construction,
surveillance, agriculture, maintenance, and even entertainment purposes, it is often
the case that the robots are designed to achieve a singular goal and are unable to gen-
eralize to different tasks or unknown environments. As a result, robots are still unable
to be utilized to their full potential in disaster relief situations such as an earthquake
where the highly unstructured, uncertain, and extremely dangerous terrain is difficult
to model beforehand. It is clear that research is still far from achieving the agility
and robustness of biological species, with plenty of areas within the field in need
of improvement including balance control, hardware design and integration, terrain
perception, and autonomous path planning and tracking. In this thesis, we jointly
focus on the areas of environment mapping and path planning, specifically targeting
quadruped robotic platforms with the goal of achieving collision-free navigation in
challenging terrain.

17
1.1 Motivation

Legged robots in particular have gained a lot of attention due to their unique set of
advantages over other forms of mobile robots. Compared with wheeled or tracked
robots, legged robots are well-suited for irregular and difficult terrain due to the fact
that they do not require a continuous path to move from a starting position to an
end goal. Instead, discrete foothold positions are enough to traverse terrain even
when the terrain contains obstacles, gaps, or uneven surfaces. Additionally, legged
robots move by creating contact with the ground which generates forces that move
the robot forward. These contact forces can be quite large, allowing them to carry a
heavy payload relative to their overall body size which opens doors to several useful
applications. Compared with bipedal humanoid robots, quadrupeds have distinct
advantages that further motivate their study. Not only are quadruped robots often
smaller, more compact, and more agile than their biped counterparts, the issue of
balancing is also much simpler in most situations due to the greater number of legs
that can be used at any given time. These benefits often outweigh the resulting added
complexity, and for this reason, we place our attention fully on quadruped robotic
platforms.

1.2 Recent Advancements

Path planning and obstacle avoidance strategies for mobile robots have a long and
extensive history in robotics literature [54]. Current approaches for legged robots lie
along a spectrum based on how much of the robot’s details are taken into account.
On one end of the spectrum, every detail is considered and so the problem becomes
incredibly complex, involving all of the robot’s degrees of freedom. This approach
works well for some tasks, such as short-term whole-body manipulation [35], but often
times quickly becomes computationally impractical to solve. Regardless, results in
[23, 48] have achieved success by using this approach to stitch together different
whole-body configurations as part of an overall locomotion plan.

18
On the other end of the spectrum exist path planners that abstract away all the
details of the robot’s legs and instead treat the robot as a point moving through space.
Navigation strategies utilizing this sort of approach often search for a collision-free
path in a 2D or 3D environment. Because of the low dimensionality of this problem
space, very efficient graph-search algorithms can be employed [50]. In the literature,
these techniques have primarily been applied to biped humanoid robots. For exam-
ple, the works in [43, 37] compute local footstep paths to follow a globally-computed
2D path. In particular, work done in [11] was one of the first to propose planning
footstep paths with graph-based search methods, although it required pre-computing
footstep transitions across obstacles areas in order to estimate the traversal costs
which failed in densely-cluttered environments [24]. Other results such as [34, 45]
produced conservative global navigation strategies obtained by choosing appropri-
ate bounding volumes for the robot and following trajectories computed by a 2D
planner without the burden of traversability estimation, but this forced the robot to
circumvent obstacles rather than traversing them by stepping over or onto them. In
our work, we build on the latter approach for path planning and apply many of the
mentioned ideas to quadruped robotic platforms.

1.2.1 Quadruped Robotic Platforms

Over the past few decades, quadruped robots have seen a surge in popularity because
of the increased accessibility to good hardware, appropriate actuation methods, and
the improved functionality of locomotion controllers. Though there exists a large
body of literature in robotics devoted to quadruped robots, only a somewhat small
subset focuses on navigation using vision-enabled platforms, likely due to the fact
that there are only a handful of such widely-known robotics research platforms. The
most prominent ones include Spot Mini from Boston Dynamics [4], ANYmal from
ETH Zurich [26], HyQ from the Istituto Italiano di Tecnologia (IIT) [49], and the
cheetah robots from MIT [8, 29]. With hardware designs constantly improving, there
have been various advancements in mapping and navigation strategies for quadruped
robots.

19
Figure 1-1: Popular Quadruped Robotic Platforms. From left to right: Boston
Dynamics’ Spot Mini, ETH Zurich’s ANYmal, and IIT’s HyQ robot.

In our research, we focus on and emphasize achieving robustness in real-world


scenarios. This means that the environment is unseen and the robot must perceive,
plan, and adapt its movement as it travels, especially if the environment changes.
We also require that all sensing and computation is done onboard in real-time, where
sensors may experience drift, delay, or noise. Under these constraints, only a handful
of integrated pipelines come close to meeting the criteria. The most well-known and
publicized is without a doubt Boston Dynamics’ Spot Mini robot, which has been
shown to traverse challenging terrain including stairs at high speeds. However, very
little is known or published about their proprietary technology. Some research groups
have published results using Boston Dynamics’ older platform, LittleDog, and while
the results were impressive, the approaches require external motion-capture systems
and pre-scanned terrain models [28, 42].
Advances towards a fully-integrated system were demonstrated on a robot known
as Messer II [6], which used a Rapidly-Exploring Random Tree (RRT)-based planner
to find a complete kinematic plan, integrating foothold selection and motion planning
into one sampling-based planner. The drawback of this type of randomized, sampling-
based method is that no bounds on the quality of the found solution can be given,
unlike graph-based methods such as A*. Additionally, the platform is still tethered.
More recently, [15] presented a perception-based, statically-stable motion planner
for the ANYmal platform, where for each footstep the algorithm generates a foothold,
a collision-free foot trajectory, and a body pose. However, they do not deal with
dynamic gaits (such as trotting) and do not account for leg collisions or external

20
disturbances when generating the trajectories.
Closest to our approach is the work done on IIT’s hydraulic quadruped HyQ
[39, 57]. Their system performs online state estimation and probabilistic robo-centric
terrain mapping [14], allowing it to autonomously climb over previously unseen ob-
stacles with a maximum step height of 15 cm at speeds of 7-13 cm/s.

1.2.2 Machine Learning

It is worth mentioning that outside of classical methods, there has recently been a
surge in interest in using neural networks in legged systems with some functioning
results. Machine learning has already been used to solve many difficult tasks in other
areas of robotics [51] such as computer vision [27], manipulation [33, 17], and multi-
agent cooperation [20, 47]. As a result, it has now gained popularity as an approach
to tackle problems specifically in quadruped robotics. For example, and most rele-
vantly, [55] used a convolutional neural network on the HyQ platform to understand
the environment and dynamically select feasible foothold locations. Reinforcement
learning as well has shown more and more impressive results during the last few
years, in particular those achieved in [53] on the ANYmal platform, but their results
do not reach fast locomotion speeds and are currently limited to simulation without
the capability to deploy on actual hardware in real-world terrains.

1.2.3 Current Challenges

Though the works reviewed above show promising results, there is one particular
shortcoming that has not been fully addressed. Because most of the robots are large
and heavy, none are able to achieve truly high-speed locomotion either as a result
of limitations by the locomotion controller or by computation time and power, with
most software pipelines running at only 1-2 Hz. To the best of our knowledge, the
current state-of-the-art, publicly-available results are IIT’s latest iteration of HyQ,
where they use a convolutional neural network (CNN) to classify footholds during
locomotion [55]. However, they are still only able to achieve speeds of up to 0.5 m/s.

21
1.3 Contributions

The work presented in this thesis represents advancement in the field of quadruped
robotics, with the major contributions being the implementation and considerable
improvements of the entire perception and planning system on the MIT Mini-Cheetah
Vision robotic platform. In particular, the contributions are twofold:

1. We design and implement a higher-level mapping and planning framework for


autonomous navigation that takes advantage of the Mini-Cheetah Vision’s ex-
isting motion controllers and allows it to robustly avoid obstacles in real-world
environments at high velocities.

2. We experimentally validate the framework and algorithms both in simulation


and on the actual robot hardware in several terrains of varying difficulty.

Since autonomous quadruped locomotion does not have a clear, quantitative met-
ric of success in the classical sense, we determine the robot to successfully accomplish
a task if it is able to reach the designated goal without falling, colliding with an ob-
stacle, or taking an unreasonable route or amount of time to get there. Furthermore,
we require the experiment to be repeatable over many trials with a high success rate,
and consider a faster rate of locomotion to be better. Based on this definition, our
vision-aided planning method achieves a high rate of success at locomotion speeds of
over 1 m/s.

1.4 Thesis Outline

The remainder of this thesis is structured as follows. Chapter 2 details the quadruped
robotic platform, the simulation environment used for development, and the structure
of the software stack that operates on the robot. Chapter 3 thoroughly describes the
architecture of the mapping and traversability estimation portion of the perception
pipeline, while Chapter 4 describes the graph-based planning algorithm and how it
uniquely incorporates information gathered from the mapping data. In Chapter 5,

22
we move on to the experimental validation of the proposed framework, showcasing
our results from various simulated environments as well as tests in real-world environ-
ments using the actual Mini Cheetah Vision hardware. Chapter 6 concludes the thesis
by summarizing the work and briefly discussing possible future research directions.
What follows after are appendices, which contain supplementary information that the
reader may find useful. Appendix A provides detailed technical specifications of the
hardware used in the Mini-Cheetah and Mini-Cheetah Vision robots while Appendix
B contains pseudocode for the graph-search algorithm used.

23
24
Chapter 2

System Overview

The robotics platform used in this research is the MIT Mini-Cheetah Vision quadruped
robot developed by the MIT Biomimetic Robotics Lab. In this section, we describe in
detail the design of the platform, focusing on the system specifications most relevant
to executing navigation-planning tasks. We begin with an overview of the original
MIT Mini-Cheetah robot, discuss the modifications made in the MIT Mini-Cheetah
Vision robot variant, and examine the structure and operation of the software stack
and simulation environment.

2.1 MIT Mini-Cheetah

The following is a high-level overview of the MIT Mini-Cheetah robotics platform.


For a complete and detailed description of the system architecture, we refer the reader
to [29]. The MIT Mini-Cheetah is a lightweight, inexpensive, and high performance
quadruped robot that allows for rapid development and experimentation of control
and planning systems. As the successor to the famous MIT Cheetah 3 [8], the Mini-
Cheetah was built using many of the same design principles at a smaller scale while
making improvements based on the experience gained from running experiments on
the larger robot. Its actuators, which are comprehensively described in [30], were
developed to be compact, low-cost modules without suffering any performance penal-
ties. Compared to the Cheetah 3, the Mini-Cheetah exhibits increased robustness

25
Figure 2-1: MIT Mini-Cheetah Robot. A small-scale, electrically-actuated, high-
performance quadruped robot.

due to its lower weight and center-of-mass (CoM) height which results in smaller
impact forces and magnitude of momentum when falling. The small-scale design is
specifically geared towards conducting rapid and frequent experimentation on the
platform.

2.1.1 Specifications

As the name implies, the Mini-Cheetah is lightweight, weighing only 9 kg and standing
at a height of 0.3 m, making it easily handled by a single operator. The body of the
robot is 0.38 m in length while the upper and lower leg links measure out to 0.21 m
and 0.20 m in length, respectively. It has a ground clearance of 0.25 m, which is
comparable to the height of a single stair or street curb. Despite its small size and
low cost, the Mini-Cheetah is capable of highly-dynamic locomotion at speeds of up
to 2.45 m/s as well as complex behaviors including a 360∘ backflip from a standing
position.

26
2.1.2 Hardware

The robot is powered by an off-the-shelf 24 V, 120 Wh Li-Ion battery normally de-


signed for use in cordless power tools. These batteries are cheap, easy to charge,
contain built-in power management systems, and have simple mechanical interfaces
that allow for quick battery swapping.
Locomotion and other high-level control tasks are executed on a small (approxi-
mately the size of a credit card), low-power UP board computer fitted with a quad-core
Intel Atom x5-z8350 CPU and 4 GB of RAM (see Appendix A for specifications).
The UP board runs Ubuntu 18.04 with a custom-built Linux kernel with the proper
UP board drivers as well as the PREEMPT_RT patch. This patch allows nearly
all of the kernel to be preempted by other processes which is critical for the near
real-time operation of the locomotion controller. High-level communication and data
logging is achieved using the Lightweight Communication and Marshalling (LCM)
library [25]. In particular, LCM allows for very easy integration of additional sensor
data. The UP board communicates with the Mini-Cheetah’s twelve actuators via a
custom-made quad CAN bus interface. The control, state estimation, and actuator
communication loops each run at 500 Hz, although the locomotion control typically
does not execute every iteration.

2.2 MIT Mini-Cheetah Vision

The MIT Mini-Cheetah Vision is an upgraded build of the original Mini-Cheetah


that is outfitted with additional exteroceptive sensors for capturing and processing
visual input data. This variant of the Mini-Cheetah was used for all of the research
conducted in this thesis.

2.2.1 Specifications

The Mini-Cheetah Vision is remarkably similar to the original, sharing many of the
same components and body structure. However, in order to fit all of the additional

27
sensors and compute, the length of the body was made longer by 2.3 cm, increasing
the weight to approximately 9.6 kg. The height and leg link lengths remain the same
as before.

Figure 2-2: MIT Mini-Cheetah Vision Robot. A variant of the original Mini-
Cheetah upgraded with additional compute power and three Intel RealSense percep-
tion sensors for exteroceptive sensing.

2.2.2 Hardware

The hardware of the Mini-Cheetah Vision is almost exactly the same as the original
Mini-Cheetah with the exception of additional components for vision-specific pro-
cessing. Alongside the UP board, the Mini-Cheetah Vision also contains an NVIDIA
Jetson TX2 embedded computing module. The TX2’s 256-core GPU with 8 GB of
LPDDR4 memory makes it particularly well-suited for heavy vision-related comput-
ing tasks, something that the UP board is not optimized to handle. The two boards
communicate relevant data between each other using LCM.

28
Intel RealSense Cameras

Perception of the robot’s environment is achieved by a combination of two different


Intel RealSense camera models: the D435 and T265. The RealSense D435 is a depth
camera that provides the robot with depth data that is used in constructing a terrain
map of the surrounding area, while the RealSense T265 is a low-power tracking camera
used for global localization of the robot. These particular cameras were selected
because their compact size, low power consumption, and high frame rate fulfill the
strict latency, throughput, and size requirements necessary for rapid onboard sensing.
LiDAR sensors were originally considered but their large size (roughly 1/3 of the Mini-
Cheetah’s body length at the time) made them difficult to integrate. Additionally,
the power consumption required for localization computation would have been too
demanding for the robot’s battery. The LiDAR sensor’s high cost, low frame rate,
and high minimum range of approximately 1 m were also significant factors in the
final decision.
The RealSense D435 measures 90 mm × 25 mm × 25 mm, has a mass of 71.8 g,
and a wide 86∘ × 57∘ × 94∘ field of view (FOV). The sensor publishes 640 × 480-sized
depth images at a rate of 90 Hz in a pointcloud format with an accuracy of less than
1% error for each meter of distance away from the aperture. For the vision-based
planning tasks where most of the sensor readings occur less than a meter away, we
expect an overall accuracy between 2.5 mm and 5 mm. The Mini-Cheetah Vision has
one D435 camera mounted directly on the front that is used to detect obstacles and
build maps of the local terrain.

Figure 2-3: Intel RealSense D435 Depth Camera. A depth camera that provides
the robot with vision data used for constructing a terrain map of the surrounding area.

29
The RealSense T265 tracking camera measures 108 mm × 24.5 mm × 12.5 mm,
weighs 55 g, and has two fisheye lenses with a combined 163 ± 5∘ FOV. Using its on-
board integrated inertial measurement unit (IMU) and visual processing unit (VPU),
the T265 publishes a pose estimate via a proprietary simultaneous localization and
mapping (SLAM) algorithm at a rate of 200 Hz. The Mini Cheetah Vision has two
T265 cameras mounted on either side of its body that are used for localizing the
global position and orientation of the robot as it moves through the environment.

Figure 2-4: Intel RealSense T265 Tracking Sensor. A low-power tracking camera
used for global localization of the robot.

2.3 Robot System Model

It is important to clearly state the definitions and conventions of the physical quan-
tities that model the Mini-Cheetah and Mini-Cheetah Vision robot systems. Unless
otherwise stated, physical quantities, vectors, and matrices will be defined in the in-
ertial frame as denoted by the superscript (·)ℐ , where the inertial frame originates
at a fixed, arbitrarily-defined origin Oℐ . In our case, Oℐ is set to the robot’s initial
position upon start-up. Quantities that are expressed in the robot’s body frame are
denoted by the leading superscript (·)ℬ . Later on, we also introduce the coordinate
frame of the depth camera, and we denote quantities in that frame with the super-
script (·)𝒞 . If a quantity written does not explicitly define a coordinate frame, it can
be assumed to be in the inertial (world) frame. Figure 2-5 illustrates the model of the

30
Mini-Cheetah Vision with the CoM point and coordinate systems clearly marked.

Figure 2-5: Mini-Cheetah Vision Coordinate Systems. A body-fixed coordinate


frame, ℬ, originates at the robot’s CoM 𝑝. A second coordinate frame originates at
the center of the depth camera mounted at the front, denoted as 𝒞. The robot itself
moves around in the world inertial frame, ℐ.

The translational position and velocity of the robot’s CoM are straightforward to
define and follow common conventions. The position of the CoM is denoted by the
vector from the origin, ⎡ ⎤
𝑥
⎢ ⎥
𝑝 = ⎢𝑦 ⎥ . (2.1)
⎢ ⎥
⎣ ⎦
𝑧

It is worth noting that because the Mini-Cheetah’s legs are light compared to the
weight of the robot’s body, the center of the body base and the CoM are approxi-
mately equal and can often be used interchangeably. Additionally, when computing
navigation paths in 2D, we may drop the 𝑧 coordinate and refer to 𝑝 as only a 2D
point. To get the velocity of the CoM in the inertial frame, we take the first derivative

31
of this position vector, which we denote as
⎡ ⎤
𝑥˙
⎢ ⎥
𝑝˙ = ⎢𝑦˙ ⎥ . (2.2)
⎢ ⎥
⎣ ⎦
𝑧˙

Formally defining the robot body’s orientation is slightly more involved, in par-
ticular because there exist multiple commonly-used representations. In our work, we
primarily use the Euler angle representation, defined as
⎡ ⎤
𝜑
⎢ ⎥
Θ = ⎢𝜃⎥ (2.3)
⎢ ⎥
⎣ ⎦
𝜓

where 𝜑 (roll), 𝜃 (pitch), and 𝜓 (yaw) are successive rotations about the respective
instantaneous axis. More specifically, we use the ZYX convention, meaning that
rotations are applied in the order of first yaw, then pitch, and finally roll.

It is sometimes more convenient to represent orientation using a full rotation


matrix R ∈ R3×3 from the special orthogonal group 𝑆𝑂(3). Any rotation matrix
R can be constructed from Euler angles as a product of three elemental rotation
matrices like so:
R = R𝑧 (𝜓)R𝑦 (𝜃)R𝑥 (𝜑) (2.4)

where R𝑛 (𝛼) represents a positive rotation of 𝛼 about the 𝑛-axis, making it relatively
easy to convert back and forth between the two representations.

The concept of a homogeneous transformation matrix will be useful when working


with pointclouds later on, so we define it now. A homogeneous transformation matrix
T is a 4 × 4 matrix from the Special Euclidean group 𝑆𝐸(3), the group of all valid
rotations and translations of a rigid body. Each matrix T is composed of a 3 × 3

32
rotation matrix R and a 3D translation vector 𝑡 as follows:
⎡ ⎤
R 𝑡
T=⎣ ⎦ (2.5)
0 1

where 0 represents the three-dimensional row vector of zeros. This transformation


matrix can be used to simultaneously rotate and translate a 3D point from one frame
to another. For example, to project a point 𝑝 from the inertial (world) frame to the
robot body frame, one can simply do

𝑝ℬ = Tℬℐ 𝑝 (2.6)

where Tℬℐ is the matrix representing a transformation from frame ℐ to frame ℬ. Be-
cause the transformation matrix is four-dimensional, a 1 is implicitly appended to
the end of the vector 𝑝 to meet the dimensional requirements for matrix multiplica-
tion. Similarly, the fourth entry in the output vector is ignored when extracting the
resulting coordinates.

2.4 Software System Architecture

The software stack running on the Mini-Cheetah Vision consists of four primary
modules: perception and high-level planning, locomotion control, state estimation,
and a motion library. We provide a brief summary of each component’s functionality
here.

Perception and Planning

The perception stack uses the integrated sensors to compute the robot’s absolute pose
in the global frame and build an elevation map around the robot used for evaluating
terrain traversability. The output is then sent to a high-level CoM planner that solves
for a path to the desired goal location. Once computed, the path is serialized and
sent over the LCM network so that the locomotion controller is able to access it.

33
Figure 2-6: High-Level System Architecture. Block diagram visualizing the
software architecture and the interaction between modules. Green represents the
perception and planning module, blue represents the state estimation module, and
red represents the locomotion control module. The motion library (not shown) is a
separate component.

Locomotion Control

The locomotion controller parses the serialized path from the vision stack and trans-
lates it into corresponding joint torque and velocity commands that the hardware
can understand. Various controllers are implemented and available to the operator
including convex MPC [12] and WBC [32] controllers that can be cycled through using
the implemented state machine. In our work, we exclusively use an RPC controller
with vision-based foothold adjustments [7, 31]. We also note that all the locomotion
controllers support taking input from a Logitech gamepad or RC controller for cases
where the operator needs to manually control the robot.

State Estimation

The state estimation module of the software stack is implemented as a two-tiered hier-
archical structure similar to the framework described in [13] that integrates kinematic-
inertial estimation with localization sensor data. This hierarchical framework lever-
ages the benefits of the various sensor data streams to create a state estimation al-
gorithm optimized for our dual planning and locomotion control architecture. While
kinematic and inertial measurements come from sensors such as the IMU, localization
measurements are provided by the two RealSense T265 cameras mounted on either
side of the robot, giving accurate, drift-robust localization even in cases where the

34
view from one side of the robot is obstructed. In the rare case that both sides of the
robot are blocked (or in exceptionally low-light conditions) the state estimator will
resort to using only kinematic-inertial data streams.

Motion Library

The motion library is a series of pre-optimized movements and trajectories like jump-
ing, backflipping, and recovery stand-up protocols that can be utilized when normal
locomotion is insufficient. These motions are generated using CasADi [5], an open-
source 2D non-linear optimization library that outputs a set of joint states and torques
for each timestep of the desired trajectory. The motions can be accessed by switching
into their corresponding states within the locomotion controller state machine.

2.5 Simulation Environment

Experimentation directly on the robotic platform hardware can be tedious and risky.
Accidents or mishaps during testing can cause damage to the robot or operator,
delaying development progress while the hardware is repaired. This is especially
problematic if replacing or remanufacturing the broken parts is difficult and costly to
do. In the robotics community, it is often common practice to at least initially perform
research in simulation where everything is virtual and there is no risk to the operator
or hardware. Running experiments in simulation comes with the added benefit of
being able to draw various helpful visualizations like the planned CoM trajectory, as
well as the ability to slow down, pause, or otherwise manipulate the physics engine.
Custom-made environments can also be loaded in, allowing the researcher to test in
terrains containing stairs, hallways, hurdles, etc. without physically needing access
to locations with such obstacles. Furthermore, being able to control the environment
makes it simple to test and debug specific parts of the software stack since simulations
can be run over and over with the same set of initial conditions.

35
Figure 2-7: The MIT Biomimetic Robotics Lab’s Open-Source Simulation
Environment. A custom, open-source simulation software environment was de-
signed to allow for fast, risk-free experimentation with realistic dynamics. It includes
a control panel GUI that allows an operator to change robot and simulation param-
eters on the fly.

To this end, a realistic, open-source simulation environment was developed by the


MIT Biomimetic Robotics Lab to allow for rapid and safe Mini-Cheetah development
[1]. Fast and accurate calculation of the robot dynamics is done via Featherstone’s
Articulated Body Algorithm (ABA) [19] which calculates the forward dynamics of
the rigid body subtrees by making use of 6D spatial vector algebra [18]. The entirety
of the simulator is written in C++ and relies on common open-source libraries such
as Eigen [21] for matrix algebra, Qt [3] for user-interface displays, OpenGL [2] for
graphics rendering, LCM [25] for network marshalling, as well as a few linear and
non-linear optimization solvers.
The simulator’s built-in graphical user interface (GUI) allows the researcher to
make real-time changes to general parameters such as ground stiffness and damping,
as well as the user-defined controller parameters such as gains and control states.

36
Moreover the GUI has the option to switch over and deploy the code directly on the
robot with a single click, eliminating the need to develop for each platform type sep-
arately. Having this easy-to-use, realistic simulator proved invaluable in completing
this research.

37
38
Chapter 3

Dynamic Terrain Mapping

In this chapter, we describe the pipeline that acquires terrain information from the
robot’s vision sensors and builds a model of the environment used for evaluating
the terrain’s traversability. Estimating traversability in rough, non-planar environ-
ments is not a straightforward task, especially because the traversability of the ter-
rain can vary continuously between fully traversable, partially traversable, and non-
traversable. Furthermore, traversability not only depends on the terrain, but also on
the capabilities of the robotic platform being used.
To tackle this problem, we designed and implemented an onboard algorithm that
efficiently stores and continuously updates the state of the world at both local and
global scales. We first explain how the terrain is modeled and then show how this
information is processed and transformed into a qualitative metric of the terrain
topology.

3.1 Terrain Modeling

In the literature, there exist various different techniques for representing the environ-
ment of a mobile robot. One popular approach is to utilize the full 3D representation,
as is done with raw data points or triangle meshes [44, 52]. Although these models
are highly accurate, they incur a huge memory cost that grows linearly in the number
of sensor readings. There exist alternatives such as 3D-based grid [41] and tree [40]

39
representations, but they still require large amounts of memory that the compute
onboard the Mini-Cheetah Vision cannot handle. In order to avoid the complexity
of the full 3D models, one attractive alternative is to use a 2.5D elevation map [46]
which is a more compact representation of the world. With an elevation map, the
terrain is represented as a 2D grid in which each cell stores an estimate of the height
of the particular region of the environment. This is the approach that our pipeline
builds off of.

3.2 Elevation Map Construction

The Intel RealSense D435 camera mounted on the front of the robot publishes point-
clouds to the NVIDIA TX2 at a rate of 90 Hz. These pointclouds are localized within
the reference frame of the camera, denoted as 𝒞, so in order to update the global
map, they must first be transformed into the world inertial reference frame ℐ via a
transformation matrix T. This is done by applying two separate transformations:
camera frame to body frame, and body frame to world frame, which we denote as

T = Tℐℬ Tℬ𝒞 . (3.1)

Because the D435 is mounted at a fixed location on the front of the robot, the first
transformation matrix is constant and hardcoded. In the case of the Mini-Cheetah
Vision, the sensor is attached 28 cm in front of the CoM at a slightly downward
angle. On the other hand, the second transform is computed dynamically from the
T265 localization data as the robot moves around in the global frame.
For each point 𝑝𝒞𝑖 in the pointcloud, we transform it into the inertial frame via
the following calculation:
𝑝𝑖 = T𝑝𝒞𝑖 (3.2)

where we implicitly add and then remove the 1 in the fourth component when per-
forming the multiplication. Once the entire pointcloud has been projected into the
world frame, it is used to update the saved elevation map. Figure 3-1 depicts the

40
Figure 3-1: Example Pointcloud and Heightmap. A visualization of the fine-
resolution pointcloud output by the front-mounted RealSense D435 camera. The
corresponding local heightmap is overlaid in simulation.

Mini-Cheetah Vision with the simulator open for visualization of the pointcloud and
generated local heightmap.
We use the open-source GridMap library [16] as the underlying data structure
for its efficiency, robustness, and built-in functions for querying and iterating over
grid cell locations. For each point 𝑝𝑖 = [𝑥𝑖 , 𝑦𝑖 , 𝑧𝑖 ]⊤ in the transformed pointcloud, we
compute the corresponding grid cell that it belongs to and set that cell’s value to the
𝑧 coordinate of the point. If multiple points of a scan fall into the same grid cell, the
maximum 𝑧 value seen during that iteration is kept. We note that existing elevation
values are overwritten by new data from future scans. To avoid inserting height values
into the map which are likely to be erroneous due to noise, only scan points within a
certain distance from the camera are considered. We experimentally determined that
a maximum distance cap of 1.5 m achieves reasonably-accurate results.

3.3 Traversability Estimation


Before being evaluated for traversability, the elevation map is first post-processed and
cleaned up in order to deal with sparse regions or noisy data that was received from

41
the sensor. We treat the map as if it were a single-channel 2D image and filter it using
erosion and dilation convolution operations from the OpenCV package. The dilation
primarily serves to fill in sparsely-sampled regions while the erosion operation serves
to eliminate extraneous sensor returns. The resulting filtered map is stored as a new
layer separate from the elevation map within the GridMap data structure.
Based on the filtered elevation map, we assign a traversability value to each cell
where a 1 indicates the robot’s CoM can enter the cell and 0 means it can’t. The value
is computed as a result of a two-step process. The first step is to generate a temporary
gradient map using the Sobel operator in OpenCV. The Sobel operator convolves two
3 × 3 kernels with the original map to approximate the discrete derivatives in the 𝑥
and 𝑦 directions at each location in the map. Given a filtered elevation map E, the
gradients are computed as
⎡ ⎤ ⎡ ⎤
−1 0 1 −1 −2 −1
⎢ ⎥ ⎢ ⎥
G𝑥 = ⎢−2 0 2⎥ * E, G𝑦 = ⎢ 0 0 ⎥*E (3.3)
⎢ ⎥ ⎢ ⎥
0
⎣ ⎦ ⎣ ⎦
−1 0 1 1 2 1

where * represents the 2D convolution operator. We then take G = max(G𝑥 , G𝑦 )


(where the maximum is taken entry-wise) and threshold each value in the resulting
map into a 0 or a 1 depending on the resulting value. Based on the locomotion abilities
of the robot, we chose a maximum gradient value of 0.5 to represent non-traversable
terrain.
Lastly, we apply a Gaussian blur filter to the traversability map in order to expand
out the footprint of each of the obstacles. The filter size is chosen based on the
cell resolution and the robot size so that obstacles are expanded out by roughly the
diameter of the smallest circle that encompasses the footprint of the robot body. This
is done so that the planner does not send the CoM along a path where the full robot
body will not fit, for example through a very narrow hallway. Since the Gaussian blur
produces floating-point values between 0 and 1, we iterate through a final time and
set any value that is explicitly non-zero to a 1 to signal that the robot CoM should
not enter that cell. This final blurred traversability map is saved and then passed on

42
to the path planner. A summary of this process is visualized in Figure 3-2.

Figure 3-2: Overview of the Traversability Map Generation Process. An


example of how a traversability map is derived from pointcloud data. The first figure
contains a pointcloud representation of a stairset with a single noisy return. The sen-
sor data is then binned into discrete cells to form a 2.5D heightmap. The heightmap
is then subsequently filtered to deal with noise and sparsity. Finally, the gradients
of the filtered heightmap are computed in order to segment the terrain based on
traversability. In the right-most figure, blue represents traversable regions while yel-
low represents non-traversable regions.

3.4 Implementation Details

Because the 2.5D elevation map is a discretization of the real world, the granularity
of the discretization must be explicitly chosen by hand. This is controlled by the
resolution parameter of the grid (side length of each square cell in the grid). For the
CoM path planning, we found a resolution of 10 cm to be sufficient to satisfy our
computational and navigational constraints. This was chosen based on the robot’s
length, since it is inefficient and unnecessary to use a resolution where the next CoM
waypoint falls within the footprint of the robot’s body. The global map size was set
to 20 m × 20 m for our experiments but can easily be increased or even dynamically
updated given the circular buffer implementation within the GridMap library and the
large amount of memory onboard the NVIDIA TX2.
Alongside this elevation map, we simultaneously update a second global map at
a finer resolution of only 1 cm. While this second map is not used at all by the path
planner, it is required by the lower-level Vision RPC locomotion controller so that it
can make vision-informed footstep adjustments [31]. This map is published from the
TX2 to the UP board over LCM, and in order to increase the publishing rate, only a

43
local, robo-centric region of size 1.5 m × 1.5 m is sent. The controller does not need
the full global map since it only needs to consider regions where the next footstep
can potentially land, and we find that the computation time required for sending the
LCM message is still within our desired time constraints.

44
Chapter 4

Path Planning

The purpose of the path planner is to find a feasible and efficient path between
an initial location and a desired goal in an unknown environment while avoiding
obstacles. This can be a difficult problem to solve, especially if the terrain is complex
or the environment is highly dynamic and changing. In this chapter, we formally
define the path planning problem, describe the architecture of our planning system,
and discuss its implementation onboard the MIT Mini-Cheetah Vision.

4.1 Problem Formulation

We formulate the general path planning problem as follows: given the current state of
the robot, some partial knowledge of the environment, and a model of how the robot
interacts with the environment, what is the most efficient path between the robot’s
location and a pre-determined goal location? To answer this question, we first define
the mathematical framework that we use to model the problem.
We begin by introducing the concepts of workspace and configuration space, orig-
inally proposed in [38]. Formally, the workspace 𝑊 is the 2D or 3D space in which
the robot and obstacle geometry are described in, which we assume is populated by
two kinds of objects:

1. obstacles: locations where the robot cannot, or should not, move to,

45
2. free space: locations where the robot is free to move.

In this work, we define the workspace to be R2 . We choose this space because for the
purposes of higher-level motion planning, it is convenient to only consider the robot’s
CoM, which we represent as a point in the plane that is free to move in any direction.
Therefore, instead of planning over the workspace entirely with a full, physical model
of the robot (e.g. its length, width, leg length, etc.), the robot is condensed to a single
point. Furthermore, because the robot is constrained to walking on the ground, a full
3D representation is unnecessary.

Figure 4-1: Workspace. An illustration of an example workspace. The shaded


regions represent obstacles in the environment.

Next, we define the concept of a configuration space. The configuration space 𝐶


is the space in which the actual motion and path of the robot is represented. In
our case, we choose 𝐶 to be similar to the workspace 𝑊 except that the obstacle
footprints have been expanded by a buffer equal to the diameter of the smallest circle
that encompasses the entire robot’s body for collision avoidance. The cofiguration
space 𝐶 is further partitioned into three regions: 𝐶𝑓 𝑟𝑒𝑒 , the space that is available for
the robot to move around in, 𝐶𝑜𝑏𝑠 , the space that is occupied by obstacles, and 𝐶𝑏𝑢𝑓 ,
the buffer region around obstacles as a result of the expansion. Since the robot can

46
translate and rotate, the configuration space 𝐶 is thus the special Euclidean group:

𝐶 = 𝑆𝐸(2) = R2 × 𝑆𝑂(2) (4.1)

where 𝑆𝑂(2) is the special orthogonal group of 2D rotations. A typical configuration


can be represented with three parameters (𝑥, 𝑦, 𝜃) with 𝜃 being a heading angle from
the 𝑥-axis.

Figure 4-2: Configuration Space. An illustration of the configuration space derived


from the example workspace in Figure 4-1. The white space is 𝐶𝑓 𝑟𝑒𝑒 , the inner shaded
regions make up 𝐶𝑜𝑏𝑠 , and the shaded region between the solid lines and the dotted
lines make up 𝐶𝑏𝑢𝑓 .

We note that planning within this configuration space ignores the dynamic con-
straints to which the robot is subject to. This is done on purpose, in particular
because it allows for higher-level decision-making without being burdened down by
lower-level considerations. We find that the Mini-Cheetah Vision’s low-level controller
is robust enough to take the planner’s output and transform it into motor commands
that are safe for the robot to execute in almost every scenario.
We model the configuration space as a graph so that we can apply search-based
algorithms. More formally, the environment is represented by a weighted graph 𝒢 =
(𝒱, ℰ, 𝑊ℰ ) where

∙ 𝒱 is the set of vertices (typically associated with a physical location in R2 ),

∙ ℰ ⊆ 𝒱 × 𝒱 is a set of edges (pair of vertices), and

47
∙ 𝑊ℰ : ℰ → R is a weight, or cost, associated with each edge.

In our case, the center of cells in the traversability map correspond to vertices and
edges are assigned between neighboring cells using eight-way connectivity.
We also stress that in this work, we compute the robot’s overall CoM motion
first, unlike works such as [36] and [15] that select the quadruped’s contact locations
first. Although selecting footstep locations first may make sense when the robot’s mo-
tion is non-gaited (such as when carefully climbing over gaps), our goal is dynamic,
high-speed locomotion in real-world terrains where the robot’s contacts with the en-
vironment do not matter as much, so long as it can recover from any perturbations.
When planning beyond the range of what the robot’s sensors can detect (which we
limit to 1.5 m), planning every footstep in advance makes little sense as it places an
unnecessary computational burden on the hardware to generate a sequence that will
likely become invalid once the new sensor data arrives. Furthermore, our lower-level
controller adjusts the final footstep location based on local fine-grained visual input
that it receives during execution [31].

4.2 Algorithm Overview


The idea motivating the algorithm described in this chapter is that walking through
a building, park, or construction site does not require immediately planning every
single footstep, but rather only a rough sketch of the path to take. Toward this end,
we integrate the planner as a component of a tiered planning framework, where the
planner provides the rough sketch as a discrete set of waypoints that then gets filled
out during execution by the lower-level controller.
Given a goal location 𝑝𝑓 and the robot’s current position 𝑝𝑐 , the solution to the
path planning problem returns a sequence of 𝑛 discrete waypoints 𝑃 = {𝑝0 , 𝑝1 , . . . , 𝑝𝑛 }
where 𝑝0 = 𝑝𝑐 and 𝑝𝑛 = 𝑝𝑓 . In our work, we employ a grid-search approach using
the A* algorithm which finds the minimum cost path to a goal vertex in a graph,
assuming one exists, by first searching the vertices which are most likely to lead to the
goal based on some admissible heuristic [22]. The algorithm does this by maintaining

48
a tree of paths originating at the start node and greedily extending those paths one
edge at a time until some termination condition is satisfied. At each iteration of its
main loop, A* needs to determine which of the paths to extend, which it does based
on the current cost of the path, an estimate of the cost required to extend the path
to the goal, and whether or not the node is traversable in the traversability map.
Specifically, A* chooses the path that minimizes

𝑓 (𝑝) = 𝑔(𝑝) + ℎ(𝑝) (4.2)

where 𝑝 is the next waypoint in the path, 𝑔(𝑝) is the cost of the path from the start
position 𝑝𝑐 to 𝑝, and ℎ(𝑝) is a heuristic function that measures a lower bound of
the cost from 𝑝 to the target position 𝑝𝑓 . The key is that ℎ(𝑝) must be admissible,
or in other words, that it never overestimates the actual cost of arriving at the goal
location. This guarantees that A* will return the least-cost path from start to finish.
The algorithm terminates when the path it chooses to extend is a path from 𝑝𝑐 to 𝑝𝑓 ,
or if there are no paths available to be extended further. Lastly, we note that A* is
a static algorithm, which means that when the configuration space 𝐶 changes (such
as when an obstacles appears or the map updates), the old path is invalidated and
the algorithm must be re-run from scratch. For full psuedocode of the algorithm, we
refer the reader to Appendix B.

4.2.1 Implementation Details

The entire algorithm is implemented in C++ as the language has efficient built-in
data structures that are suitable for search-based algorithms. The traversability map,
stored in memory using the GridMap data structure [16], is received via reference
from the mapping portion of the software as an input argument. We use a priority
queue from the C++ Standard Template Library to perform the repeated selection of
minimum cost nodes to expand. The returned path is also stored as a queue, which
later gets sent to the locomotion controller. A queue is particularly efficient because
it allows the controller to continuously pop the next waypoint off the front of the

49
data structure, even in situations where the planner has re-planned and overwritten
the contents of the queue.
In our implementation, we use two-dimensional, eight-way graph connectivity and
define the heuristic function as the Euclidean distance from point 𝑝 to the goal 𝑝𝑓 :

ℎ(𝑝) = ||𝑝 − 𝑝𝑓 || (4.3)

where || · || is the 2-norm. Edge weights are governed by the graph resolution. For
example, for a graph of resolution 0.1 m, the cardinal directions have cost 0.1 while

the ordinal directions have cost 0.1 2.
The algorithm above only gives the length of the shortest path, so to find the
actual sequence of CoM locations, we modify the algorithm so that each node keeps
track of its predecessor, or parent node. This way we can efficiently backtrack from
the goal node once it is found to return the full path.

4.3 Path Tracking

Once the planner finishes computing a path, it is passed along to the lower-level
controller to be converted into a meaningful command and executed. To follow the
path, the controller computes desired linear and angular velocities in a PD fashion
based on where the next waypoint is located. We constrain the commands to only
forward linear velocities and angular yaw rates in order to keep the robot in line with
the path and facing the next waypoint.
To calculate the desired yaw angular velocity of the robot, the controller pops
the first two waypoints (𝑥1 , 𝑦1 ) and (𝑥2 , 𝑦2 ) off of the queue and computes the angle
between them as follows:
(︁ 𝑦 − 𝑦 )︁
2 1
𝜓𝑑𝑒𝑠 = arctan . (4.4)
𝑥2 − 𝑥1
We found that looking ahead to the second waypoint instead of just the first results
in smoother locomotion with our chosen resolution of 0.1 m. If two waypoints are
not available (in the case when the next point is the final goal) then we compute the

50
angle between the robot’s current CoM and the final point. The yaw error is then

𝑒𝜓 = 𝜓𝑐𝑢𝑟 − 𝜓𝑑𝑒𝑠 (4.5)

and the final commanded velocity is

𝜓˙ 𝑐𝑜𝑚𝑚𝑎𝑛𝑑 = −8.47𝑒𝜓 (4.6)

where the value −8.47 is a hand-tuned gain. Computing the linear velocity is slightly
more involved. This is done by first computing the errors between the current 𝑥, 𝑦
values and the desired values:

𝑒𝑥 = 𝑥𝑐𝑢𝑟 − 𝑥𝑑𝑒𝑠 (4.7)

𝑒𝑦 = 𝑦𝑐𝑢𝑟 − 𝑦𝑑𝑒𝑠 . (4.8)

However, since the controller takes in commands based on the robot’s frame of ref-
erence, the computed errors must first be transformed from the world frame to the
robot body frame using a simple 2D rotation computation:

𝑒ℬ𝑥 = cos 𝜓𝑐𝑢𝑟 · 𝑒𝑥 + sin 𝜓𝑐𝑢𝑟 · 𝑒𝑦 (4.9)

𝑒ℬ𝑦 = − sin 𝜓𝑐𝑢𝑟 · 𝑒𝑥 + cos 𝜓𝑐𝑢𝑟 · 𝑒𝑦 . (4.10)

In order to control the forward velocity, we also incorporate an error term of

𝑒𝑥˙ = 𝑥˙ 𝑐𝑢𝑟 − 𝑥˙ 𝑑𝑒𝑠 (4.11)

where 𝑥˙ 𝑑𝑒𝑠 is the desired forward velocity. Including this term modifies the final
commanded 𝑥˙ so that the robot tries to maintain a constant movement speed.

51
The final velocity commands are thus

𝑥˙ 𝑐𝑜𝑚𝑚𝑎𝑛𝑑 = −2.6𝑒ℬ𝑥 − 0.35𝑒𝑥˙ (4.12)

𝑦˙ 𝑐𝑜𝑚𝑚𝑎𝑛𝑑 = −0.58𝑒ℬ𝑦 (4.13)

where the various gains are once again hand-tuned via experimentation.
Since exactly reaching an infinitesimally-sized point is essentially impossible, es-
pecially when using floating-point representations, our implementation allows for a
circular buffer with radius 5 cm when determining whether or not the robot CoM
has reached the next waypoint. Autonomous path tracking stops when the queue of
waypoints is empty, meaning the goal has been reached.

4.4 Continuous Replanning

In the very beginning, the robot does not have any information about the environment
and so the initial planned path is simply a direct line to the goal. This path may not
be fully executable or traversable. As the robot follows this path, however, it collects
information about the environment and updates the map in real-time. As discussed
in Chapter 3, the perception pipeline updates the map at a frequency of 90 Hz, while
the A* planner runs at a variable rate. The planning rate depends on how far away
the goal is from the robot, with further distances resulting in longer planning times
as the planner has to pop and explore more nodes during the graph search. In order
to ensure the continuous replanning is fast enough to react to dynamic environments,
we time-bound the A* search to match the frequency of the mapping software in the
worst case. If the time limit is exhausted, the planner returns a path to the explored
node with the shortest Euclidean distance from the goal. This is relatively easy to
calculate since our implementation stores a pointer for each node to its least-cost
parent. In most cases, this heuristic works well, although we acknowledge it could
cause the robot to get stuck in complicated environments containing dead-ends if the
goal is set too far away.

52
From an implementation standpoint, the C++ queue allows for easy and efficient
replanning. Since the path tracking pops a single node at a time off the front of the
queue, whenever a path has been replanned, the controller simply changes the pointer
to point to the returned queue containing the newly-calculated path. Then, when the
controller loop executes, it once again pops the first node off, which now corresponds
to a node along the new path.

4.5 Limitations
The success of this graph search-based approach should not be underestimated. How-
ever, we acknowledge that it is subject to certain limitations. Although searching on a
fixed grid removes the computational burden of having to maintain the graph struc-
ture, the resolution of the grid must be chosen by the operator. This makes our
approach subject to resolution completeness, meaning that the resulting path is only
optimal at the resolution of the grid being used, with the possibility of there being a
more optimal path in the real world. There is also a trade-off between increasingly-fine
grid resolutions and the resources required for computation.

53
54
Chapter 5

Practical Tests and Results

In this chapter, we present our results from experiments in both simulation and on
hardware in a variety of real-world terrains on the Mini-Cheetah Vision platform. In
all of our experiments, we use the Vision RPC controller introduced in [31] as the
underlying locomotion controller, which is a modified version of the standard RPC
controller [7] that adjusts the final footstep location based on visual feedback.

5.1 Evaluation in Simulation

We tested the full mapping and planning software implementation in numerous sim-
ulated environments with varying difficulties of terrain including flat ground, uneven
ground, narrow hallways, mazes, and stairs. In all cases, the robot was able to reach
the desired goal at velocities well surpassing 1 m/s without colliding into any walls
or obstacles. Furthermore, the robot was able to successfully replan its path when
the user changed the location of the waypoint during locomotion. In the following
subsections, we examine a few of the simulation experiments in more detail.

5.1.1 Software Demo 1

The first experiment that we performed in simulation involved having the robot track
a waypoint in a flat environment through a single hallway. The terrain was rather

55
simple with few obstacles and only one left turn, but it served as a good starting
point for debugging and ensuring that the system worked as intended before testing
in more complex environments.

Figure 5-1: Software Demo 1: Hallway. (a) The robot autonomously navigates
through a simulated hallway to the desired waypoint (red sphere). The planned
path is visualized as a blue line. (b) The local traversability map is overlaid in
simulation. Blue patches represent non-traversable locations by the CoM while green
is traversable.

In Figure 5-1, the red sphere represents the goal waypoint while the blue line visu-
alizes the robot’s current planned path. We note that this visualization continuously
updates as the robot moves and replans its path. As shown in the pictures, the robot
is able to successfully navigate through the hallway to the end without colliding with

56
the walls. This verifies that the Gaussian blurring done during traversability estima-
tion is properly expanding the obstacle space 𝐶𝑜𝑏𝑠 into 𝐶𝑏𝑢𝑓 since the A* path is not
hugging the wall. Initial tests without the blurring resulted in the robot walking too
close to the wall and colliding with it. This is further confirmed by the bottom im-
age in Figure 5-1 which depicts the local traversability map overlay, where the green
region represents traversable terrain for the CoM while the blue region represents
non-traversable terrain. This experiment was run numerous times with the robot
reaching the waypoint successfully in every case at speeds of over 1 m/s.

5.1.2 Software Demo 2

For the second experiment, we introduced a more complex, though still flat, maze-like
environment with the purpose of specifically testing the robot’s ability to replan as
the waypoint location in the maze changes. Since there are multiple potential ways to
get from the start to the goal, but generally only one shortest and most optimal path,
we were able to evaluate the effectiveness of our A* algorithm in finding the optimal
path. Specifically, the user moved the waypoint around using the input controller’s
joystick and the path updated in real-time, as seen in Figure 5-2. This successfully
worked both when the robot was stationary before moving, and during locomotion.

Figure 5-2: Software Demo 2: Maze. The robot continuously replans for the opti-
mal, shortest path as the user moves the waypoint (red sphere) to different locations.

57
5.1.3 Software Demo 3

In the third experiment in simulation, we tested the robot’s ability to autonomously


navigate challenging, non-flat terrain. For this demo, the terrain consisted of flat
ground with a large staircase located at the center with the goal waypoint placed at
the very top. Each individual stair measured 25 cm deep and 15 cm tall. Using our
proposed software implementation, the vision code successfully recognized the stairs
as an overall traversable obstacle with the map output stating that the flat steps are

Figure 5-3: Software Demo 3: Stairs. The robot successfully recognizes the stairs
as a traversable obstacle and autonomously climbs to the top without directly stepping
on any of the vertical sections. The overlaid local traversability map can be seen with
blue signifying invalid foothold locations.

58
valid locations while the steep vertical transitions of the stairs are not valid. This
accurate traversability estimation was successfully relayed to both the CoM planner
and the RPC footstep adjustment which was verified by the robot never attempting
to step directly on any vertical stair transition area during multiple trials of the
experiment. In our numerous trials, the robot was able to successfully climb to the
top of the stairs a majority of the time. Figure 5-3 shows one such trial run in action.
In this image, we note that unlike earlier, the blue regions now represent invalid
foothold locations in the traversability map while green represents valid footholds, as
opposed to non-traversable and traversable regions by the CoM.

We noticed that in some cases the robot’s leg would collide with the stair as it
attempted to step up onto the next one, likely due to noise and jitter in the state
estimation or pointcloud data. However, because of the robustness of the underlying
RPC controller, the robot was able to recover each time and still make it to the top.

5.2 Evaluation on Hardware

After verifying the functionality of the code in simulation, we ran various experiments
on the actual hardware platform in real-world environments. In particular, we chose
both indoor and outdoor environments with different assortments of obstacles for the
robot to avoid including trees, rocks, gravel, shrubbery, slippery surfaces, buckets, and
trash cans. During each of the tests, we set a waypoint some distance away from the
robot and then let it move on its own. We emphasize that the selection of the waypoint
was the only required human interaction and that the rest of the experiments were
completed in an entirely autonomous manner with the robot achieving an average
speed of just above 1 m/s. In the next few subsections, we examine a couple of our
experiments in more detail.

59
5.2.1 Hardware Demo 1

Figure 5-4: Hardware Demo 1: Treadmill Platform. The robot successfully


recognizes and avoids two different-sized obstacles in its path as it walks from one
side of the platform to the other end.

In the first experiment on the actual Mini-Cheetah Vision hardware, we placed the
robot on one end of a treadmill platform and set the goal waypoint to be on the
other end. Several obstacles of varying size and shape were placed between the robot
and the goal such that it would need to navigate around them as it moved. For this
experiment, the robot was tethered to the computer but only for the purposes of
visualizing what the robot was seeing during the trial as well as where the waypoint
was set. The visualization in the upper-right corner of each subfigure in Figure 5-
4 illustrates the traversability map that the robot has computed at the exact time
the photo was taken. Green grid cells represent traversable terrain, while blue cells

60
represent non-traversable obstacles. The small red sphere is the waypoint and the
thin blue line is the current planned path. We see that as the robot moves down the
platform and the second stack of foam obstacles enters the camera’s field of view, the
traversability map immediately updates with the second blue patch and a new path
is planned around it. Our software implementation was able to reliably reach the end
goal without any collisions regardless of the placement of the obstacles.

5.2.2 Hardware Demo 2

For the second hardware experiment, we built upon the first one and placed the
robot at the start of a longer hallway in the MIT tunnels with the waypoint set at
the opposite end of it. We then cluttered the entire hallway with objects including
buckets, trashcans, and other random debris. Just as in the first demo, the robot
successfully navigated around all the obstacles in real-time as it moved down the
hallway at an average speed of over 1 m/s. Figure 5-5 shows a sequential timelapse
of images as the robot moved.

Figure 5-5: Hardware Demo 2: MIT Hallway. A timelapse of the Mini-Cheetah


Vision autonomously navigating a cluttered hallway.

61
5.2.3 Hardware Demo 3

Figure 5-6: Hardware Demo 3: Outdoor Environments. The Mini-Cheetah


Vision during a handful of experiments in various real-world outdoor environments.

Since the previous two experiments were done in indoor environments, we decided to
test the implementation in real-world outdoor terrains as well, setting waypoints all
around various locations on MIT’s campus. Despite the more challenging conditions
due to uneven terrain, varying ground firmness, and density of the obstacles, the
robot was able to successfully reach the desired goal location with a high rate of
success. Figure 5-6 shows the Mini-Cheetah Vision traversing a variety of outdoor
environments ranging from soft grass to gravel as well as concrete stairs. Although it is
hard to get a good quantitative measure of how well our proposed system performed,
we believe that the fact that the robot was able to reach the goal point without
collisions or falling over a majority of the time in such challenging terrain signifies its
overall robustness. Despite that, we do touch upon one common failure case, which

62
occurred when the robot tried to walk through light shrubbery even though it should
not have. We attribute this failure to be likely due to the sparse density of the leaves
of certain types of vegetation, which does not get picked up by the depth sensors
very well and results in a traversability estimation telling the robot that the area is
indeed traversable even though it should not be. We address a few of these types of
shortcomings in the Future Works section of Chapter 6.

5.2.4 Hardware Demo 4

In addition to the general real-world tests, we also specifically evaluated the robot’s
ability to continuously replan trajectories in the presence of disturbances applied at
random times. We found that when a human pushed the robot while moving, it
was consistently able to maintain balance and replan its path to reach the desired
waypoint despite being thrown off of the original trajectory. Figure 5-7 displays the
logged results of the 𝑥 and 𝑦 coordinates of the robot’s center of mass during one
such test, where it is clear that the robot recovered after both of the shoves.

Figure 5-7: Hardware Demo 4: Perturbation-Robustness Testing. A graph


showing the 𝑥 and 𝑦 coordinates of the robot’s center of mass as it follows a planned
trajectory. We see that the robot successfully reaches the desired waypoint even in
the presence of two human-applied pushes at randomly-selected times.

Overall, these experiments verified that our traversability estimation and path
planning frameworks are robust against the noise of the onboard sensing and local-
ization system while guiding the Mini-Cheetah Vision through dynamic environments.

63
64
Chapter 6

Conclusion

The work completed in this thesis presents an important step towards building a
complete, generalized framework for robust autonomous navigation of small-scale
quadruped robots in unknown terrain. Through the use of an intelligent mapping and
planning framework built on top of state-of-the-art locomotion controllers, the robot is
able to rapidly move around its environment while avoiding obstacles and remaining
completely untethered, bringing the research community that much closer to full
deployment of legged robots in real-world environments. In our presented approach,
the terrain data is captured via the Intel RealSense sensor suite onboard the MIT
Mini-Cheetah Vision and used to estimate the terrain’s traversability based on certain
key features. This information is then passed on to the A* path planner so that a safe
and efficient route to the desired goal point can be calculated. The robot follows the
planned path, constantly replanning as the robot perceives changes in its environment.
Our results from the wide range of terrains tested in both simulation and on hardware
strongly indicate that the proposed framework works reliably and is robust enough to
handle a variety of challenging real-world environments. Furthermore, the approaches
presented are immediately applicable to other existing quadruped robots.

65
6.1 Future Work
As with all work in robotics, the overarching goal is to develop intelligent machines
that can autonomously help and assist humans in their day-to-day lives. While this
proposed framework’s results mark a significant step towards achieving this goal,
there is still plenty of follow-up work to be researched, implemented, and tested.

6.1.1 Perception Improvements

For the time being, the perception pipeline assumes that all objects detected by the
RealSense depth camera are rigid obstacles that must be avoided. However, this
assumption is clearly not valid when the robot encounters soft, pliable objects like
tall grass, leaves, or flowers. Using our current framework as-is, it is possible that the
robot would not find a path in an area with dense vegetation even though it should
normally be able to walk through it. Future work can extend the vision pipeline
by utilizing state-of-the-art computer vision algorithms such as [56] to estimate an
obstacle’s rigidity which would improve the terrain traversability calculation so that
the robot can make more informed and realistic decisions when planning paths.

6.1.2 Motion Controller and Gait Selection Improvements

Currently, the NVIDIA TX2 onboard the Mini-Cheetah Vision is able to solve the
RPC optimization at a rate of approximately 50 Hz. While our experimental results
showed that 50 Hz is within the requirements to achieve steady-state locomotion,
work in [7] found that the controller’s overall performance diminishes with lower
solve frequencies. At higher frequencies the controller has greater robustness against
disturbances and is able to recover from tricky footstep positions if it slips. Because
the physical space for compute onboard the Mini-Cheetah is limited due to its compact
size, it is not feasible to add a better processor without increasing the footprint of
the robot. Offloading the compute and employing a tethered solution is not ideal
either. To this end, future work could develop a supervised machine-learning model
similar to [10] to replace the difficult and highly non-convex optimization problem

66
with a fast neural network that closely approximates solutions. The ground-truth
labels for the supervised learning problem can be collected from the actual RPC
controller in simulation and used to train the policy network. This kind of approach
would increase robustness of the Mini-Cheetah Vision as well as allow other, possibly
smaller, robotic platforms with less computational power to be able to achieve similar
results, improving the framework’s accessibility and overall practicability.
For further improvement of navigation across difficult terrain, the path planner
could be extended to select different gaits for specific path segments, as done in
[9]. For example, depending on the estimated traversability of the environment, the
gait could be switched between planning each footstep individually, applying a static
walking gait, using a more dynamic trotting gait, or even jumping. While we believe
the RPC locomotion controller can be used almost universally, we acknowledge that
there may be cases where specialized gaits are beneficial, such as crossing over large
gaps or navigating through narrow hallways.

67
68
Appendix A

Hardware Specifications

A.1 UP board

Table A.1: Specifications for the UP board

Component Description
Processor Intel Atom x5-Z8350 64-bit CPU SoC
Graphics Intel HD 400 Graphics
I/O 1x HDMI 1.4b
1x I2S audio port
Camera MIPI-CSI (4 megapixels)
USB 4x USB 2.0
2x USB 2.0 pin header
1x USB 3.0 OTG
Expansion 40-pin General Purpose bus
RTC Yes
Power 5V DC-in @ 4A 5.5/2.1mm jack
Dimensions 3.37" x 2.22" (85.6m x 56.5mm)
Memory 1GB/2GB/4GB DDR3L-1600
Storage 16GB/32GB/64GB eMMC
Display Interface DSI/eDP
Ethernet 1x GB Ethernet (full speed) RJ-45
OS Support Windows 10
Linux
Operating Temperature 32∘ F - 140∘ F (0∘ C - 60∘ C)
Operating Humidity 10% - 80% relative humidity, non-condensing
Certificate CE/FCC Class A, RoHS Compliant

69
A.2 NVIDIA Jetson TX2

Table A.2: Specifications for the NVIDIA Jetson TX2

Component Description
AI Performance 1.33 TFLOPs
CPU Dual-Core NVIDIA Denver 1.5 64-bit CPU
Quad-Core ARM Cortex-A57 MPCore CPU
GPU 256-core NVIDIA Pascal GPU
Memory 8GB 128-bit LPDDR4 @ 59.7 GB/s
Storage 32GB eMMC 5.1
Power 7.5W/15W
PCIE 1 x1 + 1 x4 or 1 x1 + 1 x1 + 1 x2 (PCIe Gen2)
CSI Camera 12 lanes MIPI CSI-1
D-PHY 1.2 (up to 30 Gbps)
C-PHY 1.1 (up to 41 Gbps)
Video Encode 1x 4Kp60, 3x 4Kp30, 4x 1080p60, 8x 1080p30 (H.265)
1x 4Kp60, 3x 4Kp30, 7x 1080p60, 14x 1080p30 (H.264)
Video Decode 2x 4Kp60, 4x 4Kp30, 7x 1080p60, 14x 1080p30
Display 2x multi-mode DP 1.2/eDP 1.4/HDMI 2.0
2x4 DSI (1.5Gbps/lane)
Networking 10/100/1000 BASE-T Ethernet
WLAN
Mechanical 87mm x 50mm
400-pin connector

70
Appendix B

Algorithm Psuedocode

B.1 A*
Algorithm 1: A* Graph Search
input start, goal;
init OPEN ← Queue(), cameFrom ← {}, 𝑔(·) ← ∞, 𝑓 (·) ← ∞;
init OPEN.add(start), 𝑔(start) = 0, 𝑓 (start) = ℎ(start);
while OPEN is not empty do
cur ← OPEN.pop();
if cur == goal then
return 𝑟𝑒𝑐𝑜𝑛𝑠𝑡𝑟𝑢𝑐𝑡_𝑝𝑎𝑡ℎ(cur, cameFrom);
for each neighbor of cur do
score ← 𝑔(cur) + 𝑑(cur, neighbor);
if score < 𝑔(neighbor) then
cameFrom(neighbor) ← cur;
𝑔(neighbor) ← score;
𝑓 (neighbor) ← 𝑔(neighbor) + ℎ(neighbor);
if neighbor not in OPEN then
OPEN.add(neighbor)
return failure

71
72
Bibliography

[1] Mit biomimetics lab cheetah-software. https://github.com/mit-


biomimetics/Cheetah-Software.

[2] Opengl - the industry standard for high performance graphics. www.opengl.org.

[3] Qt - cross-platform software developent for embedded and desktop. www.qt.io.

[4] Spot mini. https://www.bostondynamics.com/spot.

[5] Joel Andersson, Joris Gillis, Greg Horn, James Rawlings, and Moritz Diehl.
Casadi: a software framework for nonlinear optimization and optimal control.
Mathematical Programming Computation, 11, 07 2018.

[6] Dominik Belter, Przemysław Łabęcki, and Piotr Skrzypczyński. Adaptive motion
planning for autonomous rough terrain traversal with a walking robot. Journal
of Field Robotics, 33, 06 2015.

[7] Gerardo Bledt. Regularized Predictive Control Framework for Robust Dynamic
Legged Locomotion. PhD thesis, Massachusetts Institute of Technology (MIT),
2020.

[8] Gerardo Bledt, Matthew J. Powell, Benjamin Katz, Jared Di Carlo, Patrick M.
Wensing, and Sangbae Kim. MIT cheetah 3: Design and control of a robust,
dynamic quadruped robot. In Proceedings of the IEEE/RSJ International Con-
ference on Intelligent Robots and Systems, Madrid, Spain, Oct. 2018.

[9] Martim Brandão, Omer Burak Aladag, and Ioannis Havoutis. Gaitmesh:
controller-aware navigation meshes for long-range legged locomotion planning
in multi-layered environments. IEEE Robotics and Automation Letters, PP:1–1,
03 2020.

[10] Jan Carius, Farbod Farshidian, and Marco Hutter. Mpc-net: A first principles
guided policy search. IEEE Robotics and Automation Letters, PP:1–1, 02 2020.

[11] Joel Chestnutt, Koichi Nishiwaki, James Kuffner, and Satoshi Kagami. An adap-
tive action model for legged navigation planning. pages 196 – 202, 01 2008.

73
[12] J. Di Carlo, P. M. Wensing, B. Katz, G. Bledt, and S. Kim. Dynamic locomotion
in the mit cheetah 3 through convex model-predictive control. In 2018 IEEE/RSJ
International Conference on Intelligent Robots and Systems (IROS), pages 1–9,
2018.

[13] M. F. Fallón, M. Antone, N. Roy, and S. Teller. Drift-free humanoid state estima-
tion fusing kinematic, inertial and lidar sensing. In 2014 IEEE-RAS International
Conference on Humanoid Robots, pages 112–119, 2014.

[14] Péter Fankhauser, Michael Bloesch, and Marco Hutter. Probabilistic terrain
mapping for mobile robots with uncertain localization. IEEE Robotics and Au-
tomation Letters (RA-L), 3(4):3019–3026, 2018.

[15] Péter Fankhauser, Marko Bjelonic, Dario Bellicoso, Takahiro Miki, and Marco
Hutter. Robust rough-terrain locomotion with a quadrupedal robot. 05 2018.

[16] Péter Fankhauser and Marco Hutter. A Universal Grid Map Library: Implemen-
tation and Use Case for Rough Terrain Navigation, volume 625. 01 2016.

[17] N. Fazeli, M. Oller, J. Wu, Z. Wu, J. Tenenbaum, and A. Rodriguez. See, feel, act:
Hierarchical learning for complex manipulation skills with multisensory fusion.
Science Robotics, 4:eaav3123, 01 2019.

[18] Roy Featherstone. A beginner’s guide to 6-d vectors (part 1). IEEE Robotics &
Automation Magazine, 17:83–94, 2010.

[19] Roy Featherstone and David E. Orin. Dynamics. In Springer Handbook of


Robotics, pages 35–65. Springer Berlin Heidelberg, 2008.

[20] Mingyang Geng, Xing Zhou, Bo Ding, Huaimin Wang, and Lei Zhang. Learning
to cooperate in decentralized multi-robot exploration of dynamic environments.
In Long Cheng, Andrew Chi Sing Leung, and Seiichi Ozawa, editors, Neural
Information Processing, pages 40–51, Cham, 2018. Springer International Pub-
lishing.

[21] Gaël Guennebaud, Benoît Jacob, et al. Eigen v3. http://eigen.tuxfamily.org,


2010.

[22] P. E. Hart, N. J. Nilsson, and B. Raphael. A formal basis for the heuristic
determination of minimum cost paths. IEEE Transactions on Systems Science
and Cybernetics, 4(2):100–107, 1968.

[23] K. Hauser, T. Bretl, and J.-C Latombe. Non-gaited humanoid locomotion plan-
ning. pages 7 – 12, 01 2006.

[24] Armin Hornung and Maren Bennewitz. Adaptive level-of-detail planning for
efficient humanoid navigation. 05 2012.

74
[25] A. S. Huang, E. Olson, and D. C. Moore. Lcm: Lightweight communications and
marshalling. In 2010 IEEE/RSJ International Conference on Intelligent Robots
and Systems, pages 4057–4062, 2010.

[26] Marco Hutter, Christian Gehring, A. Lauber, F. Gunther, Dario Bellicoso, Vas-
silios Tsounis, Péter Fankhauser, R. Diethelm, S. Bachmann, Michael Bloesch,
Hendrik Kolvenbach, Marko Bjelonic, L. Isler, and K. Meyer. Anymal - toward
legged robots for harsh environments. Advanced Robotics, 31:1–14, 10 2017.

[27] Joel Janai, Fatma Güney, Aseem Behl, and Andreas Geiger. Computer vi-
sion for autonomous vehicles: Problems, datasets and state-of-the-art. ArXiv,
abs/1704.05519, 2017.

[28] Mrinal Kalakrishnan, Jonas Buchli, Peter Pastor, Michael Mistry, and Stefan
Schaal. Learning, planning, and control for quadruped locomotion over challeng-
ing terrain. I. J. Robotic Res., 30:236–258, 02 2011.

[29] B. Katz, J. D. Carlo, and S. Kim. Mini cheetah: A platform for pushing the limits
of dynamic quadruped control. In 2019 International Conference on Robotics and
Automation (ICRA), pages 6295–6301, May 2019.

[30] Benjamin Katz. A low cost modular actuator for dynamic robots. 2018.

[31] D. Kim, D. Carballo, J. Di Carlo, B. Katz, G. Bledt, B. Wei Tern Lim, and
Sangbae Kim. Vision aided dynamic exploration of unstructured terrain with a
small-scale quadruped robot. In 2020 IEEE International Conference on Robotics
and Automation (ICRA), Paris, France, June 2020.

[32] Donghyun Kim, Jared Di Carlo, Benjamin Katz, Gerardo Bledt, and Sangbae
Kim. Highly dynamic quadruped locomotion via whole-body impulse control
and model predictive control. ArXiv, abs/1909.06586, 2019.

[33] Oliver Kroemer, Scott Niekum, and George Konidaris. A review of robot
learning for manipulation: Challenges, representations, and algorithms. ArXiv,
abs/1907.03146, 2019.

[34] James Kuffner. Goal-directed navigation for animated characters using real-time
path planning and control. Lecture Notes in Computer Science, 1537, 01 1999.

[35] James Kuffner, Koichi Nishiwaki, Satoshi Kagami, Masayuki Inaba, and Hi-
rochika Inoue. Dynamically-stable motion planning for humanoid robots. Au-
tonomous Robots, 12, 09 2000.

[36] Honglak Lee, Yirong Shen, Chih-Han Yu, Gurjeet Singh, and A.Y. Ng.
Quadruped robot obstacle negotiation via reinforcement learning. volume 2006,
pages 3003 – 3010, 06 2006.

[37] Tsai-Yen Li, Pei-Feng Chen, and Pei-Zhi Huang. Motion planning for humanoid
walking in a layered environment. volume 3, pages 3421 – 3427 vol.3, 10 2003.

75
[38] T. Lozano-Perez. A simple motion-planning algorithm for general robot manip-
ulators. IEEE Journal on Robotics and Automation, 3(3):224–238, 1987.

[39] C. Mastalli, M. Focchi, I. Havoutis, A. Radulescu, S. Calinon, J. Buchli, D. G.


Caldwell, and C. Semini. Trajectory and foothold optimization using low-
dimensional models for rough terrain locomotion. In 2017 IEEE International
Conference on Robotics and Automation (ICRA), pages 1096–1103, 2017.

[40] Donald Meagher. Octree encoding: A new technique for the representation,
manipulation and display of arbitrary 3-d objects by computer. 10 1980.

[41] Hans P. Moravec. Robot spatial perception by stereoscopic vision and 3d evidence
grids. Technical report, Carnegie Mellon University, Robotics Institute, 1996.

[42] Peter Neuhaus, Jerry Pratt, and Matthew Johnson. Comprehensive summary of
the institute for human and machine cognition’s experience with littledog. I. J.
Robotic Res., 30:216–235, 01 2011.

[43] K. Okada, Toshiki Ogura, Atsushi Haneda, and Masayuki Inaba. Autonomous
3d walking system for a humanoid robot based on visual step recognition and 3d
foot step planner. volume 2005, pages 623 – 628, 05 2005.

[44] Kai Pervolz, Andreas Nuchter, Hartmut Surmann, and Joachim Hertzberg. Au-
tomatic reconstruction of colored 3d models. 05 2004.

[45] Julien Pettre, Jean-Paul Laumond, and Thierry Siméon. A 2-stages locomotion
planner for digital actors. pages 258–264, 08 2003.

[46] Patrick Pfaff, Rudolph Triebel, and Wolfram Burgard. An efficient extension to
elevation maps for outdoor terrain mapping and loop closing. The International
Journal of Robotics Research, 26(2):217–230, 2007.

[47] Mehdi Rahimi, Spencer Gibb, Yantao Shen, and Hung La. A Comparison of
Various Approaches to Reinforcement Learning Algorithms for Multi-robot Box
Pushing, pages 16–30. 01 2019.

[48] S. Schaal, A. J. Ijspeert, A. Billard, S. Vijayakumar, and J. Meyer. Planning the


Sequencing of Movement Primitives, pages 193–200. 2004.

[49] Claudio Semini, Nikos Tsagarakis, Emanuele Guglielmino, Michele Focchi, Fer-
dinando Cannella, and D. Caldwell. Design of hyq - a hydraulically and elec-
trically actuated quadruped robot. Proceedings of the Institution of Mechanical
Engineers. Part I: Journal of Systems and Control Engineering, 225:831–849, 08
2011.

[50] Anthony Stentz. Optimal and efficient path planning for partially-known en-
vironments. 1994 International Conference on Robotics and Automation, 4, 02
2000.

76
[51] Niko Sünderhauf, Oliver Brock, Walter J. Scheirer, Raia Hadsell, Dieter Fox,
Jürgen Leitner, Ben Upcroft, Pieter Abbeel, Wolfram Burgard, Michael Milford,
and Peter I. Corke. The limits and potentials of deep learning for robotics. The
International Journal of Robotics Research, 37:405 – 420, 2018.

[52] S. Thrun, C. Martin, Yufeng Liu, D. Hahnel, R. Emery-Montemerlo,


D. Chakrabarti, and W. Burgard. A real-time expectation-maximization algo-
rithm for acquiring multiplanar maps of indoor environments with mobile robots.
IEEE Transactions on Robotics and Automation, 20(3):433–443, 2004.

[53] Vassilios Tsounis, Mitja Alge, Joonho Lee, Farbod Farshidian, and Marco Hutter.
Deepgait: Planning and control of quadrupedal gaits using deep reinforcement
learning. IEEE Robotics and Automation Letters, 5(2):3699–3706, 2020.

[54] P Victerpaul, Saravanan Devaraj, Janakiraman Subbiah, and Pradeep Jayabala.


Path planning of autonomous mobile robots: A survey and comparison. Journal
of Advanced Research in Dynamical and Control Systems, 9, 01 2017.

[55] Octavio Villarreal Magaña, Victor Barasuol, Marco Camurri, Luca Franceschi,
Michele Focchi, Massimiliano Pontil, Darwin Caldwell, and Claudio Semini. Fast
and continuous foothold adaptation for dynamic locomotion through cnns. IEEE
Robotics and Automation Letters, PP:1–1, 02 2019.

[56] Lorenz Wellhausen, Alexey Dosovitskiy, René Ranftl, Krzysztof Walas, Ce-
sar Dario Cadena Lerma, and Marco Hutter. Where should i walk? predicting
terrain properties from images via self-supervised learning. 4(2):1509 – 1516,
2019-04.

[57] Alexander Winkler, Carlos Mastalli, Ioannis Havoutis, Michele Focchi, Darwin
Caldwell, and Claudio Semini. Planning and execution of dynamic whole-body
locomotion for a hydraulic quadruped on challenging terrain. volume 2015, pages
5148–5154, 06 2015.

77

You might also like