Q&A with Aaron Saunders, VP at Boston Dynamics, on teaching robots to dance and how that informs the company’s approach to robotics for commercial applications — A week ago, Boston Dynamics posted a video of Atlas, Spot, and Handle dancing to “Do You Love Me.”
We hear how Boston Dynamics taught its robots to dance from Aaron Saunders, Boston Dynamics’ Vice President of Engineering, and he tells us all about where Atlas learned its dance moves.
The Boston Dynamics company posted a video a week ago featuring Atlas, Spot, and Handle dancing to “Do You Love Me.” It was intended to celebrate what Boston Dynamics hopes will be a happier year as a whole. Since the video was released yesterday, it has been viewed nearly 24 million times, and it is not surprising that it is so popular, given the compelling combination of technical prowess and creativity present in it.
There isn’t anything groundbreaking about what is going on in the video, in the sense that we’re not seeing any of the robots demonstrate fundamentally new capabilities in the video. However, that should not take away from the fact that it is impressive—you’re seeing the state-of-the-art in humanoid robotics, quadrupedal robotics, or whatever it is that you’d like to call it.
There is something unique about this video from Boston Dynamics, which comes largely from a collaboration between the company and choreographer Monica Thomas (you can see more of her work here) to provide an artistic component to the video. While we are aware that Atlas is capable of doing some practical tasks, and we know that it is capable of doing some gymnastics and some parkour, but dancing is undoubtedly something different. In order to discover what is required to make these dancing robots a reality, we spoke with Aaron Saunders, Boston Dynamics’ Vice President of Engineering, to find out what it takes to achieve this (and it’s a lot more complicated than it seems).
Saunders has been associated with Boston Dynamics since 2003, which means that he has played a fundamental role in a number of robots that Boston Dynamics has designed, even ones that you may have forgotten about. Have you ever heard of LittleDog, for example? This adorable little quadruped was designed and built by a team of two, including Saunders, who was part of that team.
In spite of the fact that Saunders has been an integral part of the Atlas project since the beginning (and has had a hand in pretty much everything else that Boston Dynamics works on), he has spent the past few years leading the Atlas team, and he was gracious enough to answer some of our questions regarding their dancing robots.
Our process started with composing and assembling a routine with the assistance of a choreographer and dancers so that we could create an initial concept for the dance. I believe that one of the challenges, and probably one of the main challenges for Atlas in particular, was the ability to adapt human dance moves so that they could be performed by a robot. In order to accomplish this goal, we used simulation software to rapidly iterate through different movement concepts while also soliciting feedback from the choreographer in order to reach behaviors that Atlas was strong and fast enough to execute on its own. During the iterative process, the engineers would literally dance out what they wanted us to do. They would look at the screen and think, “That would be easy” or “That would be difficult” or “That scares me”. After a discussion, we would try a few different things in simulation, and we would adjust the moves to find a set that would work on Atlas in a compatible manner.
Our goal during the project was to drive down the time it took for creating those new dance movements, and as we built tools, that time frame got shorter and shorter. One example would be that eventually we were able to use that toolchain to create one of Atlas’ ballet moves within just one day, the day before we filmed, and it worked as planned. As a result, it is not handscripted or handcoded at all; it is all about having a pipeline that allows you to take a diverse set of motions, which you can describe using a variety of different inputs, and then allowing the robot to carry out those motions.
It took a few iterations of the software and the machine to come up with a solution to some of the spinning turns used in the ballet parts, since they were so far removed from leaping and running and the other things that are more familiar to us, so they posed a challenge for the software and the machine. Dancers, on the other hand, are definitely flexible and strong, and if you try to do the same thing they do with a robot, it turns out to be a very difficult challenge. We definitely learned not to underestimate their flexibility. It’s a very humbling experience. My fundamental opinion is that Atlas does not have the same range of motion or the same power that these athletes do, but we are continuing to develop our robots in this direction, as we believe these kinds of robots must have a performance level that is comparable to that of these athletes to be widely deployed commercially, and eventually in homes.
A robot is very good at one thing: doing something over and over again in exactly the same way, over and over again. After we had dialed in exactly what we wanted to do, the robots were able to just repeat the process over and over again as we experimented with different camera angles and settings.
During the course of the process, I thought the people we worked with actually showed a lot of talent for expressing themselves through motion, and for thinking about motion in general. As for the robots, they are great at doing motion—they are dynamic, they are exciting, they balance themselves well. The dancers connected with the robots’ movement and then shaped that into a story, no matter if the robot had two legs or four legs, so what we found is that we connected with how they moved. The more pragmatic commercial behaviors also have a lot to do with being able to think about how to go about doing something if you don’t necessarily have the template of animal motion or human behavior, and that is also true when you don’t have a template of what to do.
Our view is that a wide range of robot applications can benefit from the agility, balance, and perception skills inherent in dance and parkour. These skills can be applied to a wide range of robot applications. Boston Dynamics has made a name for itself in robotics by finding the right balance between developing a new robot capability and having fun at the same time. This has been a great way for us to progress in robotics.
Using a robot to perform these dynamic motions over a period of several days is one good example of how pushing limits can teach you a lot about the robustness of your hardware by asking it to do these dynamic motions for several days in a row. Through the productization of Spot, we have been able to create a product that is extraordinarily robust and almost requires no maintenance to function-it can just dance all day long once the user teaches it. This is due in part to the lessons we learned from all those previous things that might have seemed absurd and fun at the time, but that have proven to be incredibly robust today. In order to even get a sense of what you do not know, you have to go into uncharted territory.
I will try to answer the question in the context of this video, but I believe that this can be applied to all of the videos that we post on our website. In order to make something work, we work hard to see that it works, and when it works, it works well. As for Atlas, most of the robot control existed from previous work we did, like what we had done with parkour, which prompted us to go down the path of using model predictive controllers that take dynamics and balance into account when controlling the robot. On the robot, we were able to execute a set of dance steps that the dancers and choreographer had mapped out offline in consultation with us. It took us a long time, months, to come up with the dance, compose the motions, and iterate in simulation for many months before we finished.
We even decided to upgrade some of Atlas’ hardware so that it would be able to handle dancing more effectively, as it requires a lot of strength and speed. The amount of movement and speed that you have in dance is incredible, even though you might think that parkour is more explosive-although you think that parkour is way more explosive-the amount of motion and speed that you have in dance is absolutely incredible. It was also necessary to create the capability in the machine to go along with the capability in the algorithms over the course of months; and that took a significant amount of time.
We only filmed for two days once we had completed the final sequence that you see in the video, and that was it. We spent quite a lot of time during that process figuring out how to move the camera through a scene with many robots in it so as to capture one continuous two-minute shot, and even though we ran and filmed the dance routine many times, we were able to repeat it quite reliably. The opening two-minute shot did not have any cuts or splices in it at all.
As a result of some hardware failures, some maintenance was required on our robots, and there were some instances in which our robots stumbled and fell down. There is no promise that these behaviors will be productized or that they will be 100 percent reliable, but they are certainly repeatable and are definitely repeatable. We try to be honest in showing things that we are able to do, not just snippets of things that we have done over the years. The truth is, there is a certain level of honesty that is required when you say that you have achieved something, and that is definitely an important factor for us.
There are only a handful of Atlas machines in the world, they’re complicated, and reliability isn’t a main focus. Atlas, as a machine, is still, you know… it’s still a little bit of a niche product. There is no doubt that the robot would be broken from time to time. The fact that the hardware was so robust, when viewed from the perspective of what we were trying to accomplish, was really great for us. It is because of that robustness that we were able to do the video at all, without which we would not have been able to do it. It seems to me that Atlas is a little more like a helicopter, in that there is a higher ratio between the time spent doing maintenance and the time spent performing operations on the aircraft. With Spot, the expectation is that it will behave like a car, where you can drive it for a long time without having to touch it, unlike a car, which will require you to touch it every so often.
We have explored a lot of different things as a company, but Atlas does not use a learning controller at this time. There is no doubt that we will one day reach a point where we will be able to do this. At the current stage of Atlas’ dance performance, we use a mixture of what we like to call reflexive control, which consists of reacting to forces, optimizing trajectories online and off-line, and using model predictive control to achieve the dance performance. This is because these techniques are reliable methods of unlocking really high-performance stuff, and we are very knowledgeable about how to handle them in a way that maximizes their performance. As far as what we can do with them is concerned, we haven’t come to the end of the road yet.
We plan on using learning in order to extend and build upon the foundation we have established of software and hardware that we have developed, but I think that we, along with the rest of the community, are still trying to figure out where the best places are to apply these tools. As part of our natural progression, I think you will see that as part of what we are doing.
The company we work for loves to follow advances in sensing, computer vision, and terrain perception because those are areas where as technology advances, the more we will be able to do. The manipulation research is one of the things that I like to follow personally, and in particular manipulation research that is focused on improving our understanding of complex, friction-based interactions such as sliding and pushing, or in the case of moving compliant materials like ropes.
As we move away from just pinching, lifting, moving, and dropping things, we are seeing a much more meaningful interaction with the environment, instead of just pinching, lifting, moving, and dropping. This kind of research is going to unlock the potential for mobile manipulators as well as open up a whole range of possibilities for robots to interact with the world in a more rich and meaningful way as a result of that type of manipulation research.
For me personally, and this is probably because I spend so much of my time immersed in robotics, where I have a deep appreciation for what a robot is, what its capabilities are, and what its limitations are, one of my strongest desires is for more and more people to spend more time with robots, and in order to achieve this, I believe we need to spend more time with them. As a result of our videos on YouTube, we receive a lot of opinions and ideas from people looking at them, and I think that if more people had the opportunity to consider, think about, and learn about robots, as well as having the opportunity to spend time with them, then they would be able to imagine new ways in which robots could be useful to them on a daily basis. There are a lot of possibilities here that I find really exciting, and I just wish that more people would be able to take part in that journey.