Let’s say you wanted to build the world’s best stair-climbing robots. You’d need to optimize for both the brain and the body, perhaps by giving the bot some high-tech legs and feet, coupled with a powerful algorithm to enable the climb.
Although the design of the physical body and its brain, the “control,” are key ingredients to letting the robot move, existing benchmark environments favor only the latter. Co-optimizing for both elements is hard—it takes a lot of time to train various robot simulations to do different things, even without the design element.
Scientists from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), aimed to fill the gap by designing “Evolution Gym,” a large-scale testing system for co-optimizing the design and control of soft robots, taking inspiration from nature and evolutionary processes.
The robots in the simulator look a little bit like squishy, moveable Tetris pieces made up of soft, rigid, and actuator “cells” on a grid, put to the tasks of walking, climbing, manipulating objects, shape-shifting, and navigating dense terrain. To test the robot’s aptitude, the team developed their own co-design algorithms by combining standard methods for design optimization and deep reinforcement learning (RL) techniques.
The co-design algorithm functions somewhat like a power couple, where the design optimization methods evolve the robot’s bodies and the RL algorithms optimize a controller (a computer system that connects to the robot to control the movements) for a proposed design. The design optimization asks “How well does the design perform?” and the control optimization responds with a score, which could look like a five for “walking.”
The result looks like a little robot Olympics. In addition to standard tasks like walking and jumping, the researchers also included some unique tasks, like climbing, flipping, balancing, and stair-climbing.
In over 30 different environments, the bots performed amply on simple tasks, like walking or carrying an item, but in more difficult environments, like catching and lifting, they fell short, showing the limitations of current co-design algorithms. For instance, sometimes the optimized robots exhibited what the team calls “frustratingly” obvious nonoptimal behavior on many tasks. For example, the “catcher” robot would often dive forward to catch a falling block that was falling behind it.