Several American companies (Boston Dynamics, Tesla, Figure AI), and dozens of Chinese startups are currently working on creating useful humanoid robots. I personally find this to be pretty exciting because I dream of one day having a robot to do the chores at home. Doing the laundry, washing the dishes, cooking, keeping the countertops, floors and bathrooms clean. I enjoy cooking from time to time, but most household chores represent time I'd rather spent with loved ones, exercising, or working on personal projects. Our human lives are finite, and time is precious. It's not just about chores though. Many elderly people need extra help in the last decade or two of their lives. Having robotic assistants with infinite patience could mean billions of elderly people don't have to spend the precious time they have left in the confines of a hospital.
General-purpose robotics is a technology that can radically transform the world in ways that we can't even imagine yet. That's not really what this post is about though. I want to discuss something more basic. Every few weeks, I see someone on reddit or X make a comment along the lines of:
"Why should they be humanoid robots?"
"The human body is clearly not the most efficient form factor."
"Wheeled robots would be cheaper."
"The robots would be more stable with 4 legs instead of 2."
I think that all of the above opinions are fairly bad takes. The people writing things like this are really missing the point. There are several reasons why it makes perfect sense, if we are trying to build general-purpose robotic assistants, to try to build humanoid robots. Let me explain why.
Let's start with the obvious: the cost. You could argue that wheeled robots are cheaper than a robot with two legs. Okay, maybe, but by how much? Your robot is going to need batteries. It's also going to need significant onboard compute because wifi/cellphone connections are too unreliable and have dead zones. It's also going to need cameras, and it's going to need arms. How much of the total cost are you saving by building your robot on a wheeled base instead of giving it two legs? Let's assume you can make your robot 20% cheaper. Great, you can pass those savings on to the customer, but how much functionality are you losing compared to a legged robot?
Wheeled robots probably can't get in and out of a car unassisted. This means your robot can't take a robotaxi to move to a different location. Wheeled robots may not be able to handle steep grades outside, such as may be found on some San Francisco streets. They also can't go up or down stairs unassisted. Okay fine you say, but none of this matters if your robot only operates inside an office building. That may not be true, however. Have you ever had to plug or unplug a cable under a desk? How do you think a 140cm tall wheeled robot is going to perform at this task? What if you want your robot to clean the floor or clean behind a toilet? There are many tasks where a wheeled robot is going to struggle compared to a legged robot, even in an office building, and if the cost difference is only something like 20%, the loss of functionality makes no sense. The purpose of robot helpers is for them to help you. Ideally, you wouldn't need to go help your robot(s) because their wheels are tangled in wires or they can't navigate stairs or take a robotaxi to go get the maintenance that they need without your assistance.
Another silly notion is that the human form is not the optimal form for a general-purpose robot. Okay, I guess? Maybe? In what way? One argument I've seen is that four legs is more stable than two. My first question for you then is how often do you trip over and fall down? I think it may happen to me less than twice a year on average. Now, if you want to build a robot with four legs, it's going to increase the cost of your robot. How much is that increase in stability or speed worth? You're also probably going to increase the frequency at which your robot needs maintenance, simply because you've greatly increased the number of moving parts by doubling the number of legs.
The simplest reason to build humanoid robots is that our immediate environment has been designed for human beings. We've created tools, furniture and buildings that are suited for human beings. You may think that you can create a more efficient robot that can do everything that a human can and more, but this may not be as easy as you think, and it's is going to come with tradeoffs such as increased cost and increased maintenance. Not to mention, the real problem that we need to solve, the hardest problem, is not how to build the physical robot platform, it's how to create the general-purpose AI that will make this robot actually useful.
Creating the AI that can power a general-purpose robot is extremely hard. There are many unknown unknowns in this area, and so it makes sense, for the physical design of the robot, to start with a form factor we know will be able to operate effectively in an environment designed for people, and that is a humanoid robot. There's one more reason why humanoid robots make a ton of sense though, and I'm surprised I haven't seen more people talk about it: our current AI training algorithms are very data-hungry, and there's a lot of data available when it comes to humanoid behavior.
The most efficient way to train robots to do things is via imitation learning, meaning we train a robot to imitate the movements of a human doing something. Now, it should be pretty easy to understand why it's easier to have a robot imitate you doing a task if the robot's body and limbs are shaped like yours and the topology of its body is similar to yours. We can collect training data of humans doing tasks via motion capture, or possibly simply by filming humans doing various tasks and having a deep learning model reconstruct their body's pose. Do I need to explain why it might be easier to teach a humanoid robot to use a leaf blower than a six-legged robot with thin spider-like limbs that each have two fingers at the end? About 82 years of video gets uploaded to YouTube every day, so we have a ton of data on how humans move and interact with the world. It's much easier to leverage this data or collect new data using human operators if we build robots that can interact with the world roughly in the same way that humans do.
There's one last aspect to this discussion, which is the social aspect. We're going to need to live in a world that has an increasing number of robots. General-purpose robots have so much economic value that we might see a boom in the next 10-20 years. They're going to start rapidly appearing everywhere and it's going to be a shock for many people. It makes sense to design robots in a way that they appear friendly, and not alien or scary, so that we can minimize this shock to some extent. If you try to design a superhuman robot with 4/6/7 legs, or a robot that can turn its head 180-degrees, you start to go into a zone where you might end up with something that looks uncanny, scary or creepy like an insect. Star Wars has shown us that non-humanoid robots like R2D2 can appear friendly and charming, but as previously stated, a real-world R2D2 probably struggles with stairs. So, if we want to design robots to live and help in a human world, why not start with something we know has a very good chance of working well?