Where Are All the Robotaxis We Were Promised? Well…
It’s 2022, and the world is still missing the self-driving taxis we were told we’d get by now. Here’s what’s taking so long.
Robotaxi this, autonomous that—the promise of revolutionizing the ride-hailing industry with self-driving cars has been shouted from the rooftops for years. And yes, before you say it, one of the loudest voices has been Tesla CEO Elon Musk, but I want to be clear that this story isn't really about him. What we're talking about today is a much bigger question. Entering 2022, there are just a few autonomous ride service pilot programs roaming the streets of America—undeniable progress, but not exactly a paradigm shift just yet—and plenty of unfulfilled promises to go around (Okay, that was Musk, too). So where are all the robotaxis, dammit?
Like the task of designing self-driving hardware and software suites that can stand up to the myriad demands of taxi service, the answer is complicated. In one sense, they're already here. You can fly to Phoenix tomorrow and hail a self-driving Chrysler Pacifica through Waymo with no safety driver on board, and GM's Cruise subsidiary soft-launched its service in San Francisco last month. These are real advances that would've seemed outlandish a decade ago. But scaling an autonomous ride-hailing program is more than a programming challenge, which is hard enough on its own—it's a logistical one, a mechanical one, and an economic one as well. Robotaxis for the masses, it seems, will have to wait until a company can nail all that at once, which in all likelihood, is still years away from happening.
Still, it's a good time to take a snapshot of where we're at right now to assess where we need to go tomorrow. Let's take a deep dive into the state of the self-driving taxi industry today.
The Cars Have Eyes
The first task of any self-driving startup is to give a vehicle the gift of sight. It's important to start by recapping what that means right now in 2022.
Autonomous vehicle companies are quite good at this, actually, and achieve this once-impossible feat by outfitting vehicles with vast sensor suites to map the area surrounding a car into actionable three-dimensional planes. The sensors can use lasers, radar, or camera vision not just to identify objects around them, but also to plot where the objects reside in relation to the vehicle and the velocity at which the objects are traveling.
If you’ve ever seen a car sporting the name of outfits like Waymo, Cruise, or Motional, you’ve undoubtedly noticed the large packs of sensors atop the vehicle’s roof. They look kind of like party hats, except one meant for a Dalek.
These sensor packs (along with others mounted elsewhere on the vehicle) are made up of different individual sensors of varying types that I'll detail below, all of which combine to give the vehicle the ability to see the world using different modalities. By stacking multiple sensor types on top of one another, companies help to build a vehicle capable of driving itself in different scenarios with limited visibility, like inclement weather or at night. This is almost like you or I using our different senses—touch, sight, and hearing—to get a better picture of our surroundings.
A modern sensor stack in an autonomous test vehicle might use lidar, radar, and multiple cameras, each of which excels in one area and could use help in another:
- Camera: Sensors capture two-dimensional representations of the world which can be parsed by software to recognize, similar to a human reading a road sign or seeing the color of a stoplight. Multiple feeds can be used in conjunction to improve vision and potentially "measure" objects.
- Lidar: Lasers are used to measure distance and speed. One lidar unit might use dozens of individual beams to gather this data, which is plotted in space and creates a 3D picture of its surroundings.
- Radar: Radio waves are emitted from a sensor, measuring distance and velocity. Multiple radios can be used together to gather information about objects closer to the sensor, or further away.
- Ultrasonic: High frequency, inaudible sound waves are used to detect objects in close proximity to the sensor, similar to how passenger cars use parking sensors.
Most companies pursuing the dream of driverless taxis simply slap these sensors over top of an existing vehicle. Each Pacifica is equipped with five lidar sensors: a 360-degree sensor on the top, as well as four complimenting sensors at the front, rear, and at each side of the minivan. Radar is mounted at the top, rear, and on both front fenders. Lastly, a vision system sits atop the Pacifica to complete the sensor stack. The upcoming fifth-generation sensor stack found on Waymo's fleet of Jaguar I-Pace vehicles improves upon this by adding additional perimeter vision systems and high-definition radar.
A commercial application like Waymo doesn't need to hide its sensors cleanly into its body lines like a car that could end up in a consumer's driveway, so adding a large 360-degree lidar sensor atop the roof isn't really an eyesore that these companies have to worry about. Passenger cars are a different story.
Tesla is a prime example of this, as its vehicles are designed with a more streamlined styling and for greater efficiency in the name of total driving range. Despite the styling and sensor differences between commercial solutions and a Tesla, the Texas-based electric automaker plans to be the first which offers a passenger car that can pull double duty as a driverless taxi. Musk touts that this will help make passive income for vehicle owners, potentially enabling an average person to operate a fleet of Model 3s that just generate income. In fact, that's one of the reasons that Tesla's Full Self-Driving software suite costs $12,000 on top of the price of the vehicle—even though Tesla missed its timeline on unleashing a half-million robotaxis by the end of 2020.
Left Brain, Right Brain
If a robotaxi's sensors are its eyes and ears, the software mapping everything together is the brain. That is where the magic happens—where ones and zeros, cartesian coordinates, and images are fused into usable information that the vehicle's underlying software can act upon. Data from those sensors are used to determine how to navigate roadways like a human would, by interpreting lane markings, speed limits, obstacles, and whatever else the road has to throw at it. Unlike a human, however, software doesn't suffer from the curse of selective attention, meaning the decision-making process is solely focused on the vehicle's entire surroundings, so as long as it is programmed to do so.
Now, I mentioned that an AV's sensor stack is made up of multiple modalities. That's because each sensor has individual strengths and weaknesses which can be bolstered by the strength of another sensor. Think of it like rock-paper-scissors, except the only winner is the car or pedestrian that doesn't get clobbered thanks to some redundancy.
- Pros: Software-based image recognition (with color), multi-purpose (optical character recognition of street signs and environment details), inexpensive
- Cons: Two-dimensional capture, operation in poor conditions is limited or not possible, cannot natively measure distance, compute-heavy
- Pros: Accuracy and precision, speed, natively three-dimensional, resolution
- Cons: Operation in poor weather conditions is limited, costly, difficult to package
- Pros: Inexpensive, measurement accuracy, good operation in poor weather conditions, penetration of certain materials, easy to package, furthest usable distance
- Cons: Limited resolution, cannot detect very small objects
- Pros: Ability to detect small objects, good for close-proximity objects, inexpensive, easy to package
- Cons: Very limited range, can be affected by temperature, ineffective at detecting soft materials which can absorb sound waves
Many autonomous tech companies will agree that no single sensor technology on its own is enough to solve the problem of self-driving as a service, and we still have a ways to go to make that happen outside heavily mapped and strictly geofenced urban areas.
One of the big thorny questions is what happens when sensors disagree on what they see; at least, that's how Elon Musk framed it when justifying Tesla's decision to remove radar sensors from its Model 3 and Model Y, and most recently from any other vehicles running its FSD Beta software. Musk is famously bullish on cameras as the only sensors you need for true self-driving, which stands in contrast to pretty much the rest of the entire industry. His logic is essentially that it's best to give the computer one solid stream of data that it can interpret rather than having it piece together information from disparate sources like lidar and radar.
Regardless of the format, sensors simply provide data to the vehicle's onboard software, it is then up to the programming to decide how to treat that data. More so than anything else right now, this is what companies are spending untold billions trying to figure out. Again, there have been real advancements to celebrate, but still a ton of open questions about replicating that success safely and at scale. Just an example among millions: One company may choose to ignore conflicting data, while others may treat the data from each sensor as independent, actionable entities in the name of safety. To Musk’s point, acting on all data points without proper validation could cause a problem that Tesla has been struggling with for ages: phantom braking.
Should a system be programmed to ignore all conflicting data points, safety could be compromised. Consider the 2018 death of Elaine Herzberg, a woman who was struck and killed in Tempe, Arizona, by an Uber test vehicle when its autonomous systems and safety driver failed to act. Sensor data from both lidar and radar units detected Herzberg six seconds before impact, however, the system could not predict Herzberg's path and ultimately resulted in the fatal collision.
Perfecting Self-Driving Is Hard
Companies like Waymo and Cruise are confident in their ability to make a car drive itself in a particular Operational Design Domain (ODD), or a set of defined conditions that outline how and where a self-driving vehicle can safely operate. Expanding an ODD while still achieving reliability and safety is proving to be the challenging part for the entire industry. One of the most challenging problems for these vehicles to solve today is figuring out how vehicles can continue to remain driverless in poor weather conditions, even with complex sensor stacks.
Waymo, for example, is currently testing driverless vehicles in Phoenix, Arizona—one of America's driest cities, with over 330 dry days a year. It gets 76 percent less precipitation than the rest of the United States and sees snow once every few decades, meaning that from a weather standpoint, it doesn't get much more predictable than Phoenix. Not exactly representative of conditions elsewhere. Waymo knows this of course, and it's also testing its vehicles in San Francisco, a notoriously more wet and foggy city. Since cameras and lidar sensors experience a significant drop-off in usability under these conditions, the vehicle relies heavily on high-resolution radar sensors (which still look very much like blobs). It's not perfect but does give a bit of redundancy when software has low confidence in data parsed from imaging sensors.
San Francisco was Waymo's attempt at learning to walk before it could run. Last November, it sent a fleet of its lidar-equipped Pacificas to New York City as part of a larger effort to map a more complex and demanding territory with challenging weather conditions. It'll be interesting to see what kind of data the test run returns, given that self-driving tech in general struggles with snow-covered roadways and iced-up sensors.
Outside of weather, there are a number of scenarios where vehicles simply aren't programmed to deal with. Self-driving companies have accounted for this, however, and use machine learning to help a vehicle intelligently determine how to behave in certain contexts. But if it is the first time a computer has to parse data that it has never encountered, it may be uncertain how to handle that particular event. And millions of test miles still haven't exhausted all that America's roads can throw at a driver.
Meet Joel Johnson, or as he's better known as on YouTube, JJRicks. He likes to ride in Waymo self-driving taxis and film his trips, and his videos helped to showcase the early progress of Waymo Driver in Phoenix. He documented the everyday traffic situations—mundane and otherwise—that the vehicles encountered, and gave the world outside of the Phoenix testbed a look at how the vehicles performed.
One particular situation that arose in May of 2021 involved a Waymo vehicle and a few traffic cones. As the vehicle pulled into traffic, it was unsure how to proceed and got stuck, blocking traffic. Despite having logged millions of synthetic miles in simulation city, Waymo's software had not accounted for this particular event or even one close enough to confidently react. Cones are just one example—there are also traffic pattern changes, construction, weird map errors, lane markings (or lack thereof), and other edge cases to consider.
The Cars Need Hands
At the end of the day, humans are still the biggest element on both sides of the mobility industry, and that likely won't change anytime soon.
On the operator side, when an AV isn't sure how to act in a specific scenario, it phones home to a human for help. A team of Remote Guidance (RG) operators oversee the fleets of vehicles, typically on a multiple-to-one approach (meaning that one RG operator will look after multiple operational vehicles at any given time). Depending on the company, the vehicle may ask permission to proceed (a go/no-go response), or it may seek context (road closure, construction, obstacle) so that its programming can make a more informed decision of how to navigate a certain path. If a vehicle still can't determine how to proceed, a human may need to take control of the vehicle. For most companies, this isn't done remotely due to the latency of cellular connectivity. Instead, a local field team is dispatched to pilot the vehicle around the obstacle or to its final destination—such is the case in the JJRicks video above.
But RG operators, safety drivers, and field teams are just the tip of the meatbag iceberg. You also need regional teams of caretakers who have the sole responsibility of maintaining the vehicles between riders. For example, a prep team might be in charge of refueling (or charging) a vehicle and its routine mechanical maintenance. Another maintenance team might be the ones who ensure that a vehicle is vacuumed and detailed at regular intervals. Picture this: you and your friends leave the bar one night. Someone in your party has had a bit too much to drink and gets a wee bit of motion sickness on the way home. Someone with hands—a real person—still has to clean that up. Does the car depart from its route to head back to dispatch to be sanitized? Does a cleaner head out to take care of it in the field?
Today's infrastructure is also extremely human-centric. Already congested city streets are meant to serve people, not empty vehicles. If a robotaxi finds itself without a passenger, it simply takes up space. So just how does it most efficiently spend its time waiting for the next passenger?
Roaming through the streets increases traffic and decreases usable range. Returning home moves the vehicle further away from potential riders and decreases usable range. Parking along the street could take up the limited curbside space and leaves the vehicle without a way to pay for metered parking. There doesn't seem to be a clear win in this scenario, and Waymo—the only industry player who responded to my requests on the state of the mobility industry—could not definitely tell me how it planned to solve this problem.
The human factor is also likely to be what separates a well-oiled fleet of Robotaxis owned by an AV company and the Tesla owner who expects to dispatch their electric sedan out for a night of passive income. While Elon Musk’s promise of one million robotaxis by the end of 2020 seems like a grand idea, it truly does leave out the human, which is still very much a necessity in today's stage of self-driving.
Building cars is hard, and building self-driving cars is extremely hard, that much is clear. Automated mobility as a service is just one of many avenues that self-driving cars are poised to service in the future, but it's going to take a lot more time and work to build a commercially viable product. It seems like only yesterday that the world was in awe over the robotic cars competing in DARPA's Grand Challenge in 2004. It was there where Waymo co-founders Sebastian Thrun and Anthony Levandowski would get their first taste of self-driving. Years later, it proves to be the sticking point of Tom Cargill's 90-90 rule: "The first 90 percent of the code accounts for the first 90 percent of the development time. The remaining 10 percent of the code accounts for the other 90 percent of the development time."
Back to our initial question: The answer may very well be that robotaxis simply don't work everywhere, that the ODD is constrained to certain urban areas, landscapes, or even weather conditions while technology is in its current state. You might see them in bigger cities over the next decade like you do those shared electric scooters now—a curiosity, but ultimately not essential. Or we might be on the precipice of a huge technological leap that'll allow companies like Waymo to cross off the programming part and start thinking about the next challenges in building a real self-driving rideshare company. Regardless, it's good to remember that the industry players spearheading the effort are indeed making strides, and a handful of regular schmoes like you and me who live in the right place (like Phoenix or San Francisco) can even hail a robotaxi today.
Ultimately, the name of the game is safety. A city full of two-ton computers on wheels will be a dangerous place for pedestrians and motorists if the race to the finish line is treated like a sprint instead of a marathon. Slow and steady wins the race, and likely a lot of lives down the line.
Got a tip or question for the author? Contact them directly: email@example.com.