Here's Why That Autonomous Race Car Crashed Straight Into a Wall
A little bit of bad code caused a hilariously unfortunate crash.
You might remember the Roborace car that drove itself into a concrete wall earlier this week during a Twitch livestream. It brought on a flood of negative comments toward autonomous vehicles and confusion from people who didn't understand why the crash actually happened. Fortunately, an engineer from the Schaffhausen Institute of Technology—the team that entered the car—was able to chime in and give the world a better understanding of where everything went wrong.
One of the four SIT engineers said in a Reddit comment, "The actual failure happened way before the moment of the crash, on the initialization (sic) lap. The initialization lap is there to take the car from boxes to the start/finish line and the car is driven by a human driver during the lap. The initialization lap is a standard procedure by Roborace."
They continued, "So during this initialization lap something happened which apparently caused the steering control signal to go to NaN and subsequently the steering locked to the maximum value to the right. When our car was given a permission to drive, the acceleration command went as normal but the steering was locked to the right."
The biggest takeaway is the acronym used by the engineer: NaN. It stands for Not a Number and means quite literally what it says—that the value output by a program is not a real number. Typically, this is the result of an infinite number being output, or an impossible calculation (like division by zero) being performed.
While the vehicle was being driven on its human-operated initialization lap, the failure occurred and remained while the car transitioned into autonomous mode. The engineer confirmed that the vehicle telemetry reported back the invalid data, however, it was not flagged as being invalid and was missed by the vehicle operators. Combine that with a desired trajectory that checked out and you have a recipe for failure.
"Ironically, [the NaN value] did show up on telemetry monitors, but it showed up along with 1.5k other telemetry values. Usually the operators would look only at the indicator flags that there were no failures, and in our case all indicator flags were green," wrote the engineer. In a separate comment, they continued the explanation by saying, "We did implement checks for what seemed to us as more common failure scenarios, but the devil here was that this one first appeared during the run and we did not cover it at the analytical analysis stage. In other words, we did not expect a NaN value to appear there and put too much confidence in our decision."
The engineer went on to mention that that controller is implemented in MATLAB, meaning that while there was an NaN output in the data stream, the system wouldn't come to a halt for that reason alone. And while the team did code in many fail-safes in other areas of the application, it unfortunately only contained data validation on valid numbers—and as you recall, an NaN value is not a valid number, meaning that validation would not be performed on it.
The end result? The car assumed that the data was correct, and sent the invalid output to the actuators that controlled the car's steering. The wheels locked to the right and the car then made very good friends with a concrete wall.
And this, ladies and gentlemen, is why self-driving is hard.
Got a tip? Send us a note: firstname.lastname@example.org