Driving Data as a Defensible Moat for Self-driving Cars by Peter Reinhardt

Peter Reinhardt

Co-founder, CEO @ Segment

Driving Data as a Defensible Moat for Self-driving Cars

In a future where self-driving cars are available for $0.25 per mile with less than 5 minute wait times, the cars are clean, and you control the music… Uber and Lyft both lose their driver-rider network effects and the market becomes a commodity rentals business again.

Or does it? What would be the key competitive differentiator for a self-driving car service?


The world is incredibly fixated on the safety of self-driving cars, far more than they are on the safety of human-driven cars. The implacable fear with self-driving cars is that they’ll do something crazy, crash, kill you, maim you or otherwise damage or scare you when you don’t have control. It’s terrifying.

The self-driving car service with the highest safety ratings and a better safety record will simply command a higher price from riders.

For example, would you pay $0.25/mile for 0.01% odds of a crash, or $0.35/mile for 0.001% odds of a crash? I’d choose the more expensive option all day long, because both prices are so low and the cost of an accident is just life-shatteringly awful. But safety doesn’t just command higher prices, it also lowers the fleet’s insurance costs, giving the safest fleet a wider margin on both ends.

Given that safety will be the metric that matters most for self-driving car fleets, which company in the mix will win? Who is going to build the safest self-driving car?

The major breakthroughs in AI and hardware sensors will likely be commodities. Google, for example, already open sourced TensorFlow, and the key sensors like LIDAR are generally widely available from electronics manufacturers, with rapidly falling prices.

But there’s another piece to having a great safety track record: you need a TON of data to train your AI. The state of the art in AI is basically to just throw as much data as possible against well-understood algorithms (like neural nets), and incrementally improve the results.

And the availability of training data is not evenly distributed: the bigger the fleet of self-driving cars, the more data you have, which allows you increase safety through better driver-AI, which lets you reduce insurance costs, which lets you keep lower ride prices and higher safety ratings, which attracts more customers. This is a classic economy of scale that creates a significant defensive moat around a self-driving car business.

Now let’s look at how Google, Uber and Tesla stack up when it comes to access to training data.

  • Google has a few vehicles on the road, and the ability to simulate driving on a massive scale.
  • Tesla has just over 100,000 vehicles on the road (that have sensors).
  • Uber has millions of vehicles on the road (but none with detailed sensors sending data to Uber).

Google’s software is appears to be in the lead, but Tesla is arguably building up a pretty formiddable dataset, and Uber has a bunch of untapped potential.

Looking forward, Google should be running simulations like crazy and rushing to get more than just a few test vehicles onto the road. Tesla should be rushing to get more vehicles on the road as fast as possible. Musk just announced they’re accelerating the race to 500,000 vehicles by two years. And Uber should be giving drivers a bonus for installing a sensor pack on their cars, taking advantage of the millions of diverse vehicles and drivers they already have on the road.

The race to accumulate the biggest dataset is just beginning.