Thoughts on Driverless Cars (part 3 of 6)

This is a continuation of my thoughts on where we are headed with driverless cars. I would recommend reading this series from part 1.

Previously we discussed “How to manage surges in ridership demand?” In this post, I wanted to provide some my thoughts on how ride-hailing fits and driverless technologies will work together. Both are disruptive and symbiotic technologies, so there is a lot here to discuss.

How does ride-hailing fit into all of this?

As shared CapEx and OpEx models proliferate, ride-hailing will become the primary means for letting the transit infrastructure know where you are and where you plan to go. While uber and Lyft may argue this point – technologically, sending your location, and your desired destination to a fleet operator is not a hard application to develop. The primary problem comes down to the following:

  • Building the supply to support the transit infrastructure
  • Ensuring the ride is safe and trustworthy (this includes the ML algorithm supporting the driverless behavior, and the route taken)
  • The cost of the ride
  • The time it takes for the ride to commence and complete
  • The experience of the ride (this also include brand and brand loyalty)

I assume that anyone operating providing a ride-hailing service has the capital or network to provide the appropriate supply (public transit, OEMs, or fleet operators who organize privately owned vehicles).

I also assume that with driverless vehicles the costs of transit will drop by over 80%, and a highly competitive market, there will be a fast race to 0, just as we’re seeing with computer storage.

Lastly, I assume that routing methods and algorithms have become quite prevalent, and while there is some interesting work going on with efficient pooling routing models, the basic routing models will negate the time difference in a vehicle’s ability to get from point A to point B.

The biggest factors on which company users hail will come down to:

  • safety and trustworthiness of the ride
  • availability of the fleet to commence a ride
  • and the experience of the ride

Ride safety and trustworthiness

Currently, this is handled with background checks on drivers, IHSI ratings of the vehicle, and other safety measurement methods. Lapses in safety or trustworthiness lead to disloyalty and complete abandonment. It is absolutely imperative that driverless companies provide ride safety and trustworthiness. The first to figure out safety and trustworthiness will have a substantial first mover advantage in this driverless world, triggering the ability to bring OpEx driving costs down substantially and as a result a massive competitive advantage.

Driverless technology providers (OEMs, ride-hailing companies or tech companies) will need to ensure that their driverless technologies ensure the strictest of safety standards, as humans will need to trust technology to take over what they currently do. Typically, this is handled with skepticism, but over the long term the benefits of the newer technology outweigh the skepticism and even the laggards will switch over to the new technology.

To ensure driving safety, machine learning techniques will need to be adopted. To develop high performing machine learning models, you need 3 primary things: lots and lots of data, statistical and physical features generated from the data, and algorithms to apply those features against, which develops a model. In a driverless world, we will need many ML-based models, and the ability to execute those models quickly on the vehicle itself.

Collecting the Data

Data collection is essential to training driverless car models. The more data collected, the more algorithms can train on, and as a result of the more “experienced, on the other hand ” the model will be. As a result, it is imperative for driverless car technologies to collect as much data as they can.

While Google is best known for bringing driverless technology to broad public attention first, they’ve fallen way behind in collecting data. On the other hand, Tesla has taken a nearly insurmountable lead. Tesla’s autopilot technology is a data collection powerhouse, whether drivers are enabling autopilot or not. The collection of driver behavior and real-world scenarios as real-time data is powerful training data to teach driverless models how to drive.

Tesla’s stack of sensors may not be as sophisticated as Google’s or Uber’s (e.g. laser range finder), but their ability to collect data en masse from every one of their vehicles on the road is unparalleled. If there is an optimal sensor that can dramatically impact the accuracy of driverless behavior it will still need to collect >100B miles of data to provide the fraction of training data that will affect driving behavior.

However, in addition to having the right stack of sensors with a sufficient amount of data, you have to ensure that the quality of the data is pristine. Unlike software-generated data, sensor-generated data has inherent issues: data recording gaps, extra-noise, drifting calibration. Identifying these sensor-generated data issues and being able to treat these issues is absolutely imperative to any data analysis or ML-models.

Generating the features

Features are calculations generated off the data to represent something statistically significant or a relationship between values, so the model does not need to learn it. The art of creating the right features (if working on the same data), is primarily what determines the performance of a predictive model. This is where having the right data scientists, analytical tools and subject matter experts play a critical role.

Google has built an army of data science experts, significantly improving performance of search recommendations, friend recommendations,  ad recommendations (using past behavior, social relationships, and location), routes for mapping, and even the regular home-based activities (Google Home and Google Assistant), the data scientists who have worked on these models have access to some of the best tooling to build terrific features which have created a moat in Google’s ability to execute in this space.

However, Google has to know how the vehicles will function. What makes all of their past forays in machine learning successful is the fact that they’ve owned the environment where the model execution was happening. Google has no control of the vehicle (right now they only install a small set of vehicle types, but the plan is to apply this technology to any vehicle), or the environment it is in. Google’s driverless technology has a large amount to learn both with each car in addition to environmental circumstance. Tesla’s approach however on standardizing on just a small standard set of vehicles (Model S, Model X – which is nearly the same as the Model S, Model 3) allows them to spend less time learning vehicle performance, and more on the environment. Additionally, Tesla, as the vehicle manufacturer has an unfair advantage in knowing how the vehicle behaves physically, which Google or Uber may not.


Machine Learning lately has been studied quite extensively. Nearly all machine learning models are backed by some combination of a standard set of ~200 algorithms. What defined a model’s performance is less the algorithm (using similar categorical algorithms will lead to nearly the same performance) and more the features. Since this is a highly contested market, I would not be surprised if there are proprietary algorithms that have been developed, but I would be surprised if these new algorithms vastly outperform the existing set of publicly available algorithms.

Model training and execution

As more data is collected, driverless models will be trained and driverless performance will be trained on the new data (learning as they add more miles). These learnings can not just learn from 1 vehicle, but learn from the fleet.

What is going to matter more is the ability for these models to execute in real-time. All of the parallel sensing information flowing to the vehicles’ computers will require on-board computing to digest this information and allow for model execution to take place in fractions of a second so the vehicles can adapt to the environment instantaneously.

This will require highly parallel computing to perform these actions. Those who are building computing platforms that are parallel based (GPU computers, FPGA computers, etc)  will need to find ways to bring this to vehicles. There is extensive environmental testing that takes place for vehicles to execute computationally (temperature, tumbling, sandblasting, etc) – Google, Amazon, Facebook, and Microsoft are leading this computing platform, but Tesla’s nearby access to the same pool of engineers has allowed them to focus more on the environmental aspects of this model execution. The ability to execute models like these in harsh environmental environments is a big need – beyond transit. If there’s another big bet to play here, I would focus on building these model execution platforms.


In addition to being able to execute the models, it is important to provide cyber security to this platform. This includes both the data at rest and the data in motion. The data that is sent up to the cloud for model training cannot be corrupted, as this data is what model training is based upon. Once the model is trained, it will be pushed to vehicles (think of this similar to the AppStore pushing updates to the apps on your phone) – this process needs to be secure and uncorrupted. Much of this technology has been proven in other contexts, we just need to ensure that the model execution, data collection, and model training platforms contain the same if not better technology to ensure a “software bug” or other issues we tolerate with our computers or phones don’t trickle down to our vehicles and more vital infrastructure.

Ride availability

The next most important advantage is ride availability. How many times have you thought you’re hailing a ride and you’re quoted 2 minutes, and the driver takes >15 minutes. This builds rider frustration and potential fleet switching. It is absolutely imperative that fleets have an oversupply of vehicles that riders can obtain a vehicle from. If a vehicle is readily available for a rider, they are more likely to grow loyalty towards a fleet. As a result, to efficiently manage their fleet operators must be able to predict where and demand is going to happen. This means they need to know the event and personal calendars, as well as effective route requests.

Google, Apple, and Microsoft have the strongest knowledge on personal, business and event calendars, but given that these calendars are easily shared by users, it shouldn’t be a major barrier for ride-hailing apps to integrate with these features.

Where Google (via Google Maps and Waze) and Uber have a major advantage is in their lengthy experience of providing routing directions when users need it (as discussed during the surge discussion). The major difference is Uber’s experience with ride sharing routing vs Google’s propriety owner routing information. These are very different use-cases and are limited in their ability to scope full driverless transit demand. Both of these companies will need to continue to grow their routing request information to gain a better understanding of what ride-sharing requests are, long-haul drive requests, errand drives are, etc.

Public transit and accessibility

Accessibility becomes a major issue when ride-hailing becomes the primary method of transit. How do those without cellphones gain access to ride-hailing? How about those who are blind, hard of hearing, or have other physical impairments? Currently, these are handled by public agencies to support people with economic and physical impairment. Since the driverless fleet is not as easily managed, the edged cases become harder to handle and given the socio-economic stratification especially need to be taken into consideration. I have thoughts on how this could be solved, but I have not done enough user research around physical accessibility concerns to provide a well thought through answer.

Ride experience and purpose

The last major factor in deciding which fleet riders will choose to hail from is going to be the experience of the ride. There is obviously the experience of who you ride with, the environment of the vehicle (cleanliness, advertising in the vehicle, etc) and then there is the purpose of the ride (hauling, commuting, road trips, errands,  etc).

In my eyes, this will be the primary decider in which fleet riders will hail from. The fleets will break down in one of two ways:

  • fleets specializing in a specific type of service
    • services include: commuting, errands, long haul drives, moving, afterschool activities, etc
    • We’re already beginning to see this break down with commuting = Uber/Lyft, Lugg = moving, long haul drives = Getaraound, afterschool activities = shuttle.
  • fleets with brands that customers identify with
    • companies can create brands that cater specifically for the worker, the yuppie, the soccer mom, etc, high income vs low income

In general, this will be the most experimental area, different companies will try different services, in-car experiences, and business models. It will be interesting to find out what innovations attract the most riders. However, at the end of the day, the vehicle you are picked up in will mean less than it does today and the experience you get during the ride will mean even more.


Additionally, fleet operators are going to start building loyalty programs towards their brand, rewarding experiences, upgrades and discounts for earning miles with their fleet.

We’ll also see riders who provide their own vehicle to the fleet (sharing the CapEx) will earn points for future rides, and build loyalty for a different experience that their own vehicle does not provide.

Additional hailing models

In addition to ride-hailing, there will be some additional services that arise due to a driverless world that was not possible before.

Deliveries, couriers, and errands

Instead of requiring a driver to deliver your food or take out, you can place your order and transit of the couriered service works automatically for you. You can easily send a vehicle to pick up your dry cleaning, groceries, dry cleaning and other errand related items that need to be picked up or moved. Each of the service providers would need to be ready to deal with an unmanned vehicle and put the appropriate item in that vehicle, but a lot more productivity can happen as a result of a driverless world.

All of this errand/courier related transit could potentially be disrupted by drone technologies, but with our current road infrastructure, a driverless model may work just as well and more reliably.


Another difficult activity for transit is taking care of kids with after school activities. I know many families where the kid cannot participate in an activity because a parent isn’t able to get off work, or they have to choose between two kids’ activities because of the conflict. In a driverless world, kids can pool together and be sent from place to place.

Of course, this will require a lot of trusts built into this infrastructure, but once established it may help with a lot of transit coordination.

Kayak for riders

Lastly, one area of opportunity is aggregation services, ones that query all the possible transit models and fleet operators to provide you with the appropriate ride for when you need it. This Kayak model would work well for riders, but I suspect the main players to not want to participate. Over time, as transit becomes all the more commoditized, this will become an acceptable means of ride-hailing.

What do you think will happen with ride-hailing? I’d love to hear your thoughts.

In the next post, I’ll discuss my thoughts on how driverless technologies will change urban/rural planning.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s