Innovation at Uber – Express POOL
1) Why didn’t Uber launch Express POOL and “simply” randomize users into A and B?
While a straightforward A/B user-level randomization might seem efficient - just show half the riders the
new Express POOL option and see what happens, Uber opted against it for Express POOL due to several
nuanced and practical reasons grounded in both user experience and system design:
Marketplace Spillovers - Express POOL was not a superficial UI change—it affected city-level market
equilibrium, not just individual user behavior. Launching Express POOL had implications for rider
demand, driver supply, and system efficiency. A simple user-level A/B test would have mixed both
product versions in the same market, making it difficult to isolate the systemic impact of Express on
macro variables like driver behavior, ride-matching efficiency, or seat utilization.
Rider Perception and User Frustration - Express POOL introduced significant changes: walking and
waiting. Randomly assigning users without consent to a treatment group could result in negative
reactions, especially from POOL users expecting a door-to-door experience. Some people might’ve
hated it, canceled their ride, or even stopped using the app. That’s not a risk you want to take when
trust and convenience are your brand's pillars. Creating a distinct Express POOL product allowed Uber to
avoid alienating existing customers and preserve rider choice and satisfaction.
Network Effects & Matching Complexity - Matching riders effectively depends on aggregating requests
with similar time and location constraints. If only a subset of users (randomized to “Express”) is eligible
for the new model, the matching pool is fragmented, undermining batching efficiency—one of the core
advantages of Express POOL.
Brand Positioning & Communication - By launching Express as a new product with a distinct name and
in-app communication (e.g., “Walk a little, save a lot”), Uber could set clear expectations for users, and
control perception of value and trade-offs.
So instead of an A/B test, they launched Express POOL as a separate product, in select treatment cities
where everyone got the new experience. That way, they could observe the broader ripple effects—and
riders opted in, rather than getting surprised.
2) What are the pros and cons of each randomization (A/B, switchbacks, synthetic control)? What
would you do if you were a PM at Uber?
Let’s evaluate the three major experiment types of Uber used:
Experiment
Pros Cons
Type
• Not ideal for big systemic
changes like Express POOL.
• High internal validity
• They can’t capture network
User-level • Fine-grained data
effects or large-scale dynamics.
A/B • Quick to implement
• They fragment the rider pool,
• Great for UI or pricing tweaks
which kills the algorithm’s
efficiency.
• High operational complexity
• Only one active per city
• Controls for time trends in the same location • Short-term effects only
Switchbacks
• Useful for algorithm changes • Only work well when you’re
testing one variable (like 2-
minute vs. 5-minute wait times)
• Captures macro-level changes • Requires significant effect sizes
Synthetic • Avoids contamination across groups, which (>5%) to detect
Control allows you to study things like overall market • Small sample of cities limits
equilibrium, revenue, and adoption. statistical power
If I were a PM at Uber:
I would combine synthetic control and switchback experiments. Here’s how:
• Synthetic control for product-level impact evaluation, e.g., launching Express POOL in selected cities
to track holistic metrics: adoption, profit per ride, driver behavior, and substitution patterns.
• Switchbacks for algorithm tuning, such as evaluating 2-minute vs. 5-minute wait windows within the
same city (as done in Boston), to assess cost vs. experience trade-offs.
• Avoid user-level A/B for Express POOL-type changes, since network and perception spillovers distort
measurement and violate core assumptions of independence.
Additionally, I would prioritize structured governance like the Marketplace Change Protocol (MCP) for
cross-functional alignment, as done in the case, to prevent experiment interference.
3) What are your hypotheses for the test for the 5-minute wait time?
If I were designing or interpreting the Boston switchback experiment (which tested 2-minute vs. 5-
minute rider wait times), here are the hypotheses I’d start with:
Hypothesis 1: Longer wait times (5 minutes) will reduce Uber’s cost per ride by improving batching
efficiency
I expect that giving the algorithm more time—expanding from 2 to 5 minutes—would allow it to identify
better matches between riders going in the same direction. This should lead to more full cars (higher
seat utilization), fewer detours, and less wasted driver time. From an operational perspective, I’d predict
a drop in cost per ride, especially in markets with high ride density.
Hypothesis 2: Longer wait times will increase rider cancellations, especially in colder cities like Boston
I hypothesize that pushing the wait time beyond the standard 2-minute threshold will feel like a
downgrade in experience for many riders—especially when it’s cold, rainy, or during high-stress hours
(like weekday mornings). People open Uber when they want immediate movement, not to stand
around. So, I’d expect cancellation rates to go up once the wait time crosses that psychological line.
Hypothesis 3: The negative effects of waiting will vary based on time of day
This one’s a bit more nuanced. I believe that the extra wait time will feel more acceptable during off-
peak hours, when riders are in less of a rush and have more flexibility. In contrast, during peak hours,
any additional delay could push riders to either cancel or switch to a faster alternative. I’d hypothesize
that 5-minute waits will perform better mid-day or late evening than during the morning commute.
Hypothesis 4: Price discounts will only justify longer wait times if the savings are significant (not $0.50
or $1)
I suspect that riders won’t tolerate walking and waiting unless they feel they’re getting a deal worth
taking. If Express POOL is only slightly cheaper than POOL—say by a few cents—it won’t justify the
friction. My hypothesis is that there’s a psychological price threshold (perhaps a $2+ saving) below
which Express doesn’t feel “worth it” to most users.
Hypothesis 5: Some riders will adapt to the 5-minute wait over time, especially if it’s positioned well
in the app
Behavior change takes time, and I believe some budget-conscious riders might initially resist but then
accept the longer wait once they experience the cost savings. My hypothesis is that, over time, repeat
Express POOL users will show higher tolerance for the 5-minute wait—if the value is clear and the
communication (e.g., “Walk a little, save a lot”) is compelling.