Growth.Talent
Deep Diveexperimentationgrowth-teamsproduct-velocity

The Experiment Tax: Why Elite Growth Teams Ship Less Than You Think

What 8 growth leaders from Meta, HubSpot, Amplitude, and YouTube actually say about experiment velocity, ship rates, and the operating rhythms that separate great teams from average ones

Apr 11, 2026|9 min read|By Growth.Talent|

There's a dangerous myth circulating in growth circles: elite teams run hundreds of experiments per quarter. The more tests in your backlog, the thinking goes, the faster you'll find winners. Velocity equals value. Ship rate is everything.

Except the top growth leaders who've actually scaled products to billions of users say the opposite.

The real bottleneck isn't experiment volume. It's the quality of what you choose to test and the operating system that determines whether learnings compound or disappear into a spreadsheet graveyard. Teams that optimize for raw throughput often build cultures of what Elena Verna calls "paralyzing disease"—where experimentation becomes theater rather than learning.

The difference between great growth teams and average ones isn't cadence. It's conviction.

The Anti-Pattern: When Every Initiative Becomes an Experiment

Elena Verna, who led growth at Miro, Amplitude, and Dropbox, draws a hard line on this. "If every single one of your initiatives that you're doing on growth is an experiment, that's a problem," she says. "It's almost like a disease, like a paralyzing disease."

If every single one of your initiatives that you're doing on growth is an experiment, that's a problem. It's almost like a disease, like a paralyzing disease.

— Elena Verna, Growth Advisor at Amplitude / Miro / Dropbox

The failure mode is subtle but devastating. Teams begin treating the A/B test as a decision-avoidance mechanism. Can't get alignment on a pricing change? Run a test. Unsure if the new onboarding flow works? Test it. The backlog swells with experiments that exist because someone wanted to defer a hard call, not because there's a genuine hypothesis worth validating.

Laura Schaffer, VP of Growth at Amplitude, saw this pattern play out at Twilio. Her framework pushes back: hypothesis and counter-hypothesis. "Where we disagree, instead of sitting in meetings over and over trying to get everyone to align, just recognize we're not gonna agree. What's the hypothesis we're gonna go with? What's the strongest counter-hypothesis? Let's pay attention to that."

This isn't about running fewer tests for the sake of it. It's about recognizing that experiments have costs—engineering time, user experience fragmentation, opportunity cost of not shipping the thing you're 80% sure will work. The question isn't "should we test this?" It's "what would it take for us to ship this without testing?"

Understand Work vs. Execute Work: The 80/20 Most Teams Invert

Bangaly Kaba spent years at Facebook, Instagram, and YouTube before crystallizing what he calls "the anti-pattern." Someone says they want to build a feature. They pull data to justify it. Then they execute. He calls this "identify, justify, execute."

The superior model: "Understand, identify, execute."

Someone says, hey, you know what, this would be great to build. Then you go pull data to go justify why that would be great to build. Call that identify, justify, execute. First, you have to really understand from first principles what is actually going on. So understand, identify, execute.

— Bangaly Kaba, Director of Product at YouTube (fmr Head of Growth at Instagram)

Kaba's "understand work" is the unsexy foundation that elite teams over-index on. It's the week spent analyzing user cohorts before writing a single experiment doc. It's the qualitative research that reveals why your activation metric is lying to you. It's the adjacent user analysis that shows you're optimizing for the wrong segment entirely.

Most teams invert the ratio. They spend 20% of their time understanding and 80% executing tests. Great teams flip it. They front-load understanding so ruthlessly that by the time they're ready to build, the experiment queue is half the size and twice as effective.

Chris Miller, who built HubSpot's growth function from scratch, describes the mentality this way: "We really had an aggressive mentality, an aggressive approach. Every problem is our problem and radical accountability and ownership mentality helped us find opportunities that maybe the business wasn't explicitly asking us to solve."

That hunger led to HubSpot taking over the self-service revenue stream no one else wanted to own—and "immediately blew it up." But the aggression wasn't about running more tests. It was about choosing better problems.

The Canonical Doc: Naomi Gleit's Framework for Extreme Clarity

Naomi Gleit has been at Meta longer than anyone except Mark Zuckerberg. Employee number 29. Two decades watching the company scale from 30 people to a $1.5 trillion business. Her superpower? Taking complex, gnarly problems and simplifying them until they're shippable.

Her operating system starts with what she calls the canonical doc.

I work on a lot of different projects. A lot of times I'm ramping up mid-project. I'm like, where can I learn what I need to learn about this project? I ask 5 different people, get 5 different answers. That is unacceptable. There needs to be one canonical doc. Everyone should know exactly where the canonical doc is.

— Naomi Gleit, Head of Product at Meta (fmr Growth)

This seems basic until you realize most growth teams don't have one. Experiment briefs live in Notion. Results get Slacked. Strategic context lives in someone's head. Three months later, no one remembers why the test ran or what it taught them.

Gleit's canonical doc is the anti-chaos artifact. One place. One source of truth. One link that answers: What are we doing? Why? What did we learn? What's next?

The meta-lesson: experimentation cadence doesn't matter if institutional knowledge evaporates every quarter. Elite teams build systems that make learnings compound. Average teams rebuild the wheel every six months because no one documented what the last wheel taught them.

The Self-Serve Pricing Test: When Not Testing Is the Right Call

Laura Schaffer tells a story from her time at Twilio that inverts conventional wisdom. The team wanted to add qualification questions to the signup flow—the kind of change that makes every growth PM nervous because it adds friction.

"I'm fully expecting, okay, this is gonna hurt our numbers, but maybe it won't be so bad," she recalls. They shipped it to a small group, bracing for conversion rate damage.

We start to get the data for this thing. I'm getting an improved conversion. There's no personalization, nothing past it, just the questions. It improved conversion by like 5%. Like just improved signups. And it was one of those like, what? Like, okay, this, like, what is going on here?

— Laura Schaffer, VP of Growth at Amplitude

The insight: sometimes the thing you're afraid to ship is exactly what users want. The discipline is knowing when to test and when to trust your understanding of the user well enough to just ship.

When Amplitude launched their Plus plan—a self-serve tier wedged between free and enterprise—Schaffer's team didn't test every permutation. They identified the top 10 most divisive hypotheses, shipped the plan, and monitored those specific questions in soft launch. Speed to market beat testing paralysis.

"Pricing and packaging is like a symphony," she says. "It's meant to play well together. We can shift and adjust." The operating assumption: ship, learn, iterate. Not: test until you're certain.

Where the Experts Disagree: The Guatemala vs. Silicon Valley Split

Not everyone agrees on what "elite" experimentation looks like. The sharpest disagreement centers on context: what works at a billion-user platform doesn't translate to a growth-stage startup.

Cem Kansu, CPO at Duolingo, describes a pixel-perfect culture where "consumer products live and die in the pixels." Duolingo's 90 million monthly active learners mean tiny changes—button color, shade of green—can move metrics at scale. The team uses the app heavily. Most ideas come from their own usage. They prototype, test, refine.

Consumer products live and die in the pixels. If a button that's like the dark shade of green versus the light shade of green will make a difference in user behavior, a product manager or a product designer should understand what that means.

— Cem Kansu, CPO at Duolingo

Compare that to Santiago Savinon, Chief Growth Officer at 99minutos in Mexico. His company moved from 5 million to 30 million packages per year—a 6x growth arc—by obsessing over customer use cases, not A/B tests. "We mapped each sales process to the smallest detail we could think of to close a client and start," he says. The team built AI agents to remove friction from seller workflows, giving salespeople "the powers of a CEO."

Savinon's growth lever wasn't experiment velocity. It was removing excuses from the sales motion and enabling autonomy. The metric that mattered: seller independence, not conversion rate lifts.

The disagreement isn't about right or wrong. It's about recognizing that experimentation culture scales with distribution model. Consumer apps with millions of users can profitably test micro-optimizations. B2B growth motions often need macro bets—new segments, new use cases, new sales plays—that can't be A/B tested in any meaningful way.

The Real Operating System: Radical Ownership Over Ruthless Throughput

If experiment volume isn't the answer, what is?

Every leader interviewed points to the same underlying system: radical ownership of outcomes. Chris Miller's HubSpot team didn't ask permission to take over self-serve revenue. They saw an underutilized asset, asked if anyone was working on it, and when the answer was no, they claimed it. "That attitude of saying every problem is our problem helped us find opportunities the business wasn't explicitly asking us to solve."

Bangaly Kaba's framework for change management has five components: vision, skills, incentives, resources, action plan. "You need all of those to have change," he says. Most teams optimize the action plan—the experiment backlog—and wonder why nothing sticks. The bottleneck is vision or incentives or skills.

Naomi Gleit's canonical doc is an ownership artifact. One doc. One owner. One place the whole company can point to and say "this is what we're doing and why." The discipline isn't in how many tests you run. It's in how clearly you can articulate what you're learning and what you're doing about it.

Elena Verna's counter to the experimentation disease: recognize when you're running tests to avoid making decisions. The cure isn't to stop testing. It's to stop pretending tests are a substitute for conviction.

Laura Schaffer's hypothesis-counter-hypothesis framework forces teams to name what they believe and what they're afraid of. Then ship. Then learn. The cadence isn't defined by cycle time. It's defined by how fast you can turn learning into leverage.

The Takeaway: Stop Counting Tests, Start Counting Convictions

The question "how many experiments should we run per quarter?" is the wrong question.

The right questions: How much time are we spending understanding versus executing? Do we have a canonical source of truth for what we've learned? Are we experimenting to learn or to defer decisions? Can we ship this without testing because we've done enough understand work to have conviction?

Elite growth teams don't win by running more experiments. They win by running the right experiments, documenting what they learn, and building operating systems that make knowledge compound. They over-index on understand work. They build ownership cultures where taking a bet on an underutilized asset is rewarded more than hitting a test quota.

The uncomfortable truth: if your growth team is proud of how many experiments they ran last quarter, they're probably running too many. The best teams can't tell you their experiment count. They can tell you what they learned, what they shipped, and what changed because of it.

That's the real cadence that matters. Not how fast you fill the backlog. How fast conviction turns into leverage.

Related Insights