Experiments

Did the Experiment Work? Stop Asking Too Early

The first week after launch is usually a terrible narrator.

Mostly Stable February 27, 2026 9 min read

Did the Experiment Work? Stop Asking Too Early

Teams love the first readout after an experiment. The feature shipped. The campaign launched. The onboarding email changed. Everyone wants to know whether it worked.

That urgency makes sense. It also creates bad calls. A few early points can look like momentum, disappointment, or chaos depending on which slice you stare at longest.

The launch did not erase variation

Even if an experiment has a real effect, the metric will still bounce. A process behavior chart helps you compare post-launch data against the pre-launch baseline. You are looking for evidence that the system shifted, not evidence that one point looks exciting.

A single point outside the expected range can be a signal. A run of points on one side of the old average can be a signal. A normal-looking wobble after launch is not.

Good experiment notes include the baseline

Before launch, document how the metric currently behaves. What is the average? What range is normal? How often does it cross the target already? That last question is humbling. Many teams discover their "post-launch improvement" already happened several times before launch.

After launch, keep plotting. If the chart shows a signal, then ask what part of the change most likely drove it. If the chart does not show a signal, resist the urge to dress up noise as a lesson.

A more honest readout

Instead of "The experiment increased activation by 7 percent," a better readout might say, "Activation is above the old average for eight consecutive days, which suggests the system shifted. Next we will verify the change holds across traffic sources."

That sentence is less flashy. It is also much harder to fool yourself with.

Why teams ask too early

Teams ask whether an experiment worked early because waiting feels irresponsible. The company wants momentum. The team wants validation. The launch channel is still warm. Nobody wants to say, "We do not know yet."

But "we do not know yet" is often the most honest answer. A process behavior chart helps make that answer acceptable because it shows what kind of evidence is missing.

Before-and-after is a weak habit

A before-and-after chart can be useful when the change is dramatic. Most product and growth changes are not dramatic. They are modest interventions in a noisy system. Comparing the week before launch to the week after launch gives too much power to calendar luck.

The better comparison is post-change behavior against the established pre-change range. You are asking whether the system now behaves in a way it did not behave before.

What to include in an experiment readout

The metric and why it was chosen.
The pre-change average and expected range.
The date the change entered the system.
Whether post-change points show a signal.
Known confounders: traffic mix, pricing changes, outages, sales pushes, instrumentation edits.
The decision: keep, expand, revise, or keep observing.

This format makes the readout less theatrical. It also makes the uncertainty visible enough for a decision.

What if the signal is negative?

A negative signal is still useful. It can show that a launch harmed activation, slowed support, increased incidents, or reduced usage. The key is to respond to the system change rather than argue with the chart. If the signal is real, the next step is learning what part of the intervention created it.