How long should an A/B test run?

Long enough to reach the required sample per variant, and at least one full week to capture weekday and weekend behaviour. For a 4,656-per-variant sample at 500 daily eligible users split evenly, that is about 19 days.

How is A/B test duration calculated?

Divide the required sample size per variant by the daily eligible traffic entering that variant. Daily eligible traffic times the traffic split gives the users each variant receives per day, and the required sample divided by that is the number of days.

How can I make a test finish faster?

Send more eligible traffic into the test, increase the share allocated to each variant, or target a larger detectable effect so the required sample drops. All three shorten the timeline.

Should I ever stop a test early?

Avoid stopping the moment a result looks significant — that inflates false positives. Run to the planned sample and at least a full week, then read significance once.

Calculator · 018

A/B Test Duration Calculator

Estimate how long a test must run to reach its required sample — and decide whether the timeline is acceptable before launching.

Required sample / variant

count

Daily eligible traffic

/ day

Traffic split / variant

Test duration

—

Average

Scenario lens Current · Benchmark · Optimized

Leverage

Formula

Test duration = Required sample size / (Daily eligible traffic × Traffic split)

Understanding A/B test duration

Reference material — the calculator above stays the primary tool.

What test duration tells you

Test duration converts a required sample size into a calendar timeline at your actual traffic. It answers the practical planning question a sample size alone cannot: given how many users you can send each day, how long until the test can be trusted?

It turns experimentation from open-ended into schedulable, which is what lets a team commit to a test rather than abandon it halfway.

How to read your result

Here, shorter is better for the same reliability, against typical test lengths:

Low — very long; traffic is the constraint and the test risks staleness. Average — a normal one-to-three-week window. Strong — short; the test reaches its sample quickly.

Whatever the number, run at least a full week so weekly cycles are captured.

The one-week floor

Even when the math allows a shorter run, keep any test live for at least one complete week. Conversion behaviour differs sharply between weekdays and weekends, and a test that ends mid-cycle measures a biased slice of your audience. Speed never justifies sampling only Tuesdays.

Levers that shorten a test

Three inputs move the timeline: daily eligible traffic, the split allocated to each variant, and the required sample. Raising traffic or the variant split shortens it directly; accepting a larger detectable effect shrinks the sample and shortens it further. The sample-size tool sets that requirement.

Duration and decision quality

A timeline that runs too long invites pressure to peek and stop early, which corrupts the result. If the duration is unacceptable, fix it before launch by changing traffic, split, or effect size — not by cutting the test short once it is running.