CROIndex

Calculator · 017

A/B Test Significance Calculator

Measure the statistical confidence behind an A/B result — and decide whether the difference is real enough to ship or still too noisy.

count
count
count
count

Statistical confidence

Average
Scenario lens Current · Benchmark · Optimized
Leverage

Formula

Two-proportion z-test on A vs B conversion rates

Understanding statistical significance

Reference material — the calculator above stays the primary tool.

What significance measures

Statistical significance is the confidence that a measured difference between two variants reflects a real effect rather than random variation. It answers the only question that should gate a rollout: is this difference trustworthy enough to act on?

This calculator runs a two-proportion z-test — the standard method for comparing two conversion rates — and reports the resulting confidence directly.

How to read your result

Higher confidence is better, against the conventional 95% decision bar:

Low — under 90%; the difference is well within what chance could produce. Average — 90–95%; suggestive but not yet decision-grade. Strong — 95% or above; reliable enough to ship.

Significance is not lift

The two are independent. A variant can post a large observed lift yet sit below 95% confidence because the sample is thin, or post a modest lift that is highly significant on a large sample. Significance tells you whether to believe the result; the conversion lift tool tells you how big it is. Ship on both.

Why not stop the moment it crosses 95%

Peeking and stopping at the first significant moment inflates false positives, because confidence wobbles as data arrives. Decide the sample size in advance, run to it, then read significance once — the sample-size and duration tools set that plan so this verdict is honest.

One-tailed vs two-tailed

This calculator uses a two-tailed test, which asks whether the variants differ in either direction — the conservative, default choice for most experiments. A one-tailed test assumes you only care about improvement and reaches significance sooner, at the cost of missing a variant that quietly performs worse.