A/B Test Duration Calculator

Calculate the required sample size and test duration before making a decision.

Quick Presets

Your current conversion rate

%

Smallest improvement to detect. 10% means 3.0% β†’ 3.30%

%

100% of traffic goes to the test

100%

Results

πŸ‘€
53,224
Per Variant
πŸ‘₯
106,448
Total Sample
πŸ“…
22
Days Needed
🏁
Apr 20, 2026
End Date

Test Timeline

Start
Mar 29, 2026
25%
Apr 4, 2026
50%
Apr 9, 2026
75%
Apr 15, 2026
Complete
Apr 20, 2026

Sensitivity Analysis

MDESample / VariantTotal SampleDays
5%207,997415,99484
10%53,224106,44822
15%24,19848,39610
20%13,91527,8306
25%9,10018,2004
30%6,45412,9083

⚠️ Should I Stop Early?

Stopping an A/B test before reaching statistical significance leads to false positives. If you stop at 50% of the required sample, there's a 30–40% chance your β€œwinner” is actually a fluke. Always run the test for the full calculated duration of 22 days before making decisions.

How It Works

This calculator uses the two-proportion z-test formula to determine the minimum sample size needed per variant:

n = (ZΞ±/2 + ZΞ²)2 Γ— (p1(1-p1) + p2(1-p2)) / (p2 - p1)2

Where p1 is your baseline conversion rate, p2 is the expected rate after improvement (p1 Γ— (1 + MDE)), ZΞ±/2 is the z-score for your significance level, and ZΞ² is the z-score for your chosen power.

Last updated: March 2026

What Is the A/B Test Duration Calculator?

Find out exactly how long to run your A/B test before making a decision. Enter your conversion rate, traffic, and the minimum improvement you want to detect β€” get the required sample size and test duration with a visual timeline. The calculator uses the two-proportion z-test formula, the same statistical method used by professional experimentation platforms.

One of the most common mistakes in A/B testing is stopping the test too early. When you end a test before reaching statistical significance, you dramatically increase false positive rates. This calculator tells you exactly how many visitors you need and how many days it will take, so you can plan your testing roadmap with confidence.

How to Use This Calculator

1. Enter your current conversion rate and daily traffic. These are the baseline numbers from your analytics.

2. Set the minimum detectable effect β€” the smallest improvement worth detecting. A 10% MDE on a 3% conversion rate means you want to detect a change from 3% to 3.3%.

3. Choose your statistical significance and power. 95% significance and 80% power are industry standards. Higher values require larger samples.

4. Review the sensitivity table to see how different MDEs affect your test duration, and use the visual timeline to plan your testing schedule.

Understanding Statistical Significance and Power

Statistical significance (confidence level) is the probability that your result isn't due to random chance. At 95% significance, there's only a 5% chance of a false positive β€” declaring a winner when there's actually no real difference.

Statistical power is the probability of detecting a real difference when one exists. At 80% power, you have a 20% chance of missing a true effect (false negative). Higher power means you're less likely to miss real improvements, but you'll need a larger sample.

Frequently Asked Questions

What's a good minimum detectable effect (MDE)?

For most tests, 10-20% relative change is practical. Smaller MDEs require much larger samples. If you have low traffic, aim for 15-20% MDE.

Why shouldn't I stop the test early if one variant is winning?

Early results are unreliable. Statistical significance requires a minimum sample size. Stopping early dramatically increases false positive rates β€” you might declare a "winner" that's actually just random noise.

What does "statistical power" mean?

Power is the probability of detecting a real difference when one exists. 80% power means there's an 80% chance you'll detect a true improvement. Higher power requires larger samples.

How does traffic allocation affect test duration?

If you allocate only 50% of traffic to the test, it takes twice as long to reach the required sample size. Use 100% allocation when possible to get results faster.

Should I use 90%, 95%, or 99% significance?

95% is the industry standard for most A/B tests. Use 90% for exploratory tests where false positives are less costly. Use 99% for critical changes like pricing or checkout flow modifications.

Related Tools