A/B Testing Math
P-Values, Sample Size, and the Statistics Behind Online Experiments — A TLDR Primer
You've heard A/B testing mentioned in every data science course and product job description — but when you actually sit down with the math, things get murky fast. What does a p-value really mean? How many users do you need before the result is trustworthy? Why do smart teams still get it wrong?
**A/B Testing Math** is a concise, no-filler primer that walks you through the statistics behind online experiments from the ground up. It starts where it should: what a randomized controlled experiment actually is, and how conversion rates and lift give you something measurable. From there it builds the two-proportion z-test step by step, defines p-values and confidence intervals in plain language (and corrects the misconceptions that trip up even experienced analysts), and derives the sample size formula so you understand why your experiment needs the users it needs — not just how to plug numbers into a calculator.
The final sections cover where real A/B tests break down — peeking at results early, running too many simultaneous tests, Simpson's paradox hiding in your data — and briefly survey where the math goes next: t-tests for means, Bayesian A/B testing, and sequential testing methods designed to avoid the peeking trap.
This guide is written for high school and early-college students encountering hypothesis testing for data science or statistics coursework, as well as anyone who needs ab testing statistics explained without wading through a door-stopper textbook. Short by design, built around worked examples, and honest about what the numbers actually tell you.
If you want to understand the math, not just memorize it — start here.
- Frame an A/B test as a two-sample hypothesis test for proportions or means
- Compute and interpret p-values, confidence intervals, and effect sizes
- Estimate the sample size needed to detect a given lift with adequate power
- Recognize and avoid common pitfalls: peeking, multiple comparisons, and Simpson's paradox
- 1. What an A/B Test Actually IsFrames A/B testing as a randomized controlled experiment and introduces the core quantities: conversion rate, lift, control, and treatment.
- 2. The Hypothesis Test Behind the TestSets up the null and alternative hypotheses for comparing two proportions and walks through the two-proportion z-test step by step.
- 3. P-Values, Confidence Intervals, and What They MeanDefines p-values and confidence intervals operationally, corrects the standard misinterpretations, and shows how to read a test result honestly.
- 4. Sample Size and Statistical PowerDerives how many users you need to detect a given lift, introducing power, minimum detectable effect, and the sample size formula.
- 5. Where A/B Tests Go WrongCatalogs the math pitfalls that break real experiments: peeking, multiple comparisons, novelty effects, sample ratio mismatch, and Simpson's paradox.
- 6. Beyond the Basic TestBriefly surveys where A/B testing math goes next: tests of means with t-statistics, Bayesian A/B testing, and sequential testing.