SOLID STATE PRESS
← Back to catalog
A/B Testing Math cover
Coming soon
Coming soon to Amazon
This title is in our publishing queue.
Browse available titles
Mathematics

A/B Testing Math

P-Values, Sample Size, and the Statistics Behind Online Experiments — A TLDR Primer

You've heard A/B testing mentioned in every data science course and product job description — but when you actually sit down with the math, things get murky fast. What does a p-value really mean? How many users do you need before the result is trustworthy? Why do smart teams still get it wrong?

**A/B Testing Math** is a concise, no-filler primer that walks you through the statistics behind online experiments from the ground up. It starts where it should: what a randomized controlled experiment actually is, and how conversion rates and lift give you something measurable. From there it builds the two-proportion z-test step by step, defines p-values and confidence intervals in plain language (and corrects the misconceptions that trip up even experienced analysts), and derives the sample size formula so you understand why your experiment needs the users it needs — not just how to plug numbers into a calculator.

The final sections cover where real A/B tests break down — peeking at results early, running too many simultaneous tests, Simpson's paradox hiding in your data — and briefly survey where the math goes next: t-tests for means, Bayesian A/B testing, and sequential testing methods designed to avoid the peeking trap.

This guide is written for high school and early-college students encountering hypothesis testing for data science or statistics coursework, as well as anyone who needs ab testing statistics explained without wading through a door-stopper textbook. Short by design, built around worked examples, and honest about what the numbers actually tell you.

If you want to understand the math, not just memorize it — start here.

What you'll learn
  • Frame an A/B test as a two-sample hypothesis test for proportions or means
  • Compute and interpret p-values, confidence intervals, and effect sizes
  • Estimate the sample size needed to detect a given lift with adequate power
  • Recognize and avoid common pitfalls: peeking, multiple comparisons, and Simpson's paradox
What's inside
  1. 1. What an A/B Test Actually Is
    Frames A/B testing as a randomized controlled experiment and introduces the core quantities: conversion rate, lift, control, and treatment.
  2. 2. The Hypothesis Test Behind the Test
    Sets up the null and alternative hypotheses for comparing two proportions and walks through the two-proportion z-test step by step.
  3. 3. P-Values, Confidence Intervals, and What They Mean
    Defines p-values and confidence intervals operationally, corrects the standard misinterpretations, and shows how to read a test result honestly.
  4. 4. Sample Size and Statistical Power
    Derives how many users you need to detect a given lift, introducing power, minimum detectable effect, and the sample size formula.
  5. 5. Where A/B Tests Go Wrong
    Catalogs the math pitfalls that break real experiments: peeking, multiple comparisons, novelty effects, sample ratio mismatch, and Simpson's paradox.
  6. 6. Beyond the Basic Test
    Briefly surveys where A/B testing math goes next: tests of means with t-statistics, Bayesian A/B testing, and sequential testing.
Published by Solid State Press
A/B Testing Math cover
TLDR STUDY GUIDES

A/B Testing Math

P-Values, Sample Size, and the Statistics Behind Online Experiments — A TLDR Primer
Solid State Press

Contents

  1. 1 What an A/B Test Actually Is
  2. 2 The Hypothesis Test Behind the Test
  3. 3 P-Values, Confidence Intervals, and What They Mean
  4. 4 Sample Size and Statistical Power
  5. 5 Where A/B Tests Go Wrong
  6. 6 Beyond the Basic Test
Chapter 1

What an A/B Test Actually Is

Suppose a company changes the color of its "Buy Now" button from gray to green and wants to know if that change drives more purchases. The naive approach is to make the change and watch the sales numbers. The problem: sales fluctuate for dozens of reasons — day of week, seasonality, a competitor's sale — so you can never be sure whether any movement was caused by the button or by something else entirely. An A/B test solves this by running both versions at the same time, on randomly split groups of real users, so that everything except the change itself is held roughly equal.

This is the same logic as a randomized controlled experiment in science. You split your subjects into two groups: a control group, which experiences the current version (version A), and a treatment group, which experiences the new version (version B). Because the split is random, the two groups should be similar in every measurable and unmeasurable way — age, device type, time of visit, mood — on average. Any difference in outcome you observe afterward is therefore more plausibly attributed to the change you made, not to background noise.

Randomization is the load-bearing idea here. Without it, you might accidentally send all your morning visitors (who buy more) to one variant and all your late-night visitors (who browse but rarely buy) to the other. The experiment would be broken before it started. Random assignment distributes that kind of luck evenly across both groups, which is why it is the core design requirement of any trustworthy A/B test.

The quantity you measure: conversion rate

Most A/B tests define success through a metric — a specific, numeric thing you agree to measure before the experiment runs. The most common metric in web experiments is the conversion rate: the fraction of users who complete a target action (a purchase, a sign-up, a click).

$\text{conversion rate} = \frac{\text{number of conversions}}{\text{number of users}}$

About This Book

If you're taking a statistics or data science course that covers hypothesis testing, you're preparing for an exam that includes the two-proportion z-test, or you've heard A/B testing come up in a class and want a clear explanation from the ground up, this book was written for you. It works equally well for a high school student tackling online experiments in a statistics unit or a college freshman who needs a focused review before a midterm.

This guide covers everything that makes A/B testing work mathematically: ab testing statistics explained for beginners, hypothesis testing for data science students, p-value and confidence interval mechanics, sample size calculation, statistical power, and minimum detectable effect. Think of it as a p-value and confidence interval study guide that also doubles as a statistical power and minimum detectable effect guide. Short by design, no filler.

Read straight through once to build the framework, then work every example as you encounter it. Finish with the problem set at the end to confirm you can apply the ideas cold.

Keep reading

You've read the first half of Chapter 1. The complete book covers 6 chapters in roughly fifteen pages — readable in one sitting.

Coming soon to Amazon