Correlation vs. Causation
Confounders, Spurious Patterns, and Why Randomized Trials Win — A TLDR Primer
You've seen the headlines: "Coffee drinkers live longer" or "Ice cream sales linked to drowning deaths." But does that mean coffee causes longevity — or ice cream causes drowning? Probably not. Knowing the difference between a real cause and a coincidental pattern in data is one of the most useful skills in modern life, and most students never get a clear explanation of it.
This TLDR primer covers exactly that gap. You'll learn what correlation actually measures and how to read a scatterplot, then walk through the four main reasons a correlation can appear without any causal link: confounding variables, reverse causation, selection bias, and plain old chance. From there, the guide explains how randomized controlled trials solve the causation problem — and why randomization is such a powerful tool. A final section gives you a practical checklist for evaluating causal claims you encounter in textbooks, news articles, and exam questions.
Designed for high school and early college students studying statistics, AP courses, or any class that involves reading research, this guide is short by design and stripped to essentials. No filler, no detours — just the core logic you need to stop nodding along to bad reasoning and start asking the right questions.
If you can tell a confounder from a cause, you're already ahead of most adults. Grab your copy and start thinking more clearly about data.
- Define correlation precisely and interpret the correlation coefficient r
- Distinguish correlation from causation and identify common ways the two get confused
- Recognize confounding variables, reverse causation, selection bias, and chance as alternative explanations
- Explain why randomized controlled experiments can establish causation while observational studies usually cannot
- Apply Bradford Hill–style reasoning to evaluate causal claims in news, science, and everyday arguments
- 1. What Correlation Actually MeansDefines correlation, introduces the correlation coefficient r, and shows how to read scatterplots.
- 2. Why Correlation Doesn't Imply CausationLays out the core logical gap and walks through famous spurious correlations to build intuition.
- 3. The Four Suspects: Confounding, Reverse Causation, Selection Bias, and ChanceNames and unpacks the four main reasons a correlation can appear without a direct causal link.
- 4. How Scientists Establish CausationExplains randomized controlled trials, control groups, and why randomization neutralizes confounders.
- 5. A Toolkit for Evaluating Causal ClaimsGives the reader practical questions and Bradford Hill–style criteria to apply to headlines and studies.