SOLID STATE PRESS
← Back to catalog
R-Squared and Adjusted R-Squared cover
Buy on Amazon
US list price $2.99
Mathematics

R-Squared and Adjusted R-Squared

Goodness of Fit, the Bias of Adding Variables, and What R² Doesn't Tell You — A TLDR Primer

Your stats textbook spends dozens of pages on regression before it ever explains what R² actually means — and even then the explanation is buried in notation. If you have an exam coming up, a problem set due, or you just need to understand why your professor keeps talking about "goodness of fit," this guide cuts straight to it.

**R-Squared and Adjusted R-Squared** is a focused, no-filler primer that covers exactly what the title promises. You'll learn what R² is actually measuring (proportion of variance explained — not accuracy, not causation), how to build it from the three sums of squares (SST, SSR, and SSE) with a fully worked numerical example, and why R² has a clean relationship with Pearson's correlation coefficient in simple linear regression. Then comes the problem every student eventually hits: add any variable to a regression model and R² goes up, even if that variable is useless. The guide explains why this happens and how adjusted R² applies a complexity penalty to fix it — complete with the formula, a worked computation, and the edge cases (yes, adjusted R² can go negative).

The final section catalogs the most common misuses of R²: mistaking a high value for evidence of causation, assuming it means good predictions, or ignoring whether the model form is even appropriate for the data.

Written for high school and early college students in statistics, AP Statistics, introductory econometrics, or any course that touches regression. Concise and stripped to essentials, with definitions, worked examples, and misconception callouts throughout.

If R² has felt slippery, pick this up and get clear on it today.

What you'll learn
  • Define R² as the proportion of variance explained and compute it from sums of squares
  • Derive R² from SST, SSR, and SSE and connect it to the correlation coefficient in simple regression
  • Explain why R² never decreases when predictors are added, and how adjusted R² corrects for this
  • Compute adjusted R² from R², sample size, and number of predictors
  • Identify common misuses of R² (causation, model adequacy, prediction accuracy) and know when to use adjusted R² instead
What's inside
  1. 1. What R² Actually Measures
    Introduces R² as the proportion of variance in the response variable explained by the regression model, with intuition before formulas.
  2. 2. The Sums of Squares: SST, SSR, and SSE
    Builds R² from its components — total, regression (explained), and error (residual) sums of squares — with a fully worked numerical example.
  3. 3. R² and the Correlation Coefficient
    Shows that in simple linear regression R² equals the square of Pearson's r, and clarifies what changes once you move to multiple regression.
  4. 4. Why R² Keeps Going Up: The Need for Adjusted R²
    Explains why adding any predictor — even a useless one — never decreases R², motivating a penalty for model complexity.
  5. 5. Computing and Interpreting Adjusted R²
    Presents the adjusted R² formula, walks through computation, and shows when it rises, falls, or even goes negative.
  6. 6. What R² Doesn't Tell You
    Catalogs the common misuses — equating high R² with causation, good prediction, or correct model form — and gives rules of thumb for using R² responsibly.
Published by Solid State Press · June 2026
R-Squared and Adjusted R-Squared cover
TLDR STUDY GUIDES

R-Squared and Adjusted R-Squared

Goodness of Fit, the Bias of Adding Variables, and What R² Doesn't Tell You — A TLDR Primer
Solid State Press

Contents

  1. 1 What R² Actually Measures
  2. 2 The Sums of Squares: SST, SSR, and SSE
  3. 3 R² and the Correlation Coefficient
  4. 4 Why R² Keeps Going Up: The Need for Adjusted R²
  5. 5 Computing and Interpreting Adjusted R²
  6. 6 What R² Doesn't Tell You
Chapter 1

What R² Actually Measures

Imagine you have a scatter plot of students' study hours versus their exam scores. You draw the best-fit line through the data. The question that — the coefficient of determination — answers is: how much of the up-and-down variation in exam scores does that line actually account for? R² gives you a single number between 0 and 1 that summarizes the answer.

Before any formula, here is the core idea. Exam scores vary from student to student. Some of that variation is connected to study hours — students who study more tend to score higher. The rest of the variation is noise that your line cannot explain: maybe some students are better test-takers, maybe some were sick that day. R² measures the fraction of the total variation that the regression line captures. If R² = 0.80, the line explains 80 % of the variation in scores; the other 20 % is left unexplained.

The Baseline You Are Beating

To understand what "explained variation" means, you need a baseline model to compare against. The simplest possible model for a response variable (the outcome you are trying to predict, also called the dependent variable) is to ignore every predictor and just guess the mean every time. If someone asks you to predict any student's score and you know nothing else, guessing the class average $\bar{y}$ is the best you can do.

A regression line is an improvement over that flat-line baseline. R² measures how much of an improvement. Specifically, it asks: starting from the variation around the mean, how much variation remains around the regression line? The more the line shrinks that leftover scatter, the higher R².

This framing matters because it keeps R² grounded. An R² of 0 means the regression line is no better than guessing the mean — the predictors are doing nothing. An R² of 1 means the line passes through every data point exactly — perfect fit, zero residual scatter. Real-world values live between those extremes.

Variance and What "Explained" Means

About This Book

If you are taking AP Statistics, introductory college statistics, or any course that covers linear regression, this guide was built for you. It is also useful for students studying statistics concepts for high school coursework, anyone doing linear regression statistics exam prep, or a tutor who needs a clean refresher before a session.

This book covers regression goodness of fit from the ground up — what R² actually measures, how the sums of squares (SST, SSR, and SSE) relate to it, the connection between R² and the correlation coefficient, and the critical r squared vs adjusted r squared difference that trips up students constantly. You will also learn when R² misleads you entirely. Think of it as an adjusted r squared statistics tutorial with no filler — concise and short by design.

Read straight through in order, since each section builds on the last. Work through the worked examples as you go, then attempt the problem set at the end to verify you can apply what you have learned.

Keep reading

You've read the first half of Chapter 1. The complete book covers 6 chapters in roughly fifteen pages — readable in one sitting.

Continue reading on Amazon