SOLID STATE PRESS
← Back to catalog
Data Representation and Compression cover
Coming soon
Coming soon to Amazon
This title is in our publishing queue.
Browse available titles
Computer Science

Data Representation and Compression

Huffman Coding, UTF-8, and the Logic Behind JPEG and MP3 — A TLDR Primer

Your computer science class just hit the unit on data representation, and suddenly you're staring at binary, hexadecimal, Huffman trees, and perceptual coding — all at once. Or maybe you're a parent trying to help your student make sense of why a JPEG looks different from a PNG, or why an MP3 is smaller than a WAV file. This guide was written for exactly that moment.

**TLDR: Data Representation and Compression** is a focused, no-filler guide that walks you through how computers encode text, images, and sound into bits — and how compression makes those bits smaller without destroying what matters. You'll start with binary and hexadecimal (the actual foundation everything else rests on), move through ASCII and Unicode character encoding, and then into how RGB pixels and audio sampling work. From there, the guide covers lossless compression schemes like run-length encoding and Huffman coding, then explains how JPEG and MP3 use perceptual coding to discard what human eyes and ears won't miss anyway.

This book is for high school students in AP Computer Science or a digital literacy course, college freshmen in an intro CS or IT program, and anyone who wants to understand how computers encode text, images, and sound without wading through a 500-page textbook. Every section leads with the key idea, uses concrete worked numbers, and calls out the misconceptions students most often bring into an exam.

If you need to get oriented fast, this is the guide to read first.

What you'll learn
  • Convert numbers between binary, decimal, and hexadecimal and explain why computers use binary
  • Describe how characters, images, and audio are encoded as bits using ASCII, Unicode, RGB pixels, and PCM samples
  • Distinguish lossless from lossy compression and identify when each is appropriate
  • Trace through a simple Huffman or run-length encoding example and compute a compression ratio
  • Explain at a high level why JPEG, MP3, and ZIP work and what tradeoffs they make
What's inside
  1. 1. Bits, Bytes, and Why Binary
    Introduces binary as the foundation of all digital data, with conversions between binary, decimal, and hexadecimal.
  2. 2. Encoding Text: ASCII and Unicode
    How characters become numbers, from 7-bit ASCII to UTF-8 and the handling of emoji and non-English scripts.
  3. 3. Encoding Images and Sound
    How pixels with RGB values represent images and how audio is sampled into PCM, including resolution and bit depth tradeoffs.
  4. 4. Lossless Compression: Huffman and Run-Length Encoding
    Walks through two classic lossless schemes that exploit redundancy without throwing information away.
  5. 5. Lossy Compression: JPEG, MP3, and Perceptual Coding
    Explains how JPEG and MP3 discard information humans cannot easily perceive, and why this lets files shrink dramatically.
  6. 6. Why It Matters: Storage, Networks, and Tradeoffs
    Connects representation and compression to real-world concerns like streaming, archival, and choosing the right file format.
Published by Solid State Press
Data Representation and Compression cover
TLDR STUDY GUIDES

Data Representation and Compression

Huffman Coding, UTF-8, and the Logic Behind JPEG and MP3 — A TLDR Primer
Solid State Press

Contents

  1. 1 Bits, Bytes, and Why Binary
  2. 2 Encoding Text: ASCII and Unicode
  3. 3 Encoding Images and Sound
  4. 4 Lossless Compression: Huffman and Run-Length Encoding
  5. 5 Lossy Compression: JPEG, MP3, and Perceptual Coding
  6. 6 Why It Matters: Storage, Networks, and Tradeoffs
Chapter 1

Bits, Bytes, and Why Binary

Every piece of data your computer handles — a text message, a photograph, a song — is ultimately stored as a sequence of ones and zeros. Understanding why, and how those ones and zeros represent larger numbers, is the foundation for everything else in this book.

Why Binary?

A bit (short for binary digit) is the smallest unit of information a computer can store. It has exactly two possible values: 0 or 1. Computers use binary because the physical components that store and transmit data — transistors, voltage levels on a wire, pits on a disc — are most reliably built to distinguish two states, not ten. A transistor is either conducting or not. A voltage is either high or low. Building hardware that reliably tells apart ten distinct voltage levels would be far more error-prone and expensive. Two states are robust; ten are not.

A common misconception is that binary is somehow less powerful than decimal. It isn't. With enough bits, you can represent any number, any letter, any image — exactly as you can with decimal digits. Binary just uses more digits to write the same value. That tradeoff is worth it for the hardware reliability it buys.

Place Value in Binary

You already know how place value works in decimal. The number 349 means $3 \times 100 + 4 \times 10 + 9 \times 1$, where each position is a power of ten: $10^2$, $10^1$, $10^0$.

Binary works identically, but each position is a power of two: $2^0 = 1$, $2^1 = 2$, $2^2 = 4$, $2^3 = 8$, and so on, doubling each time you move left.

So the binary number $1011$ means:

$1 \times 2^3 + 0 \times 2^2 + 1 \times 2^1 + 1 \times 2^0 = 8 + 0 + 2 + 1 = 11$

Example. Convert the binary number $11010110$ to decimal.

Solution. Write out the place values from right to left: $2^7, 2^6, 2^5, 2^4, 2^3, 2^2, 2^1, 2^0$, which are $128, 64, 32, 16, 8, 4, 2, 1$.

Multiply each bit by its place value and add: $1{\times}128 + 1{\times}64 + 0{\times}32 + 1{\times}16 + 0{\times}8 + 1{\times}4 + 1{\times}2 + 0{\times}1$ $= 128 + 64 + 0 + 16 + 0 + 4 + 2 + 0 = 214$

To go the other direction — decimal to binary — repeatedly divide by 2 and record the remainders. The remainders, read from bottom to top, give the binary representation.

About This Book

If you are staring down an AP Computer Science Principles exam, working through an intro CS course, or just trying to make sense of a confusing lecture on how computers store information, this book is for you. It also works for parents and tutors who need a fast, reliable refresher before a study session.

This is a focused data representation computer science study guide covering how computers encode text, images, and sound from the ground up. You will learn binary and hexadecimal for beginners, get ASCII and Unicode explained in plain terms, and see exactly how lossless and lossy compression work — including how JPEG and MP3 compression actually shrink files without destroying them. These are the computer science fundamentals that college freshmen encounter in nearly every intro course. A concise overview with no filler.

Read it front to back — each section builds on the last. Stop at every worked example and follow the steps yourself, then use the problem set at the end to confirm you have it.

Keep reading

You've read the first half of Chapter 1. The complete book covers 6 chapters in roughly fifteen pages — readable in one sitting.

Coming soon to Amazon