Data Representation and Compression

Huffman Coding, UTF-8, and the Logic Behind JPEG and MP3 — A TLDR Primer

Your computer science class just hit the unit on data representation, and suddenly you're staring at binary, hexadecimal, Huffman trees, and perceptual coding — all at once. Or maybe you're a parent trying to help your student make sense of why a JPEG looks different from a PNG, or why an MP3 is smaller than a WAV file. This guide was written for exactly that moment.

**TLDR: Data Representation and Compression** is a focused, no-filler guide that walks you through how computers encode text, images, and sound into bits — and how compression makes those bits smaller without destroying what matters. You'll start with binary and hexadecimal (the actual foundation everything else rests on), move through ASCII and Unicode character encoding, and then into how RGB pixels and audio sampling work. From there, the guide covers lossless compression schemes like run-length encoding and Huffman coding, then explains how JPEG and MP3 use perceptual coding to discard what human eyes and ears won't miss anyway.

This book is for high school students in AP Computer Science or a digital literacy course, college freshmen in an intro CS or IT program, and anyone who wants to understand how computers encode text, images, and sound without wading through a 500-page textbook. Every section leads with the key idea, uses concrete worked numbers, and calls out the misconceptions students most often bring into an exam.

If you need to get oriented fast, this is the guide to read first.

What you'll learn

Convert numbers between binary, decimal, and hexadecimal and explain why computers use binary
Describe how characters, images, and audio are encoded as bits using ASCII, Unicode, RGB pixels, and PCM samples
Distinguish lossless from lossy compression and identify when each is appropriate
Trace through a simple Huffman or run-length encoding example and compute a compression ratio
Explain at a high level why JPEG, MP3, and ZIP work and what tradeoffs they make

What's inside

1. Bits, Bytes, and Why Binary

Introduces binary as the foundation of all digital data, with conversions between binary, decimal, and hexadecimal.
2. Encoding Text: ASCII and Unicode

How characters become numbers, from 7-bit ASCII to UTF-8 and the handling of emoji and non-English scripts.
3. Encoding Images and Sound

How pixels with RGB values represent images and how audio is sampled into PCM, including resolution and bit depth tradeoffs.
4. Lossless Compression: Huffman and Run-Length Encoding

Walks through two classic lossless schemes that exploit redundancy without throwing information away.
5. Lossy Compression: JPEG, MP3, and Perceptual Coding

Explains how JPEG and MP3 discard information humans cannot easily perceive, and why this lets files shrink dramatically.
6. Why It Matters: Storage, Networks, and Tradeoffs

Connects representation and compression to real-world concerns like streaming, archival, and choosing the right file format.

Published by Solid State Press

Data Representation and Compression

Data Representation and Compression

Contents

Bits, Bytes, and Why Binary

Why Binary?

Place Value in Binary

About This Book