Deep Dives·November 26, 2025·10 min read

How the Deflate Algorithm Works Inside PNG Files

DEFLATE is the compression engine inside every PNG. This technical deep-dive explains LZ77 back-references, Huffman coding, and how they combine to shrink your images.

DEFLATE is everywhere

DEFLATE is one of the most widely deployed compression algorithms in existence. It is the compression layer inside PNG images, gzip-compressed HTTP responses, zip archives, PDF streams, and countless other formats. Understanding how it works demystifies not just image compression but a large slice of how data is compressed across the internet.

DEFLATE is defined in RFC 1951. It combines two algorithms in series: LZ77 (a dictionary-based compressor) followed by Huffman coding (a statistical compressor). Each algorithm removes a different type of redundancy from the data.

Stage 1: LZ77

LZ77, invented by Abraham Lempel and Jacob Ziv in 1977, exploits repetition in the data. The core insight is simple: if a sequence of bytes appeared recently in the data stream, it can be replaced with a back-reference — a (distance, length) pair that says "go back distance bytes in the output and copy length bytes from there".

LZ77 maintains a sliding window of recently-seen data (in DEFLATE, the window is 32,768 bytes — 32 KB). The encoder scans forward from the current position and searches the window for the longest match. If it finds a match of length ≥ 3, it emits a back-reference. Otherwise, it emits the literal byte value.

For PNG image data, this is powerful because of how prediction filtering works upstream. After a PNG filter pass, pixel data is transformed into small difference values. Flat regions of the image produce long runs of zero differences. Repeating patterns produce exact byte-for-byte repetitions. LZ77 catches both of these with back-references.

Example: consider a simple gradient where every row is identical (as would be the case after the "Up" filter is applied to a flat horizontal band). The first row is stored as literals. Every subsequent identical row is replaced by a back-reference to the first row — just a few bytes instead of potentially hundreds.

Stage 2: Huffman coding

After LZ77, the data stream consists of a mix of literal byte values (0–255) and back-references (distance/length pairs). These symbols do not all occur with equal frequency — in typical PNG data, certain literal values (especially zero, the result of filtered flat regions) occur very frequently, while others (high-value differences) are rare.

Huffman coding, invented by David Huffman in 1952, assigns variable-length binary codes to symbols based on their frequency. The most common symbols get the shortest codes; rare symbols get longer codes. The assignment is done by building a binary tree from the frequency table — the "Huffman tree".

In the optimal assignment, a symbol that occurs half the time might get a 1-bit code. A symbol that occurs once in a million tokens might get a 20-bit code. On average, this is far more efficient than the fixed 8 bits per symbol used in the raw byte stream.

DEFLATE uses two Huffman trees per compressed block: one for literal/length symbols (0–285, where 0–255 are literals and 257–285 encode match lengths) and one for distance symbols (0–29, encoding back-reference distances using a distance code plus extra bits).

Fixed vs dynamic Huffman codes

DEFLATE supports three block types:

Type 0 (stored): No compression — raw bytes. Used when the data is incompressible.
Type 1 (fixed Huffman): Uses a predefined, standard Huffman table. The table does not need to be transmitted because the decoder already knows it. Efficient for short data where transmitting a custom table would cost more than it saves.
Type 2 (dynamic Huffman): The encoder transmits a custom Huffman table at the start of the block, built from the actual frequency distribution of the data. More overhead to transmit, but produces better compression for longer blocks.

Most PNG compressors use dynamic Huffman codes for the image data because the blocks are large enough for the custom table to pay for itself. The "optimal" DEFLATE encoders (like Zopfli) spend significant computation finding the best block boundaries and the most efficient Huffman trees for each block.

The DEFLATE compression level trade-off

The "compression level" parameter (1–9 in zlib) controls how hard the encoder searches for back-references in LZ77. At level 1, a fast greedy algorithm finds the first match of sufficient length and moves on. At level 9, an exhaustive search considers all possible matches in the window and chooses the optimal one. The optimal choice often involves counter-intuitive decisions: sometimes using a shorter match now allows a longer match later, producing a smaller output overall.

Zopfli takes this further by using a dynamic programming approach that models the entire stream to find the globally optimal encoding. It is many times slower than level 9 zlib but typically achieves 3–8% better compression. Since PNG decompressors do not care which encoder was used — only that the output is valid DEFLATE — Zopfli-compressed PNGs are perfectly compatible with all PNG viewers.

Why PNG filter selection affects compression

The DEFLATE algorithm is the same regardless of which PNG filter type was applied upstream, but filter selection profoundly affects how well DEFLATE compresses the data. DEFLATE is good at compressing data with lots of repetition and near-zero values. A good filter produces exactly this: values close to zero (small differences from the prediction) with many exact repetitions (uniform regions that the LZ77 dictionary can reference).

A poor filter choice — for example, applying "None" to an image with large pixel value gradients — gives DEFLATE a stream of large, varied values with little repetition. LZ77 finds few matches; Huffman coding cannot produce short codes because no values dominate the frequency distribution. The result is a much larger file.

This is why advanced PNG optimisers test multiple filter strategies. The filter and compressor interact, and the optimal filter choice depends on the specific image content and how well each filter's output is exploited by the downstream compressor.

Summary

Every PNG you view has been decompressed using this pipeline in reverse: DEFLATE decodes Huffman codes back to LZ77 tokens, expands back-references to reconstruct the filtered data, then the inverse filter recovers the original pixel differences, which are added to the row predictions to recover the final pixel values. It happens in milliseconds in your browser's image decoder, transparently, billions of times a day.

To see lossless DEFLATE optimisation in action on your own files — re-encoding with optimal filters and maximum compression — try compressanimage.com.