## A blast from the past: Huffman coding

Britain used to lead the world in the deployment of a cutting edge technology: the fax machine.

Back when I graduated from university in the last 1980s fax machines were the technology every business had to have and Britain had more of them per head of population than anywhere else.

Today I work in an office where we have decided having one is no longer necessary.

Why all this? Well, one thing I remember from the era of the fax machine is the frequent reference to “Huffman coding“. Indeed, back in the days when free software graphics libraries were in short supply, I investigated whether I could convert faxes into over graphics formats and kept coming across the term.

Now, thanks to Information and Coding Theory , I can fully appreciate the beauty of David A. Huffman‘s coding scheme.

Here’s my attempt to explain it…

We have a symbol sequence $S = s_1, s_2, ... ,s_{n - 1}, s_n$, these have a probability of occurring of $p_1, p_2, ... , p_n$, where $\sum\limits_{i = 1}^n p_i = 1$. We rearrange the symbols so that $p_1 > p_2 > ... p_n$

Now take the tow least probable symbols (or if there are more than two, any two of the least probable symbols) and connote them with one of the symbols picked to produce a new symbol sequence $S^\prime$ with $(n - 1)$ members e.g. : $S^\prime = s_1, s_2, ... s_{n - 1}$ with probabilities $p_1, p_2, ... (p_{n - 1} + p_n)$.

This process can then be repeated until we have simply one symbol with probability 1.

To encode this we can now ‘go backwards’ to produce an optimal code.

We start by assigning the empty symbol $\varepsilon$, then passing back up the tree, expanding the combined symbols and adding a ‘1’ to the ‘left’ hand symbol and a ‘0’ to the right (or vice versa): $\varepsilon$ 1  = 1, $\varepsilon$ 0 = 0.

Here’s a worked example: we have the (binary) symbols $S$: 0, 1, 10, 11, which have the probabilities of 0.5, 0.25, 0.125 and 0.125.

So $S^\prime$ = 0, 1, 10 with probabilities 0.5, 0.25, 0.25, $S^{\prime\prime}$ = 0, 1 with probabilities 0.5, 0.5 and $S^{\prime\prime\prime}$ = 0 with probability 1.

Now, to encode: $C^{\prime\prime\prime}: 0 = \varepsilon$ $C^{\prime\prime}$, 0 = 1, 1 = 0 $C^\prime$ 0 = 1, 1 = 01, 10 = 00 $C$ 0 = 1, 1 = 01, 10 = 001, 11 = 000

This code has an average length of $\frac{1}{2} + \frac{2}{4} + \frac{3}{8} + \frac{3}{8} = \frac{7}{4}$

(In fact pure Huffman coding is not generally used – as other forms e.g. adaptive Huffman, offer better performance, though the underlying principles are the same.)

1. 