## Proof of the existence of an optimal code

This is another piece of extended “working out” based on Information and Coding Theory: essentially a note to demonstrate to myself I have followed one of the proofs in the book (and one that gave me some difficulty until the obviousness of it struck on the Tube this morning).

We know from McMillian’s inequality that a code, $C$ exists with an alphabet $T$ with average length $L(C)$.

To show there is an optimal code, $D$, we need to show that there are finite number of codes such that $L(D) \leq L(C)$.

If the source $S$ has words $s_1, s_2, ..., s_i$ each of probability $p_1, p_2, ..., p_i$ (each non-zero), then we let $p = min(p_1, p_2, ..., p_i)$.

Then, for an optimal code $D$, with words $w_1, w_2, ..., w_i$ of lengths $l_1, l_2, ..., l_i$, each $l_q$ of $D$ must satisfy $l_q \leq \frac{L(C)}{p}$ for $q = 1 ... i$ (this is the bit I did not grasp).

Why? Well here’s a proof by contradiction…

If the above was not true then $l_q > \frac{L(C)}{p}$, multiplying both sides by $p$ we have $pl_q > p \frac{L(C)}{p} = L(C)$.

ButÂ  as ($p$ being the probability of the longest word in $D$), $L(D) = p_1l_1 + p_2l_2 + ... + pl_i$ then $L(D) > L(C)$ as $pl_i > L(C)$.

Now there can only be a finite number of words $w \in T^+$ such that $\mid w \mid \leq \frac {L(C)}{p}$ so there may only be a finite number of codes where $L(D) \leq L(C)$.