This is another piece of extended “working out” based on Information and Coding Theory: essentially a note to demonstrate to myself I have followed one of the proofs in the book (and one that gave me some difficulty until the obviousness of it struck on the Tube this morning).

We know from McMillian’s inequality that a code, exists with an alphabet with average length .

To show there is an optimal code, , we need to show that there are finite number of codes such that .

If the source has words each of probability (each non-zero), then we let .

Then, for an optimal code , with words of lengths , each of must satisfy for (this is the bit I did not grasp).

Why? Well here’s a proof by contradiction…

If the above was not true then , multiplying both sides by we have .

But as ( being the probability of the longest word in ), then as .

Now there can only be a finite number of words such that so there may only be a finite number of codes where .

###### Related articles

- McMillian’s inequality (cartesianproduct.wordpress.com)
- Kraft’s inequality (cartesianproduct.wordpress.com)
- How optimal is Lempel-Ziv at reaching the Shannon limit? (cs.stackexchange.com)

Pingback: Mixing Times 2 – Metropolis Chains | Eventually Almost Everywhere

Pingback: Mixing Times 4 – Avoiding Periodicity | Eventually Almost Everywhere

Pingback: Two different ways to calculate r-ary Huffman Code | cartesian product