Calculating a binary Huffman code is easy: amalgamate the symbols in the source, two-by-two, and then reverse it to get the code.
But what of an r-ary Huffman code, of arbitary .
Text books give two “different” methods which puzzled me at first until (not too much, I admit) thinking convinced me they were the same.
In a normal Huffman ‘compression’ we might have a source of
elements, for a code of base
we would reduce
to
with
elements by merging
symbols in
.
To do this properly we need to have symbols left for the final merge. To do this then, if we start with
symbols, then
.
The difference between the two methods suggested in text books is either positing additional symbols of probability zero such that or ensuring the first merger is of only
symbols where
.
Both, are of course, functionally similar. For instance, with and
, then
, so we can begin by adding two symbols of probability zero or the first reduction should be of size
such that
– and we can see that, in this case,
.
Related articles
- McMillian’s inequality (cartesianproduct.wordpress.com)
- Proof of the existence of an optimal code (cartesianproduct.wordpress.com)
- Digital devices do without details (stuff.co.nz)