Calculating a binary Huffman code is easy: amalgamate the symbols in the source, two-by-two, and then reverse it to get the code.
But what of an r-ary Huffman code, of arbitary .
Text books give two “different” methods which puzzled me at first until (not too much, I admit) thinking convinced me they were the same.
In a normal Huffman ‘compression’ we might have a source of elements, for a code of base we would reduce to with elements by merging symbols in .
To do this properly we need to have symbols left for the final merge. To do this then, if we start with symbols, then .
The difference between the two methods suggested in text books is either positing additional symbols of probability zero such that or ensuring the first merger is of only symbols where .
Both, are of course, functionally similar. For instance, with and , then , so we can begin by adding two symbols of probability zero or the first reduction should be of size such that – and we can see that, in this case, .