Kraft’s inequality states that: There is an instantaneous -ary code with word lengths if and only if :
But what of codes that are merely uniquely decodable, and not necessarily instantaneous?
There number is given to us by McMillian’s inequality and the surprising thing is that, despite the looser bound this is:
There is a uniquely decodable -ary code with word lengths if and only if :
And here’s a proof (again drawn from Information and Coding Theory):
We have a code with word lengths and we define:
Additionally we say that and
Let us now consider or where .
This is and is the sum of a series of products of the form:
(I found this a bit difficult to follow at first, but imagine we had , then we would could expand this as:
And we can also write:
Now since for all .
Now, here’s the fancy bit (at least for me!)…
This means we can rewrite as
The upper and lower bounds for follow from the inequalities above (we could sum over a wider range but then the coefficient would be zero beyond these bounds. And what is this coefficient?
is the number of ways of writing any given (or, alternatively, the number of different code word sequences whose total lengths sum to ).
Let’s take a simple example again – we have a binary code with code words (plainly not instantaneous as is a prefix of ).
But what if we look at ?
Then we have:
Writing this out in the form above we then have – and you can see the coefficients:
The coefficient can never be greater than (as it can never be greater than the number of possible ways to arrange the code), so .
We also know that is a sum of terms, so:
As increases, if then the left hand side increases exponentially while the right (where and are independent of ) increases only linearly, hence .
I am sure this will be one of the less read postings on the blog (congratulations if you got this far), but if it helps just one person get a better undestanding of McMillan’s inequality then it has been worth it!
- Kraft’s inequality (cartesianproduct.wordpress.com)
But now, thanks to another book – Networks on Chips – here’s a slightly more fundamental examination.
CRCs are a form of Cyclic codes – codes where a rotation – eg., make the 0 bit the highest order bit, the previously highest order bit the second highest order bit and so on – produces a new codeword.
As before we can think of code words as polynomials of the power of 2 with coefficients of 0 or 1, eg.,
(And of course any natural number can be represented in this fashion.)
Then a circular shift is similar to multiplying by 2 – though with the proviso that
We can construct a codeword in the form where is the message, is the (monic) generator and a remainder (of the same or lesser order as ).
But if we stipulate that is of the smallest order of a codeword then that means must be zero.
So, what does a generator matrix look like?
If we have a monic generator polynomial of degree and a code word length of :
- Deriving an Incremental Form of the Polynomial Regression Equations (erikerlandson.github.com)
- A smattering of MyForth: CRC16 checksums (toddbot.blogspot.com)
- Selecting a Checksum algorithm (fastcompression.blogspot.com)
- Parity checks as matrix multiplication (cartesianproduct.wordpress.com)