## Cyclic Redundancy Checks

If you have ever downloaded any software from the internet you are quite likely to have seen a reference to a “CRC” – a cyclic redundancy check.

I have been reading about them as I plough on through Andrew Tanenbaum‘s (pictured) Computer Networks : so here is an explanation based closely on his text (though it’s not a copy except where indicated)…

CRCs rely on some properties of polynomial arithmetic, namely polynomials with the coefficients 0 or 1 only.

We say that a $k-bit\ frame$ is a list of coefficients for a polynomial with $k$ terms ie for a polynomial ranging from $x^{k-1}$ to $x^0$  (which is said to have $degree\ k-1$).

So $11001101$ represents $x^7 + x^6 +x^3 + x^2 + x^0$.

Polynomial arithmetic is a form of modulo-2 arithmetic “according to the rules of algebraic field theory” (that’s what it says in the book!). In practise what this means is that there are no “carry-overs” and addition and subtraction are identical and the same as exclusive or (XOR).

Hence $11011 + 10111 = 01100$ while $10011 - 01011 = 11000$ and so on…

To use a CRC a sender and receiver must agree a generator polynomial: in fact these are now generally standardised and IEEE 802 (network protocols) defines: $x^{32}+x^{26}+x^{22}+x^{16}+x^{12}+x^{11}+x^{10}+x^8+x^7+x^5+x^4+x^2+x+1$

(the last two terms are of course equivalent to $x^1+x^0$).

1. Let us call this generator polynomial $G(x)$ and let $M(x)$ represent the polynomial (ie the sequence of 0s and 1s) we wish to transmit (with $m$ bits) – $M(x)$ must have more bits than $G(x)$.

2. Let r be the degree of $G(x)$ – so we append r zero bits to the lower end of the frame, which now has $m + r$ bits and is $M(x)x^r$ (or $M(x)2^r$).

3. Divide $M(x)x^r$ by $G(x)$ using modulo 2 division.

4. Subtract the remainder (which will always be r or fewer bits (as $G(x)$ is of degree r) from the $M(x)x^r$ string (again using modulo 2 subtraction). This gives the checksummed frame to be transmitted – we shall call this $T(x)$.

If the frame is transmitted correctly then when it is received it should be cleanly divisible by $G(x)$ (as taking away the remainder in step 4 should leave us with a cleanly divisible frame). The additional checksum is merely the additional $r$ bits at the end.

The power of the checksum relies on some of its mathematical properties – it will always detect single bit errors if $G(x)$ contains more than one term – as the bad frame is now $T(x) + E(x)$ where $E(x)$ is the error but in this case $E(x) = x^i$ where $i$ is the “bad bit” but with a two-termed $G(x)$, $x^i/[x^m + x^n] \neq 0$ for any $i, m, n$.

Similarly, if there are two errors then $E(x) = x^i + x^j = x^j(x^{i-j} + 1)$ so if $G(x)$ does not cleanly divide $x^k + 1$ where $k$ can be any value from 1 to the maximum of $i -j$ (ie  the frame length) then any two bit error would be detected.

Similarly, if there are an odd number of errors then $E(x)$ cannot be divisible by a $G(x)$ that contains the terms $x + 1$. We can show by a reductio ad absurdum proof:

1. We know that an $E(x)$ must be capable of evaluating to one: eg $x^i + x^j + x^k$ must evaluate to 1 if $x = 1$. But if $E(x)$ is divisible by $(x + 1)$ then $E(x) = (x + 1)F(x)$, but if $x = 1$ then, modulo 2, $x + 1 = 1 + 1 = 0$ so $E(x) = 0$.

To really fool the system the error pattern would have to match $G(x)$ which is, assuming all transmission errors are equally likely (in fact they are not, but we will ignore that for now), has a probability (again, the book says) of $\frac{1}{2}^{r -1}$  which is pretty small for the IEEE picked $G(x)$. (The GNU calculator says it is zero – so it’s too small to register there!)

1. 