If you have ever downloaded any software from the internet you are quite likely to have seen a reference to a “CRC” – a cyclic redundancy check.

I have been reading about them as I plough on through Andrew Tanenbaum‘s (pictured) Computer Networks: so here is an explanation based closely on his text (though it’s not a copy except where indicated)…

CRCs rely on some properties of polynomial arithmetic, namely polynomials with the coefficients 0 or 1 only.

We say that a is a list of coefficients for a polynomial with terms ie for a polynomial ranging from to (which is said to have ).

So represents .

Polynomial arithmetic is a form of *modulo-2* arithmetic “according to the rules of algebraic field theory” (that’s what it says in the book!). In practise what this means is that there are no “carry-overs” and addition and subtraction are identical and the same as exclusive or (XOR).

Hence while and so on…

To use a CRC a sender and receiver must agree a **generator polynomial**: in fact these are now generally standardised and IEEE 802 (network protocols) defines:

(the last two terms are of course equivalent to ).

1. Let us call this generator polynomial and let represent the polynomial (ie the sequence of 0s and 1s) we wish to transmit (with bits) – must have more bits than .

2. Let *r* be the degree of – so we append *r* zero bits to the lower end of the frame, which now has bits and is (or ).

3. Divide by using modulo 2 division.

4. Subtract the remainder (which will always be *r* or fewer bits (as is of degree *r*) from the string (again using modulo 2 subtraction). This gives the checksummed frame to be transmitted – we shall call this .

If the frame is transmitted correctly then when it is received it should be cleanly divisible by (as taking away the remainder in step 4 should leave us with a cleanly divisible frame). The additional checksum is merely the additional bits at the end.

The power of the checksum relies on some of its mathematical properties – it will always detect single bit errors if contains more than one term – as the bad frame is now where is the error but in this case where is the “bad bit” but with a two-termed , for any .

Similarly, if there are two errors then so if does not cleanly divide where can be any value from 1 to the maximum of (ie the frame length) then any two bit error would be detected.

Similarly, if there are an odd number of errors then cannot be divisible by a that contains the terms . We can show by a *reductio ad absurdum* proof:

1. We know that an must be capable of evaluating to one: eg must evaluate to 1 if . But if is divisible by then , but if then, modulo 2, so .

To really fool the system the error pattern would have to match which is, assuming all transmission errors are equally likely (in fact they are not, but we will ignore that for now), has a probability (again, the book says) of which is pretty small for the IEEE picked . (The GNU calculator says it is zero – so it’s too small to register there!)

###### Related Articles

- Recovering Outlook PST File Corruption (brighthub.com)
- How error in binary data can be detected (wiki.answers.com)
- Fourier Series: before I begin… (cartesianproduct.wordpress.com)
- Pretty pictures of sets of roots (gasstationwithoutpumps.wordpress.com)
- Applications of Complex Number Analysis to Divisibility Problems : Two Undergrad Problems (wpgaurav.wordpress.com)
- Cyclic Subgroups (rkneufeld.wordpress.com)

Pingback: Calculating CRCs in hardware | cartesian product