A further thought on MD5

Shows a typical cryptographic hash function (S...
Image via Wikipedia

The main use of MD5 – at least if my computer is any guide – is to check that a file you have downloaded from the internet or elsewhere is what it says it is.

In fact in this general use MD5 is not being used to encrypt anything – instead it produces a “message digest” – a 128 bit number that is a hash function of the supplied file. The problem with collisions in this case is that it means two different files could give the same hashed value (ie MD5 digest) and you could be left thinking you had the genuine file when you did not.

But that 128 bit hashed value plainly is not going to give you back the file – unlike CSI:Miami and everywhere where you see a “let’s enhance that” computer graphics gimmick, in the real world you cannot get more information out than you put in: so a 128 bit number will not magically transform into a 5 MB file even if you can reverse the hashing.

But that was not the issue with the Sun – they appeared to be using MD5 to hash a short password and in that case, at least in theory, being able to crack MD5 could give the original information back.


So, is the MD5 weakness a real world problem or not?

My last posting – made in a hurry while I was waiting for a large SCP transfer to complete – has generated more traffic than anything else in the last month: possibly because it was mildly topical and largely because it was retweeted by John Rentoul, one of the UK’s leading political commentators and all-round good egg.

Maybe I was being a bit naive with it – because I took what the New Scientist said the US Department of Homeland Security said about the MD5 hashing algorithm – in short, it is completely broken and should not be used – and LulzSec’s claim to have cracked the Sun’s MD5 based password system and drew what I thought was the obvious conclusion – that an MD5 crack was in some way related to LulzSec’s attack on the Sun’s website on last Monday night.

But at least one person who ought to know more about this than me – forensic investigator Jonathan Krause – has taken issue with it and indeed with the whole idea that MD5 is a major security risk:




I have to admit I find this all a bit puzzling, as the web is full of stories like “brute force algorithm can crack 1.5 million MD5 hashes per second” and so on, as well as even some sites that allow you to look up previously brute forced hashes. (Of course 1.5 million per second is not a lot in a key space of 2^{128}.)

Yet on the other hand I can also find no concrete example (the disputed LulzSec crack at the Sun excepted) where someone is claiming to have made a practical use of an MD5 crack.

Brokenness of MD5 leads to attack on “The Sun”

lulzsec and Anonymous magnets
Image by goblinbox (queen of ad hoc bento) via Flickr

News coverage in Britain has been dominated by “hackgate” for several weeks now, the interest only subsiding as the horrific nature of what happened in Norway on Friday became clear.

In the middle of all this the website of News International‘s leading daily, the Sun, was taken over by the “LulzSec” crackers – who spent several hours boasting over Twitter about how they were battling the Sun’s admins.

Human security is the weakest form of security – we have all worked in places were management expect you to share passwords, after all. But it seems that one of the issues here was technical, according to the latest issue of New Scientist.

Various passwords at NI were hashed with the MD5 algorithm, which is thoroughly broken: something which is pretty worrying when a locate md5 command throws up 2928 references.

What makes it worse is that the breakage has apparently been known since 1996. (From what I can gather the issue is that the hashed code can have duplicates ie two different inputs can give the same output – meaning it is possible to create an MD5 hash that matches the expected code but which does not indicate that the supplier of the MD5 hash is genuine.)

Update: (And with thanks to John Rentoul for spotting the spelling mistake). It has been said to me that “this explanation makes no sense whatsoever”. Well, I am merely commenting on  what others have reported – click on some of the links below – to make the point that a clearly very broken hash algorithm is in very widespread use. But there are many ways to pick up a password file that admins may have exposed and not worried about because they think it’s encrypted and so unbreakable. Perhaps that happened here? Back in the ‘olden days’ before the web crushed all internet competitors, FTP sites were very common and littered with password files. Perhaps the Sun has an FTP site (this venerable protocol still has some uses after all)?