Let us now praise insufficiently famous men: Claude Shannon


Claude Elwood Shannon (1916-2001)
Claude Elwood Shannon (1916-2001) (Photo credit: Wikipedia)lies absolutely all the communication on the internet.

Claude Shannon, who was born exactly a century ago, is the reason you are reading this – his work on signal processing lies behind all the communication on the internet.

Shannon’s master work was his MSc thesis – the unreachable target which is dangled in front of every computer science masters student.

Advertisements

Give yourself a Christmas present: learn sed


English: A Shebang, also Hashbang or Sharp ban...
A Shebang, also Hashbang or Sharp bang. (Photo credit: Wikipedia)

Text is at the core of The Unix Way – and all True Unix Hackers work from the command line. This much you know.

(If you don’t get a copy of The Art of Unix Programming – there is an awful lot of rubbish in that book but it does do one thing well: explain the deep connection between text and Unix.)

In a practical sense this means to get the best from your Unix system (and this includes you if you are a Mac OSX user) you need to boost your command line skills. The first thing to do is, of course, become familiar with a text editor – either vi or emacs (I am a vi user, but refuse to engage in a religious war on this matter.)

Then, perhaps not the next thing, but one of the next things you should do is learn sed – the streaming editor – one of the many gifts to the world (including Unix, of course) from Bell Labs (I recently read The Idea Factory: Bell Labs and the Great Age of American Innovation and I suppose I really ought to get around to writing a review of that).

Sed comes from the 1970s, but as so often in computing, it feels to me that its time has come again – in the era of big data a program that allows you to edit a file one line at a time – as opposed to trying to read as much of a file as possible into your computer’s memory – has come round again.

If you are sufficiently long in the tooth to have messed about with Microsoft’s edlin or any other line editor you might be forgiven for giving a hollow laugh at this point – but sed is a tool that genuinely repays the effort you have to make to learn it.

In the last few weeks I have been messing about with 220GB XML files and even the University of York’s big iron compute server cannot handle a buffered edit of a file that size – sed is the only realistic alternative (actually I thought about using my own hex editor – hexxed – which is also essentially a line editor, but a hex editor is really for messing about with binary files and I wouldn’t recommend it.

Sed has allowed me to fix errors deep inside very large files with just a few commands – eg:

LANG=C sed ‘51815253s@^.*$@<instruction address=\’004cf024\’ size=’03’ />@’ infile.xml >outfile.xml

Fixes line 51,815,253 in my file (the line identified by an XML fatal error). Earlier I had executed another line of sed to see what was wrong with that line:

LANG=C sed -n ‘51815253p’ infile.xml

(The LANG=C prefix is because the breakage involved an alien locale seemingly being injected into my file.)

Sed allows you to do much more – for instance anything you can identify through a pattern can be altered. Let’s say you have (text) documents with your old email address – me@oldaddress.com – and you want to change that to your new address – me@newaddress.com …

sed ‘s/me@oldaddress\.com/me@newaddress\.com/g’ mytext.txt > newtext.txt

Then check newtext.txt for correctness before using mv to replace the original.

But there is much, much more you can do with it.

Plus you get real cred as a Unix hacker if you know it.

Now, too many programs these days – especially anything from Redmond – go out of their way to suppress text formats. Text, after all, is resistant to the “embrace and extend” methodology – text wants to be free. But there is plenty of it out there still.

Books that teach you about sed are not so plentiful – I have been reading an old edition of sed & awk – which seems to be out of print – though you can buy a second hand copy for less than a quid excluding postage costs. Well worth the investment, I’d say.

Plan9 on the Raspberry Pi


Glenda, the Plan 9 Bunny
Glenda, the Plan 9 Bunny (Photo credit: Wikipedia)

Plan 9 from Bell Labs” was meant to be the successor system to Unix and like the original was designed and built by AT&Ts engineers at Bell Labs(the title is, of course, a skit on what is supposedly the best worst-ever film – “Plan 9 from Outer Space”).

Plan 9 never really made it. Linux came along and gave us Unix for the masses on cheap hardware for free and the world moved on. (Though some of the ideas in Plan 9 were retro-fitted into Linux and other Unix-like systems.)

The increased speed of commodity computers – latterly sustained via SMP – meant that computing power that once seemed only available to the elite could be found on the High Street and easy to use and install clustering software meant scientists and others could build super-computers using cheap hardware and free software. The multi-computer idea at the heart of Plan 9 seemed to have been passed-by as we screamed along the Moore’s Law Highway.

But now Moore’s Law is breaking down – or rather we are discovering that while the Law continues to apply – in other words we can still double the number of transistors on silicon every 18 – 24 months – other factors (heat dissipation essentially) mean we cannot translate a doubling of transistors into a computer that runs twice as fast. And so the multi-computer idea is of interest once more.

Plan 9 is not likely to be the operating system of the future. But as an actually existing multi-computer operating system it could still have a lot to teach us.

Now it has been ported to run on the Raspberry Pi single board computer I have decided to order another three of these things (I already have one running as a proxy server) and use them as Plan 9 nodes. The boards should be here in about three weeks (I hope), meaning I will have them as a Christmas present to myself.

Trying to work out why some people think Kevin Mitnick is a hero


Free Kevin bumper sticker, advocating release ...
Free Kevin bumper sticker, advocating release of Kevin Mitnick (Photo credit: Wikipedia)

For the last six months I have been spending a fair bit of time in the gym – I am getting older and I need to lose weight and increase fitness.

In truth, I quite enjoy it in general, but there are moments when I want to stop just because running on the same spot on a treadmill is essentially not that exciting. And recently I have been upping my endurance and pace (from walking to slow running, that is), and the biggest challenge to keeping that up and extending it can feel like beating the boredom, not passing through any physical barrier.

So, I thought I’d try an audio book as a way of overcoming the running-in-one-spot-blues. The one I wanted –The Idea Factory: Bell Labs and the Great Age of American Innovation – exists but I am banned from buying in the UK (so much for the free market), so I decided to try Ghost In The WiresKevin Mitnick‘s (ghost written) autobiography.

Now, I have listened to about an hour of this and I am really struggling to understand why so many people see Mitnick as a hero. So far he’s only 17 but has already described his engagement in sexual harassment, behaviour which got his mother’s phone cut off and general all-round anti-social unpleasantness.

I am not into lcoking people up and throwing away the key, especially for crimes against property which have minimal impact (after all if you steal a piece of software source code you don’t automatically diminish the utility of the code to the original user). But I think I would find it hard to be angry on Mitnick’s behalf. Perhaps greater injustice will be revealed as the book goes on, but so far Mitnick just sounds like a poorly socialised boor.

The Great Alan Turing


Allan Turing Statue, on display at Bletchley Park
Image via Wikipedia

Slashdot have a story to the effect that Leonardo DiCaprio is to play Alan Turing in a film that will mark the mathematician’s centenary next year.

Great news – the man’s memory deserves nothing more than the actor who has proved himself to be both great and edgy in recent work (he’s certainly not the milque toast figure the start of his career briefly suggested.)

As a geek, of course, I hope that the film will try to explain, just a little his achievements.

But how can you explain the ideas of computability and the Church-Turing thesis in a popular film? A tough one, but I suppose you could do something.

The Bletchley Park “bombe” and the idea that the weakness of the German Enigma machine – that it would never map a letter to itself (eg., in any message “e” would never be encrypted as “e”) – could be used to break the code (if a combination of a guessed plain text, usually a weather report, at the start of the message , and the initial key settings produced code that mapped letters to themselves then the initial settings were wrong) – is probably easier to explain.

And don’t forget about SIGSALY, the voice encryption system Turing worked on with Bell Labs. As a piece of engineering this is probably impossible to over-estimate in importance: as the first practical pulse code modulation system it could even be said to be the mother the mobile phone, or at least its grand aunt.

And, of course, let me again plug my book of the year: The Annotated Turing: A Guided Tour Through Alan Turing’s Historic Paper on Computability and the Turing Machine