Things learned this week: 7 February 2014

1. BT Business Broadband is rubbish but you can fix your download problem

For a long time I have wanted to download, for testing, OSX Mavericks on to a machine at work. And I have not been able to do so, the download always freezing at 995MB.

BT acknowledge that their equipment is crippled in this way – they claim it is a security feature and even said to me that “two years ago downloads weren’t this big”. As a result they may (as in my case) simply refuse to replace the equipment and instead try to get you to reset the hardware on a per download basis.

But you can get round this – download “Hotspot Shield” (it is free) – a VPN product – and use it to get you over the 995MB hump.

2. C++ templates are powerful

English: Example red-black tree; analogy to a ...
English: Example red-black tree; analogy to a B-tree. (Photo credit: Wikipedia)

Four years ago I wrote some C++ code to generate a red-black tree using templates. The code has been mouldering on Github ever since, but now I have (hopefully) found a completely new use for it – because the template design allows me to repurpose it for something else. Code reuse in action!

Enhanced by Zemanta

Give yourself a Christmas present: learn sed

English: A Shebang, also Hashbang or Sharp ban...
A Shebang, also Hashbang or Sharp bang. (Photo credit: Wikipedia)

Text is at the core of The Unix Way – and all True Unix Hackers work from the command line. This much you know.

(If you don’t get a copy of The Art of Unix Programming – there is an awful lot of rubbish in that book but it does do one thing well: explain the deep connection between text and Unix.)

In a practical sense this means to get the best from your Unix system (and this includes you if you are a Mac OSX user) you need to boost your command line skills. The first thing to do is, of course, become familiar with a text editor – either vi or emacs (I am a vi user, but refuse to engage in a religious war on this matter.)

Then, perhaps not the next thing, but one of the next things you should do is learn sed – the streaming editor – one of the many gifts to the world (including Unix, of course) from Bell Labs (I recently read The Idea Factory: Bell Labs and the Great Age of American Innovation and I suppose I really ought to get around to writing a review of that).

Sed comes from the 1970s, but as so often in computing, it feels to me that its time has come again – in the era of big data a program that allows you to edit a file one line at a time – as opposed to trying to read as much of a file as possible into your computer’s memory – has come round again.

If you are sufficiently long in the tooth to have messed about with Microsoft’s edlin or any other line editor you might be forgiven for giving a hollow laugh at this point – but sed is a tool that genuinely repays the effort you have to make to learn it.

In the last few weeks I have been messing about with 220GB XML files and even the University of York’s big iron compute server cannot handle a buffered edit of a file that size – sed is the only realistic alternative (actually I thought about using my own hex editor – hexxed – which is also essentially a line editor, but a hex editor is really for messing about with binary files and I wouldn’t recommend it.

Sed has allowed me to fix errors deep inside very large files with just a few commands – eg:

LANG=C sed ‘51815253s@^.*$@<instruction address=\’004cf024\’ size=’03’ />@’ infile.xml >outfile.xml

Fixes line 51,815,253 in my file (the line identified by an XML fatal error). Earlier I had executed another line of sed to see what was wrong with that line:

LANG=C sed -n ‘51815253p’ infile.xml

(The LANG=C prefix is because the breakage involved an alien locale seemingly being injected into my file.)

Sed allows you to do much more – for instance anything you can identify through a pattern can be altered. Let’s say you have (text) documents with your old email address – – and you want to change that to your new address – …

sed ‘s/me@oldaddress\.com/me@newaddress\.com/g’ mytext.txt > newtext.txt

Then check newtext.txt for correctness before using mv to replace the original.

But there is much, much more you can do with it.

Plus you get real cred as a Unix hacker if you know it.

Now, too many programs these days – especially anything from Redmond – go out of their way to suppress text formats. Text, after all, is resistant to the “embrace and extend” methodology – text wants to be free. But there is plenty of it out there still.

Books that teach you about sed are not so plentiful – I have been reading an old edition of sed & awk – which seems to be out of print – though you can buy a second hand copy for less than a quid excluding postage costs. Well worth the investment, I’d say.

@AmazonKindle have broken their Cloud Reader too

Having checked out the alternatives to the Kindle that are provided by Amazon the current state of play is:

  • Linux: no native alternative offered
  • Windows: Kindle app crashes on 64 bit wine, so cannot tell you what that’s like
  • Mac OSX: seems to render pages perfectly as broken as the Kindle
  • Android: app is as broken as the Kindle itself
  • Cloud Reader: supposedly what Linux users should be using, but this too is broken, failing to render characters properly in the same way as the Kindle.