Give yourself a Christmas present: learn sed


English: A Shebang, also Hashbang or Sharp ban...
A Shebang, also Hashbang or Sharp bang. (Photo credit: Wikipedia)

Text is at the core of The Unix Way – and all True Unix Hackers work from the command line. This much you know.

(If you don’t get a copy of The Art of Unix Programming – there is an awful lot of rubbish in that book but it does do one thing well: explain the deep connection between text and Unix.)

In a practical sense this means to get the best from your Unix system (and this includes you if you are a Mac OSX user) you need to boost your command line skills. The first thing to do is, of course, become familiar with a text editor – either vi or emacs (I am a vi user, but refuse to engage in a religious war on this matter.)

Then, perhaps not the next thing, but one of the next things you should do is learn sed – the streaming editor – one of the many gifts to the world (including Unix, of course) from Bell Labs (I recently read The Idea Factory: Bell Labs and the Great Age of American Innovation and I suppose I really ought to get around to writing a review of that).

Sed comes from the 1970s, but as so often in computing, it feels to me that its time has come again – in the era of big data a program that allows you to edit a file one line at a time – as opposed to trying to read as much of a file as possible into your computer’s memory – has come round again.

If you are sufficiently long in the tooth to have messed about with Microsoft’s edlin or any other line editor you might be forgiven for giving a hollow laugh at this point – but sed is a tool that genuinely repays the effort you have to make to learn it.

In the last few weeks I have been messing about with 220GB XML files and even the University of York’s big iron compute server cannot handle a buffered edit of a file that size – sed is the only realistic alternative (actually I thought about using my own hex editor – hexxed – which is also essentially a line editor, but a hex editor is really for messing about with binary files and I wouldn’t recommend it.

Sed has allowed me to fix errors deep inside very large files with just a few commands – eg:

LANG=C sed ‘51815253s@^.*$@<instruction address=\’004cf024\’ size=’03’ />@’ infile.xml >outfile.xml

Fixes line 51,815,253 in my file (the line identified by an XML fatal error). Earlier I had executed another line of sed to see what was wrong with that line:

LANG=C sed -n ‘51815253p’ infile.xml

(The LANG=C prefix is because the breakage involved an alien locale seemingly being injected into my file.)

Sed allows you to do much more – for instance anything you can identify through a pattern can be altered. Let’s say you have (text) documents with your old email address – me@oldaddress.com – and you want to change that to your new address – me@newaddress.com …

sed ‘s/me@oldaddress\.com/me@newaddress\.com/g’ mytext.txt > newtext.txt

Then check newtext.txt for correctness before using mv to replace the original.

But there is much, much more you can do with it.

Plus you get real cred as a Unix hacker if you know it.

Now, too many programs these days – especially anything from Redmond – go out of their way to suppress text formats. Text, after all, is resistant to the “embrace and extend” methodology – text wants to be free. But there is plenty of it out there still.

Books that teach you about sed are not so plentiful – I have been reading an old edition of sed & awk – which seems to be out of print – though you can buy a second hand copy for less than a quid excluding postage costs. Well worth the investment, I’d say.

A GUI for Metapost?


English: Three examples of metapost output
English: Three examples of metapost output (Photo credit: Wikipedia)

I have sort-of abandoned my Apple Air Book for serious work this last week – going back to a 2008/9 Toshiba laptop (another Morgan Computers purchase) running Linux.

The Apple is a lovely device to travel with and is beautiful, if extremely expensive, device with which to browse the web, but a decade of conditioning to Linux and its command-line power and orthogonal tool set means I am much happier even with a slower machine when it comes to doing things like drawing figures with Metapost.

But having extolled the power of the command line I am wondering whether I should build a GUI for Metapost – essentially an editor panel coupled with a EPS display panel.

Metapost users seem thin on the ground – though maybe that is because a GUI tool doesn’t exist – but anyone who does use it care to comment?

It’s Friday afternoon…


…and I could really do with an answer to this question I have posted over on “Superuser”:

 

 

I am seeking to back up an encrypted volume used by Virtual Box on one OS X machine to another using rsync (I will eventually stick this into cron).

This is the command line (I am sharing public keys so no password is required) – with some details obscured:

rsync –bwlimit=100 -av -e “scp -P [port numb] user@address:~/VirtualBox\ VMs/ubuntu1/*” ./ubuntu1/.

But it won’t copy anything, just repeatedly giving me this, ie no copying is done – despite the fact that ubuntu1.vdi date stamp and size have now changed:

building file list … done drwxr-xr-x 170 2012/10/04 19:06:15 . -rw——- 7265 2012/10/05 10:00:21 ubuntu1.vbox -rw——- 7265 2012/10/05 10:00:21 ubuntu1.vbox-prev -rw——- 7881625600 2012/10/05 10:53:23 ubuntu1.vdi

sent 132 bytes received 20 bytes 304.00 bytes/sec total size is 7881640130 speedup is 51852895.59

How do get this to work properly?