It’s fair to say I made quite a lot of mistakes in this article – so don’t read what follows (though the comments are of interest) – but try this fixed version.
(So I had quite a lot of errors in this blog. I hope I’ve fixed them now and apologies for the mistakes.)
This paper – The 5 Minute Rule for Trading Memory for Disc Accesses and the 5 Byte Rule for Trading Memory for CPU Time – looked at the economics of memory and disk space on big iron database systems in the early/mid 1980s and concluded that the financial trade-off between disk space (then costing about $20000 per 540MB or 3.7 cents or there abouts per KB) and (volatile) memory (about $5 per kilobyte) favoured having enough memory to keep data you need to access around once every five minutes in memory.
It then compared the cost of computing power (which was estimated to cost about $50,000 per MIPS) and memory – eg if you compressed data to save on memory space you will have to use additional computing power to access the data. Here the trade off is calculated to be about 5 bytes of memory per instruction per second.
The trade off between memory and disk
Now the paper explicitly ruled out applying these sort of comparisons to PCs – citing limited flexibility in system design options and different economics. But we won’t be so cautious.
So what are the trade offs today? Well let us consider a case with 2TB SSD disks. These cost about £500 (and probably about $500, we are approximating) and let’s say we are going with a RAID 5 arrangement – so actually we need 4 ‘disks per disk’ and the cost is then 0.0001 penny per kilobyte (about 4 orders of magnitude less than 35 years ago).
And memory – for simplicity we are going to say 128GB of DRAM costs us £1000 (an over-estimate but fine for this sort of calculation). That means memory costs about 0.001p per KB – so the cost of both media has fallen hugely.
To make the calculation we need to look at access times. We’ll assume that we can access (read) a 4KB page on our SSD in 100 microseconds, so can handle 10,000 such reads a second. Four KB of memory costs us 0.004p and our disk costs us 20p/a/s. So following the logic in the original paper – if we stored a page in memory it has cost us 0.004p but saved 20p in a second, so if we save a page every 5000 seconds we have cost 20p – the break even point.
Alternatively we can think of this as meaning our caching or virtual memory schemes should target those pages that are accessed every 5000 seconds – about an hour and a half! The problem with that is that it implies a memory system of roughly 16TB to be truly efficient.
The tradeoff between memory and computing power
As mentioned above, back in 1985, computing power was estimated to cost about $50,000 per Million Instructions Per Second (MIPS). These days single core designs are essentially obsolete and so it’s harder to put a price on MIPS – good parallel software will drive much better performance from a set of 3GHz cores than hoping a single core’s 5GHz burst speed will get you there. But, as we are making estimates here we will opt for 3000 MIPS costing you £500 and so a single MIPS costing £0.17, and a single instruction per second costing (again approximating) 0.00002 pence.
Toughly speaking we have a byte of memory costing 0.000001p – maybe 20 times cheaper than an instruction. This suggests that there isn’t much to be gained in data compression at all.
But there are some big assumptions here – again that we have all the memory we need and that there is no instruction cost in having lots of memory.