cartesian product … stuff about computing mostly

Updating the five minute and the five byte rules

It’s fair to say I made quite a lot of mistakes in this article – so don’t read what follows (though the comments are of interest) – but try this fixed version.

(So I had quite a lot of errors in this blog. I hope I’ve fixed them now and apologies for the mistakes.)

This paper – The 5 Minute Rule for Trading Memory for Disc Accesses and the 5 Byte Rule for Trading Memory for CPU Time – looked at the economics of memory and disk space on big iron database systems in the early/mid 1980s and concluded that the financial trade-off between disk space (then costing about $20000 per 540MB or 3.7 cents or there abouts per KB) and (volatile) memory (about $5 per kilobyte) favoured having enough memory to keep data you need to access around once every five minutes in memory.

It then compared the cost of computing power (which was estimated to cost about $50,000 per MIPS) and memory – eg if you compressed data to save on memory space you will have to use additional computing power to access the data. Here the trade off is calculated to be about 5 bytes of memory per instruction per second.

The trade off between memory and disk

Now the paper explicitly ruled out applying these sort of comparisons to PCs – citing limited flexibility in system design options and different economics. But we won’t be so cautious.

So what are the trade offs today? Well let us consider a case with 2TB SSD disks. These cost about £500 (and probably about $500, we are approximating) and let’s say we are going with a RAID 5 arrangement – so actually we need 4 ‘disks per disk’ and the cost is then 0.0001 penny per kilobyte (about 4 orders of magnitude less than 35 years ago).

And memory – for simplicity we are going to say 128GB of DRAM costs us £1000 (an over-estimate but fine for this sort of calculation). That means memory costs about 0.001p per KB – so the cost of both media has fallen hugely.

To make the calculation we need to look at access times. We’ll assume that we can access (read) a 4KB page on our SSD in 100 microseconds, so can handle 10,000 such reads a second. Four KB of memory costs us 0.004p and our disk costs us 20p/a/s. So following the logic in the original paper – if we stored a page in memory it has cost us 0.004p but saved 20p in a second, so if we save a page every 5000 seconds we have cost 20p – the break even point.

Alternatively we can think of this as meaning our caching or virtual memory schemes should target those pages that are accessed every 5000 seconds – about an hour and a half! The problem with that is that it implies a memory system of roughly 16TB to be truly efficient.

The tradeoff between memory and computing power

As mentioned above, back in 1985, computing power was estimated to cost about $50,000 per Million Instructions Per Second (MIPS). These days single core designs are essentially obsolete and so it’s harder to put a price on MIPS – good parallel software will drive much better performance from a set of 3GHz cores than hoping a single core’s 5GHz burst speed will get you there. But, as we are making estimates here we will opt for 3000 MIPS costing you £500 and so a single MIPS costing £0.17, and a single instruction per second costing (again approximating) 0.00002 pence.

Toughly speaking we have a byte of memory costing 0.000001p – maybe 20 times cheaper than an instruction. This suggests that there isn’t much to be gained in data compression at all.

But there are some big assumptions here – again that we have all the memory we need and that there is no instruction cost in having lots of memory.

Rate this:

Adrian McMenamin

November 15, 2020

Uncategorized

Five byte rule, Five minute rule, Memory management, virtual memory

5 responses to “Updating the five minute and the five byte rules”

Albert

November 22, 2020 at 11:15 pm

$500 for 2TB, that is 5^4 p / 2^9 kB = 2.5^-5 p/kB, that’s 5 orders of magnitude reduction than 35 years ago, seems another major error on the scale. there was no raid back then, we should not consider raid now.
Zaphus

November 22, 2020 at 11:41 pm

“we will opt for 3000 MIPS costing you £500 and so a single MIPS costing £6”

I think you have that wrong, a single MIPS would cost £500/3000 = £0.16
1. Adrian McMenamin
  
  November 24, 2020 at 7:40 pm
  
  Yes, I dropped the ball on this quite badly! I have updated it (new blog) with new figures which I hope are correct
Greg Shubert

November 23, 2020 at 2:42 am

Reading this article on Nov. 22, I think I see some errors.

Price for new SSD is 2*10^5 cents / 2*10^9 KB = 10^-4 cents/KB, not 0.1 cents/KB as you write. Recall that 1 TB = 10^12 B = 10^9 KB.

Likewise price for RAM is 10^5 cents / 10^8 KB = 10^-3 cents/KB, not 1 cent/KB. 100 GB = 10^2 * 10^6 KB = 10^8 KB. Both new calculations seem to be off by 3 orders of magnitude.

Disk accesses cost about 20 cents per (acc/sec). Double the disk cost to account for extra CPU support, which is 10^5 cents/10^4 a/s = 10 cents per a/s. So 10 cents times 2 = 20 cents (or 20p at dollar-pound parity).

But the seconds to break-even is no longer $2000 (to save 1 a/s) divided by $5 (cost of RAM) in the 1986 paper. It will be 20 cents (to save 1 a/s) divided by 4*10^-4 cents (cost of 4K RAM). This is 5*10^4 seconds, roughly 10 hours, not 5 minutes.

The decrease in cost of MIPS from $50,000 to roughly $5 is 4 orders of magnitude. The decrease in cost of disk accesses was from about $2,000 to about $0.2, or 4 orders of magnitude. 1 KB RAM decreased from $5 to 10^-3 cent or almost 6 orders of magnitude. This is why the break-even time increased, because RAM got cheaper faster than disks got faster.

The break-even point between memory and MIPS is based on what you can buy of each for the same money. 1 cent per MB is 500 MB for 500 cents. One MIPS is about 500 cents. Dropping the millions, we get 100 bytes per instruction/second.

The decrease in cost of MIPS from $50,000 to roughly $5 is 4 orders of magnitude. Again, RAM got cheaper faster than MIPS got cheaper.

Instead of 5 minutes and 5 bytes, the break-even points seem now to be 10 hours and 100 bytes.
1. Adrian McMenamin
  
  November 24, 2020 at 5:53 pm
  
  Thanks. Yes – it was a mess and sloppy on my part. Going to redo it.