Delays in a binary memory tree queue

This is a bit niche, but I spent a fair bit of this weekend working out what the algorithm to calculate how long a memory request packet would take to traverse a binary tree (from a leaf to the root) was. And now I have written a Groovy script (testable in your browser at https://groovyconsole.appspot.com/script/5109475228778496 – click on ‘edit’ and then ‘run’) to calculate the results.

(I am sure that this has been done before – many times – but I couldn’t find the algorithm on a quick search and then the desire to solve this myself took over).

The problem is this: memory request packets enter the tree at a leaf, they then have to cross a series of multiplexors until they reach the root (which is where the tree interfaces with the memory supply). Each multiplexor (mux) has two inputs and one output, so taken together the muxes form a binary tree.

As request packets could arrive at the leaves of a mux at the same time, there has to be a decision procedure about which packet progresses and which is delayed. The simple choice is to favour packets on either the left or the right leaf (in my case I went with the right). The question is then what is the average and maximum delay for a request packet.

So, if you muck about with the script you’ll see that the absolute maximum delay on a 256 leaf tree (eg., on a Bluetree memory tree connected to a 256 tile NoC) is 495 cycles, while the average maximum delay (ie for a leaf in the middle of the tree) is 248 cycles.

In contrast if the load is just 1% then these figures fall to about 3 and 1.5.

There is a flaw here though – because this all assumes that no new packets enter the system while the original request packet is traversing it – but in a dynamic system, even with a load as low as 1%, this is highly unlikely – packets that enter the tree “to the right” of the original request could potentially add to the delay if they chain with packets already present.