A week further on and not much C++ has been written – and now I think I need to make a new start.
Up to this point I have been trying to write a software model of the hardware and my thought was I could think put a software-modelling layer on top of that. But that simply is not going to work – it is just too complex.
Instead I am going to have to make some policy decisions in the software – essentially over how I model the local memory on the chip: each tile will process memory reads and writes and needs to know where that memory is – it could be in the global off-chip memory store or it could be on-chip.
The difference matters because, at least in theory, the on-chip memory is speedily accessible, while the off-chip memory is 50 to 100 to 500 times “further away”. Because memory accesses exhibit locality it makes sense to ship blocks of addressed memory from the global to the local store – but doing so takes time and if there are a lot of memory movements then we get thrashing.
What I now have to do is think of what policy I will use to decide what memory gets stored locally (or, more likely, what policy I use to map the addresses). I’ll start by once again reviewing papers that propose some schemes for existing Networks-on-Chip.
In other news: I have had a paper (of which I am co-author and first named author) accepted by OSPERTS 15 – so I will be off to Sweden to get mauled by the audience there in early July. It will be an experience, and I am looking forward to it, but I also think it might be not so much a baptism, but a destruction by fire.
Lying in bed this morning and puzzling over what to do …
At first I thought what I should do is copy one of the existing operating system models for NoCs, but that simply would not be flexible enough.
What I have to do is model the hardware (including the modifications to the MMU I want to see) as, essentially, some form of black box, and build other layers – including the memory tree – above that. That means I need to separate the CPU/tiles from global memory: sounds simple in theory but implementing this is going to be very far from easy.
I hope you can see how quickly this becomes complex – and all we are talking about here is a simple software framework to allow me to do the research (ie., delivering the software, complex as it is, is only the very start.)
I am struggling to know where to begin – C++ seems like the logical choice for this, but it’s not proving to be much fun. Particularly because my CPU class has to be able to “execute” some code – I thought about using a DSL but may stick to the XML output I got from my hacked Valgrind Lackey – as at least I can then use existing XML engines.
Should I build from the XML up – eg., get a CPU class that can hack the XML and pass the requests up the chain (eg via the cache up to the Mesh and up to the DRAM etc), or what?