The actual optimised code

I discovered when I tried to write my optimised code from last night that there was a problem with it – the various register add instructions use a 5 bit number to number the registers being accessed – allowing access to registers 0 to 31.

But the program counter is register number 32, so just can’t be used in these commands. After a bit of head scratching I came up with a solution:

aui t0, 0
ld t1, 24(t0)
addi sp, sp, -8
sd t1, 0(sp)
ld t1, 32(t0)
jr t1

That aui t0, 0 adds 0 to the value in the pc and stores it in t0, so is the equivalent of mv t0, pc – and then by just taking an offset straight from t0 I also saved another instruction.

Does it work? Yes, it compiles and runs.

But does it work by saving time? Yes – knocks about half a second off the benchmark execution time: roughly a 6% saving, which isn’t bad. But my code is still rather adrift of GForth, so I need to look deeper.