Friday, 6 January 2023

I'm not seeing red, but that's not a good thing!

Some more optimisations today; look-up tables for foreground and background video sprite tile (SCB1) addresses per memory address.

Looking at the online profiler again, I've decided I can't make much sense of the numbers that it is reporting. It may have something to do with the RED numbers on half the lines, which are devoid of any cycle, read or write statistics. I'm guessing it doesn't like the GNU AS syntax...

So I can only guess as to whether or not the code is actually more efficient. Given the relative number of lines and the elimination of multiple shifts, I think it's a safe assumption that it is.

I also did a few optimisations in some of the SUB CPU ROM code, mainly optimising the number of registers pushed onto the stack, and changing some MOVEA.L instructions to LEA. Moreso things that I noticed in passing, than a concerted effort to optimise that code.

Then I had a look at what I did in Scramble for measuring how much time was spent in various routines, including the VBLANK ISR - I changed the backdrop colour. In order to see it on Xevious, I had to temporarily disable the display masking sprites top and bottom of screen. That in itself reminded me of some other optimisations that I need to make - the (Xevious) sprite management.

The MAIN and SUB programs each execute a table of routines once per VBLANK, and then spin waiting for the next VBLANK. So I hooked into 3 areas of the code to see how much (if any) idle time there was. 1st up was the VBLANK ISR in RED, 2nd the SUB program in BLUE, and 3rd the MAIN program in GREEN.

Execution times for SUB (blue) and MAIN (green)

As can be seen above, the results were both confounding and encouraging... confounding because there is no visible ISR (RED). Ordinarily that would be expected because it is running within VBLANK, but in this case I know that's not the case - at least on real hardware - and I would have expected to see quite a bit of red here given that it runs at 50% on my AES.

The BLUE and GREEN bars are very encouraging though, because it shows that they're executing well within one frame, and there's plenty of headroom even if the ISR takes longer than the VBLANK period. However I do need to split the SUB CPU into code that should run during the VBLANK, and code that can run during a frame... not sure yet exactly how I will achieve that...

Next step is to run this on my AES, and see what happens to the RED line... I suspect that MAME is not emulating the VRAM acess timings for the Neo Geo sprite hardware. I guess I'll find out soon enough!

No comments:

Post a Comment