Tonight, at the suggestion of the author of the Atari 800 Pentagram port, I thought I'd look at the masking logic. It certainly seemed like a good candidate, given the amount of time spent in the rendering routines and the fact that the masking requires two table look-ups per byte rendered.
Rather than try to optimise it, I thought I'd disable it altogether and see what effect it had on the performance. I was surprised, and a little disappointed, that it seemed to have very little in fact. Well, perhaps not completely insignificant at 4% in the 'busy' room, taking the frame rate from 8-9fps (see screenshot below) to a solid 9fps.
The infamous 'busy' screen, with fps counter |
And because I now had a metric, I disabled the Z-ordering again. I clearly underestimated the effect last time, because the performance jumped to 13fps, and a reduction in the corresponding routine (calc_display_order_and_render) from 25% all the way down to 2%.
I was now a little bummed that disabling masking and Z-order completely still resulted in 13fps, a few frames short of my target 15fps. Curious as to how this compared to the original, I fired up the ZX Spectrum and Coco3 emulators side-by-side and entered the 'busy' room on each.
To my surprise, the Coco3 (with no masking or Z-order) was significantly faster. So I re-enabled Z-order. It was still faster. So I re-enabled masking - back to the complete code - and it was still faster at 8-9fps!!! For this screen at least, I had been trying to optimise something that was already too fast!
So where does that leave the project? What I need to do now is visit a significant number of screens and compare them, side-by-side, with the ZX Spectrum original. If the Coco3 is running faster in every case, then my work here is done! In fact, I'll need to (re-enable and) tweak the throttling function so that the screens are all roughly the same speed.
EDIT: I think I'm going to back-port my fps ISR from the Coco3 to the ZX Spectrum!
Then there's the matter of beefing up the Z80 R register emulation, fixing a graphical glitch that is simply a result of not being able to emulate the Spectrum's attribute bytes, and we're done!
Livin' high and wide, one might even say!
Good to see more frequent updates from you again - and especially with news of progress in them. :-)
ReplyDeleteSoon you'll be comparin', side-by-side...
ReplyDeleteAhem. :)
How does it run in the lower CPU-speed mode?