Saturday, 13 February 2016

The fat lady hasn't started singing just yet...

I thought I'd give a brief run-down of the process that Knight Lore uses to render each frame. I won't go into any of the algorithms, but just give an overview of the steps involved.

As I've mentioned previously, Knight Lore builds and maintains a table of objects in the current room. The objects comprise absolutely everything that describes a room, from the background objects (eg walls and arches), both animated (eg guards, ghosts) and inanimate (eg. blocks, spikes) objects, special objects and of course the player himself. The table contains all there is to know about an object, including 2D and 3D positions & dimensions, sprite number, adjustments and current state. A nice architecture and trivial to add new objects with new behaviour.

The first order of business in rendering a new frame is to iterate through the object table and call the update routine - from a lookup table - for each object. The update routines vary in complexity from simply specifying pixel offset for display, to essentially being part of a larger state machine that controls movement, object life-cycle, appearance and interaction with other objects. If an object has changed position or appearance, it flags itself for wiping and redrawing.

Once all the objects have been updated with new state information, the code iterates through the object table again and builds a list (array) of objects that have been flagged for redrawing.

Next the code iterates through the above-mentioned list, looking for those objects which have been flagged for wiping. For these objects, the rectangular (dirty) area to wipe is calculated from its previous and next (2D) sprite position and dimensions, and the off-screen buffer is subsequently wiped. A list of these wiped rectangles is maintained for later blitting to the video memory (screen).

Penultimately, the code examines the list of objects to draw again, and calculates the display order (Z-order), rendering a sprite when it finds nothing else in the list obscured by it. It's then marked as drawn, and processing starts over on the list again, until all objects have been rendered. This is the clever bit, and the algorithm whose minute details still elude me.

Lastly, the aforementioned blitting can be done, resulting in the absolute minimum area (more-or-less) of the screen being copied each frame. And thusly the display is updated.

My C port faithfully reproduces the behaviour of the Z80 code, and works flawlessly on those systems I've ported it to that have bitmapped displays. And in typical fashion, I assumed I could mimic the correct behaviour on a system with a requisiste number of sprites - namely the Neo Geo - without actually devising an algorithm up-front. And if you've read this blog the last few days, you'll know that I've failed to do so thus far.

Last post I lamented on the very real possibility that it couldn't be done with only the information imparted during the rendering process I described above, by the code base as-is. However my 1-year-old had his shots yesterday so I was up for an hour last night trying to get him to sleep, giving me plenty of time to agonise over Knight Lore, and the issue at hand.

One thing I did realise, that has given me some new hope, is this; up until now I've been processing the list of objects that need to be redrawn each frame. That list comprises objects that have been affected by foreground objects but have themselves remained unchanged. And on a sprite-based system, those objects I can completely ignore!

Rather, at least with respect to objects (sprites) that may need moving and/or shuffling of display order (sprite priority), I need concern myself only with the list of objects that have been wiped! These objects have either moved, changed appearance, or disappeared altogether. And it only took about 2 minutes to modify my algorithm to prove it. So far so good.

Exactly if/how this fact contributes to a full solution to the problem I am yet to determine, and I'm not about to predict anything given my recent track record.

That aside, I've been thinking about the worst-case scenario (rendering all objects every frame) and how I'd go about optimising the rendering should I need to defer to this case. I've got a few ideas, but at this stage I don't have a feel for how the 68K would handle the extra processing, and/or whether the bottleneck exists in the sheer number of sprites to re-build each frame.

Any readers that have any theories/ideas - I'd be more than happy to hear them!

EDIT: Another little optimisation I thought about last night; not all sprites require 3x4 Neo Geo sprite tiles. In fact, a lot are considerably less. Given that writing to VRAM (sprite registers) is quite inefficient, reducing that to the minimum should increase the number of sprites that can be written per frame. I've just implemented that.

No comments:

Post a Comment