Wednesday 16 November 2022

Another dormant bug that only awakens on the Neo Geo

Generally during RE, and sometimes during transcoding, a few bugs rear their ugly head. Some, like in Knight Lore, are benign on their native platform (hence the reason they're not found) and sit dormant, until someone comes along and transcodes the game to another platform.

I've found a few relatively harmless ones thus far in Xevious, mostly initialisation bugs that ultimately don't seem to affect the game. But tonight I discovered one that was crashing the Neo Geo until I found it and coded a work-around. Interestingly, it has some similarities with the Knight Lore bug.

In Xevious, as the map is built line-by-line as it scrolls onto the screen the SUB CPU maintains a pointer to the object table for that area. Each invocation it compares the scroll counter with the value pointed to in the object table, waiting for a match. When they match, it means it's time to add the next object to the map.

The trouble is, the SUB CPU routine is running whether or not the game is playing, and in fact, it's running before the MAIN CPU even initialises either the scroll counter or the area object table pointer with actual data. Due to the RAM test/initialisation functions, both the scroll counter and the area object table pointer are set to 0.

On the arcade hardware, the code looks at the value stored at the pointer ($0000) which happens to be the first location in ROM, in this case $3E. Since the scroll counter is $00, it never matches, and in fact will never match until after both it and the pointer have been initialised properly. So the bug lies dormant.

On the Neo Geo however, location $00 is part of a vector and is, in this case, $00. That matches the scroll counter before it is initialised, and so the code attempts to read the object type from the area object table. Again this is reading from low memory and returns an invalid object type, outside the range of the function table which subsequently causes it to jump to an arbitrary address and - crash!

The bug in Knight Lore also had a null pointer, and the code was trying to write to ROM at $0000 which of course is also benign.

Anyway, interesting to see how many bugs still lurk in these old arcade games.

The fix wasn't particularly elegant; just return if the area object table address is null.

FTR I'm still adding the infrastructure for handling objects added from the SUB CPU as the map is constructed. There was a little more code than I had remembered. It involves nested jump tables so a little bit involved, but nothing I haven't done elsewhere in the code. Interestingly, where I left off it should have crashed - but didn't!?!


No comments:

Post a Comment