Pages

Saturday, 24 November 2018

When a bug is not a bug

Tonight I thought I'd see if I could find the AI bug in the C code.

[To explain, the benchmark for the AI has been to run the attract screens on the target port side-by-side with the Apple II original. If the outcome for all attract screens are exactly the same as the Apple II version, it's a pretty safe bet that the AI is spot-on, as it is likely that all execution paths are thoroughly exercised across the multiple levels.]

I started comparing the 6809 and C code side-by-side for the guard AI routines. I did find one minor discrepancy (a pair of while loops that should have been do-while loops) but that didn't make any visible difference to the outcome. Also a few red herrings which turned out not to be differences at all, but I'd obviously done some optimisations in the C code that obfuscated the transcoding somewhat.

After spending a few hours not getting anywhere, and about to wrap it all up for the night, I was watching the Apple II and Coco 3 attract modes side-by-side (again) just for a sanity check and - to my horror - noticed that they actually differed!?! This can't be right - I thoroughly tested the Coco 3 version against the Apple II version all those years ago!!!

So I fired up the Neo Geo (C) version again so that all three were running in parallel, and to my surprise it corresponded with the Apple II version - for the first time. How could it be that I fixed a bug in the C code by comparing it against inaccurate 6809 code? This warranted further experimentation, so I ran the first attract level on the Apple II over and over again until...

... I discovered that there are two possible outcomes on the first attract screen! I had never noticed this before, and right now I can't imagine how that happens, as I thought it was completely deterministic!

[UPDATE] I've noticed that on the first run you get one particular outcome, and on all subsequent runs (eg warm boot) you get the other outcome. So likely an uninitialised variable or variable corruption.

The outcome on the first run from a cold boot

The outcome on subsequent runs (eg. warm boot)

But the good news is that both the CoCo 3 version and the Neo Geo versions behave the same way as the Apple II version or rather, at this point, both the CoCo 3 version and the Neo Geo versions don't behave unlike the Apple II version; and by that I mean that I haven't confirmed at this point whether the C version results in two different outcomes. (It's late, that's for another time).

To satisfy my curiosity, I'll probably have a go at trying to find the source of the non-deterministic outcomes, and then confirm that the behaviour is faithfully reproduced in the C version.

Then I think I'll FINALLY finish off the Neo Geo port, hopefully adding the circular wipe and maybe a few bells and whistles (eg. high score save to memory card) to give it some polish and release it for MAME and the NeoSD.

From there, who knows? Maybe I should design a cart for the CoCo 3 version...

No comments:

Post a Comment