Monday, 14 April 2014

Down to Business

The primary reason why I had put off this project in the past is my distinct lack of Apple II knowledge. That does sound like a pretty stupid reason when one of the goals of my retro projects is to learn more about the platforms I'm dealing with, but never-the-less it remains a fact.

Apple II Lode Runner was also a relatively heavily protected (multi-stage loader) disk-based game which suggested to me that the ramp-up time was going to be even longer than I otherwise feared. Fortunately I discovered some time back (though subsequently forgot about until recently) that French hackers had done a lot of the hard work in this respect, and produced an 'unprotected' version of the disk that was freely available for download - even documenting the boot code in detail! Even more fortunately, I had archived all of that information and the disk image, as it appears to be no longer available!

So the first task was to extract the program and data code from the disk image, into binary files that I could load into the IDAPro disassembler. That was a relatively simple process, comprising the zero page memory dump from MESS, data loaded during the booting phase (via MESS) and the main code/data block loaded into memory immediately before execution begins (again via MESS but also verified against the actual disk image sectors).

The data block occupies 4KB from $0F00-$1EFF, most of which is the title screen and associated data.

The main code/data block occupies 24KB from $6000-$BFFF, of which I estimate approximately 11KB is actual 6502 code. The remainder is most likely primarily graphics data, including the font used to render in-game text on the hires screen.

I should note that all the ASCII strings in the code have bit 7 set for all characters, so there is no text plainly visible in the binary dump at all. Once I deduced this fact I was able to produce an alternate dump for reference, but it is definitely a mild annoyance. I have no intention of retaining this 'feature' in the ports; there's simply no reason to do so as I suspect it was done only to hamper hacking attempts.

The next task was to dive into the disassembly, to at least get a feel for the overall code/data structure and to start to earn my Apple II stripes. I have to admit that the first few hours weren't particularly fruitful, and I began to wonder if this was going to be a very short-lived project. But once I realised that the data block at $0F00 was loaded during the boot process, things started to make more sense and I managed to reverse-engineer what turned out to be the title screen display routine.

One early plan-of-attack with this type of exercise is to identify and comment the message display routine(s). Generally that's relatively straightforward and also helps to identify different sections of the code. In this case, I was left scratching my head as I couldn't find a single reference to the address of any string in the program! I did eventually locate the display routine, and discovered that the messages immediately followed each call to the routine! So the display routine used the return address for the message, and then subsequently adjusted it to return execution to the byte immediately following the message itself. I suspect this sort of thing is quite common in 6502, where registers are at a premium...

One last note on what I've observed so far... there appears to be a reasonable amount of self-modifying code - mainly addresses poked into subsequent code blocks. Again, I suspect this is quite common in 6502.

That's about all I've reverse-engineered so far, and at this point I decided to stop and set up my development environments for both the TRS-80 and Coco3. More about that next post.


  1. That "data after call" trick is used in 8080 + Z-80 Microsoft BASIC. They have a subroutine which compares the next character against the parameter and throws a syntax error if not. Their motivation is code size. "call synchk; db '#' " saves a byte over "ld a,'#'; call synchk". They squeeze even more bytes out of it by using one of the 8 "rst" calls for it.

  2. Cool, thanks for the heads-up! Donkey Kong (also) makes judicious use of RST instructions for often-called routines.