Saturday, 23 September 2017

Star Wars. Software Developer Kit.

Firstly, although I have not been working on Asteroids lately, I would like to reiterate that the Apple II and Coco3 ports of Asteroids are still very much alive. In fact, I'm about to return to the Coco3 port - which is about 50% complete - very soon.

So, what's up with the Star Wars screenshot in the last blog post?

It's probably obvious that it's not a genuine emulated Star Wars screen shot; in fact it's not Star Wars running at all. It is actually a crude AVG emulator - based on my Asteroids DVG emulator - 'executing' a handful of select AVG ROM routines. So... why? ... you may ask.

I was approached recently by someone who was looking for a '6809 guy' that might be interested in doing some development work on the Star Wars hardware in order to encourage/facilitate development of new games for the platform. That development work may or may not comprise code examples, demos, tutorials and even an 'SDK' if you like for the platform.

That someone also happens to be working on a related project for which any such software development may be useful to assist in testing/debugging.

Well that piqued my interest and although I didn't intend on interrupting my Asteroids projects, curiosity got the better of me and I started to not only look at the disassembly, but also learn how the AVG and various other hardware components operate. That turned out to be quite a bit of fun and only served to draw me further into the investigation.

The 6809 disassembly, at a whopping 48KB (and banked just to complicate matters) was definitely daunting but I was making decent progress none-the-less and by the time I had written my crude AVG emulator, I felt I was ready to start writing code from scratch to run on the arcade hardware.

And thus the Star Wars SDK was started. After a few examples writing vectors, calling ROM routines and reading buttons, I started looking at the math box, whose implementation has been well documented (and of course emulated) at the low level, but I could find no descriptions of the higher-level functions. I chose a few well-used functions to RE and I finally had a working example using the 3x3 matrix multiply function which I used for a simple 2D rotation of a square.

Example code running on Star Wars hardware under MAME

More disassembly and I turned my attention to the sound ROM. I soon had example code that played through all the sounds in the game. I decided to complete the sound ROM disassembly - as far as practical at least - and reached this point tonight. The TMS5220 routines are completely commented, and I've made educated guesses for the higher-level POKEY routines; they get very complicated at the register level. Everything else in the ROM has been RE'd, and that's as far as I'm taking it at this point. See the Project List & Downloads page for the disassembly.

As for the SDK sound functions, it should be sufficient to provide a 'source' version of the sound ROM code that can be edited with new sound data and re-assembled. Anything more would be a mammoth task in re-writing the TMS5220 and Quad POKEY routines for no good reason (other than copyright of course).

Some sound-related trivia; the game has ~60 pre-canned sound effects, the data for all of which are stored in the 6809 sound ROMs. 22 of those are samples from the movie played via the TMS5220. Another 20 are sound effects played via POKEY 1 & 2, and the remaining 11 are tunes played on POKEY 3 & 4 (actually, just 3 in reality I believe). Playing a sound from the main CPU is as simple as writing a single byte - the sound number - to the mailbox register. Some TMS5220 sounds (random sound bites during gameplay) will only be played if the TMS5220 queue is empty - otherwise it will be silently discarded.

There's still work to do on the SDK, mainly handling yoke inputs and RE'ing the remaining math box functions and coming up with (more impressive) examples that use them - like spinning 3D wireframe models. Aside from that though, there's probably enough info now to write a full-blown game from scratch on the platform.

I don't intend on doing much more in the way of commenting the main ROM disassembly, aside from sections that may assist in my understanding of the remaining SDK functions. I would like to perhaps know a little more about the states (routines) in the main state machine though. And FTR most of the banked code appears to be routines that use the math box, and they are pretty obfuscated, so I won't be commenting much of them.

For now, I aim to get back to Asteroids on the Coco3 before I forget it all. And no prizes for guessing what the next target platform will be...

Friday, 8 September 2017

More vectors!

Been busy with work, investigating a new project, and being ill.

Just finish Asteroids already!!!

More to come...

Tuesday, 29 August 2017

Thrust

I'm starting to get into territory that I haven't fully reverse-engineered yet, which makes debugging the 6809 port just that little bit more difficult. On the plus side, it forces me to understand the original code and therefore I can subsequently go back and comment the arcade disassembly.

Most recently I've been adding the code for thrust and as a result once you start a game, your ship appears and you can move around.

video


One nuance of porting is the distinction between zero-page accesses and memory accesses. Looking at the original 6502 source listing, there's no indication which is which. On the 6809, all labels for direct page variables are .EQU statements, and the operand prefix is '*'. If you forget the asterisk, the code assembles but doesn't work as planned. Somewhat fortuitously though, the way I have the memory map configured, you'll get stray pixels on the top few lines of the video, and that's easily trapped in the MAME debugger since - atm - Asteroids never renders there (see below).

The other issue I touched on last post is the vertical resolution. After some experimentation in the C port, which is rendering 'vectors' in 1024x1024 coordinate space, I've confirmed that Asteroids uses approximately 788 "lines" of the display space, effectively leaving the top and bottom 118 lines (or 10% each) blank. When you reduce the resolution to 192 pixels, that 20% blank space is quite significant.

Unfortunately 192 doesn't quite divide into 788 nicely, and 192*4 (768) crops the score and copyright messages. The Coco3, however, conveniently has a 200-line mode, which would greatly simplify the scaling (right-shift by 2) and allow use of most of the display. The only issue is that the graphics were, IIUC, originally designed by Norbet for a 192-line display. I'll have to experiment to see if and how they could be adapted for a higher resolution.

But back to porting code for now...

UPDATE: Tonight I ported the code that handles all the collisions. I need to update my disassembly comments in a few areas here, as the routines actually handle all collisions between all objects, whilst my comments suggest it's only the collisions between shots and other objects.

There's a decent chunk of code involved, so not surprisingly it doesn't yet work. Worse yet, I've actually broken what was working before, in the process of 'fixing' a few bugs that I discovered porting the new code. It's too easy to forget that the 6502 X,Y registers are 8-bit vs 16-bit on the 6809...

Looking purely at object (binary) code size, the porting is now roughly 50% complete.

Saturday, 26 August 2017

The 6502 can be a BIT weird.

Quick update.

One thing I like about Asteroids is that it's completely deterministic. Each time you turn it on, it will behave in exactly the same way, until you start messing with the controls. Makes it easier to test my ports...

I was looking into why the movement of the first saucer in attract mode was seemingly mirrored on the Coco3 port. With the above-mentioned in mind, it came down to a BIT instruction, followed by a BVS (branch on overflow set) which was branching on the 6502 but not on the 6809. In my haste I (obviously) didn't pay careful enough attention to the descriptions in the 6502 and 6809 Zaks instruction references and decided they operated in the same way, at least as far as the V (overflow) flag was concerned.

Turns out the 6502 transfers bit 6 of the memory operand into the V flag, whilst the 6809 simply clears the V flag. I'm struggling to relate the 6502 behaviour to a real-world operation. Followed by a BVS, the 6502 code is effectively testing bit 6 of the memory operand (which incidentally is one byte of a 16-bit pseudo-random number) without having to load the value first into the accumulator. It of course simply required an extra instruction on the 6809.

Moving on, you can now start a game and (hacked) code to render the player ship is in place. However it turns out that the game simulates the player coming out of hyperspace when the wave starts (to enable the logic that waits for a clear break in the asteroids) and I've yet to code up the associated routines on the Coco3. That'll have to wait for another night.

Starting a game after coining-up

I have noticed, doing the IIGS, C and Coco3 ports, that the scores and copyright messages don't extend to the top and bottom of the screen respectively as the MAME emulation of the arcade game suggests they should. It has puzzled me somewhat, but now I believe I know the answer. The DVG operates on 10-bit coordinates, so I've been basing everything on a 1024x1024 display, and scaling accordingly. And to note, it's relatively easy to scale down to 256x192.

However, I was reading today about an FPGA emulation of Asteroids Deluxe, and it mentioned the vertical display resolution was in the vicinity of 800 (I need to look it up again) which would explain the discrepancy. The issue of course, is that scaling becomes more of a chore, i.e. inefficient. But I do seem to recall that Norbert's scaling was different to mine... so I need to go back and study his again, as well as the MAME emulation of the DVG.

UPDATE: Both the FPGAArcade page and the Asteroids MAME driver suggest the vertical resolution is 788. Note that 192*4 = 768. Damn... can we get away with it though? OTOH the Coco3 does have a 256x200 video mode...

I also did some work on the C port yesterday lunchtime, adding the saucer to bring it in line with the 6809 port. Interestingly it has the same bug - or more accurately the same side-effect - as the Coco3 version, but for a completely different reason!

Thursday, 24 August 2017

Coco, Xevious and... Atari Jaguar now!?!

A few different topics today.

Firstly, the Apple II/IIGS version is temporarily on ice, but rest assured I will get back to it in the not-too-distant future. The current status is that it's all-but complete - if quite flickery - but I haven't figured out banking for direct page and stack registers yet, which is preventing the use of PEI slamming for the IIGS. I (also) plan on returning to legacy hires mode, and doing an accelerated II/IIC+ version, but this will probably come later.

In the mean time I started on the Coco3 port whilst waiting for some assistance on the aforementioned IIGS issue, and as a result I'm now on a roll and loathe to put it aside. Looking purely at code volume, I'd estimate it's about 30% complete now, with all text, asteroids and saucer rendering complete (saucer movement is a bit off). You can coin-up but not a lot else happens.
video


Rendering is far from optimal; the aim of the exercise is to complete the porting of the arcade 6502 core with the simplest and smallest amount of graphics data required.

Oh and together with a possible Vectrex port (proof-of-concept at least) I'm also considering a port to the arcade Star Wars hardware - another 6809-based vector platform!

The C port will probably progress in step with the Coco3 port, and they will probably both be used to debug the other at various points. Again, that's probably for Neo Geo and Amiga.

And now for something completely different...

I don't recall exactly how, but I recently stumbled across reports of a port of the arcade game Xevious to the Atari Jaguar. From what I can gather, it's close to completion and will be released on cartridge for sale. It is particularly interesting to me because Xevious is my all-time favourite arcade game! I do have some knowledge of its internals, as I have in the dim dark past done some preliminary reverse-engineering for the purposes of both software (MAME) and hardware (FPGA) emulation. In both cases I was beaten to the punch by someone else, but it's not for nought as it is knowledge that I plan to use again one day.

There's scant information on the technical details of the port; I have no idea to what extent - if any - the original arcade code has been reverse-engineered, nor whether the port involves any sort of emulation (doubtful) or how faithful the Jaguar code is to the original. In any case, I'd be very interested in learning about the process, and getting my hands on any RE work already done. Not sure how likely any of that will be. For now, I plan on buying the cart.

Of course I started to look into the specifications of the Jaguar, and the resources and toolchains available for homebrew development. To my surprise the homebrew scene is quite active and, compared to similar platforms, the output is quite prolific - and that is due in no small part to the comparatively large number of Atari ST games ported to the platform!

As it turns out, a decent number of Atari ST games lend themselves to being quite easily patched to at least run on the Jaguar, and the architecture of the console - with no less than 3 decent CPU's - gives it the capability to screen-scrape the (largely incompatible format of) ST video memory!

[This is of course exactly what I have recently done with Asteroids on the Apple IIGS; patching as little as two instructions allows the core 6502 code to run on the IIGS, and the processing of the display list is perfectly analogous to screen-scraping!]

Developing for the Jaguar appears to be a tad more complicated than other platforms, though it is suggested that most homebrew development primarily commandeers the 68K 'management' CPU and the pair of purpose-built GPU/DSP devices are under-utilised. Also the toolchain involves the installation of a complete Ubuntu distribution - a far cry from simply having a makefile and AS6809.EXE and ASLINK.EXE in your path for the Coco3, for example!

Of course now I want to get my hands on some Jaguar hardware (not to mention Xevious!) Unfortunately - not unlike other platforms - the so-called SkunkBoard (basically a flash cartridge designed for homebrew developers on the Jaguar) doesn't appear to be currently in production. It's going to be an expensive foray...

The retro gaming scene is alive and never ceases to amaze me!

Wednesday, 16 August 2017

IIGS is no slam-dunk. Coco3 is looking good though...

Slow going on the Apple IIGS front, but I've finally figured out my half-screen issue when shadowing is enabled. After banging my head against a brick wall for a few days, I decided to look at the MAME driver source for the IIGS and immediately found the issue. Because the legacy Apple II hires screen memories overlap the SHR screen memory, you also need to disable the respective bits for those in the SHADOW register. And voila - a blank screen!

Now I've hit another snag. I was intending to use so-called PEI-slamming to update the modified areas of the SHR screen, which require that the direct page and stack are located in BANK1. However, I can't make any sense of the soft switches that control the bank for these and the current state doesn't make sense either. After putting out a call for help I've had a couple of responses but I'm still none-the-wiser.

Somewhat discouragingly, I did a quick experiment double-rendering the frame (one with shadowing disabled, the other enabled) and it's running a bit slower than the arcade game. Granted PEI-slamming will help, but it's going to be tight!

Whilst I'm struggling with IIGS technical issues I got impatient and started on the Coco3 (6809) port. Thus far I have a skeleton main loop that initialises the object table and adds the extra lives to the display list. I've coded up a skeleton rendering routine and fleshed out the CUR command and the tokenised command to display an extra life. It's far from optimal but it works!

First rendering on the Coco3

It's been mostly cut-and-paste from 6502 until this point. Again, the one thorn is the indirect indexed addressing mode of the 6502 but, unlike Lode Runner, it's not as extensively used in Asteroids. Other than that, there's endianity to be mindful of and I've actually swapped the byte order of words in the display list for more efficient rendering. I think this port is going to fall out relatively easily!

Saturday, 12 August 2017

Half the screen in shadow!?!

Not a lot of progress, or time to work on it.

I did, however, take preliminary steps towards eliminating flicker. The plan is to utilise video shadowing, as described in this article here, by one of the authors of the IIGS port of Wolfenstein 3D.

Currently, I've got shadowing permanently enabled and simply write to bank $01, which is shadowed on-the-fly to the SHR memory in bank $E1. Slow and flickering, but no trickery to implement and of course easier to debug. But now it's time to up my game.

So in preparation, I simply changed my initialisation routine to disable shadowing, rather than enable it. As expected, I now get a blank screen as writes to bank $01 are not copied to the SHR screen at all. So far so good.

Next step is to disable shadowing at the start of every frame render. Since I've never actually enabled it anywhere, I should still get a blank screen - right? However, when I run it, I now see the top half of the screen! And it happens in both MAME and GSPort, so it's not likely to be an emulation bug.

My first thought was Alternate Display Mode, which can be activated from the Control Panel. I can only access it in GSPort, but it's already turned off. Toggling the value regardless doesn't make any difference. And for good measure, I also disabled interrupts.

It has me completely stumped. I've fielded some queries out there but for now, no responses.

With a little more spare time tonight, but no path forward on the IIGS port, I returned to the C port. I fixed the object update and now the asteroids move as expected in attract mode. Next I added some inputs, so you can coin-up and start a game. The next bit will be tricky, rendering the player ship, as that's quite involved.

I don't really have to go through the exercise of emulating the DVG anymore; with my tokenised display list I can simply back-port that to the C version and it'll be sufficient for all platforms that it'll be running on. However I figured there wasn't too much more work to do, and it may help in understanding some nuance at some point (eg. exploding ship), so I'll persist for now.

Hopefully soon I'll learn of my stupid mistake and I can get back to finishing off the IIGS port...

Tuesday, 8 August 2017

Offsets!

I've been digging into Norbert's Atari 800XL Asteroids Emulator and trying to ascertain if/where offsets are used for various objects. To re-iterate; the position of an object is relative to the first point in the draw list of vectors that comprise the object. OTOH the position of a bitmap is (always) relative to one corner of the bounding rectangle. So in theory, displaying bitmaps in place of vector objects should require an offset for each object.

I had luck with the player ship and (all) shots. The ship appears to be a lot closer to its true location now, as evidenced by the much reduced incidence of asteroids merely passing close-by and destroying your ship. Shots are definitely much more accurate - if not perfect - as hitting the smaller asteroids requires you to, well, actually hit them!

In order to maintain the current performance, the offsets are coded directly into the compiled sprites, rather than simply offsetting the object's coordinates and thus requiring additional calculations. It did unfortunately necessitate one extra conditional branch in the ship rendering dispatcher because of an odd-valued X offset, but there was no avoiding it.

Characters don't appear to have any offsets applied, as far as I can see. As for asteroids & saucers - eek! There's a few tables with small values that could feasibly contain offsets (one, for example, is indexed by asteroid shape and size) but I'm not sure I'm going to be able to reverse-engineer exactly how it all works as it looks to my untrained eye that it starts to get into display lists. And those routines appear to be working directly on the vector hardware 1024x1024 coordinates as well.

To further add a spanner in the works, played against the Atari version there do still appear to be some inaccuracies between ship and shot and ship and asteroid, albeit subtle. They'll be fun to track down...

Anyway, I'm happy with the progress thus far, and to be honest, it's probably not going to be noticeable to any but the hard-core Asteroids experts out there... and they're not likely to be playing it on the IIGS I wouldn't think. I will look further into it, and perhaps it's a good excuse to crack open Norbert's C64 version which might be a little easier (for me) to follow.

However, I might actually move on now to the ship explosion and then flicker and sound before returning to this issue. I did notice tonight that with only a few small asteroids on the screen, the game is definitely running way too fast, so I'll look at throttling as well.

Discussions on the IIGS FB page tonight have me interested already in another project, though I would only consider it after I've done a proper feasibility study and I have collaborators. I have to admit that I even loaded up the ROM in IDAPro and had a quick look at the start-up code in my lunch break.

But at this stage it's still only a slight possibility and I still have other plans for Asteroids before I put it to bed.

Monday, 7 August 2017

5 lines of code to fix 3 issues!

With the wife wanting some company in the lounge room tonight while she did some crochet, I turned on the TV for some background noise (Jupiter Ascending - what an absolute FX overload) and fired up the laptop to get some of the easier of the remaining tasks out of the way.

To this end I defined the bitmaps for two extra characters, period and underscore, and updated the arcade code to print those rather than render them with discrete vector commands. The period is used in the high score list display, and the underscore during high score entry. Straightforward.

Next was the 'rubbish' that is left on the screen after coining-up and starting a game. I thought I'd found the culprit and fixed it. Although the subsequent game appeared to be remedied, in later games it reappeared. Back to the drawing board on that one.

Finally, despite the dipswitches being hard-coded for 3 ships/game, the 2nd and subsequent games all start with 4 ships. Found that one rather quickly; the code does an LSR on the (read-only) hardware location, branching on a Carry condition. Unfortunately not so benign when that hardware location is emulated in RAM... loading into A and then shifting was a simple fix.

Somewhat interestingly, the game appeared to run faster on my laptop. Not sure if that was my imagination, a different version of MAME, different OS, or something else. Makes me even more interested in seeing it run on a real IIGS...

Aside from the above-mentioned rubbish, that only leaves the exploding ship and proper (accurate) alignment of the sprites before I tackle proper game speed throttling, flicker and sound. I might tackle alignment next, because that has the most detrimental effect on game play atm.

Oh and another thing I've forgotten about; 2-player mode. Trivial, but another task on the list.

Friday, 4 August 2017

VCF West Preview

I've had a kind offer to demonstrate Asteroids for the Apple IIGS at VCF West so today I added a quick text-mode splash screen.

Last-minute splash screen and loading bar (mid-load)

Datajerk's c2d utility allows you to display a text splash screen whilst the game is loading. It requires a dump of $400 bytes from text memory which of course on the Apple is a dog's breakfast. So I whipped up a quick C program that would allow me to easily layout my screen and then write out a binary dump compatible with Apple II text screen memory.

Some eye-candy from the latest build.

End of game - attract mode

High score list


Will be interesting to gauge the reaction of attendees. Unfortunately it's still pre-alpha so there's glitches and flickering, but it was a last-minute offer.

Thursday, 3 August 2017

Closer to an Alpha Release!

Quick update; all of the IIGS code is pure 16-bit now except for two routines - the DVG CUR handler and the support routine that calculates SHR addresses and is only called from there. They need a complete overhaul and merging into one routine. On reflection there might just be enough savings to be had to be noticeable...

All of the rendering is done and optimised (including the small saucer and thrust) except for the exploding ship, plus I need to add data for 'dot' and 'underscore' characters for the high score entry & display. And that's it in terms of optimisation, unless I can coax more out of the hardware by using shadowing and/or blanking to my advantage. It's still around 18% faster from my crude calculations.

For an alpha release I'll rework the CUR routine, add the last pieces of the missing rendering, fix the object alignment, and add a crude text-mode splash screen. I'll tackle the flicker & sound after the alpha is out there.

Closer...

UPDATE: The DVG CUR routine - and hence all IIGS-specific code in the main execution loop - is now 16-bit and optimised. I replaced the x160 calculation with a table look-up, whose instantiation was facilitated in no small part by CA65's .REPEAT command (nice!)

My latest crude calculation suggests it is ~28% faster with 4 large asteroids on the screen. After the alpha release I'll schedule rendering strictly to the arcade frame rate and see if it can keep up. Having said that, there's no guarantees that the arcade game maintains that frame rate either - there is leeway in the code to skip a frame or two before collapsing in a heap!

Tuesday, 1 August 2017

A shadow of a doubt!?!

I've added the last of the compiled sprites - the only outstanding graphics now are the ship's thrust and the exploding ship. The former I will probably tackle next.

Aside from fixing the saucer rendering, I've been cleaning up - and in the process optimising - the rendering and erase dispatch routines. Originally they were switching back and forth between 8- and 16-bit mode; all the rendering and erase routines themselves are now pure 16-bit code and the dispatchers are almost there. I also shaved some cycles off the asteroid render/erase dispatchers by streamlining my tokenised 'asteroid' instruction.

The DVG CUR handler does some Apple IIGS video calculations and stores values for use by subsequent render/erase routines. It needs a good overhaul - converting to pure 16-bit, doing away with a few values that aren't needed, and adding another that will enable further optimisation of the render/erase routines. However there's not a huge amount of cycles to be saved per frame.

Incidentally, I was studying some IIGS code online and found the switch to change the border colour, so it's now black and does improve the aesthetics somewhat.

Finally, I activated SHR shadowing and expected to see performance gain. Nada. I'll have to go back and study (again) what that actually does and how it is of benefit (if any) in my case.

As for remaining optimisations, there's not a lot else I can come up with atm. The so-called stack-blasting technique isn't really suited to this situation, nor is moving DP to the video memory. They're generally more suited to larger objects and opaque layers. I'd say it's not going to get significantly faster than what it is now, unless shadowing changes something.

One last task I forgot about; there appears to be a requirement for an adjustment factor for the objects rendered as bitmaps. This is due no doubt to the difference between the vector objects having arbitrary 'origins' (defined as the starting point of the beam) versus the origin of a bitmap always being one corner of the bounding rectangle. With luck I can deduce the required values from Norbert's code.

Edging closer to something suitable for an alpha release just to give people a taste of the game...

Monday, 31 July 2017

Locaton, location, location!

Tonight I completely re-arranged the memory map, compressing all the areas together at the bottom of memory, and also finally did away with the DVG ROM at the same time. The 6502 arcade ROM now resides at $2000 (as opposed to $6800) and together with the IIGS code extensions - including the compiled sprites - now extends to "just" $7055. Plenty of space to finish off the rendering routines now.

I forgot that the English (only) messages are actually stored in the DVG ROM, so when I eliminated the last stray write to sound hardware and got the game running, all was well except for the messages, which were rubbish. It finally tweaked and I copied the English message tables into my core asteroids code module (together with the sine table) and it's all working as before.

High Score entry

And all this is now irrefutable proof that my Asteroids "source" code is fully relocatable, including all the ROM, RAM and hardware I/O location references! And now simply changing one assembler directive, for example, I could move it above the Apple II hires screen memory pages if I ever attempt a IIC+ version.

On another note altogether, someone asked me the other day whether I've had to decrease the frame rate. Thus far, the answer is 'no', and I'm hoping that won't change. It did remind me, however, that Norbert's emulator only renders every third frame! That could probably be improved somewhat now with my core, which eliminates all unnecessary display list calculations and operations.

And so onwards with the rendering and optimisations...

Sunday, 30 July 2017

Compiling a to-do list

Real Life has been intervening but I have been chipping away regularly at Asteroids.

I've now got compiled sprites for the bulk of the rendering, and all of the erase routines. And it finally runs faster than the arcade game, albeit only about 18% faster atm. I do still have some optimsations to do - but I also have to remove the flicker and add sound.

To generate code for the compiled sprites (and erases) I simply processed my bitmap .asm file in a quick 'n dirty C program. Not the most efficient tool but... old dog, new tricks...

So to explain exactly how my compiled sprites work is actually quite simple. Instead of looking up sprite data in a table and rendering it to the screen in a loop, I have one routine for each and every sprite that simply loads the sprite data into the A register as immediate operands and then writes that to the display using absolute indexed mode, with the X register containing what is effectively the video address of the sprite.

In the case of the IIGS, the bottleneck isn't actually the execution overhead of the loop or the amount of data per se, but rather the number of video access required since these are rather slow. Where the compiled sprites make the most improvement in this case is that they are only writing pixels that are set, unlike an non-discerning loop which writes the entire bounding rectangle, set or not. And with the asteroid sprites in particular, where the pixels are either relatively sparse, or the asteroid itself is often a lot smaller than the bounding rectangle, the improvements can be significant.

Aside from skipping zero-pixels (or rather words of pixels), there's also no need to OR the value $FFFF to the display, so there's another video (read) access saved. And a mate had a good suggestion that I sort the values within each sprite and use Y as a temporary store (where appropriate) to save a few more cycles!

For certain sprites I didn't bother with some optimisations... for characters there's still a look-up table of data, since during the game there's only a few digits on the screen and little savings to be had in the character data itself. And for small sprites I simply erase the entire bounding rectangle since there's only 14 words to write.

So now, instead of look-up tables of sprite data, I have look-up tables of sprite rendering routines. The erase routines are similarly optimised; only those pixels that were set are erased, and of course it's sufficient to simply write $0000 to those words. The down side of course is increased memory usage - quite a bit in fact - and I've just hit the requirement to re-arrange my memory map because of it. Not surprising since up until this point I've left all the arcade RAM, ROM & hardware addresses in their original locations. That should be trivial to change with the 'source code' of course and there's still plenty of space left on the Apple so no danger of running out anytime soon.

Tonight though, before re-arranging the memory map, I thought I'd tackle the 'crash' at the high score entry routine. Turns out it wasn't a crash at all, but rather a combination of a bug on my part, and yet-to-be modified display code on the other; the core core was in fact running perfectly fine.

My rendering routine wasn't setting the start of the display list buffer correctly; when the high score message was being printed the display list exceeded 256 bytes and the MSB of the address incremented. I subsequently set that as the start of the buffer and so it only rendered what was in the 2nd page of the list - the last few words of the message.

Next issue was that I hadn't updated the routine that printed the initials as you entered them, so nothing showed up when changing letters. And finally, to select a letter it was debouncing the hyperspace key; having mapped that only to the Apple II keyboard you could never trigger the selection of a letter. So I mapped it to the 2nd joystick button and it fixed that issue.

So now you can coin up, play a game at (slightly more than) full speed, enter your initials on the high score screen, and start again.

I still have a few niggly bugs. There's an initialisation issue that sometimes results in garbage on the screen and non-zero velocity for the player. And starting a second game you have four lives instead of three. I'm sure they won't be difficult to track down, and possibly even the same bug.

And aside from bugs, there's re-arranging the memory map, adding compiled sprites for the player ship (already generated, I just ran out of space temporarily), adding the 'thrust' pixel, displaying the correct saucer size, and rendering of the exploding player ship. I also need to add special characters for 'underscore' (high score entry) and 'dot' (high score list). Lastly then, remove the flicker, add sound and a main menu screen.

Sunday, 23 July 2017

IIC, or not IIC, that is the question:

Whether 'tis nobler in the mind to suffer
The Apple II hires video memory map,
...

Due to an SOS call from my wife whilst I was en-route to WozFest (car trouble) I ended up missing the brief link-up with KFest and people had well and truly broken off into small groups to work on their own projects by the time I arrived, which meant I didn't get the chance to see it running on real hardware.

I did get a chance to do a little more work on it though (despite heckling from the Peanut Gallery - you know who you are) and have now got pixel-shifted graphics rendering. Although somewhat hampered by the flickering graphics, on close inspection it is definitely animating more smoothly now.

I also decided to go down the path of so-called compiled sprites, reasoning that it wouldn't be very difficult to write a C program to parse my ASM bitmap data file to produce the requisite code. I've got one or two minor optimisations to effect first, and then I'll give it a spin. If that doesn't make a marked improvement, I'll be at a bit of a loss in terms of how to proceed further. As a first-pass I'll opt not to use stack-blasting and see where that gets me.

After chatting to a few learned fellow attendees at WOzFest it became apparent that the 4MHz IIC+ would be another good candidate for a port - even more capable than the IIGS in fact - with a faster CPU (same video memory bandwidth) but with a monochrome graphics mode meaning only 1/4 of the graphics data to push around. I'm fast running out of excuses to keep avoiding the legacy Apple II hires video display...

Thursday, 20 July 2017

Game On!

Excellent progress today - in fact it's in good enough shape to demo at WOzFest now although it would be nice to take it along even further if I get the chance!

The bulk of the rendering (sans exploding ship and thrust) and the erasing has been done. I've also managed to pilfer the joystick read routine from Lode Runner and as a result the game is actually playable, albeit slightly slower than the arcade original at this point.

video


I've still got some optimisations up my sleeve, from simple changes to the display list entry format through to stack-blasting hand-compiled sprites, so I'm still holding out hope that I can get it running fast enough to require being throttled by a IIGS interrupt. And I still haven't worked out the whole video shadowing mechanism; a bug in my code meant that I was never shadowing the SHR screen in the first place - and now when I turn it on, it can't read the keyboard... so there's that to play with as well.

I guess if all else fails I can revert to the legacy hires screen which is a lot less data to move.

From the video it's obvious I've got some graphics tweaks to do, including bit-shifting plus offsets from CUR for each object. There's flickering of course, and the odd glitch and then the minor matter of a complete crash at the end of the game.

As an aside, I got stuck on the joystick not working in the IIGS emulation under MAME. I could move left/up, but centering the joystick read back as $FF, so I couldn't move right/down. That forced my hand in trying to get the disk booting in GSPort, and I eventually realised the floppy disk image should be in slot 5, not slot 7. However GSPort had the same issue...

Then it finally twigged; the routine was written for a 1MHz machine and was running on a 2.8MHz machine. The counter was overflowing before even the centre position was detected! After slowing down the CPU it all started working!

So - finish off the graphics, tweak the display positions, optimise the erase/rendering, fix the Game Over bug. Then add sound, title screen, and release! I've got - realistically - only one more night to work on it before its debut.

Wednesday, 19 July 2017

IIGS Take 2

The experimental work I'd done on the IIGS port prior to starting on the port proper has certainly paid off; as of tonight it's rendering the characters, lives, copyright, asteroids and player ship (and the latter only when it should be). I should note that I'm yet to generate or code for the bit-shifted graphics.

It may look no different to the previous version, but the display list is greatly simplified - effectively tokenised - and all the dead code has been removed from the 6502 core, giving me more headroom on the IIGS. And I've still got a little more optimisation to do in my rendering routines.

IIGS Asteroids Take 2 - optimal display list

When I next get the chance I'll continue with the saucer, the shots and the shrapnel which should be as straightforward as the other objects have been until now.  That'll just leave the exploding ship, which I am yet to work on at all.

I think that'll be a good point to research reading the IIGS keyboard; I've been encouraged by reading vague suggestions that it's possible to read the IIGS keyboard directly from the ADB. If that pans out I'll be able to make the game playable, if slow.

Then it'll be time to work on the bit-shifted graphics and erase logic (right now it's clearing the entire screen every frame). That should bring the game back up to speed and I'm hoping it'll actually then be too fast!

I'll be happy if I get to this point by the weekend for WOzFest Slot 7!

Beyond that, there'll be the addition of the exploding ship, sound (samples), support for variable beam brightness, and spit & polish and bells & whistles, such as a title screen, joystick/paddle support etc etc.

Tuesday, 18 July 2017

A token effort

I have to admit, I haven't been able to tear myself away from the C port to make any further progress on the IIGS port. However, it hasn't all been for nought as it has definitely reinforced my understanding of the arcade code, and cemented my decision regarding tokenising (optimising) the display list for the 8-bit ports.

Before I get to that; most of the work on the C code has been 'infrastructure work' and low-level DVG interface routines, which necessarily support both the new abstract display list and the original in parallel - to facilitate debugging and development. What that leaves, then, is the game logic and housekeeping code which generally tends to be easier to translate to C; the upshot of all this is that I don't think the C port is going to take very long to complete at all!

[Just for the record, I have the C port rendering all the text, including scores, and rendering and animating the asteroids themselves. The pseudo-random number generator is also in lock-step with the arcade machine and produces the same output at the appropriate times].

Keep in mind that the arcade code is only 6KB of 6502 - a lot of that munging 16-bit numbers - and it's not surprising that the C port isn't huge. From memory, Knight Lore was ~12KB of Z80 code and translated to ~5K lines of C. I'm around 1,300 lines for Asteroids already, and you could estimate it'll be in the vicinity of 2,500 lines.

Getting back to the IIGS (and 8-bit) ports; aside from the existing CUR (which sets the current beam coordinates) and HALT display list commands, there'll be a distinct command for the rendering of each object in the game, comprising character, extra ship, copyright, asteroid, ship, saucer, shot, shrapnel and exploding ship. I may add one last command to set the brightness - something the arcade code does but Norbert doesn't bother with - simply because the IIGS has the palette to support some variance in brightness.

Some of those commands will have one or two parameters, but all will render at the current beam position. The parameters will be succinct and optimised for the bitmap display routines. What this means is that I can actually remove a lot of code that generates the display list content that is irrelevant for the port, such as DVG subroutine calls or component vector commands. This is one area where I'll be able to improve performance over Norbert's emulators, only because I effectively have the arcade 6502 source that I can modify and re-assemble at will.

I've also identified which of the bitmaps will and won't require bit-shifting, and which will require an extra byte's width to do so. Because, for example, all of the game's text message coordinates are fixed, specified on a 0-255 grid (before being scaled-up in the display list), and also happen to have even X coordinates, I don't have to bit-shift any of the character set for the IIGS 2BPP SHR graphics!

Most of the remaining bitmaps will require bit-shifting, and a few - not all - of those will require an additional byte's width to facilitate it. But that simply boils down to an extra compare and load for each object rendering, unless I need to really wring the performance out of the rendering routines.

My next task now is to generate shifted bitmap data, which is trivial, and essentially start over from scratch with the IIGS port. I'll probably have to stub out all the routines that write to the display list, and then begin work on the so-called tokenising version. None of that should be too difficult...

[UPDATE: I've regenerated the 6502 ASM file from my disassembly, starting the IIGS port from scratch. All of the DVG write routines have been stubbed-out so that only the CUR command is now written to display list. Next is tokenising the character command and then rendering it on the IIGS.]

As for the erasure; I'm planning on (eventually) making use of the ping-pong display list buffer. Immediately before rendering the new list, I'll simply re-parse the old buffer and use it essentially as dirty rectangles. I do have more sophisticated optimisation possibilities up my sleeve; it's useful to know, for example, that all objects are written to the display list in a fixed order. I'll leave all that, however, until I need it - if ever.

Wednesday, 12 July 2017

Asteroids with a 'C'

On Friday nights my wife & I traditionally watch a show together with which I've become rather bored in recent times. Rather than waste that hour last week I decided to set up the laptop in front of the TV and work on some aspect of Asteroids that required a minimum level of concentration. Ultimately I decided to start work on the C port of Asteroids, mainly because it required a lot of crank-the-handle type coding up front before any real work was required. Like defining data structures for zero page variables and player RAM.

Aside from the aforementioned, I manage to also code the main routine and stubs for all the subroutines called from there. Then over the next few nights I was keen to take it a little further; implementing a rather more 'abstract' display list to aid not only in development and debugging, but also to facilitate the so-called tokenising I'd be doing in the 8-bit ports. That entailed a DVG 'disassembler' of sorts which subsequently morphed itself into a DVG interpreter/emulator which was soon rendering a few vectors on the display.

Of course time is ticking for WOzFest and I do need to bite the bullet on the tokenised display list and optimisations for the IIGS. However it has been a very useful exercise and I've discovered a few subtleties of the DVG which had escaped me until now. Regardless, I really need to put it aside for now and continue on with the IIGS port. In the mean-time, here's a sample rendering of what I have thus far.

Asteroids C port (Win7, GCC, Allegro)

Like my other C ports (Lode Runner and Knight Lore), the C code is as faithful to the original assembler source as practical, whilst optimising aspects of the original code such as using 16- and 32-bit variables rather than multiple bytes for things like addresses, scores, coordinates, etc. I retain all the same subroutines with the same names, albeit adding parameters for values passed in registers, etc. The logic within each routine is representative of the assembly code, differing only to accommodate the aforementioned optimisations and/or clarify the intent, without changing the underlying algorithm or compromising accuracy.

The end result is the same as the 8-bit assembler ports; a game that plays exactly - and looks as far as practical on the target hardware - the same as the original. And as I've discovered in the past, I've even been able to debug aspects of the assembler ports on the C version! In the case of Asteroids, I think the ability to inspect the display list so easily will come in handy down the track.

The C version should be portable to the Amiga and the Neo Geo at the very least. For Lode Runner the C port was an after-thought of the Coco3 (6809) port, but for Knight Lore, I developed it in parallel the with Coco3 port and it was, as I mentioned, very helpful. This time 'round, I'm undecided how I'll proceed once the IIGS port is finished...

Friday, 7 July 2017

To SNES or not to SNES?

No opportunity for any development today but time to ponder random aspects of the project. I was also prompted by gp2000 to look a little further into specific aspects of the code, and discovered something that should have been obvious from the start, but escaped me until today - so thanks George for that inadvertent trigger!

I did tweak some of the coordinate transformation and video address calculations today, converting my 6502 code into 65816 and improving the resolution of some of the calculations. Always good to see a half-page of 8-bit code reduce to a few lines of 16-bit code!

And in the comments of a previous post I pondered the feasibility of porting this to the TRS-80 Model 4. Aside from the effort of porting to yet another CPU (Z80) there's also the fact that the hires board is all-but-crippled by not only port-mapping the hires video memory, but also restricting access to (vertical?) blanking periods. George suggested a hybrid mode mixing the text and hires graphics screens... very interesting but a lot of work none-the-less. I'll put this in the 'maybe' basket.

And on the subject of alternate ports, the SNES sprung to mind! I know little about the technical specifications except for the fact that it is powered by a 65816 (clone). A quick Google reveals it supports 256x224 resolution, allows 128 sprites (up to 32/line) and has the usual tilemap(s).

I'm thinking this would be a no-brainer; text would appear on the tilemap layer, with 27 asteroids, player ship, saucer and 6 shots making up a maximum of 35 sprites on-screen. Extremely unlikely that they'd all appear on the same scan line, but if I was really pedantic about it I could implement a software priority scheme. But with all the arcade 6502 code running, plus the bulk of the IIGS 65816 code available, it wouldn't be a lot of work at all. I'm going to put this in the 'almost certainly' basket, and I might be tempted to tackle it immediately after the IIGS port is done.

EDIT: Doh! It's already been ported to the SNES by Digital Eclipse!

[Makes me wonder if I should be porting that version to IIGS!?!]

That's about it for random musings. A parting fact: whilst the vector display coordinates range from 0-1023, the game's virtual playfield coordinates actually range from 0-8191. Somehow that escaped me... now consider it's all scaled down to 256x192... or in the case of the TRS-80 text mode graphics, 128x48 (128x72 if I get really tricky).

Thursday, 6 July 2017

Use the source, Luke!

In my third blog update for the day, I can report that I've all-but-finished the reverse-engineering of the arcade Asteroids 6502 code.

Aside from temporary storage, all zero page and player RAM variables have been documented. There are no variable addresses remaining for which I do not know the purpose.

About 95% of the code has been commented. There's some particularly nasty code in a few places throughout the ROM that remains uncommented at this stage; aside from some physics there's the exploding ship routine - which seems unnecessarily complex in my opinion - for example.

Importantly, I know what all the code is meant to achieve, even if I don't understand the nitty-gritty of every line in some isolated cases. It's something I'll probably have to rectify once I start transcoding to 6809 and/or C, but for now I'm satisfied that I have a well-enough commented source file on which to base my official Apple IIGS port.

From here I need to re-generate the core .ASM file and re-apply my patches for running on the IIGS. Since I annotated those patches in IDAPro, it should only take about 10 minutes before it's running again with the new source. And thereafter, I can start modifying the code 'for real' this time, including optimising for performance and incorporating pixel-shifted bitmaps.

It'll probably be a few days before I have anything rendering again, and the still screen shots will probably look no different to those I have posted already. The video should look a lot better though...

Wednesday, 5 July 2017

Great minds think alike (or fools never differ).

Interesting to dissect Norbert's Atari800XL Asteroids emulator.

The aforementioned patches to the rendering routines actually implement an alternate display list, of sorts. For all the (alpha-numeric) characters and the extra ships, Norbert adds an entry to his own display list, using the character code directly (no reverse-lookup on JSR address required). He also assigns character code $FF for the DVG CUR command, and inserts the pre-scaled Atari display coordinates. This is essentially what I had in mind for 'tokenising' the display list to optimise for the Apple IIGS.

As described in my last post, the emulator hooks the main loop and calls out to three (3) subroutines.

The first routine is (as I subsequently discovered) the rendering routine and it only renders the display every 3rd call. It does something with self-modifying code that I'm yet to reverse-engineer, before rendering the asteroids directly from the player status RAM area. Next is rendering the player ship or explosion, depending again on the player status - something I'm actually doing now as a 'quick hack'... not so much of a hack as it turns out! And as I suspected, the relative coordinates (offset) of the thrust pixel is stored in a lookup table and plotted discretely. After that, the saucer is rendered, and then the shots (saucer and player), before the alternate display list (characters) is finally rendered. At the end of the routine it appears to handle the high score entry, and then mess with ANTIC registers - and I'm way out of my depth here!

I've missed the copyright message in there somewhere, but perhaps it's done at startup and never deleted from the ANTIC display list? Not worth pursuing further since it's not relevant to the IIGS or likely any other hardware I'll be porting to.

The second hook routine emulates the inputs, and the third the sound.

And as I suspected, when the player status RAM bank switch is hit (changed), the emulator simply swaps 256 bytes between $200 and $300.

So what will I be taking away from this?

I like the idea of the alternate display list, though perhaps with the arcade 'source' it'll be easier for me to simply re-purpose the DVG shared RAM. Certainly it would appear that 'tokenising' the display list is the way to go. I would also eliminate all the dead code that makes up the current display list. And not having to iterate over player status RAM - essentially for the 2nd time each frame - should speed things up a little too.

I'll use Norbert's lookup table for the thrust, but instead use it to pre-render a 2nd set of bitmaps for the player ship. Again, not having the extra look-up and calculations to render a single pixel will increase performance further.

I should also be able to find the ship explosion bitmap(s) in there somewhere, if I can navigate the eccentricities of the Atari 800XL display hardware!

Standing on the shoulders of giants...

Undecided on how best to proceed with the remaining rendering tasks, today at lunchtime I downloaded Norbert's (Atari 800XL) emulator and fired it up in MAME, intending to plot the 'thrust' pixel in each of the 24 player ship bitmaps based purely on observation.

I've documented 12/24 but not surprisingly, it got old pretty quickly and, curiosity getting the better of me, I dumped the first 16KB of the Atari's memory into a binary file and loaded it up into IDAPro.

Before transferring control to the arcade code, Norbert's emulator patches a bunch of addresses in the ROM. Aside from those critical for running on non-Asteroids hardware (ie. the same patches I made) it also patches routines such as the 'CUR' (current) DVG command, character display, display of extra ships, etc.

It also installs a hook in the main game loop. The hook itself calls three subroutines; one to read the Atari joystick inputs and seed the memory-mapped Asteroids inputs, one to play the sounds based on the memory-mapped outputs, and the third I'm yet to ascertain.

Most importantly though, I'm still yet to determine how the emulator goes about rendering the display. From what little I've seen, the display list is somewhat 'corrupted' by the patched routines. However there are other unpatched routines that must still provide data to the emulator via the display list - so I'm not sure how it all works just yet. The waters could also be further muddied by the Atari 800XL's unique display hardware... I'm sure I'll work it all out next session.

Tuesday, 4 July 2017

Ship-shape!

I converted Norbert's ship data to my IIGS format and checked out exactly what he has rendered. There are 24 bitmaps in total, covering 360 degrees of rotation.

That is contrasted with 64 different renderings of the player ship in the arcade game. However again, at this resolution, it'd be pointless to attempt to render that many different bitmaps.

One interesting thing to note is that the player ship direction is stored as a single byte, the value varying the full range of 0-255 to represent 360 degrees. Each tap of the rotation changes the direction by +/-3, which means that coming full circle, you don't actually end up at 0 again, but rather at 255 or 2 first time 'round, and 254 or 1 next time. Not terribly important, because your direction is effectively right-shifted by 2 bits to determine which ship to render - IOW each tap of the rotation button does not necessarily change the ship rendering.

In the IIGS case, the direction needs to be divided by 24, which is equivalent to right-shifting by 4 and 5 and adding the results, although the resolution of the operation needs to be increased (easily done in 16-bit mode using the XBA instruction) to get all 24 outcomes.

Unfortunately identifying the ship in the display list is practically impossible. Rather than work out how/where to patch the arcade code, as a quick hack I simply added a routine at the end of the DVG emulation code to (always) render the player ship. Fortunately the arcade code always ends the display list with a CUR command corresponding to the center of the screen, so at least it's always rendered there and the game can be 'played' (more-or-less) as long as you don't use the thrust button!

Player ship is now rendered... sort of...

I should add that I'm yet to implement the 'thrust' indicator on the player ship; Norbert hasn't supplied me with any rendering details for this but from a quick look at his emulator video, it looks like a single pixel is illuminated for each bitmap - I just need to work out exactly which pixel!

That really just leaves the player ship explosion, another list of component vectors patched by the arcade code as it is copied to the display list. Again, there won't be any way to identify it in the display list, so I'll likely extend my quick hack to detect when the ship is exploding and render it there; enough to ensure my rendering algorithm is correct.

And that should be everything that needs to be rendered! From that point on, it's a matter of producing pixel-shifted bitmaps where required, updating the rendering routines to use them, and then finally optimise it all to eliminate the flicker and get it running at full frame rate. There'll be some use of the IIGS interrupts to throttle the frame rate, and of course hooking up proper keyboard/joystick/paddle controls and adding a fancy menu. Unlikely it'll all happen before WOzFest, but I should have a decent demo by then at least!?!

(C)1979 ATARI INC

Picked some low-hanging fruit in my lunch break today; rendering the copyright message at a more appropriate size.

Although Norbert didn't supply the source data for the copyright message, it was a trivial matter to load a screen shot from his emulator page into a graphics editor, crop the message, reduce the colour depth and save it as a Portable Bitmap File - a text-based format perfectly suited to turning into assembler source data statements.

And while I was at it, I centered the screen on the IIGS display. Again, trivial, since screen accesses are all performed via an index register relative to the start of SHR memory - a constant defined in my IIGS .inc file. Simply adjusting the constant by 4 lines and 32 pixels ($290) was sufficient to center the display for each and every rendering routine.

Centered and a less obtrusive copyright message

Now for the player ship...

Monday, 3 July 2017

A shot in the dark

Shots, as it turns out, are rendered in the display list as zero-length vectors with scale=7 and maximum brightness and thus can be uniquely identified.

So I simply added a check for such in my DVG emulation code and now have shots being displayed for both player and saucer. I added some crude keyboard mappings for fire and left/right rotate, and I can coin up, start a game and take aim at asteroids and destroy them.

Player's ship is yet to be rendered, but asteroids can still be destroyed

That leaves player ship and player explosion. The latter consists of component vector commands copied and patched from the DVG ROM routine. In theory, they are the only two remaining objects in the display list, and it may yet be possible to distinguish between the two... something I need to experiment with in order to confirm. It would be really, really nice if I didn't have to patch the original game - even if just for this exercise - and be able to render all the game graphics!

But next task is to get Norbert's player ship bitmaps converted and displayed in the correct orientation.

Sunday, 2 July 2017

Where's the kaboom? There was supposed to be an earth-shattering kaboom!

It turns out that, as suggested on the Computer Archeology page on the DVG ROM, Asteroids does indeed use the global scale in the animation of the explosion. In fact all-up there are 21 different frames of animation of the explosion, all based on the 4 shrapnel pattern routines in the ROM.

Understandably though, Norbert appears to make do without scaling at all, using 4 patterns as-is. In truth, at 256x192 resolution 21 frames of animation of exploding particles is overkill, and half the frames would probably look the same anyway. The shrapnel bitmaps, like the other objects, are confined to 16x16 pixels and likely don't render quite as large as the shrapnel on the original, but it's not noticeable at all except perhaps for the number of frames each pattern persists for. Either way it doesn't affect game play in any way.

Throughout the original animation, the global scale is changed in 6 steps from 11 thru 15 and finally to 0. What I do is simply ignore the shrapnel pattern number and instead use the global scale to display pattern 0, 1, 2, 2, 3 & 3 since the latter scales are displayed for more frames (somewhat realistically the explosion slows down).

Kaboom! A saucer hits an asteroid.

Next I'll look at the (saucer) shots, and see if they can be unambiguously identified in the display list... perhaps via their brightness and/or vector length??? If not, it's time for some 'less benign' changes to the original code!

Friday, 30 June 2017

Asteroids come in all shapes and sizes

I added the code for the various asteroid sizes; that was a simple matter of checking the global scale value in effect and adjusting the asteroid bitmap table index accordingly. That entailed 12/19 'rocks' in Norbert's source file.

As for the remainders, turns out they actually comprise the small/large saucer, 4 shrapnel (explosion) patterns, and one bitmap of the player ship at '0 degrees' rotation. Since there only appears to be a single function in the DVG ROM to display the saucer, I'm assuming that the global scale is used for the 2nd saucer - but I need to verify that.

But for now, I've added the large saucer and now I'm at the exact point where I left off the text version. But it's a good indication of how the final game will look on the IIGS. I'll likely retain the 256x192 display area, but center it on the IIGS 320x200 SHR display.

All 3 sizes of asteroids and large saucer

It should be straightforward to render the shrapnel patterns next. Although the DVG notes online suggest the global scaling factor may also be used for explosions, I can't see where that's the case when I run the arcade emulation, and certainly Norbert is not scaling them at all.

I think that just leaves the ship, shots and ship explosion. As I've mentioned in a previous entry, these (mostly) manifest themselves in the display list as component vectors, and it's not possible to differentiate the actual objects from them alone. Again, at this point I will need to decide how to optimise the process - either tokenising the display list or bypassing the list altogether.

I've also had a few more thoughts on the erasure. I'm thinking dirty rectangles is going to be easiest to implement and I'm hoping fast enough. Each time I render an object, I'll add the coordinates and dimensions of the bounding rectangle to a list. When it's time to wipe the frame, I'll iterate through the list and wipe the rectangles. After all, this is what Knight Lore did...

Wednesday, 28 June 2017

Shifty operations and bitmaps!

Graphics! Norbert Kehrer was kind enough to send me some of the graphics data from his emulator - notably the character set, the player ship, and the asteroids. Given that his emulators run in 256x192 mode, I thought I'd start with the same to enable me to use his bitmaps as-is. Well, I did have to convert from 1BPP to 4BPP mode, writing a small C program to process Norbert's ASM source file.

At this stage I haven't bothered with bit-shifted data - being 4BPP there's only 2 copies anyway - it's enough to see what it's going to look like. Since the arcade Asteroids works with a 1024x1024 coordinate system, I first had to scale down to 256x192. And it's worth noting here that 192=128+64, which means scaling down Y can be done with shift & add operations only.

And somewhat inconveniently, the IIGS SHR screen is 160 bytes wide, so to find the video address of a coordinate, you need (Y*160 + X/2). Fortunately, 160=128+32, so again shift & add operations are sufficient. These calculations are generally only required when the display list contains a command to set the current coordinate (CUR). And at the risk of bragging, my scale and address calculation routine actually worked first go! Of course that's more than offset by all the stupid bugs I had doing trivial stuff.

First task was getting the characters displayed. Rather than use more calculations to find the character data address, I simply use a table of addresses. The routine simply renders 7 lines of 2 words at the current address, then adds 4 bytes to that. It's not perfect because there's no shifted data, but it's close.

It's worth noting that Asteroids uses several different font sizes by changing the global scale factor in the DVG. However Norbert hasn't emulated this behaviour, evidenced by the relative sizes of the score and high score text. Presumably his copyright text is a single purpose-rendered bitmap. I'm undecided at this point whether I'll follow suit.

Next task was getting some sort of representation of the asteroids themselves on the screen. Norbert's file had 19 bitmaps labelled as 'rocks'; I was expecting 4 asteroids in 3 sizes each, or a total of 12 asteroids. But for the moment I'm only rendering each of the 4 asteroids as the largest size and I'll have to investigate what the last 7 bitmaps actually represent at a later date (perhaps shrapnel?)

Lastly, there needs to be some sort of mechanism to wipe data from the previous frame. At 4BPP the SHR screen is 32KB and too big to wipe completely every frame. However, for now, that will have to suffice, so the video is very flickery, and quite slow, atm. Exactly how I optimise this, I'm undecided. It's worth noting here that in 1BPP mode, Norbert would have had to contend with 'only' 6KB of video memory...

Here's a still of the attract mode, showing 4 asteroids.

Yes, the asteroids are the correct shape too!

Next task is handling the different-sized asteroids, which should actually be quite trivial. That's about as far as I took the text version because after that, it all starts to get tricky!

And a 65816 trap-for-young-players - the MVP & MVN instructions change the data bank register! That wasn't documented in the first reference I was using, and I couldn't work out why my code was going into la-la land after using them.

Monday, 26 June 2017

65C816... Meh!

Another first today - my first 65C816 program. I purposefully omitted the exclamation mark from that last sentence because it really is nothing to get excited about. In fact, if you've never written 65C816 code before, don't rush to change that.

I've replaced the apple.asm file in the Asteroids project with another named iigs.asm. Currently the startup code enables the Super High Resolution (SHR) display, sets linear mapping mode, enables shadowing, and then switches to full 16-bit mode to initialise the palette (all 2 colours in one of 16 palettes) and the SCB. The frame rendering code simply switches to 16-bit mode, then immediately back to 8-bit mode before returning to the Asteroids code.

Booting the disk under IIGS emulation eventually - after the machine boots itself and the floppy disk image - results in a black screen. I've also verified that the game is running and the frame renderer is repeatedly being called. And writing values to the SHR memory from the MAME debugger results in pixels appearing on the display. So no crashes so far...

As for the 65C816; no 8-bit memory accesses in 'full' 16-bit mode? OK, perhaps not so much of an issue if the machine is designed from the ground up around the CPU, but when you're running on an architecture with byte-wide softswitches... and interfacing to 8-bit code and data structures... you're in for a bad time.

Then there's the issue, for example, with the assembler not unambiguously knowing whether to load the accumulator with an 8-bit or 16-bit immediate value - because the mnemonics (and, incidentally, the opcode values) are identical. You have to give it hints, and hope it gets it right. A recipe for frustrating bugs if ever I've seen one.

Anyway, as a first pass, I'll be replicating the logic in the text version, and parsing the VDG display list in the same way. I suspect all the parsing code will remain (8-bit) 6502, and I'll only switch into 16-bit mode to render the bitmaps to the SHR. But first I need to prepare said bitmaps for the IIGS display.

Sunday, 25 June 2017

2600 for a day and IIGS video

A little diversion; someone posted on an Atari-related FB group about tinkering with Ms Pacman and not having much luck getting it 'loaded into a disassembler' for more in-depth hacking. I couldn't help myself and started asking questions, and of course ended up doing it myself to satisfy my own curiosity, having never done anything with 2600 before.

The complication is this case is that the 2600 only maps 4KB of cartridge space, and Ms Pacman is 8KB. There are a handful of different banking schemes implemented in various cartridges, Ms Pacman being one of the simplest. Despite that, DiStella for example, doesn't support banked cartridges though it is forgivable and not really surprising. Also worth nothing that Dan Boris is one of the authors, and there's bound to be a good reason if he elected not to support banking.

After another wildly unsuccessful attempt to understand if/how banking is supported in IDAPro, I forged ahead with Ms Pacman only to discover that the first code bank actually executes at $D000, and not $1000-$1FFF as is documented as the reserved cartridge address space. Of course with the higher address lines missing from the 2600's 6507 CPU, the machine's 8KB of addressable device/memory space is mirrored every $2000.

I then turned my attention to the second bank, loading it into a second IDAPro session - until it was revealed that this bank actually executed at $F000! No doubt making development much, much less painful, it also allowed both banks to be loaded into the same IDAPro disassembly and the banking issue all-but-ignored. I added a few other segments, notably the TIA registers, the zero page area and the PIA registers, and as a result have a ready-to-go base for reverse-engineering.

However, I should note that there's very little chance I'll be tempted to work on this at the expense of Asteroids, or even anytime soon after I'm done! From what very little perusing I did of the source, it really didn't look enticing at all, especially in light of my limited knowledge (having read Racing The Beam) that suggests programming the 2600 is just as much about coding a video hardware controller as it is about coding game logic. And although I briefly mused about porting a 2600 title to another platform, I also quickly realised that much of the code wouldn't resemble the original in the slightest.

So about Asteroids; I've done further reading on the subject of IIGS architecture, and the video memory in particular. At best it looks like you can only write to the video memory at 1MHz, though you can read back at 2.8MHz. I've also read a few interesting articles on optimisation techniques - some specifically for the IIGS - and suspect I'll be employing at least some of them down the track. But for now, I think I'm across the technical aspects enough to choose a tact and begin work on it next session.

So it's time to fire up the 65816 assembler - or rather, 65816 switch on CA65 - and see if I can manage not to crash the IIGS in 4 lines or less...

Tuesday, 20 June 2017

Insert Coin. Press Start. Player 1.

Ordinarily I wouldn't have another update yet, but my 2 yr old came into our bed this morning at 4am and shortly before 5:30am, not having had a minute of sleep since then, I gave up and went out do some more Asteroids.

The plan was to use the MAME debugger to ascertain which DVG ROM subroutines were yet to be implemented. As I expected, the first to reveal itself was the copyright message at the bottom of the screen. The routine itself draws some discrete vectors (presumably for the © symbol) before calling the character routines for the remainder of the message.

I had two choices here; simply do the same and explicitly call my own character routines in sequence for the entire message, or implement some mechanism to allow me to simply point to the DVG ROM routine and recursively execute DVG instructions. I figured the latter wasn't worth the effort - and would be slower - so I implemented the former.

Someone had also 'complained' about the flickering graphics after I posted my last video. Of course this being purely a development aid I wasn't concerned, but knowing the Apple II had two text pages, curiosity got the better of me. And I'm not claiming to be breaking any new ground here, but I did manage to implement double buffering without any conditional logic involved in the process at all.

There's not a lot more to see in attract mode alone, so I decided it was time to properly initalise some dipswitches and hook up some crude control panel inputs. I settled on two hook routines, apple_reset and apple_start, that get called at the end of the original reset and start routines respectively.

In apple_reset, the hardware I/O locations - such as dipswitches - can be initialised. Since they map to normal Apple II RAM locations, all that is required is to write the appropriate value to the respective address. Thus far I set the coinage and the number of starting lives.

In apple_start, the Apple-specific initialisation code is run. Here I'm currently setting up the page flipping logic, and clearing both text pages.

As I've mentioned in the past, the NMI routine in the arcade code handles the coin switch inputs. Other inputs are read in the main game loop, once per frame. For the moment though, I simply added a few lines to read the Apple II keyboard at the end of my frame rendering routine. Pressing <5> will insert a coin by simply incrementing a zero page shadow value, and pressing <1> will start a game by setting a bit in the hardware I/O location (mapped to Apple II RAM) - for 1 frame. That's enough to get a game started and running.

I then added the display of the remaining ships, mainly because it was trivial. Unfortunately with only 16 lines on the screen, they overwrite the score, but the point is that it's more evidence that things are running as expected. The next obvious object to implement was the player ship...

video


...and here is where things start to get more complicated. The DVG ROM indeed has a table of 17 subroutines for drawing the ship (and optionally the thrust), not unlike other objects. However, these 17 ships only cover 90 degrees of rotation. As a result, the 6502 can't simply add a JSR to the player ship routines into the display list.

Instead, the 6502 copies the component instructions (vectors) from the above-mentioned DVG ROM routines into the display list, adjusting each on-the-fly for the current direction. So when the Apple II rendering code comes to the player ship, it's simply a list of CUR and VEC instructions - nothing decidedly identifiable as the player ship object!

So how do we solve this? In a rare coincidence, the solution is actually an optimisation as far as an Apple port goes - and there are also a few options. The most straightforward is to replace the 6502 routine for the player ship entirely, bypassing the display list and directly rendering the appropriate bitmap on the Apple display. One step removed from that is to 'tokenise' the display list entry; rather than add the component vectors, simply add a 'token' command to display the player ship that the Apple rendering engine can parse. Both have pro's and con's.

At this point though, I think I've taken the text-based proof-of-concept engine as far as I need to. It's time to make the switch to the 2.8MHz IIGS, consider writing the rendering engine in native 65816, start working in graphics mode, and decide how best to solve the latest issue.