Since it has been over 2 weeks since the last update, I thought I'd better post something.
Whilst I have been focusing on life off the computer lately (namely an 82km fun/charity bicycle ride) I have managed to make some progress. What I haven't made any progress on, however, is the elusive 'pass right through some objects' bug.
As a first step towards compiled sprites, I computed the pre-shifted bitmaps for the asteroids. Still largely unoptimised, they at least eliminate the need for a pair of table look-ups for every byte rendered on the screen, and the associated set-up calculations required for that. The extra data also required that I start to use the lower 16KB of the cartridge ROM area.
Unfortunately I can't find the speed/profile numbers that I thought I'd recorded for the game before changing the asteroids. I could quite easily restore a copy from version control, but I'm too lazy to do that at the moment. FTR though, the start of the attract mode is now running about 33% too slow, keeping in mind that the I've only changed the asteroids, and they're still not compiled sprites.
One thing I did notice when counting cycles in my new code though was that there may be room for improvement in my Knight Lore sprite rendering routine. I had previously assumed the post-incrementing index instructions were the most efficient but depending on the situation, it may be faster to use a series of 5-bit constant offset instructions and then adjust the index register (LEA) afterwards.
Next step is to produce compiled sprites for the asteroids and see if that makes much difference. If my theories are correct though, it probably won't make as much difference as pre-shifting the data.
This blog chronicles my progress porting various retro games to other retro platforms. The goal in each project - at least when targeting a new CPU - is to effectively replicate the original graphics and the original code line-by-line, to produce a 100% accurate port of the original game.
Monday, 13 November 2017
Friday, 27 October 2017
Flicker-be-gone! (mostly)
No progress on the collision-detection bug so I decided to implement the double-buffering (page flipping) since it is almost trivial and won't be affected by any other programming issue.
The arcade machine uses a pair of so-called "ping-pong" buffers to allow the DVG to render the frame whilst the CPU is building the next frame. This comes in very handy indeed on the raster ports (Apple IIGS, Coco3) when erasing the previous frame before rendering the current frame.
Of course double-buffering requires erasing the frame prior to the previous frame. The easiest way to implement this with the current architecture is to extend the 2x buffers to 4x and modify the "ping-pong" logic slightly. No more than a handful of instructions in a few strategic locations...
At this point the game is running quite slowly due to the sub-optimal (to put it mildly) erase/render code, so there's little point synchronising the page flipping to the VBLANK and therefore the video still exhibits some flicker. However it is much improved and gives a taste for things to come...
UPDATE: Tonight I thought I'd add some profiling code before starting on any more of the optimisations. When you first start a game (with 4 asteroids on-screen) it's hovering around 55fps. When things get a lot busier, it's down around 20fps, and the lowest I've encountered is 13fps. And when there's all-but-nothing to render, it hits 89fps.
Will be interesting to see where it goes from here...
UPDATE 2: I've just optimised the copyright rendering. The copyright is unique in that it is rendered every frame, in a fixed location, and therefore never needs to be erased.
After some experimentation, and without resorting to stack blasting (which I can't see being optimal in this case due to the OR'ing operation), I came up with the following for each line of 4 words (Y is the video address):
That's the best I can come up with late on a Friday night (37 -> 22/24 cycles/line). Improvements welcome!
The arcade machine uses a pair of so-called "ping-pong" buffers to allow the DVG to render the frame whilst the CPU is building the next frame. This comes in very handy indeed on the raster ports (Apple IIGS, Coco3) when erasing the previous frame before rendering the current frame.
Of course double-buffering requires erasing the frame prior to the previous frame. The easiest way to implement this with the current architecture is to extend the 2x buffers to 4x and modify the "ping-pong" logic slightly. No more than a handful of instructions in a few strategic locations...
At this point the game is running quite slowly due to the sub-optimal (to put it mildly) erase/render code, so there's little point synchronising the page flipping to the VBLANK and therefore the video still exhibits some flicker. However it is much improved and gives a taste for things to come...
UPDATE: Tonight I thought I'd add some profiling code before starting on any more of the optimisations. When you first start a game (with 4 asteroids on-screen) it's hovering around 55fps. When things get a lot busier, it's down around 20fps, and the lowest I've encountered is 13fps. And when there's all-but-nothing to render, it hits 89fps.
Will be interesting to see where it goes from here...
UPDATE 2: I've just optimised the copyright rendering. The copyright is unique in that it is rendered every frame, in a fixed location, and therefore never needs to be erased.
After some experimentation, and without resorting to stack blasting (which I can't see being optimal in this case due to the OR'ing operation), I came up with the following for each line of 4 words (Y is the video address):
LDD #0x1234
ORA ,Y
ORB 1,Y
STD ,Y
LDD #0x5678
ORA 2,Y
ORB 3,Y
STD 2,Y
...
LEAY 32,Y
That's the best I can come up with late on a Friday night (37 -> 22/24 cycles/line). Improvements welcome!
Tuesday, 24 October 2017
A progress report on lack of progress.
Today I had the chance to review the collision-detection code. The bad news is that I couldn't see anything amiss. I did find a few minor issues to do with ADC/ADD but they don't appear to be the cause of the bug. I also managed to effect a few minor optimisations.
I'm still hanging my hat on an issue with the core code, rather than the display mapping. I say that because a lot of the time the shots and objects are spot-on - even the small asteroids and small saucer - but then a shot will pass right through the middle of the large saucer. That's not just a few pixels off... more like a logic bug.
Aside from revisiting the collision-detection code again, I don't have any further theories on the matter. This might turn out to be a tough one.
I've been holding off on the optimisations up until now for a few reasons. One, it's simply nice to have the rest of the porting 100% complete. Two, it's easier to tweak things like display mapping with brain-dead code. And finally, I didn't want to find myself in the situation where I had to re-optimise certain routines because something fundamental wasn't quite right.
Having said all that, I'm wondering whether it is actually safe to press on with the optimisations now and revisit this issue down the track - assuming it doesn't have anything to do with display mapping. I don't want to get bogged down debugging this and lose momentum (again).
At least part of the optimisation - double buffering - won't be affected either way and should be relatively straight-forward. From there it gets more involved with compiled sprites, but I could get a start on objects such as text and the copyright message.
Have to think about it...
I'm still hanging my hat on an issue with the core code, rather than the display mapping. I say that because a lot of the time the shots and objects are spot-on - even the small asteroids and small saucer - but then a shot will pass right through the middle of the large saucer. That's not just a few pixels off... more like a logic bug.
Aside from revisiting the collision-detection code again, I don't have any further theories on the matter. This might turn out to be a tough one.
I've been holding off on the optimisations up until now for a few reasons. One, it's simply nice to have the rest of the porting 100% complete. Two, it's easier to tweak things like display mapping with brain-dead code. And finally, I didn't want to find myself in the situation where I had to re-optimise certain routines because something fundamental wasn't quite right.
Having said all that, I'm wondering whether it is actually safe to press on with the optimisations now and revisit this issue down the track - assuming it doesn't have anything to do with display mapping. I don't want to get bogged down debugging this and lose momentum (again).
At least part of the optimisation - double buffering - won't be affected either way and should be relatively straight-forward. From there it gets more involved with compiled sprites, but I could get a start on objects such as text and the copyright message.
Have to think about it...
Monday, 23 October 2017
Even more display tweaks, looks even better. Still not perfect...
More tweaks to the display mapping, and it's improved even more. There was an offset added in the core (arcade) code when adding the CUR command to the DVG display list that must be peculiar to the vector hardware; I needed to remove that offset to enable objects to use the entire 192 lines of the display. Norbert didn't have this issue because he all-but ignores the display list, except for rendering text and his seemingly arbitrary offset (after scaling) accounts for it - and now makes sense of course.
I've also added clipping to the screen for all objects except the exploding ship. My explosion rendering code differs quite a bit from Norbert's; he uses a generic pixel-plot routine that handles the clipping (it'll render pixels outside the visible display on line 191, which is odd). I will probably not bother with clipping until I optimise the graphics for the Coco3.
After a few more hours of coding and comparing the Atari and Coco3 versions, I'm convinced there's still an issue with collision-detection (apparent in the video below). I'm reasonably sure it's not the display mapping now as sometimes the shots go straight through the middle of the large saucer, and occasionally you can't seem to hit the smallest asteroids. I simply cannot reproduce either bug on Norbert's emulator.
So now I need to go back and review the collision code which is not altogether surprising since it pretty much "worked" straight away. In fact I'm hoping that is the issue because otherwise everything else seems spot-on now, and I can definitely move on to Coco3 optimisations once this issue is sorted - something I've been itching to do for some time now.
I've also added clipping to the screen for all objects except the exploding ship. My explosion rendering code differs quite a bit from Norbert's; he uses a generic pixel-plot routine that handles the clipping (it'll render pixels outside the visible display on line 191, which is odd). I will probably not bother with clipping until I optimise the graphics for the Coco3.
After a few more hours of coding and comparing the Atari and Coco3 versions, I'm convinced there's still an issue with collision-detection (apparent in the video below). I'm reasonably sure it's not the display mapping now as sometimes the shots go straight through the middle of the large saucer, and occasionally you can't seem to hit the smallest asteroids. I simply cannot reproduce either bug on Norbert's emulator.
So now I need to go back and review the collision code which is not altogether surprising since it pretty much "worked" straight away. In fact I'm hoping that is the issue because otherwise everything else seems spot-on now, and I can definitely move on to Coco3 optimisations once this issue is sorted - something I've been itching to do for some time now.
A few more fixes and it's looking good... but not perfect.
Another brief update; I seem to only get to work on it in snippets atm...
I've fixed the ship explosion offset (same as ship offset). I also fixed a long-standing bug in the erase routine for the extra life indicators.
It's looking pretty spot-on now as far as object placement goes, although occasionally a shot appears to pass right through an object. I've played a bit of Norbert's Atari emulator, and it doesn't appear to have this issue. So either there's a subtle bug in the 6809 core code, or there are some more offsets that I haven't noticed yet. Tonight I went through the rendering routines in Norbert's code again, and I don't see any more offsets applied.
From here I'll likely render the 1st frame in attract mode and compare all the plot positions for each object on Atari/Coco3. If they match (they're all large asteroids of course) I'll let it run for a fixed number of frames until a few get split and try again. Fortunately the attract mode is completely deterministic from a cold start.
After closer scrutinisation of Norbert's emulator, a few warts become more apparent.
Norbert has the same bug as I had; the first game from a cold start has 3 lives, and subsequent games start with 4. That's because the original code right-shifts a hardware I/O location mapped to a dipswitch, and checks for carry. Under emulation, that's simply a RAM location so although it's seeded with the correct dipswitch value (bit0=1) at initialisation, it's shifted out after the first game starts. I fixed this in the code (also on the Apple version), since unlike Norbert, I have the luxury of assembling the core.
The Atari version is also only rendered every 3rd frame, because the CPU is simply too slow to render every frame and have the game run at full speed (which I must admit, I still don't have a value for). At least, Norbert's code isn't anywhere near as optimised as it could be (an observation, not a criticism). That's fine for a busy screen, but when you're down to only a few small asteroids, the game is obviously too fast. There's no periodic interrupt throttling the game speed.
With any luck, I'm not too far off being in a position to start the Coco3 optimisations...
I've fixed the ship explosion offset (same as ship offset). I also fixed a long-standing bug in the erase routine for the extra life indicators.
It's looking pretty spot-on now as far as object placement goes, although occasionally a shot appears to pass right through an object. I've played a bit of Norbert's Atari emulator, and it doesn't appear to have this issue. So either there's a subtle bug in the 6809 core code, or there are some more offsets that I haven't noticed yet. Tonight I went through the rendering routines in Norbert's code again, and I don't see any more offsets applied.
From here I'll likely render the 1st frame in attract mode and compare all the plot positions for each object on Atari/Coco3. If they match (they're all large asteroids of course) I'll let it run for a fixed number of frames until a few get split and try again. Fortunately the attract mode is completely deterministic from a cold start.
After closer scrutinisation of Norbert's emulator, a few warts become more apparent.
Norbert has the same bug as I had; the first game from a cold start has 3 lives, and subsequent games start with 4. That's because the original code right-shifts a hardware I/O location mapped to a dipswitch, and checks for carry. Under emulation, that's simply a RAM location so although it's seeded with the correct dipswitch value (bit0=1) at initialisation, it's shifted out after the first game starts. I fixed this in the code (also on the Apple version), since unlike Norbert, I have the luxury of assembling the core.
The Atari version is also only rendered every 3rd frame, because the CPU is simply too slow to render every frame and have the game run at full speed (which I must admit, I still don't have a value for). At least, Norbert's code isn't anywhere near as optimised as it could be (an observation, not a criticism). That's fine for a busy screen, but when you're down to only a few small asteroids, the game is obviously too fast. There's no periodic interrupt throttling the game speed.
With any luck, I'm not too far off being in a position to start the Coco3 optimisations...
Saturday, 21 October 2017
#bug
Super-brief update... I've fixed the ship/shot offset issue. Nothing sinister at all - simply forgot the '#' character when adding the fixed pixel offsets to the accumulator!
Need to re-check the other offsets now, and fix the display issue at the bottom of the screen, and possibly add Y clipping - and I can then move on to optimisation for the Coco3!
Need to re-check the other offsets now, and fix the display issue at the bottom of the screen, and possibly add Y clipping - and I can then move on to optimisation for the Coco3!
Friday, 20 October 2017
Code that isn't executed has no bugs!
Very brief update.
After five days away from the keyboard I decided there was only one possible reason for the erase ship routine not working... and I was right! Because I cut-and-paste the render routine and used it for the erase by replacing the video writes with CLR, it simply wasn't possible for it to fail. And it wasn't actually failing at all - I simply wasn't calling it!
I have a jump table for the (tokenised) DVG commands, and for the previous iteration without any constant offsets, the ship was erased (for now) by a call to the generic erase_chr routine, since the ship is no larger than a character. So when I added new code to the render_ship routine that moved it...
Right now I'm where I was at with the IIGS version, with the exploding ship graphics as well. In short, all the rendering is done - it's just not done at exactly the right position on the screen. The shots are coming out near the nose of the ship, but it's still offset by a pixel or two in some orientations. Why is a complete mystery to me, as both the ship and shot appear to have a constant offset applied to them.
Most of the asteroid hits seem spot-on too, but occasionally I've seen a shot pass right through the middle of a saucer.
I was hoping things would "just work" with the scaling sorted but it appears there's something still amiss. I just hope it's not buried in the Atari display list code, because that's all Greek to me...
I'm tempted to forge ahead with the Coco3 optimisations at this point, but something is nagging at me to get it exactly right this time before moving on. At least once all the offsets are deduced and coded I can back-port to any Apple version(s)!
After five days away from the keyboard I decided there was only one possible reason for the erase ship routine not working... and I was right! Because I cut-and-paste the render routine and used it for the erase by replacing the video writes with CLR, it simply wasn't possible for it to fail. And it wasn't actually failing at all - I simply wasn't calling it!
I have a jump table for the (tokenised) DVG commands, and for the previous iteration without any constant offsets, the ship was erased (for now) by a call to the generic erase_chr routine, since the ship is no larger than a character. So when I added new code to the render_ship routine that moved it...
Right now I'm where I was at with the IIGS version, with the exploding ship graphics as well. In short, all the rendering is done - it's just not done at exactly the right position on the screen. The shots are coming out near the nose of the ship, but it's still offset by a pixel or two in some orientations. Why is a complete mystery to me, as both the ship and shot appear to have a constant offset applied to them.
Most of the asteroid hits seem spot-on too, but occasionally I've seen a shot pass right through the middle of a saucer.
I was hoping things would "just work" with the scaling sorted but it appears there's something still amiss. I just hope it's not buried in the Atari display list code, because that's all Greek to me...
I'm tempted to forge ahead with the Coco3 optimisations at this point, but something is nagging at me to get it exactly right this time before moving on. At least once all the offsets are deduced and coded I can back-port to any Apple version(s)!
Thursday, 12 October 2017
Size does matter
Finally, I've implemented the exploding ship routine! I had even avoided it on the Apple IIGS version which, of course, remains incomplete at this point. It's complicated, it's messy, it's slow - but I get the same results as Norbert, sans graphical bugs.
When it comes time to implement it 'properly' (optimised) on the Coco, I'll likely generate bitmaps of the pieces rather than use a table of pixel coordinates so I can use compiled sprites.
One thing that I did notice, however, on both the Apple IIGS version and the Coco version, is that the screen appears 'compressed' vertically compared to the Atari 800XL version, despite the screen resolution being similar.
Note particularly the space between rows of text in the high score table.
Although, as I said, I had noticed it before, I didn't give it much thought until now, attributing it to larger border areas on both platforms. Aside from reducing the playing area, it does cause the high score table text to overlap. In fact, on the high score entry screen, I (also) patched the message text coordinates to space them out a bit more.
I have confirmed that Norbert hasn't patched the high score display routine at all, so something is amiss. Far from being bad news though, this is actually encouraging, because on the IIGS version I did notice some subtle pixel offsets that didn't look quite right when compared to Norbert's version. I'm hoping then that fixing this issue will not only decompress the screen - and playing area - but also fix those subtle issues!
UPDATE: I've identified the issue with the screen compression, and it's both simpler and a little more complicated than I'd guessed. Asteroids world coordinates are 13 bits (0..8191) which are shifted down to 10 bits (0..1023) - with an X offset - when inserted into the display list for rendering. My rendering routine then takes the Y coordinate, for example, and scales it to (0..191). Norbert, however, doesn't scale the Y coordinate accordingly, but simply shifts it down to 0..255 and adds an offset for rendering, even though the Atari vertical resolution is also 192. That works because the visible vertical resolution on Asteroids isn't the full range 0..1023, but rather closer to 800.
Norbert also handles the objects and text slightly differently, adding a (different) Y offset for the text. It will take some further analysis before I can implement all the subtleties, and I also (now) have to add Y-axis clipping since it's now possible some objects may have a few lines off-screen. At least I've solved the mystery.
UPDATE #2: I've sorted the scaling issue, and the game now uses the full extent of the visible area, and removed my hack for the multi-line high score entry message. I've tweaked a few things to place the score at the very top, and the copyright at the very bottom, like the arcade game. It has, however, introduced a new bug with objects rendered near the bottom of the screen...
I've decided to add all the tweaks before I start optimisations, since it will likely be more difficult to debug/modify once I add things like doulbe-buffering and compiled sprites. I also started to add the ship & shot offsets, but can't for the life of me get the ship erasing properly. I don't really understand why not, since - for now - I've cut-and-paste the render routine, replacing the instructions that write bitmap data with a CLR instruction... simply shouldn't be possible to not work. :(
I'm off interstate for a few days - sans computer - so no new updates for a while...
When it comes time to implement it 'properly' (optimised) on the Coco, I'll likely generate bitmaps of the pieces rather than use a table of pixel coordinates so I can use compiled sprites.
One thing that I did notice, however, on both the Apple IIGS version and the Coco version, is that the screen appears 'compressed' vertically compared to the Atari 800XL version, despite the screen resolution being similar.
Atari 800XL version by Norbert Kehrer |
Note particularly the space between rows of text in the high score table.
Coco3 version (WIP) by yours truly |
Although, as I said, I had noticed it before, I didn't give it much thought until now, attributing it to larger border areas on both platforms. Aside from reducing the playing area, it does cause the high score table text to overlap. In fact, on the high score entry screen, I (also) patched the message text coordinates to space them out a bit more.
I have confirmed that Norbert hasn't patched the high score display routine at all, so something is amiss. Far from being bad news though, this is actually encouraging, because on the IIGS version I did notice some subtle pixel offsets that didn't look quite right when compared to Norbert's version. I'm hoping then that fixing this issue will not only decompress the screen - and playing area - but also fix those subtle issues!
UPDATE: I've identified the issue with the screen compression, and it's both simpler and a little more complicated than I'd guessed. Asteroids world coordinates are 13 bits (0..8191) which are shifted down to 10 bits (0..1023) - with an X offset - when inserted into the display list for rendering. My rendering routine then takes the Y coordinate, for example, and scales it to (0..191). Norbert, however, doesn't scale the Y coordinate accordingly, but simply shifts it down to 0..255 and adds an offset for rendering, even though the Atari vertical resolution is also 192. That works because the visible vertical resolution on Asteroids isn't the full range 0..1023, but rather closer to 800.
Norbert also handles the objects and text slightly differently, adding a (different) Y offset for the text. It will take some further analysis before I can implement all the subtleties, and I also (now) have to add Y-axis clipping since it's now possible some objects may have a few lines off-screen. At least I've solved the mystery.
UPDATE #2: I've sorted the scaling issue, and the game now uses the full extent of the visible area, and removed my hack for the multi-line high score entry message. I've tweaked a few things to place the score at the very top, and the copyright at the very bottom, like the arcade game. It has, however, introduced a new bug with objects rendered near the bottom of the screen...
Re-scaled Coco3 version, using all 192 lines |
I've decided to add all the tweaks before I start optimisations, since it will likely be more difficult to debug/modify once I add things like doulbe-buffering and compiled sprites. I also started to add the ship & shot offsets, but can't for the life of me get the ship erasing properly. I don't really understand why not, since - for now - I've cut-and-paste the render routine, replacing the instructions that write bitmap data with a CLR instruction... simply shouldn't be possible to not work. :(
I'm off interstate for a few days - sans computer - so no new updates for a while...
Wednesday, 11 October 2017
Another last word on explosions
After adding the thrust (pixel) rendering, I decided to bite the bullet and implement the ship explosion to round off the port before I move on to optimisations for the Coco3.
I've ported all the original code, despite the fact that one of the routines isn't required for the Coco3 (or any other raster/bitmap platform) port. That latter routine is, of course, completely untested.
I've taken a slightly different approach to Norbert, who renders his graphics by referencing the player data directly. Norbert hooks into the arcade explosion rendering routine to build a table of piece offsets, which he uses in his end-of-frame rendering, together with the player ship position data.
Rather, my (tokenised) display list command for a ship explosion (piece) comprises the piece number and piece X,Y offsets, which together with the current beam position (the last CUR command) is sufficient to position the piece bitmap.
Right now it's WIP; I am plotting a single pixel per piece rather than rendering the entire piece as I debug the ported code. It's sort-of working but there's a bug or two to iron out before I can complete the piece rendering.
Interestingly the explosion is somewhat non-deterministic. The code (re)intialises a table of piece X,Y offsets in the first few frames of the explosion - but only the high byte. So each explosion may, in theory, be subtly different. Not sure if that was intended or not.
Hopefully I'll have another update tonight.
UPDATE: I forged ahead with the explosion rendering, and managed to get something that resembles the arcade, or at least Norbert's version of it. It's not quite there, so a little debugging to do yet. But it's certainly close.
Note that I have a .define for the number of starting lives, to make debugging easier - and for the video above, it's set to 1.
To summarise, once the explosion is debugged - aside from sound - that's the porting process complete! I just need to optimised the rendering and eliminate the flicker on the Coco3. Then perhaps I can look at a Star Wars port...
UPDATE: Ship explosion is now fully debugged!
I've ported all the original code, despite the fact that one of the routines isn't required for the Coco3 (or any other raster/bitmap platform) port. That latter routine is, of course, completely untested.
I've taken a slightly different approach to Norbert, who renders his graphics by referencing the player data directly. Norbert hooks into the arcade explosion rendering routine to build a table of piece offsets, which he uses in his end-of-frame rendering, together with the player ship position data.
Rather, my (tokenised) display list command for a ship explosion (piece) comprises the piece number and piece X,Y offsets, which together with the current beam position (the last CUR command) is sufficient to position the piece bitmap.
Right now it's WIP; I am plotting a single pixel per piece rather than rendering the entire piece as I debug the ported code. It's sort-of working but there's a bug or two to iron out before I can complete the piece rendering.
Interestingly the explosion is somewhat non-deterministic. The code (re)intialises a table of piece X,Y offsets in the first few frames of the explosion - but only the high byte. So each explosion may, in theory, be subtly different. Not sure if that was intended or not.
Hopefully I'll have another update tonight.
UPDATE: I forged ahead with the explosion rendering, and managed to get something that resembles the arcade, or at least Norbert's version of it. It's not quite there, so a little debugging to do yet. But it's certainly close.
Note that I have a .define for the number of starting lives, to make debugging easier - and for the video above, it's set to 1.
To summarise, once the explosion is debugged - aside from sound - that's the porting process complete! I just need to optimised the rendering and eliminate the flicker on the Coco3. Then perhaps I can look at a Star Wars port...
UPDATE: Ship explosion is now fully debugged!
A last word on explosions.
I've finished the RE of the exploding ship routines, both in the original and Norbert's emulator.
Norbert patches the explosion routine to call his routine to render a piece - using intermediate results from the caller - and then jumps back to the original routine. There's a decent amount of code that doesn't have to be executed on the emulator, but the game at that point doesn't need to be fast.
By necessity, Norbert grabs the player ship location and derives the explosion offsets from it again (since the vector commands are all relative to the ship position). He also maintains his own table of piece locations rather than use the original table, presumably because he requires less resolution.
In short, there's a non-trivial amount of code involved, even in Norbert's emulator. Since all the pieces follow a fixed trajectory it would be possible to replace all the calculations with simple look-up tables for either piece positions, or even simply a bitmap for each frame. At this point I'm undecided on how I'll approach it, but likely simply port Norbert's code.
Thus far I've ported the arcade explosion code up until the point where Norbert hooks it to render a piece. And in theory, that's all the original code that is required; the remainder is concerned only with moving the beam back to the ship position ready to render the next piece.
To re-iterate, the only outstanding bits of code on the Coco are: rendering the thrust pixel, rendering the ship explosion pieces, and the sound. Aside from those, the graphics display needs tweaking (pixel offsets adjusted) and the flicker is to be eliminated.
I'm undecided in which order I'll tackle the above-mentioned tasks. The ship explosion is tedious, the thrust isn't too bad since I've done it on the Apple IIGS already, and the sound will definitely be the last thing I do - so fixing the display on the Coco3 is looking tempting.
UPDATE: thrust rendering is done
This also means that the RE of the original is more-or-less complete, and fully-documented. I'll probably hold off releasing that until the 6809 version is fully ported and debugged, just in case...
Norbert patches the explosion routine to call his routine to render a piece - using intermediate results from the caller - and then jumps back to the original routine. There's a decent amount of code that doesn't have to be executed on the emulator, but the game at that point doesn't need to be fast.
By necessity, Norbert grabs the player ship location and derives the explosion offsets from it again (since the vector commands are all relative to the ship position). He also maintains his own table of piece locations rather than use the original table, presumably because he requires less resolution.
In short, there's a non-trivial amount of code involved, even in Norbert's emulator. Since all the pieces follow a fixed trajectory it would be possible to replace all the calculations with simple look-up tables for either piece positions, or even simply a bitmap for each frame. At this point I'm undecided on how I'll approach it, but likely simply port Norbert's code.
Thus far I've ported the arcade explosion code up until the point where Norbert hooks it to render a piece. And in theory, that's all the original code that is required; the remainder is concerned only with moving the beam back to the ship position ready to render the next piece.
To re-iterate, the only outstanding bits of code on the Coco are: rendering the thrust pixel, rendering the ship explosion pieces, and the sound. Aside from those, the graphics display needs tweaking (pixel offsets adjusted) and the flicker is to be eliminated.
I'm undecided in which order I'll tackle the above-mentioned tasks. The ship explosion is tedious, the thrust isn't too bad since I've done it on the Apple IIGS already, and the sound will definitely be the last thing I do - so fixing the display on the Coco3 is looking tempting.
UPDATE: thrust rendering is done
This also means that the RE of the original is more-or-less complete, and fully-documented. I'll probably hold off releasing that until the 6809 version is fully ported and debugged, just in case...
Monday, 9 October 2017
High scores and exploding ships
Some progress in a couple of areas.
Firstly, the high score display/entry is now fully debugged. IIRC there were at least 5 bugs in the code; not completely unexpected since - by necessity - I ported pretty much all of the code in one go before being able to test any of it. It did take a little longer than I'd hoped, but nothing too nasty in there.
The game is, arguably, fully playable now, missing only the exploding ship graphics and subtle positioning tweaks. IOW the 6809 core is complete, aside from the aforementioned exploding ship. I still need to compare a running attract mode against the original, but it appears on the surface to be 100% faithful. Now the fun bits can begin; getting the display right on various platforms!
Secondly, I took a closer look at the exploding ship routine. There's a curious amount of code for such a seemingly minor aesthetic, but after a second look I understand most of what's happening now. I was actually tipped-off when I revisited the DVG ROM listing on Computer Archeology; I'm sure it has been updated recently with more information! The extra snippet of information that put the pieces together for me was the 'Ship explosion pieces velocity' table.
There are six (6) pieces of ship that expand, dissipate and disappear one-by-one - hence the relative complexity of the code. The first (few) times the code is visited in an explosion, it creates an array of piece data in zeropage memory. This is the data that is updated each iteration (frame) of the explosion as the code moves and scales the pieces.
When rendering a given piece, a VECT command is added to the display list (with Z=0) to place the beam for the start of the piece, relative to the ship position. The SVEC command is then copied from the table in the DVG ROM to render the actual piece. And here's the interesting bit. The same SVEC is then copied again, but negating the dX,dY signs to position the beam back at the start of the piece. And finally, the first VECT command is similarly copied right out of the display list itself, again negating dX,dY to put the beam back at the ship position. And then on to the next piece...
I'd like to see how Norbet rendered the explosion; it seems awfully complicated for what could be done in a few pre-rendered bitmaps. If he took some short-cuts, it may not be necessary to port this code at all. However, I will need to port it for the Star Wars port anyway, so perhaps I will tackle it regardless. It'll be a little messy as there's copious instances of shuffling data to/from index registers and zeropage memory - doesn't translate well to the 6809's 16-bit index registers of course. At least it's not absolutely critical that this code run quickly...
UPDATE: I've taken a look at Norbert's code for the exploding ship, and have identified the graphics data - stored as individual pixel offsets - for the ship explosion pieces. He uses the same mechanism and rendering routine for the thrust (which reminds me, I haven't implemented the thrust graphic on the Coco version).
I'm yet to RE his code for moving the exploding pieces... that'll have to be another time.
Interestingly I've found a bug in his emulator. He overwrites part of the ship explosion pixel data table sometime during initialisation; I've haven't RE'd that routine but I suspect it's preparing memory for the control input shadow values. Why? Because he's also using 2 of those bytes from the pixel data table for the hyperspace and fire shadow values, and they're in a different memory area than the others. So he likely moved variables around and didn't finish the job...
And when you know it's there; the bug is apparent (extraneous pixels) in the YouTube video of his Atari emulator at 0:53.
Firstly, the high score display/entry is now fully debugged. IIRC there were at least 5 bugs in the code; not completely unexpected since - by necessity - I ported pretty much all of the code in one go before being able to test any of it. It did take a little longer than I'd hoped, but nothing too nasty in there.
The game is, arguably, fully playable now, missing only the exploding ship graphics and subtle positioning tweaks. IOW the 6809 core is complete, aside from the aforementioned exploding ship. I still need to compare a running attract mode against the original, but it appears on the surface to be 100% faithful. Now the fun bits can begin; getting the display right on various platforms!
Secondly, I took a closer look at the exploding ship routine. There's a curious amount of code for such a seemingly minor aesthetic, but after a second look I understand most of what's happening now. I was actually tipped-off when I revisited the DVG ROM listing on Computer Archeology; I'm sure it has been updated recently with more information! The extra snippet of information that put the pieces together for me was the 'Ship explosion pieces velocity' table.
There are six (6) pieces of ship that expand, dissipate and disappear one-by-one - hence the relative complexity of the code. The first (few) times the code is visited in an explosion, it creates an array of piece data in zeropage memory. This is the data that is updated each iteration (frame) of the explosion as the code moves and scales the pieces.
When rendering a given piece, a VECT command is added to the display list (with Z=0) to place the beam for the start of the piece, relative to the ship position. The SVEC command is then copied from the table in the DVG ROM to render the actual piece. And here's the interesting bit. The same SVEC is then copied again, but negating the dX,dY signs to position the beam back at the start of the piece. And finally, the first VECT command is similarly copied right out of the display list itself, again negating dX,dY to put the beam back at the ship position. And then on to the next piece...
I'd like to see how Norbet rendered the explosion; it seems awfully complicated for what could be done in a few pre-rendered bitmaps. If he took some short-cuts, it may not be necessary to port this code at all. However, I will need to port it for the Star Wars port anyway, so perhaps I will tackle it regardless. It'll be a little messy as there's copious instances of shuffling data to/from index registers and zeropage memory - doesn't translate well to the 6809's 16-bit index registers of course. At least it's not absolutely critical that this code run quickly...
UPDATE: I've taken a look at Norbert's code for the exploding ship, and have identified the graphics data - stored as individual pixel offsets - for the ship explosion pieces. He uses the same mechanism and rendering routine for the thrust (which reminds me, I haven't implemented the thrust graphic on the Coco version).
I'm yet to RE his code for moving the exploding pieces... that'll have to be another time.
Interestingly I've found a bug in his emulator. He overwrites part of the ship explosion pixel data table sometime during initialisation; I've haven't RE'd that routine but I suspect it's preparing memory for the control input shadow values. Why? Because he's also using 2 of those bytes from the pixel data table for the hyperspace and fire shadow values, and they're in a different memory area than the others. So he likely moved variables around and didn't finish the job...
And when you know it's there; the bug is apparent (extraneous pixels) in the YouTube video of his Atari emulator at 0:53.
Saturday, 7 October 2017
The Universe conspires!
I've ported the remaining high score routines, which leaves just two 'core' routines remaining - those concerned with the explosion of the player ship. The high score routines are almost functional; they just need a bit of debugging but don't adversely affect anything else.
So back to the mysterious end of game issue. After the addition of the high score entry routines, the nature of the issue has changed. In retrospect, the original problem was likely the very lack of the high score routines - particularly the routine that checks for a new high score - since it does change some of the state information.
The issue then manifested itself as the high score entry screen appearing briefly, then a new game starting immediately. After an hour or so of static code review and then moving on to debugging in MAME, I'd stumbled upon something that in itself didn't appear to cause the issue, but still wasn't right - a seemingly arbitrary value in the current number of credits.
However, setting a watchpoint in MAME and running not only refused to reproduce the problem, but the game started behaving correctly on the high score entry screen! Running again without the watchpoint, and it would fail. Run it again with - all OK. Inexplicable.
And then, I finally caught it. Not part of the 'core' code, but in the Coco-specific portion, I poll the keyboard every frame and increment, amongst other things, coin counters when appropriate. FTR the arcade code does this in an NMI, but for now I'm doing it once per frame. Anyway, turns out I wasn't detecting depress properly and holding down the <5> key adds multiple credits. And whilst running the debugger and watchpoint, it's slow enough to detect just one coin if you press <5> quickly.
But the universe wasn't done with me yet. In the same portion of code, I also handle the start button. In this case I was polling the keyboard correctly, but never clearing the state of the button, so once pressed, it remained in that state permanently.
And finally, to muddy the waters even further, it turns out that pressing the start button with credits in the machine will actually abort the high score entry screen altogether and start a new game!
Here I was thinking that somewhere in the core code it was exiting the high score entry state erroneously, when in fact a combination of multiple credits and a 'stuck' start button - all in my code - were to blame! At least I finally found it...
Next, debugging high score entry (which shouldn't take long) and then the exploding player ship.
So back to the mysterious end of game issue. After the addition of the high score entry routines, the nature of the issue has changed. In retrospect, the original problem was likely the very lack of the high score routines - particularly the routine that checks for a new high score - since it does change some of the state information.
The issue then manifested itself as the high score entry screen appearing briefly, then a new game starting immediately. After an hour or so of static code review and then moving on to debugging in MAME, I'd stumbled upon something that in itself didn't appear to cause the issue, but still wasn't right - a seemingly arbitrary value in the current number of credits.
However, setting a watchpoint in MAME and running not only refused to reproduce the problem, but the game started behaving correctly on the high score entry screen! Running again without the watchpoint, and it would fail. Run it again with - all OK. Inexplicable.
And then, I finally caught it. Not part of the 'core' code, but in the Coco-specific portion, I poll the keyboard every frame and increment, amongst other things, coin counters when appropriate. FTR the arcade code does this in an NMI, but for now I'm doing it once per frame. Anyway, turns out I wasn't detecting depress properly and holding down the <5> key adds multiple credits. And whilst running the debugger and watchpoint, it's slow enough to detect just one coin if you press <5> quickly.
But the universe wasn't done with me yet. In the same portion of code, I also handle the start button. In this case I was polling the keyboard correctly, but never clearing the state of the button, so once pressed, it remained in that state permanently.
And finally, to muddy the waters even further, it turns out that pressing the start button with credits in the machine will actually abort the high score entry screen altogether and start a new game!
Here I was thinking that somewhere in the core code it was exiting the high score entry state erroneously, when in fact a combination of multiple credits and a 'stuck' start button - all in my code - were to blame! At least I finally found it...
Next, debugging high score entry (which shouldn't take long) and then the exploding player ship.
Thursday, 5 October 2017
Getting close
Today I did a quick audit of what was outstanding in the 'core':
There's also a few routines not strictly necessary for game play, but will likely be implemented at some stage to round off the port:
I could probably knock over the high score routines in a single session. The exploding ship was the one aspect of the Apple IIGS port I hadn't addressed, for two reasons; I don't (yet) understand the code and I don't have Norbert's graphics for it (nor do I understand - yet - how he implemented it).
UPDATE: I've done 4 of the remaining 7 high score routines; namely those that display the high score table. The remaining 3 comprise the high score initials entry.
I might finish off the high score and then have another look at the exploding ship code for the arcade. If it's still incomprehensible to me, I might move on to optimising the rendering for the Coco3 before returning to it in earnest.
Today I did manage to completely break the asteroids though when adding a few tweaks... so fixing that is first on the agenda.
UPDATE: I also fixed this, however although it detects the end of the game (and displays GAME OVER) it still doesn't actually end the game - you can still play. I had a look for this earlier, and it still eludes me.
- 7x routines for high score display/entry
- 2x routines to display exploding player ship
There's also a few routines not strictly necessary for game play, but will likely be implemented at some stage to round off the port:
- 4x sound routines
- 4x sundry routines, such as NMI
I could probably knock over the high score routines in a single session. The exploding ship was the one aspect of the Apple IIGS port I hadn't addressed, for two reasons; I don't (yet) understand the code and I don't have Norbert's graphics for it (nor do I understand - yet - how he implemented it).
UPDATE: I've done 4 of the remaining 7 high score routines; namely those that display the high score table. The remaining 3 comprise the high score initials entry.
I might finish off the high score and then have another look at the exploding ship code for the arcade. If it's still incomprehensible to me, I might move on to optimising the rendering for the Coco3 before returning to it in earnest.
Today I did manage to completely break the asteroids though when adding a few tweaks... so fixing that is first on the agenda.
UPDATE: I also fixed this, however although it detects the end of the game (and displays GAME OVER) it still doesn't actually end the game - you can still play. I had a look for this earlier, and it still eludes me.
Wednesday, 4 October 2017
Mostly Playable.
Quick update on the Coco3 port of Asteroids tonight.
Have managed a few sessions over the last few days...
The game is all-but-playable now. Although it detects and displays Game Over, it still continues the game and subsequently dying results in a crash as it tries to display (presumably) 255 remaining lives.
Going purely on generated code size, it's roughly 67% complete now. I'm yet to do a formal audit, but I can't actually think of a lot that is yet to be ported, aside from the high score handling and player ship explosions, plus snippets of sound code. There's certainly some AVG display list management code that will not need to be ported for the Coco3 version, plus the non-English message tables that I'm not going to support, so it could well be closer to 80% or even 85% complete in reality.
Once the 'core' code has been ported, the next task is optimising the rendering for the Coco3. That means adding compiled sprites and page-flipping for starters. Nothing too complex in there, and I've already been through the exercise for the (unfinished) Apple IIGS port.
Here is where I hope the slightly higher clock speed and 1BPP graphics mode will give it the edge over the IIGS SHR mode graphics, which still crawled despite my best efforts.
Time will tell...
Have managed a few sessions over the last few days...
- Fixed score display
- Fixed end-of-wave detection
- Added shots (player & saucer) and fixed shot direction
- Added shrapnel (explosion) rendering
- Added end-of-game detection
- Added hyperspace
The game is all-but-playable now. Although it detects and displays Game Over, it still continues the game and subsequently dying results in a crash as it tries to display (presumably) 255 remaining lives.
Going purely on generated code size, it's roughly 67% complete now. I'm yet to do a formal audit, but I can't actually think of a lot that is yet to be ported, aside from the high score handling and player ship explosions, plus snippets of sound code. There's certainly some AVG display list management code that will not need to be ported for the Coco3 version, plus the non-English message tables that I'm not going to support, so it could well be closer to 80% or even 85% complete in reality.
Once the 'core' code has been ported, the next task is optimising the rendering for the Coco3. That means adding compiled sprites and page-flipping for starters. Nothing too complex in there, and I've already been through the exercise for the (unfinished) Apple IIGS port.
Here is where I hope the slightly higher clock speed and 1BPP graphics mode will give it the edge over the IIGS SHR mode graphics, which still crawled despite my best efforts.
Time will tell...
Friday, 29 September 2017
Return To Asteroids!
Today I finally returned to Asteroids, as promised for so long. Specially, the 6809 port to the Coco3. After spending 20 minutes familiarising myself with where I was up to all those weeks ago, I managed to fix the bug that I introduced with the last addition of code, which broke most of everything.
I also deduced the actual purpose of the function I had last added and had formerly named handle_shots(). It is now called handle_collisions(), and that's probably sufficient explanation. It's not completely debugged yet - it's a decent chunk of code all-up - although running into an asteroid splits it in two and the player loses a life.
I feel like I'm back in the groove with it now, so I'll look to knock as much of it over as possible (it's a touch over 50% complete) before returning to other side projects. I have a new incentive to finish it now; I plan to port it to the arcade Star Wars hardware once the Coco3 is done.
What I have been doing in the last few days was RE'ing yet another game, this time Berzerk for the Vectrex. I wanted to get a feel for how the Vectrex was programmed and how difficult, or otherwise, it would be to port games to another platform.
As it turns out, the BIOS is used quite extensively - in Berzerk at least - so porting the game would mean also porting a significant portion of the BIOS code. On the up-side, once that was done it would be much easier to port other Vectrex games that made use of it as well.
Regardless, Berzerk isn't the easiest of games to RE. I'm slowly getting there, but I've had to work for it. I had expected 4KB of code to be a bit easier to work out, especially given the use of well-documented BIOS calls, but it hasn't been the case. I've taken it as far as I need to at this point.
Purely for fun I took the Vectrex character data from the BIOS and converted it to Star Wars AVG commands (or near enough) and wrote a routine to render a string on Star Wars hardware in the Vectrex font. That's the picture in the last blog post! It uses an insane amount of resources on that platform - somewhat paradoxically the Vectrex rasterizes its characters - but still looks cool!
But for the moment, I've done what I need to do with Star Wars (and the Vectrex) and will set them aside whilst I work towards finishing the Coco3 port of Asteroids.
I also deduced the actual purpose of the function I had last added and had formerly named handle_shots(). It is now called handle_collisions(), and that's probably sufficient explanation. It's not completely debugged yet - it's a decent chunk of code all-up - although running into an asteroid splits it in two and the player loses a life.
I feel like I'm back in the groove with it now, so I'll look to knock as much of it over as possible (it's a touch over 50% complete) before returning to other side projects. I have a new incentive to finish it now; I plan to port it to the arcade Star Wars hardware once the Coco3 is done.
What I have been doing in the last few days was RE'ing yet another game, this time Berzerk for the Vectrex. I wanted to get a feel for how the Vectrex was programmed and how difficult, or otherwise, it would be to port games to another platform.
As it turns out, the BIOS is used quite extensively - in Berzerk at least - so porting the game would mean also porting a significant portion of the BIOS code. On the up-side, once that was done it would be much easier to port other Vectrex games that made use of it as well.
Regardless, Berzerk isn't the easiest of games to RE. I'm slowly getting there, but I've had to work for it. I had expected 4KB of code to be a bit easier to work out, especially given the use of well-documented BIOS calls, but it hasn't been the case. I've taken it as far as I need to at this point.
Purely for fun I took the Vectrex character data from the BIOS and converted it to Star Wars AVG commands (or near enough) and wrote a routine to render a string on Star Wars hardware in the Vectrex font. That's the picture in the last blog post! It uses an insane amount of resources on that platform - somewhat paradoxically the Vectrex rasterizes its characters - but still looks cool!
But for the moment, I've done what I need to do with Star Wars (and the Vectrex) and will set them aside whilst I work towards finishing the Coco3 port of Asteroids.
Thursday, 28 September 2017
Saturday, 23 September 2017
Star Wars. Software Developer Kit.
Firstly, although I have not been working on Asteroids lately, I would like to reiterate that the Apple II and Coco3 ports of Asteroids are still very much alive. In fact, I'm about to return to the Coco3 port - which is about 50% complete - very soon.
So, what's up with the Star Wars screenshot in the last blog post?
It's probably obvious that it's not a genuine emulated Star Wars screen shot; in fact it's not Star Wars running at all. It is actually a crude AVG emulator - based on my Asteroids DVG emulator - 'executing' a handful of select AVG ROM routines. So... why? ... you may ask.
I was approached recently by someone who was looking for a '6809 guy' that might be interested in doing some development work on the Star Wars hardware in order to encourage/facilitate development of new games for the platform. That development work may or may not comprise code examples, demos, tutorials and even an 'SDK' if you like for the platform.
That someone also happens to be working on a related project for which any such software development may be useful to assist in testing/debugging.
Well that piqued my interest and although I didn't intend on interrupting my Asteroids projects, curiosity got the better of me and I started to not only look at the disassembly, but also learn how the AVG and various other hardware components operate. That turned out to be quite a bit of fun and only served to draw me further into the investigation.
The 6809 disassembly, at a whopping 48KB (and banked just to complicate matters) was definitely daunting but I was making decent progress none-the-less and by the time I had written my crude AVG emulator, I felt I was ready to start writing code from scratch to run on the arcade hardware.
And thus the Star Wars SDK was started. After a few examples writing vectors, calling ROM routines and reading buttons, I started looking at the math box, whose implementation has been well documented (and of course emulated) at the low level, but I could find no descriptions of the higher-level functions. I chose a few well-used functions to RE and I finally had a working example using the 3x3 matrix multiply function which I used for a simple 2D rotation of a square.
More disassembly and I turned my attention to the sound ROM. I soon had example code that played through all the sounds in the game. I decided to complete the sound ROM disassembly - as far as practical at least - and reached this point tonight. The TMS5220 routines are completely commented, and I've made educated guesses for the higher-level POKEY routines; they get very complicated at the register level. Everything else in the ROM has been RE'd, and that's as far as I'm taking it at this point. See the Project List & Downloads page for the disassembly.
As for the SDK sound functions, it should be sufficient to provide a 'source' version of the sound ROM code that can be edited with new sound data and re-assembled. Anything more would be a mammoth task in re-writing the TMS5220 and Quad POKEY routines for no good reason (other than copyright of course).
Some sound-related trivia; the game has ~60 pre-canned sound effects, the data for all of which are stored in the 6809 sound ROMs. 22 of those are samples from the movie played via the TMS5220. Another 20 are sound effects played via POKEY 1 & 2, and the remaining 11 are tunes played on POKEY 3 & 4 (actually, just 3 in reality I believe). Playing a sound from the main CPU is as simple as writing a single byte - the sound number - to the mailbox register. Some TMS5220 sounds (random sound bites during gameplay) will only be played if the TMS5220 queue is empty - otherwise it will be silently discarded.
There's still work to do on the SDK, mainly handling yoke inputs and RE'ing the remaining math box functions and coming up with (more impressive) examples that use them - like spinning 3D wireframe models. Aside from that though, there's probably enough info now to write a full-blown game from scratch on the platform.
I don't intend on doing much more in the way of commenting the main ROM disassembly, aside from sections that may assist in my understanding of the remaining SDK functions. I would like to perhaps know a little more about the states (routines) in the main state machine though. And FTR most of the banked code appears to be routines that use the math box, and they are pretty obfuscated, so I won't be commenting much of them.
For now, I aim to get back to Asteroids on the Coco3 before I forget it all. And no prizes for guessing what the next target platform will be...
So, what's up with the Star Wars screenshot in the last blog post?
It's probably obvious that it's not a genuine emulated Star Wars screen shot; in fact it's not Star Wars running at all. It is actually a crude AVG emulator - based on my Asteroids DVG emulator - 'executing' a handful of select AVG ROM routines. So... why? ... you may ask.
I was approached recently by someone who was looking for a '6809 guy' that might be interested in doing some development work on the Star Wars hardware in order to encourage/facilitate development of new games for the platform. That development work may or may not comprise code examples, demos, tutorials and even an 'SDK' if you like for the platform.
That someone also happens to be working on a related project for which any such software development may be useful to assist in testing/debugging.
Well that piqued my interest and although I didn't intend on interrupting my Asteroids projects, curiosity got the better of me and I started to not only look at the disassembly, but also learn how the AVG and various other hardware components operate. That turned out to be quite a bit of fun and only served to draw me further into the investigation.
The 6809 disassembly, at a whopping 48KB (and banked just to complicate matters) was definitely daunting but I was making decent progress none-the-less and by the time I had written my crude AVG emulator, I felt I was ready to start writing code from scratch to run on the arcade hardware.
And thus the Star Wars SDK was started. After a few examples writing vectors, calling ROM routines and reading buttons, I started looking at the math box, whose implementation has been well documented (and of course emulated) at the low level, but I could find no descriptions of the higher-level functions. I chose a few well-used functions to RE and I finally had a working example using the 3x3 matrix multiply function which I used for a simple 2D rotation of a square.
Example code running on Star Wars hardware under MAME |
More disassembly and I turned my attention to the sound ROM. I soon had example code that played through all the sounds in the game. I decided to complete the sound ROM disassembly - as far as practical at least - and reached this point tonight. The TMS5220 routines are completely commented, and I've made educated guesses for the higher-level POKEY routines; they get very complicated at the register level. Everything else in the ROM has been RE'd, and that's as far as I'm taking it at this point. See the Project List & Downloads page for the disassembly.
As for the SDK sound functions, it should be sufficient to provide a 'source' version of the sound ROM code that can be edited with new sound data and re-assembled. Anything more would be a mammoth task in re-writing the TMS5220 and Quad POKEY routines for no good reason (other than copyright of course).
Some sound-related trivia; the game has ~60 pre-canned sound effects, the data for all of which are stored in the 6809 sound ROMs. 22 of those are samples from the movie played via the TMS5220. Another 20 are sound effects played via POKEY 1 & 2, and the remaining 11 are tunes played on POKEY 3 & 4 (actually, just 3 in reality I believe). Playing a sound from the main CPU is as simple as writing a single byte - the sound number - to the mailbox register. Some TMS5220 sounds (random sound bites during gameplay) will only be played if the TMS5220 queue is empty - otherwise it will be silently discarded.
There's still work to do on the SDK, mainly handling yoke inputs and RE'ing the remaining math box functions and coming up with (more impressive) examples that use them - like spinning 3D wireframe models. Aside from that though, there's probably enough info now to write a full-blown game from scratch on the platform.
I don't intend on doing much more in the way of commenting the main ROM disassembly, aside from sections that may assist in my understanding of the remaining SDK functions. I would like to perhaps know a little more about the states (routines) in the main state machine though. And FTR most of the banked code appears to be routines that use the math box, and they are pretty obfuscated, so I won't be commenting much of them.
For now, I aim to get back to Asteroids on the Coco3 before I forget it all. And no prizes for guessing what the next target platform will be...
Friday, 8 September 2017
More vectors!
Tuesday, 29 August 2017
Thrust
I'm starting to get into territory that I haven't fully reverse-engineered yet, which makes debugging the 6809 port just that little bit more difficult. On the plus side, it forces me to understand the original code and therefore I can subsequently go back and comment the arcade disassembly.
Most recently I've been adding the code for thrust and as a result once you start a game, your ship appears and you can move around.
One nuance of porting is the distinction between zero-page accesses and memory accesses. Looking at the original 6502 source listing, there's no indication which is which. On the 6809, all labels for direct page variables are .EQU statements, and the operand prefix is '*'. If you forget the asterisk, the code assembles but doesn't work as planned. Somewhat fortuitously though, the way I have the memory map configured, you'll get stray pixels on the top few lines of the video, and that's easily trapped in the MAME debugger since - atm - Asteroids never renders there (see below).
The other issue I touched on last post is the vertical resolution. After some experimentation in the C port, which is rendering 'vectors' in 1024x1024 coordinate space, I've confirmed that Asteroids uses approximately 788 "lines" of the display space, effectively leaving the top and bottom 118 lines (or 10% each) blank. When you reduce the resolution to 192 pixels, that 20% blank space is quite significant.
Unfortunately 192 doesn't quite divide into 788 nicely, and 192*4 (768) crops the score and copyright messages. The Coco3, however, conveniently has a 200-line mode, which would greatly simplify the scaling (right-shift by 2) and allow use of most of the display. The only issue is that the graphics were, IIUC, originally designed by Norbet for a 192-line display. I'll have to experiment to see if and how they could be adapted for a higher resolution.
But back to porting code for now...
UPDATE: Tonight I ported the code that handles all the collisions. I need to update my disassembly comments in a few areas here, as the routines actually handle all collisions between all objects, whilst my comments suggest it's only the collisions between shots and other objects.
There's a decent chunk of code involved, so not surprisingly it doesn't yet work. Worse yet, I've actually broken what was working before, in the process of 'fixing' a few bugs that I discovered porting the new code. It's too easy to forget that the 6502 X,Y registers are 8-bit vs 16-bit on the 6809...
Looking purely at object (binary) code size, the porting is now roughly 50% complete.
Most recently I've been adding the code for thrust and as a result once you start a game, your ship appears and you can move around.
One nuance of porting is the distinction between zero-page accesses and memory accesses. Looking at the original 6502 source listing, there's no indication which is which. On the 6809, all labels for direct page variables are .EQU statements, and the operand prefix is '*'. If you forget the asterisk, the code assembles but doesn't work as planned. Somewhat fortuitously though, the way I have the memory map configured, you'll get stray pixels on the top few lines of the video, and that's easily trapped in the MAME debugger since - atm - Asteroids never renders there (see below).
The other issue I touched on last post is the vertical resolution. After some experimentation in the C port, which is rendering 'vectors' in 1024x1024 coordinate space, I've confirmed that Asteroids uses approximately 788 "lines" of the display space, effectively leaving the top and bottom 118 lines (or 10% each) blank. When you reduce the resolution to 192 pixels, that 20% blank space is quite significant.
Unfortunately 192 doesn't quite divide into 788 nicely, and 192*4 (768) crops the score and copyright messages. The Coco3, however, conveniently has a 200-line mode, which would greatly simplify the scaling (right-shift by 2) and allow use of most of the display. The only issue is that the graphics were, IIUC, originally designed by Norbet for a 192-line display. I'll have to experiment to see if and how they could be adapted for a higher resolution.
But back to porting code for now...
UPDATE: Tonight I ported the code that handles all the collisions. I need to update my disassembly comments in a few areas here, as the routines actually handle all collisions between all objects, whilst my comments suggest it's only the collisions between shots and other objects.
There's a decent chunk of code involved, so not surprisingly it doesn't yet work. Worse yet, I've actually broken what was working before, in the process of 'fixing' a few bugs that I discovered porting the new code. It's too easy to forget that the 6502 X,Y registers are 8-bit vs 16-bit on the 6809...
Looking purely at object (binary) code size, the porting is now roughly 50% complete.
Saturday, 26 August 2017
The 6502 can be a BIT weird.
Quick update.
One thing I like about Asteroids is that it's completely deterministic. Each time you turn it on, it will behave in exactly the same way, until you start messing with the controls. Makes it easier to test my ports...
I was looking into why the movement of the first saucer in attract mode was seemingly mirrored on the Coco3 port. With the above-mentioned in mind, it came down to a BIT instruction, followed by a BVS (branch on overflow set) which was branching on the 6502 but not on the 6809. In my haste I (obviously) didn't pay careful enough attention to the descriptions in the 6502 and 6809 Zaks instruction references and decided they operated in the same way, at least as far as the V (overflow) flag was concerned.
Turns out the 6502 transfers bit 6 of the memory operand into the V flag, whilst the 6809 simply clears the V flag. I'm struggling to relate the 6502 behaviour to a real-world operation. Followed by a BVS, the 6502 code is effectively testing bit 6 of the memory operand (which incidentally is one byte of a 16-bit pseudo-random number) without having to load the value first into the accumulator. It of course simply required an extra instruction on the 6809.
Moving on, you can now start a game and (hacked) code to render the player ship is in place. However it turns out that the game simulates the player coming out of hyperspace when the wave starts (to enable the logic that waits for a clear break in the asteroids) and I've yet to code up the associated routines on the Coco3. That'll have to wait for another night.
I have noticed, doing the IIGS, C and Coco3 ports, that the scores and copyright messages don't extend to the top and bottom of the screen respectively as the MAME emulation of the arcade game suggests they should. It has puzzled me somewhat, but now I believe I know the answer. The DVG operates on 10-bit coordinates, so I've been basing everything on a 1024x1024 display, and scaling accordingly. And to note, it's relatively easy to scale down to 256x192.
However, I was reading today about an FPGA emulation of Asteroids Deluxe, and it mentioned the vertical display resolution was in the vicinity of 800 (I need to look it up again) which would explain the discrepancy. The issue of course, is that scaling becomes more of a chore, i.e. inefficient. But I do seem to recall that Norbert's scaling was different to mine... so I need to go back and study his again, as well as the MAME emulation of the DVG.
UPDATE: Both the FPGAArcade page and the Asteroids MAME driver suggest the vertical resolution is 788. Note that 192*4 = 768. Damn... can we get away with it though? OTOH the Coco3 does have a 256x200 video mode...
I also did some work on the C port yesterday lunchtime, adding the saucer to bring it in line with the 6809 port. Interestingly it has the same bug - or more accurately the same side-effect - as the Coco3 version, but for a completely different reason!
One thing I like about Asteroids is that it's completely deterministic. Each time you turn it on, it will behave in exactly the same way, until you start messing with the controls. Makes it easier to test my ports...
I was looking into why the movement of the first saucer in attract mode was seemingly mirrored on the Coco3 port. With the above-mentioned in mind, it came down to a BIT instruction, followed by a BVS (branch on overflow set) which was branching on the 6502 but not on the 6809. In my haste I (obviously) didn't pay careful enough attention to the descriptions in the 6502 and 6809 Zaks instruction references and decided they operated in the same way, at least as far as the V (overflow) flag was concerned.
Turns out the 6502 transfers bit 6 of the memory operand into the V flag, whilst the 6809 simply clears the V flag. I'm struggling to relate the 6502 behaviour to a real-world operation. Followed by a BVS, the 6502 code is effectively testing bit 6 of the memory operand (which incidentally is one byte of a 16-bit pseudo-random number) without having to load the value first into the accumulator. It of course simply required an extra instruction on the 6809.
Moving on, you can now start a game and (hacked) code to render the player ship is in place. However it turns out that the game simulates the player coming out of hyperspace when the wave starts (to enable the logic that waits for a clear break in the asteroids) and I've yet to code up the associated routines on the Coco3. That'll have to wait for another night.
Starting a game after coining-up |
I have noticed, doing the IIGS, C and Coco3 ports, that the scores and copyright messages don't extend to the top and bottom of the screen respectively as the MAME emulation of the arcade game suggests they should. It has puzzled me somewhat, but now I believe I know the answer. The DVG operates on 10-bit coordinates, so I've been basing everything on a 1024x1024 display, and scaling accordingly. And to note, it's relatively easy to scale down to 256x192.
However, I was reading today about an FPGA emulation of Asteroids Deluxe, and it mentioned the vertical display resolution was in the vicinity of 800 (I need to look it up again) which would explain the discrepancy. The issue of course, is that scaling becomes more of a chore, i.e. inefficient. But I do seem to recall that Norbert's scaling was different to mine... so I need to go back and study his again, as well as the MAME emulation of the DVG.
UPDATE: Both the FPGAArcade page and the Asteroids MAME driver suggest the vertical resolution is 788. Note that 192*4 = 768. Damn... can we get away with it though? OTOH the Coco3 does have a 256x200 video mode...
I also did some work on the C port yesterday lunchtime, adding the saucer to bring it in line with the 6809 port. Interestingly it has the same bug - or more accurately the same side-effect - as the Coco3 version, but for a completely different reason!
Thursday, 24 August 2017
Coco, Xevious and... Atari Jaguar now!?!
A few different topics today.
Firstly, the Apple II/IIGS version is temporarily on ice, but rest assured I will get back to it in the not-too-distant future. The current status is that it's all-but complete - if quite flickery - but I haven't figured out banking for direct page and stack registers yet, which is preventing the use of PEI slamming for the IIGS. I (also) plan on returning to legacy hires mode, and doing an accelerated II/IIC+ version, but this will probably come later.
In the mean time I started on the Coco3 port whilst waiting for some assistance on the aforementioned IIGS issue, and as a result I'm now on a roll and loathe to put it aside. Looking purely at code volume, I'd estimate it's about 30% complete now, with all text, asteroids and saucer rendering complete (saucer movement is a bit off). You can coin-up but not a lot else happens.
Rendering is far from optimal; the aim of the exercise is to complete the porting of the arcade 6502 core with the simplest and smallest amount of graphics data required.
Oh and together with a possible Vectrex port (proof-of-concept at least) I'm also considering a port to the arcade Star Wars hardware - another 6809-based vector platform!
The C port will probably progress in step with the Coco3 port, and they will probably both be used to debug the other at various points. Again, that's probably for Neo Geo and Amiga.
And now for something completely different...
I don't recall exactly how, but I recently stumbled across reports of a port of the arcade game Xevious to the Atari Jaguar. From what I can gather, it's close to completion and will be released on cartridge for sale. It is particularly interesting to me because Xevious is my all-time favourite arcade game! I do have some knowledge of its internals, as I have in the dim dark past done some preliminary reverse-engineering for the purposes of both software (MAME) and hardware (FPGA) emulation. In both cases I was beaten to the punch by someone else, but it's not for nought as it is knowledge that I plan to use again one day.
There's scant information on the technical details of the port; I have no idea to what extent - if any - the original arcade code has been reverse-engineered, nor whether the port involves any sort of emulation (doubtful) or how faithful the Jaguar code is to the original. In any case, I'd be very interested in learning about the process, and getting my hands on any RE work already done. Not sure how likely any of that will be. For now, I plan on buying the cart.
Of course I started to look into the specifications of the Jaguar, and the resources and toolchains available for homebrew development. To my surprise the homebrew scene is quite active and, compared to similar platforms, the output is quite prolific - and that is due in no small part to the comparatively large number of Atari ST games ported to the platform!
As it turns out, a decent number of Atari ST games lend themselves to being quite easily patched to at least run on the Jaguar, and the architecture of the console - with no less than 3 decent CPU's - gives it the capability to screen-scrape the (largely incompatible format of) ST video memory!
[This is of course exactly what I have recently done with Asteroids on the Apple IIGS; patching as little as two instructions allows the core 6502 code to run on the IIGS, and the processing of the display list is perfectly analogous to screen-scraping!]
Developing for the Jaguar appears to be a tad more complicated than other platforms, though it is suggested that most homebrew development primarily commandeers the 68K 'management' CPU and the pair of purpose-built GPU/DSP devices are under-utilised. Also the toolchain involves the installation of a complete Ubuntu distribution - a far cry from simply having a makefile and AS6809.EXE and ASLINK.EXE in your path for the Coco3, for example!
Of course now I want to get my hands on some Jaguar hardware (not to mention Xevious!) Unfortunately - not unlike other platforms - the so-called SkunkBoard (basically a flash cartridge designed for homebrew developers on the Jaguar) doesn't appear to be currently in production. It's going to be an expensive foray...
The retro gaming scene is alive and never ceases to amaze me!
Firstly, the Apple II/IIGS version is temporarily on ice, but rest assured I will get back to it in the not-too-distant future. The current status is that it's all-but complete - if quite flickery - but I haven't figured out banking for direct page and stack registers yet, which is preventing the use of PEI slamming for the IIGS. I (also) plan on returning to legacy hires mode, and doing an accelerated II/IIC+ version, but this will probably come later.
In the mean time I started on the Coco3 port whilst waiting for some assistance on the aforementioned IIGS issue, and as a result I'm now on a roll and loathe to put it aside. Looking purely at code volume, I'd estimate it's about 30% complete now, with all text, asteroids and saucer rendering complete (saucer movement is a bit off). You can coin-up but not a lot else happens.
Rendering is far from optimal; the aim of the exercise is to complete the porting of the arcade 6502 core with the simplest and smallest amount of graphics data required.
Oh and together with a possible Vectrex port (proof-of-concept at least) I'm also considering a port to the arcade Star Wars hardware - another 6809-based vector platform!
The C port will probably progress in step with the Coco3 port, and they will probably both be used to debug the other at various points. Again, that's probably for Neo Geo and Amiga.
And now for something completely different...
I don't recall exactly how, but I recently stumbled across reports of a port of the arcade game Xevious to the Atari Jaguar. From what I can gather, it's close to completion and will be released on cartridge for sale. It is particularly interesting to me because Xevious is my all-time favourite arcade game! I do have some knowledge of its internals, as I have in the dim dark past done some preliminary reverse-engineering for the purposes of both software (MAME) and hardware (FPGA) emulation. In both cases I was beaten to the punch by someone else, but it's not for nought as it is knowledge that I plan to use again one day.
There's scant information on the technical details of the port; I have no idea to what extent - if any - the original arcade code has been reverse-engineered, nor whether the port involves any sort of emulation (doubtful) or how faithful the Jaguar code is to the original. In any case, I'd be very interested in learning about the process, and getting my hands on any RE work already done. Not sure how likely any of that will be. For now, I plan on buying the cart.
Of course I started to look into the specifications of the Jaguar, and the resources and toolchains available for homebrew development. To my surprise the homebrew scene is quite active and, compared to similar platforms, the output is quite prolific - and that is due in no small part to the comparatively large number of Atari ST games ported to the platform!
As it turns out, a decent number of Atari ST games lend themselves to being quite easily patched to at least run on the Jaguar, and the architecture of the console - with no less than 3 decent CPU's - gives it the capability to screen-scrape the (largely incompatible format of) ST video memory!
[This is of course exactly what I have recently done with Asteroids on the Apple IIGS; patching as little as two instructions allows the core 6502 code to run on the IIGS, and the processing of the display list is perfectly analogous to screen-scraping!]
Developing for the Jaguar appears to be a tad more complicated than other platforms, though it is suggested that most homebrew development primarily commandeers the 68K 'management' CPU and the pair of purpose-built GPU/DSP devices are under-utilised. Also the toolchain involves the installation of a complete Ubuntu distribution - a far cry from simply having a makefile and AS6809.EXE and ASLINK.EXE in your path for the Coco3, for example!
Of course now I want to get my hands on some Jaguar hardware (not to mention Xevious!) Unfortunately - not unlike other platforms - the so-called SkunkBoard (basically a flash cartridge designed for homebrew developers on the Jaguar) doesn't appear to be currently in production. It's going to be an expensive foray...
The retro gaming scene is alive and never ceases to amaze me!
Wednesday, 16 August 2017
IIGS is no slam-dunk. Coco3 is looking good though...
Slow going on the Apple IIGS front, but I've finally figured out my half-screen issue when shadowing is enabled. After banging my head against a brick wall for a few days, I decided to look at the MAME driver source for the IIGS and immediately found the issue. Because the legacy Apple II hires screen memories overlap the SHR screen memory, you also need to disable the respective bits for those in the SHADOW register. And voila - a blank screen!
Now I've hit another snag. I was intending to use so-called PEI-slamming to update the modified areas of the SHR screen, which require that the direct page and stack are located in BANK1. However, I can't make any sense of the soft switches that control the bank for these and the current state doesn't make sense either. After putting out a call for help I've had a couple of responses but I'm still none-the-wiser.
Somewhat discouragingly, I did a quick experiment double-rendering the frame (one with shadowing disabled, the other enabled) and it's running a bit slower than the arcade game. Granted PEI-slamming will help, but it's going to be tight!
Whilst I'm struggling with IIGS technical issues I got impatient and started on the Coco3 (6809) port. Thus far I have a skeleton main loop that initialises the object table and adds the extra lives to the display list. I've coded up a skeleton rendering routine and fleshed out the CUR command and the tokenised command to display an extra life. It's far from optimal but it works!
It's been mostly cut-and-paste from 6502 until this point. Again, the one thorn is the indirect indexed addressing mode of the 6502 but, unlike Lode Runner, it's not as extensively used in Asteroids. Other than that, there's endianity to be mindful of and I've actually swapped the byte order of words in the display list for more efficient rendering. I think this port is going to fall out relatively easily!
Now I've hit another snag. I was intending to use so-called PEI-slamming to update the modified areas of the SHR screen, which require that the direct page and stack are located in BANK1. However, I can't make any sense of the soft switches that control the bank for these and the current state doesn't make sense either. After putting out a call for help I've had a couple of responses but I'm still none-the-wiser.
Somewhat discouragingly, I did a quick experiment double-rendering the frame (one with shadowing disabled, the other enabled) and it's running a bit slower than the arcade game. Granted PEI-slamming will help, but it's going to be tight!
Whilst I'm struggling with IIGS technical issues I got impatient and started on the Coco3 (6809) port. Thus far I have a skeleton main loop that initialises the object table and adds the extra lives to the display list. I've coded up a skeleton rendering routine and fleshed out the CUR command and the tokenised command to display an extra life. It's far from optimal but it works!
First rendering on the Coco3 |
It's been mostly cut-and-paste from 6502 until this point. Again, the one thorn is the indirect indexed addressing mode of the 6502 but, unlike Lode Runner, it's not as extensively used in Asteroids. Other than that, there's endianity to be mindful of and I've actually swapped the byte order of words in the display list for more efficient rendering. I think this port is going to fall out relatively easily!
Saturday, 12 August 2017
Half the screen in shadow!?!
Not a lot of progress, or time to work on it.
I did, however, take preliminary steps towards eliminating flicker. The plan is to utilise video shadowing, as described in this article here, by one of the authors of the IIGS port of Wolfenstein 3D.
Currently, I've got shadowing permanently enabled and simply write to bank $01, which is shadowed on-the-fly to the SHR memory in bank $E1. Slow and flickering, but no trickery to implement and of course easier to debug. But now it's time to up my game.
So in preparation, I simply changed my initialisation routine to disable shadowing, rather than enable it. As expected, I now get a blank screen as writes to bank $01 are not copied to the SHR screen at all. So far so good.
Next step is to disable shadowing at the start of every frame render. Since I've never actually enabled it anywhere, I should still get a blank screen - right? However, when I run it, I now see the top half of the screen! And it happens in both MAME and GSPort, so it's not likely to be an emulation bug.
My first thought was Alternate Display Mode, which can be activated from the Control Panel. I can only access it in GSPort, but it's already turned off. Toggling the value regardless doesn't make any difference. And for good measure, I also disabled interrupts.
It has me completely stumped. I've fielded some queries out there but for now, no responses.
With a little more spare time tonight, but no path forward on the IIGS port, I returned to the C port. I fixed the object update and now the asteroids move as expected in attract mode. Next I added some inputs, so you can coin-up and start a game. The next bit will be tricky, rendering the player ship, as that's quite involved.
I don't really have to go through the exercise of emulating the DVG anymore; with my tokenised display list I can simply back-port that to the C version and it'll be sufficient for all platforms that it'll be running on. However I figured there wasn't too much more work to do, and it may help in understanding some nuance at some point (eg. exploding ship), so I'll persist for now.
Hopefully soon I'll learn of my stupid mistake and I can get back to finishing off the IIGS port...
I did, however, take preliminary steps towards eliminating flicker. The plan is to utilise video shadowing, as described in this article here, by one of the authors of the IIGS port of Wolfenstein 3D.
Currently, I've got shadowing permanently enabled and simply write to bank $01, which is shadowed on-the-fly to the SHR memory in bank $E1. Slow and flickering, but no trickery to implement and of course easier to debug. But now it's time to up my game.
So in preparation, I simply changed my initialisation routine to disable shadowing, rather than enable it. As expected, I now get a blank screen as writes to bank $01 are not copied to the SHR screen at all. So far so good.
Next step is to disable shadowing at the start of every frame render. Since I've never actually enabled it anywhere, I should still get a blank screen - right? However, when I run it, I now see the top half of the screen! And it happens in both MAME and GSPort, so it's not likely to be an emulation bug.
My first thought was Alternate Display Mode, which can be activated from the Control Panel. I can only access it in GSPort, but it's already turned off. Toggling the value regardless doesn't make any difference. And for good measure, I also disabled interrupts.
It has me completely stumped. I've fielded some queries out there but for now, no responses.
With a little more spare time tonight, but no path forward on the IIGS port, I returned to the C port. I fixed the object update and now the asteroids move as expected in attract mode. Next I added some inputs, so you can coin-up and start a game. The next bit will be tricky, rendering the player ship, as that's quite involved.
I don't really have to go through the exercise of emulating the DVG anymore; with my tokenised display list I can simply back-port that to the C version and it'll be sufficient for all platforms that it'll be running on. However I figured there wasn't too much more work to do, and it may help in understanding some nuance at some point (eg. exploding ship), so I'll persist for now.
Hopefully soon I'll learn of my stupid mistake and I can get back to finishing off the IIGS port...
Tuesday, 8 August 2017
Offsets!
I've been digging into Norbert's Atari 800XL Asteroids Emulator and trying to ascertain if/where offsets are used for various objects. To re-iterate; the position of an object is relative to the first point in the draw list of vectors that comprise the object. OTOH the position of a bitmap is (always) relative to one corner of the bounding rectangle. So in theory, displaying bitmaps in place of vector objects should require an offset for each object.
I had luck with the player ship and (all) shots. The ship appears to be a lot closer to its true location now, as evidenced by the much reduced incidence of asteroids merely passing close-by and destroying your ship. Shots are definitely much more accurate - if not perfect - as hitting the smaller asteroids requires you to, well, actually hit them!
In order to maintain the current performance, the offsets are coded directly into the compiled sprites, rather than simply offsetting the object's coordinates and thus requiring additional calculations. It did unfortunately necessitate one extra conditional branch in the ship rendering dispatcher because of an odd-valued X offset, but there was no avoiding it.
Characters don't appear to have any offsets applied, as far as I can see. As for asteroids & saucers - eek! There's a few tables with small values that could feasibly contain offsets (one, for example, is indexed by asteroid shape and size) but I'm not sure I'm going to be able to reverse-engineer exactly how it all works as it looks to my untrained eye that it starts to get into display lists. And those routines appear to be working directly on the vector hardware 1024x1024 coordinates as well.
To further add a spanner in the works, played against the Atari version there do still appear to be some inaccuracies between ship and shot and ship and asteroid, albeit subtle. They'll be fun to track down...
Anyway, I'm happy with the progress thus far, and to be honest, it's probably not going to be noticeable to any but the hard-core Asteroids experts out there... and they're not likely to be playing it on the IIGS I wouldn't think. I will look further into it, and perhaps it's a good excuse to crack open Norbert's C64 version which might be a little easier (for me) to follow.
However, I might actually move on now to the ship explosion and then flicker and sound before returning to this issue. I did notice tonight that with only a few small asteroids on the screen, the game is definitely running way too fast, so I'll look at throttling as well.
Discussions on the IIGS FB page tonight have me interested already in another project, though I would only consider it after I've done a proper feasibility study and I have collaborators. I have to admit that I even loaded up the ROM in IDAPro and had a quick look at the start-up code in my lunch break.
But at this stage it's still only a slight possibility and I still have other plans for Asteroids before I put it to bed.
I had luck with the player ship and (all) shots. The ship appears to be a lot closer to its true location now, as evidenced by the much reduced incidence of asteroids merely passing close-by and destroying your ship. Shots are definitely much more accurate - if not perfect - as hitting the smaller asteroids requires you to, well, actually hit them!
In order to maintain the current performance, the offsets are coded directly into the compiled sprites, rather than simply offsetting the object's coordinates and thus requiring additional calculations. It did unfortunately necessitate one extra conditional branch in the ship rendering dispatcher because of an odd-valued X offset, but there was no avoiding it.
Characters don't appear to have any offsets applied, as far as I can see. As for asteroids & saucers - eek! There's a few tables with small values that could feasibly contain offsets (one, for example, is indexed by asteroid shape and size) but I'm not sure I'm going to be able to reverse-engineer exactly how it all works as it looks to my untrained eye that it starts to get into display lists. And those routines appear to be working directly on the vector hardware 1024x1024 coordinates as well.
To further add a spanner in the works, played against the Atari version there do still appear to be some inaccuracies between ship and shot and ship and asteroid, albeit subtle. They'll be fun to track down...
Anyway, I'm happy with the progress thus far, and to be honest, it's probably not going to be noticeable to any but the hard-core Asteroids experts out there... and they're not likely to be playing it on the IIGS I wouldn't think. I will look further into it, and perhaps it's a good excuse to crack open Norbert's C64 version which might be a little easier (for me) to follow.
However, I might actually move on now to the ship explosion and then flicker and sound before returning to this issue. I did notice tonight that with only a few small asteroids on the screen, the game is definitely running way too fast, so I'll look at throttling as well.
Discussions on the IIGS FB page tonight have me interested already in another project, though I would only consider it after I've done a proper feasibility study and I have collaborators. I have to admit that I even loaded up the ROM in IDAPro and had a quick look at the start-up code in my lunch break.
But at this stage it's still only a slight possibility and I still have other plans for Asteroids before I put it to bed.
Monday, 7 August 2017
5 lines of code to fix 3 issues!
With the wife wanting some company in the lounge room tonight while she did some crochet, I turned on the TV for some background noise (Jupiter Ascending - what an absolute FX overload) and fired up the laptop to get some of the easier of the remaining tasks out of the way.
To this end I defined the bitmaps for two extra characters, period and underscore, and updated the arcade code to print those rather than render them with discrete vector commands. The period is used in the high score list display, and the underscore during high score entry. Straightforward.
Next was the 'rubbish' that is left on the screen after coining-up and starting a game. I thought I'd found the culprit and fixed it. Although the subsequent game appeared to be remedied, in later games it reappeared. Back to the drawing board on that one.
Finally, despite the dipswitches being hard-coded for 3 ships/game, the 2nd and subsequent games all start with 4 ships. Found that one rather quickly; the code does an LSR on the (read-only) hardware location, branching on a Carry condition. Unfortunately not so benign when that hardware location is emulated in RAM... loading into A and then shifting was a simple fix.
Somewhat interestingly, the game appeared to run faster on my laptop. Not sure if that was my imagination, a different version of MAME, different OS, or something else. Makes me even more interested in seeing it run on a real IIGS...
Aside from the above-mentioned rubbish, that only leaves the exploding ship and proper (accurate) alignment of the sprites before I tackle proper game speed throttling, flicker and sound. I might tackle alignment next, because that has the most detrimental effect on game play atm.
Oh and another thing I've forgotten about; 2-player mode. Trivial, but another task on the list.
To this end I defined the bitmaps for two extra characters, period and underscore, and updated the arcade code to print those rather than render them with discrete vector commands. The period is used in the high score list display, and the underscore during high score entry. Straightforward.
Next was the 'rubbish' that is left on the screen after coining-up and starting a game. I thought I'd found the culprit and fixed it. Although the subsequent game appeared to be remedied, in later games it reappeared. Back to the drawing board on that one.
Finally, despite the dipswitches being hard-coded for 3 ships/game, the 2nd and subsequent games all start with 4 ships. Found that one rather quickly; the code does an LSR on the (read-only) hardware location, branching on a Carry condition. Unfortunately not so benign when that hardware location is emulated in RAM... loading into A and then shifting was a simple fix.
Somewhat interestingly, the game appeared to run faster on my laptop. Not sure if that was my imagination, a different version of MAME, different OS, or something else. Makes me even more interested in seeing it run on a real IIGS...
Aside from the above-mentioned rubbish, that only leaves the exploding ship and proper (accurate) alignment of the sprites before I tackle proper game speed throttling, flicker and sound. I might tackle alignment next, because that has the most detrimental effect on game play atm.
Oh and another thing I've forgotten about; 2-player mode. Trivial, but another task on the list.
Friday, 4 August 2017
VCF West Preview
I've had a kind offer to demonstrate Asteroids for the Apple IIGS at VCF West so today I added a quick text-mode splash screen.
Datajerk's c2d utility allows you to display a text splash screen whilst the game is loading. It requires a dump of $400 bytes from text memory which of course on the Apple is a dog's breakfast. So I whipped up a quick C program that would allow me to easily layout my screen and then write out a binary dump compatible with Apple II text screen memory.
Some eye-candy from the latest build.
Will be interesting to gauge the reaction of attendees. Unfortunately it's still pre-alpha so there's glitches and flickering, but it was a last-minute offer.
Last-minute splash screen and loading bar (mid-load) |
Datajerk's c2d utility allows you to display a text splash screen whilst the game is loading. It requires a dump of $400 bytes from text memory which of course on the Apple is a dog's breakfast. So I whipped up a quick C program that would allow me to easily layout my screen and then write out a binary dump compatible with Apple II text screen memory.
Some eye-candy from the latest build.
End of game - attract mode |
High score list |
Will be interesting to gauge the reaction of attendees. Unfortunately it's still pre-alpha so there's glitches and flickering, but it was a last-minute offer.
Thursday, 3 August 2017
Closer to an Alpha Release!
Quick update; all of the IIGS code is pure 16-bit now except for two routines - the DVG CUR handler and the support routine that calculates SHR addresses and is only called from there. They need a complete overhaul and merging into one routine. On reflection there might just be enough savings to be had to be noticeable...
All of the rendering is done and optimised (including the small saucer and thrust) except for the exploding ship, plus I need to add data for 'dot' and 'underscore' characters for the high score entry & display. And that's it in terms of optimisation, unless I can coax more out of the hardware by using shadowing and/or blanking to my advantage. It's still around 18% faster from my crude calculations.
For an alpha release I'll rework the CUR routine, add the last pieces of the missing rendering, fix the object alignment, and add a crude text-mode splash screen. I'll tackle the flicker & sound after the alpha is out there.
Closer...
UPDATE: The DVG CUR routine - and hence all IIGS-specific code in the main execution loop - is now 16-bit and optimised. I replaced the x160 calculation with a table look-up, whose instantiation was facilitated in no small part by CA65's .REPEAT command (nice!)
My latest crude calculation suggests it is ~28% faster with 4 large asteroids on the screen. After the alpha release I'll schedule rendering strictly to the arcade frame rate and see if it can keep up. Having said that, there's no guarantees that the arcade game maintains that frame rate either - there is leeway in the code to skip a frame or two before collapsing in a heap!
All of the rendering is done and optimised (including the small saucer and thrust) except for the exploding ship, plus I need to add data for 'dot' and 'underscore' characters for the high score entry & display. And that's it in terms of optimisation, unless I can coax more out of the hardware by using shadowing and/or blanking to my advantage. It's still around 18% faster from my crude calculations.
For an alpha release I'll rework the CUR routine, add the last pieces of the missing rendering, fix the object alignment, and add a crude text-mode splash screen. I'll tackle the flicker & sound after the alpha is out there.
Closer...
UPDATE: The DVG CUR routine - and hence all IIGS-specific code in the main execution loop - is now 16-bit and optimised. I replaced the x160 calculation with a table look-up, whose instantiation was facilitated in no small part by CA65's .REPEAT command (nice!)
My latest crude calculation suggests it is ~28% faster with 4 large asteroids on the screen. After the alpha release I'll schedule rendering strictly to the arcade frame rate and see if it can keep up. Having said that, there's no guarantees that the arcade game maintains that frame rate either - there is leeway in the code to skip a frame or two before collapsing in a heap!
Tuesday, 1 August 2017
A shadow of a doubt!?!
I've added the last of the compiled sprites - the only outstanding graphics now are the ship's thrust and the exploding ship. The former I will probably tackle next.
Aside from fixing the saucer rendering, I've been cleaning up - and in the process optimising - the rendering and erase dispatch routines. Originally they were switching back and forth between 8- and 16-bit mode; all the rendering and erase routines themselves are now pure 16-bit code and the dispatchers are almost there. I also shaved some cycles off the asteroid render/erase dispatchers by streamlining my tokenised 'asteroid' instruction.
The DVG CUR handler does some Apple IIGS video calculations and stores values for use by subsequent render/erase routines. It needs a good overhaul - converting to pure 16-bit, doing away with a few values that aren't needed, and adding another that will enable further optimisation of the render/erase routines. However there's not a huge amount of cycles to be saved per frame.
Incidentally, I was studying some IIGS code online and found the switch to change the border colour, so it's now black and does improve the aesthetics somewhat.
Finally, I activated SHR shadowing and expected to see performance gain. Nada. I'll have to go back and study (again) what that actually does and how it is of benefit (if any) in my case.
As for remaining optimisations, there's not a lot else I can come up with atm. The so-called stack-blasting technique isn't really suited to this situation, nor is moving DP to the video memory. They're generally more suited to larger objects and opaque layers. I'd say it's not going to get significantly faster than what it is now, unless shadowing changes something.
One last task I forgot about; there appears to be a requirement for an adjustment factor for the objects rendered as bitmaps. This is due no doubt to the difference between the vector objects having arbitrary 'origins' (defined as the starting point of the beam) versus the origin of a bitmap always being one corner of the bounding rectangle. With luck I can deduce the required values from Norbert's code.
Edging closer to something suitable for an alpha release just to give people a taste of the game...
Aside from fixing the saucer rendering, I've been cleaning up - and in the process optimising - the rendering and erase dispatch routines. Originally they were switching back and forth between 8- and 16-bit mode; all the rendering and erase routines themselves are now pure 16-bit code and the dispatchers are almost there. I also shaved some cycles off the asteroid render/erase dispatchers by streamlining my tokenised 'asteroid' instruction.
The DVG CUR handler does some Apple IIGS video calculations and stores values for use by subsequent render/erase routines. It needs a good overhaul - converting to pure 16-bit, doing away with a few values that aren't needed, and adding another that will enable further optimisation of the render/erase routines. However there's not a huge amount of cycles to be saved per frame.
Incidentally, I was studying some IIGS code online and found the switch to change the border colour, so it's now black and does improve the aesthetics somewhat.
Finally, I activated SHR shadowing and expected to see performance gain. Nada. I'll have to go back and study (again) what that actually does and how it is of benefit (if any) in my case.
As for remaining optimisations, there's not a lot else I can come up with atm. The so-called stack-blasting technique isn't really suited to this situation, nor is moving DP to the video memory. They're generally more suited to larger objects and opaque layers. I'd say it's not going to get significantly faster than what it is now, unless shadowing changes something.
One last task I forgot about; there appears to be a requirement for an adjustment factor for the objects rendered as bitmaps. This is due no doubt to the difference between the vector objects having arbitrary 'origins' (defined as the starting point of the beam) versus the origin of a bitmap always being one corner of the bounding rectangle. With luck I can deduce the required values from Norbert's code.
Edging closer to something suitable for an alpha release just to give people a taste of the game...
Monday, 31 July 2017
Locaton, location, location!
Tonight I completely re-arranged the memory map, compressing all the areas together at the bottom of memory, and also finally did away with the DVG ROM at the same time. The 6502 arcade ROM now resides at $2000 (as opposed to $6800) and together with the IIGS code extensions - including the compiled sprites - now extends to "just" $7055. Plenty of space to finish off the rendering routines now.
I forgot that the English (only) messages are actually stored in the DVG ROM, so when I eliminated the last stray write to sound hardware and got the game running, all was well except for the messages, which were rubbish. It finally tweaked and I copied the English message tables into my core asteroids code module (together with the sine table) and it's all working as before.
And all this is now irrefutable proof that my Asteroids "source" code is fully relocatable, including all the ROM, RAM and hardware I/O location references! And now simply changing one assembler directive, for example, I could move it above the Apple II hires screen memory pages if I ever attempt a IIC+ version.
On another note altogether, someone asked me the other day whether I've had to decrease the frame rate. Thus far, the answer is 'no', and I'm hoping that won't change. It did remind me, however, that Norbert's emulator only renders every third frame! That could probably be improved somewhat now with my core, which eliminates all unnecessary display list calculations and operations.
And so onwards with the rendering and optimisations...
I forgot that the English (only) messages are actually stored in the DVG ROM, so when I eliminated the last stray write to sound hardware and got the game running, all was well except for the messages, which were rubbish. It finally tweaked and I copied the English message tables into my core asteroids code module (together with the sine table) and it's all working as before.
High Score entry |
And all this is now irrefutable proof that my Asteroids "source" code is fully relocatable, including all the ROM, RAM and hardware I/O location references! And now simply changing one assembler directive, for example, I could move it above the Apple II hires screen memory pages if I ever attempt a IIC+ version.
On another note altogether, someone asked me the other day whether I've had to decrease the frame rate. Thus far, the answer is 'no', and I'm hoping that won't change. It did remind me, however, that Norbert's emulator only renders every third frame! That could probably be improved somewhat now with my core, which eliminates all unnecessary display list calculations and operations.
And so onwards with the rendering and optimisations...
Sunday, 30 July 2017
Compiling a to-do list
Real Life has been intervening but I have been chipping away regularly at Asteroids.
I've now got compiled sprites for the bulk of the rendering, and all of the erase routines. And it finally runs faster than the arcade game, albeit only about 18% faster atm. I do still have some optimsations to do - but I also have to remove the flicker and add sound.
To generate code for the compiled sprites (and erases) I simply processed my bitmap .asm file in a quick 'n dirty C program. Not the most efficient tool but... old dog, new tricks...
So to explain exactly how my compiled sprites work is actually quite simple. Instead of looking up sprite data in a table and rendering it to the screen in a loop, I have one routine for each and every sprite that simply loads the sprite data into the A register as immediate operands and then writes that to the display using absolute indexed mode, with the X register containing what is effectively the video address of the sprite.
In the case of the IIGS, the bottleneck isn't actually the execution overhead of the loop or the amount of data per se, but rather the number of video access required since these are rather slow. Where the compiled sprites make the most improvement in this case is that they are only writing pixels that are set, unlike an non-discerning loop which writes the entire bounding rectangle, set or not. And with the asteroid sprites in particular, where the pixels are either relatively sparse, or the asteroid itself is often a lot smaller than the bounding rectangle, the improvements can be significant.
Aside from skipping zero-pixels (or rather words of pixels), there's also no need to OR the value $FFFF to the display, so there's another video (read) access saved. And a mate had a good suggestion that I sort the values within each sprite and use Y as a temporary store (where appropriate) to save a few more cycles!
For certain sprites I didn't bother with some optimisations... for characters there's still a look-up table of data, since during the game there's only a few digits on the screen and little savings to be had in the character data itself. And for small sprites I simply erase the entire bounding rectangle since there's only 14 words to write.
So now, instead of look-up tables of sprite data, I have look-up tables of sprite rendering routines. The erase routines are similarly optimised; only those pixels that were set are erased, and of course it's sufficient to simply write $0000 to those words. The down side of course is increased memory usage - quite a bit in fact - and I've just hit the requirement to re-arrange my memory map because of it. Not surprising since up until this point I've left all the arcade RAM, ROM & hardware addresses in their original locations. That should be trivial to change with the 'source code' of course and there's still plenty of space left on the Apple so no danger of running out anytime soon.
Tonight though, before re-arranging the memory map, I thought I'd tackle the 'crash' at the high score entry routine. Turns out it wasn't a crash at all, but rather a combination of a bug on my part, and yet-to-be modified display code on the other; the core core was in fact running perfectly fine.
My rendering routine wasn't setting the start of the display list buffer correctly; when the high score message was being printed the display list exceeded 256 bytes and the MSB of the address incremented. I subsequently set that as the start of the buffer and so it only rendered what was in the 2nd page of the list - the last few words of the message.
Next issue was that I hadn't updated the routine that printed the initials as you entered them, so nothing showed up when changing letters. And finally, to select a letter it was debouncing the hyperspace key; having mapped that only to the Apple II keyboard you could never trigger the selection of a letter. So I mapped it to the 2nd joystick button and it fixed that issue.
So now you can coin up, play a game at (slightly more than) full speed, enter your initials on the high score screen, and start again.
I still have a few niggly bugs. There's an initialisation issue that sometimes results in garbage on the screen and non-zero velocity for the player. And starting a second game you have four lives instead of three. I'm sure they won't be difficult to track down, and possibly even the same bug.
And aside from bugs, there's re-arranging the memory map, adding compiled sprites for the player ship (already generated, I just ran out of space temporarily), adding the 'thrust' pixel, displaying the correct saucer size, and rendering of the exploding player ship. I also need to add special characters for 'underscore' (high score entry) and 'dot' (high score list). Lastly then, remove the flicker, add sound and a main menu screen.
I've now got compiled sprites for the bulk of the rendering, and all of the erase routines. And it finally runs faster than the arcade game, albeit only about 18% faster atm. I do still have some optimsations to do - but I also have to remove the flicker and add sound.
To generate code for the compiled sprites (and erases) I simply processed my bitmap .asm file in a quick 'n dirty C program. Not the most efficient tool but... old dog, new tricks...
So to explain exactly how my compiled sprites work is actually quite simple. Instead of looking up sprite data in a table and rendering it to the screen in a loop, I have one routine for each and every sprite that simply loads the sprite data into the A register as immediate operands and then writes that to the display using absolute indexed mode, with the X register containing what is effectively the video address of the sprite.
In the case of the IIGS, the bottleneck isn't actually the execution overhead of the loop or the amount of data per se, but rather the number of video access required since these are rather slow. Where the compiled sprites make the most improvement in this case is that they are only writing pixels that are set, unlike an non-discerning loop which writes the entire bounding rectangle, set or not. And with the asteroid sprites in particular, where the pixels are either relatively sparse, or the asteroid itself is often a lot smaller than the bounding rectangle, the improvements can be significant.
Aside from skipping zero-pixels (or rather words of pixels), there's also no need to OR the value $FFFF to the display, so there's another video (read) access saved. And a mate had a good suggestion that I sort the values within each sprite and use Y as a temporary store (where appropriate) to save a few more cycles!
For certain sprites I didn't bother with some optimisations... for characters there's still a look-up table of data, since during the game there's only a few digits on the screen and little savings to be had in the character data itself. And for small sprites I simply erase the entire bounding rectangle since there's only 14 words to write.
So now, instead of look-up tables of sprite data, I have look-up tables of sprite rendering routines. The erase routines are similarly optimised; only those pixels that were set are erased, and of course it's sufficient to simply write $0000 to those words. The down side of course is increased memory usage - quite a bit in fact - and I've just hit the requirement to re-arrange my memory map because of it. Not surprising since up until this point I've left all the arcade RAM, ROM & hardware addresses in their original locations. That should be trivial to change with the 'source code' of course and there's still plenty of space left on the Apple so no danger of running out anytime soon.
Tonight though, before re-arranging the memory map, I thought I'd tackle the 'crash' at the high score entry routine. Turns out it wasn't a crash at all, but rather a combination of a bug on my part, and yet-to-be modified display code on the other; the core core was in fact running perfectly fine.
My rendering routine wasn't setting the start of the display list buffer correctly; when the high score message was being printed the display list exceeded 256 bytes and the MSB of the address incremented. I subsequently set that as the start of the buffer and so it only rendered what was in the 2nd page of the list - the last few words of the message.
Next issue was that I hadn't updated the routine that printed the initials as you entered them, so nothing showed up when changing letters. And finally, to select a letter it was debouncing the hyperspace key; having mapped that only to the Apple II keyboard you could never trigger the selection of a letter. So I mapped it to the 2nd joystick button and it fixed that issue.
So now you can coin up, play a game at (slightly more than) full speed, enter your initials on the high score screen, and start again.
I still have a few niggly bugs. There's an initialisation issue that sometimes results in garbage on the screen and non-zero velocity for the player. And starting a second game you have four lives instead of three. I'm sure they won't be difficult to track down, and possibly even the same bug.
And aside from bugs, there's re-arranging the memory map, adding compiled sprites for the player ship (already generated, I just ran out of space temporarily), adding the 'thrust' pixel, displaying the correct saucer size, and rendering of the exploding player ship. I also need to add special characters for 'underscore' (high score entry) and 'dot' (high score list). Lastly then, remove the flicker, add sound and a main menu screen.
Sunday, 23 July 2017
IIC, or not IIC, that is the question:
Whether 'tis nobler in the mind to suffer
The Apple II hires video memory map,
...
Due to an SOS call from my wife whilst I was en-route to WozFest (car trouble) I ended up missing the brief link-up with KFest and people had well and truly broken off into small groups to work on their own projects by the time I arrived, which meant I didn't get the chance to see it running on real hardware.
I did get a chance to do a little more work on it though (despite heckling from the Peanut Gallery - you know who you are) and have now got pixel-shifted graphics rendering. Although somewhat hampered by the flickering graphics, on close inspection it is definitely animating more smoothly now.
I also decided to go down the path of so-called compiled sprites, reasoning that it wouldn't be very difficult to write a C program to parse my ASM bitmap data file to produce the requisite code. I've got one or two minor optimisations to effect first, and then I'll give it a spin. If that doesn't make a marked improvement, I'll be at a bit of a loss in terms of how to proceed further. As a first-pass I'll opt not to use stack-blasting and see where that gets me.
After chatting to a few learned fellow attendees at WOzFest it became apparent that the 4MHz IIC+ would be another good candidate for a port - even more capable than the IIGS in fact - with a faster CPU (same video memory bandwidth) but with a monochrome graphics mode meaning only 1/4 of the graphics data to push around. I'm fast running out of excuses to keep avoiding the legacy Apple II hires video display...
The Apple II hires video memory map,
...
Due to an SOS call from my wife whilst I was en-route to WozFest (car trouble) I ended up missing the brief link-up with KFest and people had well and truly broken off into small groups to work on their own projects by the time I arrived, which meant I didn't get the chance to see it running on real hardware.
I did get a chance to do a little more work on it though (despite heckling from the Peanut Gallery - you know who you are) and have now got pixel-shifted graphics rendering. Although somewhat hampered by the flickering graphics, on close inspection it is definitely animating more smoothly now.
I also decided to go down the path of so-called compiled sprites, reasoning that it wouldn't be very difficult to write a C program to parse my ASM bitmap data file to produce the requisite code. I've got one or two minor optimisations to effect first, and then I'll give it a spin. If that doesn't make a marked improvement, I'll be at a bit of a loss in terms of how to proceed further. As a first-pass I'll opt not to use stack-blasting and see where that gets me.
After chatting to a few learned fellow attendees at WOzFest it became apparent that the 4MHz IIC+ would be another good candidate for a port - even more capable than the IIGS in fact - with a faster CPU (same video memory bandwidth) but with a monochrome graphics mode meaning only 1/4 of the graphics data to push around. I'm fast running out of excuses to keep avoiding the legacy Apple II hires video display...
Thursday, 20 July 2017
Game On!
Excellent progress today - in fact it's in good enough shape to demo at WOzFest now although it would be nice to take it along even further if I get the chance!
The bulk of the rendering (sans exploding ship and thrust) and the erasing has been done. I've also managed to pilfer the joystick read routine from Lode Runner and as a result the game is actually playable, albeit slightly slower than the arcade original at this point.
I've still got some optimisations up my sleeve, from simple changes to the display list entry format through to stack-blasting hand-compiled sprites, so I'm still holding out hope that I can get it running fast enough to require being throttled by a IIGS interrupt. And I still haven't worked out the whole video shadowing mechanism; a bug in my code meant that I was never shadowing the SHR screen in the first place - and now when I turn it on, it can't read the keyboard... so there's that to play with as well.
I guess if all else fails I can revert to the legacy hires screen which is a lot less data to move.
From the video it's obvious I've got some graphics tweaks to do, including bit-shifting plus offsets from CUR for each object. There's flickering of course, and the odd glitch and then the minor matter of a complete crash at the end of the game.
As an aside, I got stuck on the joystick not working in the IIGS emulation under MAME. I could move left/up, but centering the joystick read back as $FF, so I couldn't move right/down. That forced my hand in trying to get the disk booting in GSPort, and I eventually realised the floppy disk image should be in slot 5, not slot 7. However GSPort had the same issue...
Then it finally twigged; the routine was written for a 1MHz machine and was running on a 2.8MHz machine. The counter was overflowing before even the centre position was detected! After slowing down the CPU it all started working!
So - finish off the graphics, tweak the display positions, optimise the erase/rendering, fix the Game Over bug. Then add sound, title screen, and release! I've got - realistically - only one more night to work on it before its debut.
The bulk of the rendering (sans exploding ship and thrust) and the erasing has been done. I've also managed to pilfer the joystick read routine from Lode Runner and as a result the game is actually playable, albeit slightly slower than the arcade original at this point.
I've still got some optimisations up my sleeve, from simple changes to the display list entry format through to stack-blasting hand-compiled sprites, so I'm still holding out hope that I can get it running fast enough to require being throttled by a IIGS interrupt. And I still haven't worked out the whole video shadowing mechanism; a bug in my code meant that I was never shadowing the SHR screen in the first place - and now when I turn it on, it can't read the keyboard... so there's that to play with as well.
I guess if all else fails I can revert to the legacy hires screen which is a lot less data to move.
From the video it's obvious I've got some graphics tweaks to do, including bit-shifting plus offsets from CUR for each object. There's flickering of course, and the odd glitch and then the minor matter of a complete crash at the end of the game.
As an aside, I got stuck on the joystick not working in the IIGS emulation under MAME. I could move left/up, but centering the joystick read back as $FF, so I couldn't move right/down. That forced my hand in trying to get the disk booting in GSPort, and I eventually realised the floppy disk image should be in slot 5, not slot 7. However GSPort had the same issue...
Then it finally twigged; the routine was written for a 1MHz machine and was running on a 2.8MHz machine. The counter was overflowing before even the centre position was detected! After slowing down the CPU it all started working!
So - finish off the graphics, tweak the display positions, optimise the erase/rendering, fix the Game Over bug. Then add sound, title screen, and release! I've got - realistically - only one more night to work on it before its debut.
Wednesday, 19 July 2017
IIGS Take 2
The experimental work I'd done on the IIGS port prior to starting on the port proper has certainly paid off; as of tonight it's rendering the characters, lives, copyright, asteroids and player ship (and the latter only when it should be). I should note that I'm yet to generate or code for the bit-shifted graphics.
It may look no different to the previous version, but the display list is greatly simplified - effectively tokenised - and all the dead code has been removed from the 6502 core, giving me more headroom on the IIGS. And I've still got a little more optimisation to do in my rendering routines.
When I next get the chance I'll continue with the saucer, the shots and the shrapnel which should be as straightforward as the other objects have been until now. That'll just leave the exploding ship, which I am yet to work on at all.
I think that'll be a good point to research reading the IIGS keyboard; I've been encouraged by reading vague suggestions that it's possible to read the IIGS keyboard directly from the ADB. If that pans out I'll be able to make the game playable, if slow.
Then it'll be time to work on the bit-shifted graphics and erase logic (right now it's clearing the entire screen every frame). That should bring the game back up to speed and I'm hoping it'll actually then be too fast!
I'll be happy if I get to this point by the weekend for WOzFest Slot 7!
Beyond that, there'll be the addition of the exploding ship, sound (samples), support for variable beam brightness, and spit & polish and bells & whistles, such as a title screen, joystick/paddle support etc etc.
It may look no different to the previous version, but the display list is greatly simplified - effectively tokenised - and all the dead code has been removed from the 6502 core, giving me more headroom on the IIGS. And I've still got a little more optimisation to do in my rendering routines.
IIGS Asteroids Take 2 - optimal display list |
When I next get the chance I'll continue with the saucer, the shots and the shrapnel which should be as straightforward as the other objects have been until now. That'll just leave the exploding ship, which I am yet to work on at all.
I think that'll be a good point to research reading the IIGS keyboard; I've been encouraged by reading vague suggestions that it's possible to read the IIGS keyboard directly from the ADB. If that pans out I'll be able to make the game playable, if slow.
Then it'll be time to work on the bit-shifted graphics and erase logic (right now it's clearing the entire screen every frame). That should bring the game back up to speed and I'm hoping it'll actually then be too fast!
I'll be happy if I get to this point by the weekend for WOzFest Slot 7!
Beyond that, there'll be the addition of the exploding ship, sound (samples), support for variable beam brightness, and spit & polish and bells & whistles, such as a title screen, joystick/paddle support etc etc.
Tuesday, 18 July 2017
A token effort
I have to admit, I haven't been able to tear myself away from the C port to make any further progress on the IIGS port. However, it hasn't all been for nought as it has definitely reinforced my understanding of the arcade code, and cemented my decision regarding tokenising (optimising) the display list for the 8-bit ports.
Before I get to that; most of the work on the C code has been 'infrastructure work' and low-level DVG interface routines, which necessarily support both the new abstract display list and the original in parallel - to facilitate debugging and development. What that leaves, then, is the game logic and housekeeping code which generally tends to be easier to translate to C; the upshot of all this is that I don't think the C port is going to take very long to complete at all!
[Just for the record, I have the C port rendering all the text, including scores, and rendering and animating the asteroids themselves. The pseudo-random number generator is also in lock-step with the arcade machine and produces the same output at the appropriate times].
Keep in mind that the arcade code is only 6KB of 6502 - a lot of that munging 16-bit numbers - and it's not surprising that the C port isn't huge. From memory, Knight Lore was ~12KB of Z80 code and translated to ~5K lines of C. I'm around 1,300 lines for Asteroids already, and you could estimate it'll be in the vicinity of 2,500 lines.
Getting back to the IIGS (and 8-bit) ports; aside from the existing CUR (which sets the current beam coordinates) and HALT display list commands, there'll be a distinct command for the rendering of each object in the game, comprising character, extra ship, copyright, asteroid, ship, saucer, shot, shrapnel and exploding ship. I may add one last command to set the brightness - something the arcade code does but Norbert doesn't bother with - simply because the IIGS has the palette to support some variance in brightness.
Some of those commands will have one or two parameters, but all will render at the current beam position. The parameters will be succinct and optimised for the bitmap display routines. What this means is that I can actually remove a lot of code that generates the display list content that is irrelevant for the port, such as DVG subroutine calls or component vector commands. This is one area where I'll be able to improve performance over Norbert's emulators, only because I effectively have the arcade 6502 source that I can modify and re-assemble at will.
I've also identified which of the bitmaps will and won't require bit-shifting, and which will require an extra byte's width to do so. Because, for example, all of the game's text message coordinates are fixed, specified on a 0-255 grid (before being scaled-up in the display list), and also happen to have even X coordinates, I don't have to bit-shift any of the character set for the IIGS 2BPP SHR graphics!
Most of the remaining bitmaps will require bit-shifting, and a few - not all - of those will require an additional byte's width to facilitate it. But that simply boils down to an extra compare and load for each object rendering, unless I need to really wring the performance out of the rendering routines.
My next task now is to generate shifted bitmap data, which is trivial, and essentially start over from scratch with the IIGS port. I'll probably have to stub out all the routines that write to the display list, and then begin work on the so-called tokenising version. None of that should be too difficult...
[UPDATE: I've regenerated the 6502 ASM file from my disassembly, starting the IIGS port from scratch. All of the DVG write routines have been stubbed-out so that only the CUR command is now written to display list. Next is tokenising the character command and then rendering it on the IIGS.]
As for the erasure; I'm planning on (eventually) making use of the ping-pong display list buffer. Immediately before rendering the new list, I'll simply re-parse the old buffer and use it essentially as dirty rectangles. I do have more sophisticated optimisation possibilities up my sleeve; it's useful to know, for example, that all objects are written to the display list in a fixed order. I'll leave all that, however, until I need it - if ever.
Before I get to that; most of the work on the C code has been 'infrastructure work' and low-level DVG interface routines, which necessarily support both the new abstract display list and the original in parallel - to facilitate debugging and development. What that leaves, then, is the game logic and housekeeping code which generally tends to be easier to translate to C; the upshot of all this is that I don't think the C port is going to take very long to complete at all!
[Just for the record, I have the C port rendering all the text, including scores, and rendering and animating the asteroids themselves. The pseudo-random number generator is also in lock-step with the arcade machine and produces the same output at the appropriate times].
Keep in mind that the arcade code is only 6KB of 6502 - a lot of that munging 16-bit numbers - and it's not surprising that the C port isn't huge. From memory, Knight Lore was ~12KB of Z80 code and translated to ~5K lines of C. I'm around 1,300 lines for Asteroids already, and you could estimate it'll be in the vicinity of 2,500 lines.
Getting back to the IIGS (and 8-bit) ports; aside from the existing CUR (which sets the current beam coordinates) and HALT display list commands, there'll be a distinct command for the rendering of each object in the game, comprising character, extra ship, copyright, asteroid, ship, saucer, shot, shrapnel and exploding ship. I may add one last command to set the brightness - something the arcade code does but Norbert doesn't bother with - simply because the IIGS has the palette to support some variance in brightness.
Some of those commands will have one or two parameters, but all will render at the current beam position. The parameters will be succinct and optimised for the bitmap display routines. What this means is that I can actually remove a lot of code that generates the display list content that is irrelevant for the port, such as DVG subroutine calls or component vector commands. This is one area where I'll be able to improve performance over Norbert's emulators, only because I effectively have the arcade 6502 source that I can modify and re-assemble at will.
I've also identified which of the bitmaps will and won't require bit-shifting, and which will require an extra byte's width to do so. Because, for example, all of the game's text message coordinates are fixed, specified on a 0-255 grid (before being scaled-up in the display list), and also happen to have even X coordinates, I don't have to bit-shift any of the character set for the IIGS 2BPP SHR graphics!
Most of the remaining bitmaps will require bit-shifting, and a few - not all - of those will require an additional byte's width to facilitate it. But that simply boils down to an extra compare and load for each object rendering, unless I need to really wring the performance out of the rendering routines.
My next task now is to generate shifted bitmap data, which is trivial, and essentially start over from scratch with the IIGS port. I'll probably have to stub out all the routines that write to the display list, and then begin work on the so-called tokenising version. None of that should be too difficult...
[UPDATE: I've regenerated the 6502 ASM file from my disassembly, starting the IIGS port from scratch. All of the DVG write routines have been stubbed-out so that only the CUR command is now written to display list. Next is tokenising the character command and then rendering it on the IIGS.]
As for the erasure; I'm planning on (eventually) making use of the ping-pong display list buffer. Immediately before rendering the new list, I'll simply re-parse the old buffer and use it essentially as dirty rectangles. I do have more sophisticated optimisation possibilities up my sleeve; it's useful to know, for example, that all objects are written to the display list in a fixed order. I'll leave all that, however, until I need it - if ever.
Wednesday, 12 July 2017
Asteroids with a 'C'
On Friday nights my wife & I traditionally watch a show together with which I've become rather bored in recent times. Rather than waste that hour last week I decided to set up the laptop in front of the TV and work on some aspect of Asteroids that required a minimum level of concentration. Ultimately I decided to start work on the C port of Asteroids, mainly because it required a lot of crank-the-handle type coding up front before any real work was required. Like defining data structures for zero page variables and player RAM.
Aside from the aforementioned, I manage to also code the main routine and stubs for all the subroutines called from there. Then over the next few nights I was keen to take it a little further; implementing a rather more 'abstract' display list to aid not only in development and debugging, but also to facilitate the so-called tokenising I'd be doing in the 8-bit ports. That entailed a DVG 'disassembler' of sorts which subsequently morphed itself into a DVG interpreter/emulator which was soon rendering a few vectors on the display.
Of course time is ticking for WOzFest and I do need to bite the bullet on the tokenised display list and optimisations for the IIGS. However it has been a very useful exercise and I've discovered a few subtleties of the DVG which had escaped me until now. Regardless, I really need to put it aside for now and continue on with the IIGS port. In the mean-time, here's a sample rendering of what I have thus far.
Like my other C ports (Lode Runner and Knight Lore), the C code is as faithful to the original assembler source as practical, whilst optimising aspects of the original code such as using 16- and 32-bit variables rather than multiple bytes for things like addresses, scores, coordinates, etc. I retain all the same subroutines with the same names, albeit adding parameters for values passed in registers, etc. The logic within each routine is representative of the assembly code, differing only to accommodate the aforementioned optimisations and/or clarify the intent, without changing the underlying algorithm or compromising accuracy.
The end result is the same as the 8-bit assembler ports; a game that plays exactly - and looks as far as practical on the target hardware - the same as the original. And as I've discovered in the past, I've even been able to debug aspects of the assembler ports on the C version! In the case of Asteroids, I think the ability to inspect the display list so easily will come in handy down the track.
The C version should be portable to the Amiga and the Neo Geo at the very least. For Lode Runner the C port was an after-thought of the Coco3 (6809) port, but for Knight Lore, I developed it in parallel the with Coco3 port and it was, as I mentioned, very helpful. This time 'round, I'm undecided how I'll proceed once the IIGS port is finished...
Aside from the aforementioned, I manage to also code the main routine and stubs for all the subroutines called from there. Then over the next few nights I was keen to take it a little further; implementing a rather more 'abstract' display list to aid not only in development and debugging, but also to facilitate the so-called tokenising I'd be doing in the 8-bit ports. That entailed a DVG 'disassembler' of sorts which subsequently morphed itself into a DVG interpreter/emulator which was soon rendering a few vectors on the display.
Of course time is ticking for WOzFest and I do need to bite the bullet on the tokenised display list and optimisations for the IIGS. However it has been a very useful exercise and I've discovered a few subtleties of the DVG which had escaped me until now. Regardless, I really need to put it aside for now and continue on with the IIGS port. In the mean-time, here's a sample rendering of what I have thus far.
Asteroids C port (Win7, GCC, Allegro) |
Like my other C ports (Lode Runner and Knight Lore), the C code is as faithful to the original assembler source as practical, whilst optimising aspects of the original code such as using 16- and 32-bit variables rather than multiple bytes for things like addresses, scores, coordinates, etc. I retain all the same subroutines with the same names, albeit adding parameters for values passed in registers, etc. The logic within each routine is representative of the assembly code, differing only to accommodate the aforementioned optimisations and/or clarify the intent, without changing the underlying algorithm or compromising accuracy.
The end result is the same as the 8-bit assembler ports; a game that plays exactly - and looks as far as practical on the target hardware - the same as the original. And as I've discovered in the past, I've even been able to debug aspects of the assembler ports on the C version! In the case of Asteroids, I think the ability to inspect the display list so easily will come in handy down the track.
The C version should be portable to the Amiga and the Neo Geo at the very least. For Lode Runner the C port was an after-thought of the Coco3 (6809) port, but for Knight Lore, I developed it in parallel the with Coco3 port and it was, as I mentioned, very helpful. This time 'round, I'm undecided how I'll proceed once the IIGS port is finished...
Friday, 7 July 2017
To SNES or not to SNES?
No opportunity for any development today but time to ponder random aspects of the project. I was also prompted by gp2000 to look a little further into specific aspects of the code, and discovered something that should have been obvious from the start, but escaped me until today - so thanks George for that inadvertent trigger!
I did tweak some of the coordinate transformation and video address calculations today, converting my 6502 code into 65816 and improving the resolution of some of the calculations. Always good to see a half-page of 8-bit code reduce to a few lines of 16-bit code!
And in the comments of a previous post I pondered the feasibility of porting this to the TRS-80 Model 4. Aside from the effort of porting to yet another CPU (Z80) there's also the fact that the hires board is all-but-crippled by not only port-mapping the hires video memory, but also restricting access to (vertical?) blanking periods. George suggested a hybrid mode mixing the text and hires graphics screens... very interesting but a lot of work none-the-less. I'll put this in the 'maybe' basket.
And on the subject of alternate ports, the SNES sprung to mind! I know little about the technical specifications except for the fact that it is powered by a 65816 (clone). A quick Google reveals it supports 256x224 resolution, allows 128 sprites (up to 32/line) and has the usual tilemap(s).
I'm thinking this would be a no-brainer; text would appear on the tilemap layer, with 27 asteroids, player ship, saucer and 6 shots making up a maximum of 35 sprites on-screen. Extremely unlikely that they'd all appear on the same scan line, but if I was really pedantic about it I could implement a software priority scheme. But with all the arcade 6502 code running, plus the bulk of the IIGS 65816 code available, it wouldn't be a lot of work at all. I'm going to put this in the 'almost certainly' basket, and I might be tempted to tackle it immediately after the IIGS port is done.
EDIT: Doh! It's already been ported to the SNES by Digital Eclipse!
[Makes me wonder if I should be porting that version to IIGS!?!]
That's about it for random musings. A parting fact: whilst the vector display coordinates range from 0-1023, the game's virtual playfield coordinates actually range from 0-8191. Somehow that escaped me... now consider it's all scaled down to 256x192... or in the case of the TRS-80 text mode graphics, 128x48 (128x72 if I get really tricky).
I did tweak some of the coordinate transformation and video address calculations today, converting my 6502 code into 65816 and improving the resolution of some of the calculations. Always good to see a half-page of 8-bit code reduce to a few lines of 16-bit code!
And in the comments of a previous post I pondered the feasibility of porting this to the TRS-80 Model 4. Aside from the effort of porting to yet another CPU (Z80) there's also the fact that the hires board is all-but-crippled by not only port-mapping the hires video memory, but also restricting access to (vertical?) blanking periods. George suggested a hybrid mode mixing the text and hires graphics screens... very interesting but a lot of work none-the-less. I'll put this in the 'maybe' basket.
And on the subject of alternate ports, the SNES sprung to mind! I know little about the technical specifications except for the fact that it is powered by a 65816 (clone). A quick Google reveals it supports 256x224 resolution, allows 128 sprites (up to 32/line) and has the usual tilemap(s).
I'm thinking this would be a no-brainer; text would appear on the tilemap layer, with 27 asteroids, player ship, saucer and 6 shots making up a maximum of 35 sprites on-screen. Extremely unlikely that they'd all appear on the same scan line, but if I was really pedantic about it I could implement a software priority scheme. But with all the arcade 6502 code running, plus the bulk of the IIGS 65816 code available, it wouldn't be a lot of work at all. I'm going to put this in the 'almost certainly' basket, and I might be tempted to tackle it immediately after the IIGS port is done.
EDIT: Doh! It's already been ported to the SNES by Digital Eclipse!
[Makes me wonder if I should be porting that version to IIGS!?!]
That's about it for random musings. A parting fact: whilst the vector display coordinates range from 0-1023, the game's virtual playfield coordinates actually range from 0-8191. Somehow that escaped me... now consider it's all scaled down to 256x192... or in the case of the TRS-80 text mode graphics, 128x48 (128x72 if I get really tricky).
Subscribe to:
Posts (Atom)