So it’d be great to say we’re done - but there are still a couple of things to iron out.
Test ROM
The test ROM by corax89
is really handy - when run it will validate instructions and assuming you have the disp
instruction working (0xdxyn
) it will output one of ok
or no
for each instruction type.
I took it for a spin and was greeted by 3 things. The first was that it completed before I could make any sense of what happened - so I added a SDL_Delay(200);
after each iteration to slow things down (told you that timing thing would come back to haunt us…). The second was that every status code is displayed in the middle of the screen instead of in a grid format (woops). The third was a bunch of no
statuses for my “set register to register” and subtraction operations:
A quick glance at the code showed that I had swapped registers x
and y
around:
diff --git a/src/mylib.c b/src/mylib.c
index 8901755..0740405 100644
--- a/src/mylib.c
+++ b/src/mylib.c
@@ -102,7 +102,7 @@ void jumpIfRegNotEqualToReg(State *state, uint8_t reg1, uint8_t reg2)
}
void setRegisterToRegister(State *state, uint8_t reg1, uint8_t reg2)
{
- state->registers[reg2] = state->registers[reg1];
+ state->registers[reg1] = state->registers[reg2];
state->pc += 2;
}
void setRegisterToBitwiseOr(State *state, uint8_t reg1, uint8_t reg2)
For subtraction I messed up the mod 255
operation:
@@ -143,15 +143,14 @@ void addRegisters(State *state, uint8_t reg1, uint8_t reg2)
void subtractRegisters(State *state, uint8_t reg1, uint8_t reg2)
{
bool needBorrow = state->registers[reg2] > state->registers[reg1];
- state->registers[reg1] = (state->registers[reg1] - state->registers[reg2]) + (needBorrow ? 0xff : 0);
- // if we need a borrow, this is set 0
+ state->registers[reg1] = (state->registers[reg1] - state->registers[reg2]) % 0xff;
state->registers[0xf] = needBorrow ? 0 : 1;
state->pc += 2;
}
void subtractRightFromLeft(State *state, uint8_t reg1, uint8_t reg2)
{
bool needBorrow = state->registers[reg1] > state->registers[reg2];
- state->registers[reg1] = (state->registers[reg2] - state->registers[reg1]) + (needBorrow ? 0xff : 0);
+ state->registers[reg1] = (state->registers[reg2] - state->registers[reg1]) % 0xff;
// if we need a borrow, this is set 0
state->registers[0xf] = needBorrow ? 0 : 1;
state->pc += 2;
Fixing those and running the test ROM again gave me the all clear. But I couldn’t readily tell what was wrong with the display.
Fixing the display
Controlling execution
One thing that I wanted was the ability to suspend execution - to be able to log some of the state (e.g. dump the registers, show the memory, …), and resume/step through instructions. So I changed the main loop to this such that pressing space would suspend/resume execution, and pressing return would step through the next instruction (before suspending again).
We can see that working through the logs:
INFO: Pause toggled
INFO: Stepping through
INFO: Decoding a202 (A:a, B:2, C:0, D:2)
INFO: Stepping through
INFO: Decoding dab4 (A:d, B:a, C:b, D:4)
INFO: Stepping through
INFO: Decoding 6b10 (A:6, B:b, C:1, D:0)
Animating a sprite
The aim here is to write a small ROM (in assembly) that animates a sprite. This will hopefully show us what we might be doing wrong.
Picking the right sprite was the hardest bit. Thankfully there are free tools like pixilart that make this easy. I’m not much of an artist but this looked alien-like enough for me. Note the use of an asymmetric pattern to catch bit-flipping:
# # 01000010
# # # 00100101
###### 01111110
## ## ## 11011011
######## 11111111
###### 01111110
# # 00100100
#### 00111100
As I started to look through this I realised I made 2 mistakes. The first is that I took x,y
to be absolute coordinates whereas the specs specify those are registers - that’s the easy fix. The second mistake was to assume you always wrote pixels at 8-bit boundaries. Consider the top row of the sprite:
Step 1 - (0,0): 0100000010 (0,8): 00000000
Step 2 - (0,0): 0010000001 (0,8): 00000000
Step 3 - (0,0): 0001000000 (0,8): 10000000
Shifting the bit pattern doesn’t happen in multiples of 8. If we update a pattern at (0,3)
it will cause 2 memory addresses to change:
Current pixels: (0,0): 01100111 (0,8): 01100001
New pixels: (0,0): 10100 (0,8): 110
End result: (0,0): 01110100 (0,8): 11000001
Now because this implementation stores the state of the pixels in memory, we store them in groups of 8 bits - so updating pixels at places other than boundaries is a little trickier. I tried my hand at doing this but I couldn’t justify the complexity of the code (also, whilst it appeared to work, I couldn’t convince myself it was correct).
Pragmatism took over - instead of relying on writing the display to the “main” memory I added a bool
array representing the state of each pixel to the state:
Meaning that my setPixel
implementation was condensed to about 15 lines instead of the monstruosity it was before:
Whilst doing this I also re-read the original Byte magazine CHIP-8 instructions to sort out how the 0xf
register gets set (hint - it’s when you try to turn on a pixel that is already on, not when you change the state of a pixel, meaning it’s actually different from the draw
flag!).
The following test ROM will slide the alien’s face down the screen every second (based on the timer):
0xa2, 0x04, // set i to x204
0x12, 0x0c, // jump to the start of the program
0x42, 0x24, // alien sprite start
0x7e, 0xdb,
0xff, 0x7e,
0x24, 0x3c, // alien sprite end
0x61, 0x0, // r1=0
0x62, 0x0, // r2=0
0x0, 0xe0, // DISP: clear display
0xd1, 0x28, // display sprite
0x71, 0x01, // r1 += 1
0x72, 0x01, // r2 += 1
0x65, 0x3c, // r5 = 60
0xf5, 0x15, // set the timer to r5
0xf5, 0x07, // CHECK_TIMER: r5 = delay timer value
0x35, 0x00, // if r5 (the timer value) is 0, skip the next instructions
0x12, 0x1c, // jump back to CHECK_TIMER
0x12, 0x10, // jump back to DISP
Once satisfied it worked, re-running the test ROM gives the expected output:
Success!
Input capture
It turns out there’s a much simpler way of capture input. The below creates an array that stores the underlying state of each key:
The state of this is updated when various SDL
functions are called but the one that does this with minimal side effects is SDL_PumpEvents
- we just need to call it more or less every time we want to process input. Speaking of which, the latter is no longer a massive case statement based on an SDL_Event
:
Tidy heh?
Timers & timings
I left this till the end because… I wasn’t sure how I wanted to deal with it. After some (light) reading, I think the approach below should do the trick.
To start, let’s note that CHIP-8 timers (both the delay and the sound) run at 60Hz - this roughly translates to a tick every 0.0167 seconds. In other words, the timers should be decremented every 0.0167 seconds.
Leaving that aside, how many opcodes should we process per time step? The answer is a bit more complicated in that it really is “as many as we can but no more than”. The CHIP-8 virtual CPU runs at around 500Hz, or 500 cycles per second. We can make the assumption that each opcode takes one cycle to process (it doesn’t really - some might take more where e.g. memory access is required).
Combining both together, the pseudo-code for the loop goes like this:
clockSpeed = 500
timerDelta = 1/60.0
accumulator = 0
currTime = now()
while (running)
{
newNow = now()
elapsedTime = newNow - currTime
numCycles = int(elapsedTime*clockSpeed)
if (numCycles > 1)
{
currTime = newNow
timePerCycle = elapsedTime/numCycles
while (numCycles > 1)
{
procesInput
processOp
updateDisplay
accumulator += timePerCycle
while (accumulator > timerDelta)
{
if (delayTimer > 0) delayTimer -= 1
if (soundTimer > 0) soundTimer -= 1
accumulator -= timerDelta
}
maybeDelayIfTooFast
}
}
}
This places a lower bound on the execution - if we process operations faster than 500Hz (500 cycles per second), we’ll just wait until enough time has elapsed that we need to process something. Spoiler alert, this is exactly what happened when I tried this on my moderately modern laptop. It would execute a handful of cycles super fast and then wait till it accumulated enough “ticks” to process some instructions. This gave games a very noticeable stop/start delay. The solution was to add a fixed delay after each cycle of timePerCycle
. If anything I was wondering whether I was going too slow - but this clearly wasn’t the case.
The implementation is practically the same as the pseudo-code:
Any pause/stepping code was removed for clarity.
Conclusion
So now that we’re here, what next? Some might have noticed I didn’t wire up the sound (yet). I should really do this to call this project “complete”. Also the tests need a bit more TLC - there was a lot of experimentation towards the end and whilst they provided a solid foundation, they didn’t necessarily grow alongside the codebase.
Finally perhaps, debugging capabilities are minimal at best. A cool feature to add would be the ability to “save state” - which should be as simple as serialising the State
object - it contains all the information pertaining to the execution and loading should just be a case of restoring this.
All in all though it’s been a bag of fun - learnt tons on the way, and still am. I’d be interested to revisit this in another language down the line and see whether my approach would change.