Hello C64
The Commodore 64 was one of the first computers I played around with. I never had one myself until recent years, but a friend of mine did and we had some good times playing games on his little breadbox. So I have been looking forward to learn more about the hardware of the C64.
Before we jump into the code, let me just say that I started with the code a few years ago, but I was stumped by some erratic behaviour of the system. I thought it was strange the way the 64 behaved, and thought it was just a weird machine. But last week when I picked up the code again, I realised there was a silly bug in my code that wrote zeros to more or less random locations in RAM. That is why it behaved strange, and when the bug was taken care of, I learned to enjoy my coding sessions.
The code
This time, I display Hello World in two different graphics mode, switching between them every few seconds. The two modes I used where both bitmap modes, so no character based modes, however the layout of the screen RAM is based on blocks of 8 by 8 pixels, or 4 by 8 in the Multi colour bitmap mode. Please note that the following code have had comments removed so I
Let’s start by checking out the assembly code. For this example code, I decided to run the program as a tokenised BASIC program. This kind of executables are always loaded at address $801, and contains BASIC tokens that will be executed BASIC interpret when you type run on the C64. Since I write all my code in assembly, the first few bytes are representation of the BASIC program 10 sys 2064
.
.processor 6502
.org $801
.byte $0c, $08, $0a, $00, $9e, $20
.byte $32, $30, $36, $34, $00, $00, $00
Since we told the interpreter to jump to address $810, we tell the assembler to put this next segment at address $810. The we disable interrupts and start executing out main loop. This part is a bit different from the previous implementations I’ve made in this project. Since I wanted to try out more than one graphics mode, I separated the two put putting them in different subroutines and just jump to them after each other in a loop.
.org $810
sei
main_loop:
jsr hello_standard_bitmap_mode
jsr hello_multi_color_bitmap_mode
jmp main_loop
Standard bitmap mode
The first routine we execute is using the Standard bitmap mode, which can display 320 by 200 pixels, using 2 colours in every 8 by 8 block. In this mode, the pixel data are spilt into two areas, one where the 8x8 block data is stored, and in the other area we store what two colours should be displayed in each of the blocks. The Hello World graphics data is pretty boring, so the colour data will be set to the same two colours for all block of the screen. The classic white text on black background.
I decided to store the block data at address $2000 and the colours at $400, which you will see later in the code.
Let’s look at some code.
hello_standard_bitmap_mode: subroutine
lda #$0
sta $d011
sta $d020
jsr clear_screen
lda #$10
jsr fill_color_ram
ldy #gfx_len-1
.copy_loop
lda gfx,y
sta $2000,y
dey
bne .copy_loop
This bit of code is setting up the data for the graphics, but why are we writing 0 to $d011 and $d020?
On the C64, the memory area $d000 to $d3ff can be mapped to the VIC-II chip, which is the chip that handles the graphics on the 64. At $d011 we find the Control register 1. Together with Control register 2 that we will see more of later, this register controls display mode and hardware scrolling. Setting bit 4 to 0 will disable displaying graphics, setting the whole screen to the border colour. Speaking of the border, address $d020 in where the border colour is stored, and setting it to 0 means border is set to black.
So clearing the two memory addresses means the display will be all black.
When the screen is black, we jump to two different subroutines, one that clears the block data memory at address $2000, and the other will set the colours for all blocks to whatever register a contains. In this case, setting a to $10 means a bit with value 0 in the block data will be black (the low nibble of $10) and a 1 bit will be white (colour 1 in the C64 palette, from the high nibble of $10).
When colour RAM has been set up, we copy a bunch of data (defined later on) into the screen memory at $2000.
Now it’s time to switch to the correct display mode.
lda #3
sta $dd00
lda #$3b
ldx #$08
ldy #$18
sta $d011
stx $d016
sty $d018
The register at address $dd00 is for setting what part of the RAM is visible to the VIC-II. The VIC-II can only access 16K of RAM at a time, and setting this register to 3 will expose addresses from 0 to $3fff. The next few lines configures the VIC-II. As I stated before, register $d011 is Control register 1 and its companion register Control register 2 has the address $d018.
Writing $3b to $d011 will set the vertical scroll to 3, screen height to 25 rows (200 pixels), enable the screen and switch to Bitmap mode. Writing $08 to $d016 sets the horizontal scroll to 0 and setting the screen width to 40 columns (320 pixels).
Register $d018 tells the VIC-II from where the graphics data should be read. 8 in the low nibble sets the block pixel data to be read from $2000, and 1 in the high nibble, as you might have figured out will set the colour data to be read from $400.
ldx #$80
.wait jsr wait_vbl
dex
bne .wait
rts
This last bit just calls wait_vbl 128 times before returning from the subroutine.
Multicolour bitmap mode
In the multicolour mode pixels have double width, so each block of data consists of 4 by 8 pixels, but in the mode each blocks can contain 4 different colours. The first colour is defined in the background colour register $d021, so this is shared between all blocks. The next two colours are defined in the same way as the colours in the Standard bitmap mode, and the final colour is read from the hardwired Colour RAM at $d800 to $dbe7. In this example we only use two colours.
hello_multi_color_bitmap_mode: subroutine
lda #$0
sta $d011
jsr clear_screen
lda #$0
jsr fill_color_ram
ldy #88-1
.copy_row
lda multi_gfx,y
sta $2200,y
lda multi_gfx+88,y
sta $2200+320,y
dey
bpl .copy_row
Much of the code here is the same as for the standard bitmap mode, however I changed the graphics a bit, so it now consists of two rows of data, and also we offset the graphics a bit so it’s drawn on the right side of the screen.
lda #$b
sta $d020
sta $d021
lda #3
sta $dd00
lda #$3b
ldx #$18
ldy #$18
sta $d011
stx $d016
sty $d018
ldx #$80
.wait jsr wait_vbl
dex
bne .wait
rts
First, we set the border and background colours to grey by writing $b to $d020 and d021. It is followed by setting up the VIC-II registers. The only difference from the standard mode is that we set the bit to enable multi colour mode in $d016. Finally we wait for 128 frames before returning, just like before.
Help routines
Here comes the help routines we called from the code above, for clearing screen, filling colour data and waiting for VBL.
fill_color_ram: subroutine
ldx #0
.cram_clear
sta $400,x
sta $500,x
sta $600,x
sta $700,x
inx
bne .cram_clear
rts
Filling the the area we chose for block colour is pretty straight forward. We actually clear 24 bytes more than we need just to get simpler code.
clear_screen: subroutine
ldx #$00
lda #$20
stx $fb
sta $fc
ldy #0
ldx #32
lda #$0
.clear_loop
sta ($fb),y
iny
bne .clear_loop
inc $fc
dex
bne .clear_loop
rts
The clear_screen subroutine clears the blocks of pixel data. The screen data is almost 8KB, and this method of clearing data is not very fast, but the amount of code is relatively small. We’re using memory at $fb and $fc as a base pointer for the screen. The first four instructions just store $2000 at those locations. Following that we setup register x for counting and y as an index register used as an offset from the base address.
The clearing itself uses the (<address>),y
addressing mode. The effective address is calculated by taking the 16 bit address at <address> and adding the y register. First we loop 256 times and clear addresses $2000 to $20ff. Then we increase the byte at address $fc, which is the high byte of the screen memory, decrease register x and starts clearing the memory at $2100 unless x reached 0.
Now time for a small anecdote. I started writing this example a few years ago, but I was so confused about the result I got. The C64 seemed to be so random and I could not understand why the code was behaving to erratic. When I picked up the code again a few weeks ago, I realised that I was using the (<address>),x
addressing mode. Using register X instead of Y might seem like a small thing, but the effective address is calculated in a totally different way. Instead of reading the address from $fb and $fc, and the adding the y register, it adds the x register to $fb and $fc and uses the address from that location. This meant I was clearing more or less random bytes all over the RAM instead of the contiguous bytes in screen RAM. When I fixed the bug I found that the C64 made much more sense.
One last subroutine before the graphics data.
wait_vbl:
lda #$80
.w1: bit $d011
bpl .w1
.w2: bit $d011
bmi .w2
rts
There are probably other ways to wait for a vertical blank on the C64, but this method was pretty simple. The highest bit in the register at $d011 contains the high bit of a 9 bit counter that is increased every scan line. Since there are more than 255 scanlines, and the value is reset to 0 between frames, we know that when this bit goes from 1 to 0, a new frame is about to be drawn.
First we load register a with $80, which has only the high bit set. Then we loop until the value in the high bit is 1, followed by a similar loop until the high bit is 0 and that’s it.
Graphics data
Not sure how interesting this is, but I’m going to include it just for the sake of it.
gfx:
dc.b %00000000
dc.b %01000100
dc.b %01000100
dc.b %01000101
dc.b %01111101
dc.b %01000101
dc.b %01000100
dc.b %00000000
dc.b %00000000
dc.b %00000101
dc.b %11100101
dc.b %00010101
dc.b %11110101
dc.b %00000101
dc.b %11110101
dc.b %00000000
dc.b %00000000
dc.b %00000001
dc.b %00110001
dc.b %01001001
dc.b %01001001
dc.b %01001001
dc.b %00110000
dc.b %00000000
dc.b %00000000
dc.b %00010000
dc.b %00010011
dc.b %00010100
dc.b %01010100
dc.b %01010100
dc.b %10100011
dc.b %00000000
dc.b %00000000
dc.b %00000001
dc.b %00011001
dc.b %10100101
dc.b %10100001
dc.b %10100001
dc.b %00100001
dc.b %00000000
dc.b %00000000
dc.b %00001010
dc.b %00111010
dc.b %01001010
dc.b %01001010
dc.b %01001000
dc.b %00111010
dc.b %00000000
gfx_len equ *-gfx
multi_gfx:
dc.b 0,$00,$40,$40,$40,$40,$40,$40
dc.b 0,$00,$40,$40,$40,$40,$40,$40
dc.b 0,$00,$01,$01,$01,$01,$01,$01
dc.b 0,$00,$10,$10,$10,$10,$10,$10
dc.b 0,0,0,0,0,0,0,0
dc.b 0,$00,$10,$10,$10,$10,$10,$10
dc.b 0,$00,$10,$10,$10,$10,$10,$10
dc.b 0,0,0,0,0,0,0,0
dc.b 0,0,0,0,0,0,0,0
dc.b 0,$00,$10,$10,$10,$10,$10,$10
dc.b 0,$00,$04,$04,$04,$04,$04,$04
; row 2
dc.b $40,$55,$40,$40,$40,$40,$40,0
dc.b $41,$44,$44,$45,$44,$44,$41,0
dc.b $41,$11,$11,$51,$01,$01,$51,0
dc.b $10,$11,$11,$11,$11,$11,$10,0
dc.b $50,$04,$04,$04,$04,$04,$50,0
dc.b $10,$11,$11,$11,$11,$11,$04,0
dc.b $10,$11,$11,$11,$11,$11,$40,0
dc.b $50,$04,$04,$04,$04,$04,$50,0
dc.b $44,$51,$40,$40,$40,$40,$40,0
dc.b $10,$11,$11,$11,$11,$11,$10,0
dc.b $54,$04,$04,$04,$04,$04,$54,0
multi_gfx_len equ *-multi_gfx
And that is all. I enjoyed learning about the C64 hardware and I might return to it soon to make something more advanced than just a Hello World hack.
Coming up next is probably another Atari console.
Links
- Hello World, C64 source file on bitbucket
- dasm - Assembler tool, I used the version provided by brew.sh
- C64-Wiki - A wiki for the C64.