Hello C64

The Commodore 64 was one of the first computers I played around with. I never had one myself until recent years, but a friend of mine did and we had some good times playing games on his little breadbox. So I have been looking forward to learn more about the hardware of the C64.

Before we jump into the code, let me just say that I started with the code a few years ago, but I was stumped by some erratic behaviour of the system. I thought it was strange the way the 64 behaved, and thought it was just a weird machine. But last week when I picked up the code again, I realised there was a silly bug in my code that wrote zeros to more or less random locations in RAM. That is why it behaved strange, and when the bug was taken care of, I learned to enjoy my coding sessions.

The code

This time, I display Hello World in two different graphics mode, switching between them every few seconds. The two modes I used where both bitmap modes, so no character based modes, however the layout of the screen RAM is based on blocks of 8 by 8 pixels, or 4 by 8 in the Multi colour bitmap mode. Please note that the following code have had comments removed so I

Let’s start by checking out the assembly code. For this example code, I decided to run the program as a tokenised BASIC program. This kind of executables are always loaded at address $801, and contains BASIC tokens that will be executed BASIC interpret when you type run on the C64. Since I write all my code in assembly, the first few bytes are representation of the BASIC program 10 sys 2064.

	.processor 6502
	.org $801

	.byte $0c, $08, $0a, $00, $9e, $20
	.byte $32, $30, $36, $34, $00, $00, $00

Since we told the interpreter to jump to address $810, we tell the assembler to put this next segment at address $810. The we disable interrupts and start executing out main loop. This part is a bit different from the previous implementations I’ve made in this project. Since I wanted to try out more than one graphics mode, I separated the two put putting them in different subroutines and just jump to them after each other in a loop.

	.org $810
	sei
main_loop:
	jsr	hello_standard_bitmap_mode
	jsr	hello_multi_color_bitmap_mode
	jmp	main_loop

Standard bitmap mode

The first routine we execute is using the Standard bitmap mode, which can display 320 by 200 pixels, using 2 colours in every 8 by 8 block. In this mode, the pixel data are spilt into two areas, one where the 8x8 block data is stored, and in the other area we store what two colours should be displayed in each of the blocks. The Hello World graphics data is pretty boring, so the colour data will be set to the same two colours for all block of the screen. The classic white text on black background.

I decided to store the block data at address $2000 and the colours at $400, which you will see later in the code.

Let’s look at some code.

hello_standard_bitmap_mode: subroutine
	lda	#$0
	sta	$d011
	sta	$d020

	jsr	clear_screen

	lda	#$10
	jsr	fill_color_ram

	ldy	#gfx_len-1
.copy_loop
	lda	gfx,y
	sta	$2000,y
	dey
	bne	.copy_loop

This bit of code is setting up the data for the graphics, but why are we writing 0 to $d011 and $d020?

On the C64, the memory area $d000 to $d3ff can be mapped to the VIC-II chip, which is the chip that handles the graphics on the 64. At $d011 we find the Control register 1. Together with Control register 2 that we will see more of later, this register controls display mode and hardware scrolling. Setting bit 4 to 0 will disable displaying graphics, setting the whole screen to the border colour. Speaking of the border, address $d020 in where the border colour is stored, and setting it to 0 means border is set to black.

So clearing the two memory addresses means the display will be all black.

When the screen is black, we jump to two different subroutines, one that clears the block data memory at address $2000, and the other will set the colours for all blocks to whatever register a contains. In this case, setting a to $10 means a bit with value 0 in the block data will be black (the low nibble of $10) and a 1 bit will be white (colour 1 in the C64 palette, from the high nibble of $10).

When colour RAM has been set up, we copy a bunch of data (defined later on) into the screen memory at $2000.

Now it’s time to switch to the correct display mode.

	lda	#3
	sta	$dd00
	lda	#$3b
	ldx	#$08
	ldy	#$18
	sta	$d011
	stx	$d016
	sty	$d018

The register at address $dd00 is for setting what part of the RAM is visible to the VIC-II. The VIC-II can only access 16K of RAM at a time, and setting this register to 3 will expose addresses from 0 to $3fff. The next few lines configures the VIC-II. As I stated before, register $d011 is Control register 1 and its companion register Control register 2 has the address $d018.

Writing $3b to $d011 will set the vertical scroll to 3, screen height to 25 rows (200 pixels), enable the screen and switch to Bitmap mode. Writing $08 to $d016 sets the horizontal scroll to 0 and setting the screen width to 40 columns (320 pixels).

Register $d018 tells the VIC-II from where the graphics data should be read. 8 in the low nibble sets the block pixel data to be read from $2000, and 1 in the high nibble, as you might have figured out will set the colour data to be read from $400.

	ldx	#$80
.wait	jsr	wait_vbl
	dex
	bne	.wait
	rts

This last bit just calls wait_vbl 128 times before returning from the subroutine.

Multicolour bitmap mode

In the multicolour mode pixels have double width, so each block of data consists of 4 by 8 pixels, but in the mode each blocks can contain 4 different colours. The first colour is defined in the background colour register $d021, so this is shared between all blocks. The next two colours are defined in the same way as the colours in the Standard bitmap mode, and the final colour is read from the hardwired Colour RAM at $d800 to $dbe7. In this example we only use two colours.

hello_multi_color_bitmap_mode: subroutine
	lda	#$0
	sta	$d011

	jsr	clear_screen

	lda	#$0
	jsr	fill_color_ram

	ldy	#88-1
.copy_row
	lda	multi_gfx,y
	sta	$2200,y
	lda	multi_gfx+88,y
	sta	$2200+320,y
	dey
	bpl	.copy_row

Much of the code here is the same as for the standard bitmap mode, however I changed the graphics a bit, so it now consists of two rows of data, and also we offset the graphics a bit so it’s drawn on the right side of the screen.

	lda	#$b
	sta	$d020
	sta	$d021

	lda	#3
	sta	$dd00
	lda	#$3b
	ldx	#$18
	ldy	#$18
	sta	$d011
	stx	$d016
	sty	$d018

	ldx	#$80
.wait	jsr	wait_vbl
	dex
	bne	.wait
	rts

First, we set the border and background colours to grey by writing $b to $d020 and d021. It is followed by setting up the VIC-II registers. The only difference from the standard mode is that we set the bit to enable multi colour mode in $d016. Finally we wait for 128 frames before returning, just like before.

Help routines

Here comes the help routines we called from the code above, for clearing screen, filling colour data and waiting for VBL.

fill_color_ram: subroutine
	ldx	#0
.cram_clear
	sta	$400,x
	sta	$500,x
	sta	$600,x
	sta	$700,x
	inx
	bne	.cram_clear
	rts

Filling the the area we chose for block colour is pretty straight forward. We actually clear 24 bytes more than we need just to get simpler code.

clear_screen: subroutine
	ldx	#$00
	lda	#$20
	stx	$fb
	sta	$fc

	ldy	#0
	ldx	#32
	lda	#$0
.clear_loop
	sta	($fb),y
	iny
	bne	.clear_loop
	inc	$fc
	dex
	bne	.clear_loop
	rts

The clear_screen subroutine clears the blocks of pixel data. The screen data is almost 8KB, and this method of clearing data is not very fast, but the amount of code is relatively small. We’re using memory at $fb and $fc as a base pointer for the screen. The first four instructions just store $2000 at those locations. Following that we setup register x for counting and y as an index register used as an offset from the base address.

The clearing itself uses the (<address>),y addressing mode. The effective address is calculated by taking the 16 bit address at <address> and adding the y register. First we loop 256 times and clear addresses $2000 to $20ff. Then we increase the byte at address $fc, which is the high byte of the screen memory, decrease register x and starts clearing the memory at $2100 unless x reached 0.

Now time for a small anecdote. I started writing this example a few years ago, but I was so confused about the result I got. The C64 seemed to be so random and I could not understand why the code was behaving to erratic. When I picked up the code again a few weeks ago, I realised that I was using the (<address>),x addressing mode. Using register X instead of Y might seem like a small thing, but the effective address is calculated in a totally different way. Instead of reading the address from $fb and $fc, and the adding the y register, it adds the x register to $fb and $fc and uses the address from that location. This meant I was clearing more or less random bytes all over the RAM instead of the contiguous bytes in screen RAM. When I fixed the bug I found that the C64 made much more sense.

One last subroutine before the graphics data.

wait_vbl:
	lda	#$80
.w1:	bit	$d011
	bpl	.w1
.w2:	bit	$d011
	bmi	.w2
	rts

There are probably other ways to wait for a vertical blank on the C64, but this method was pretty simple. The highest bit in the register at $d011 contains the high bit of a 9 bit counter that is increased every scan line. Since there are more than 255 scanlines, and the value is reset to 0 between frames, we know that when this bit goes from 1 to 0, a new frame is about to be drawn.

First we load register a with $80, which has only the high bit set. Then we loop until the value in the high bit is 1, followed by a similar loop until the high bit is 0 and that’s it.

Graphics data

Not sure how interesting this is, but I’m going to include it just for the sake of it.

gfx:
  dc.b    %00000000
  dc.b    %01000100
  dc.b    %01000100
  dc.b    %01000101
  dc.b    %01111101
  dc.b    %01000101
  dc.b    %01000100
  dc.b    %00000000

  dc.b    %00000000
  dc.b    %00000101
  dc.b    %11100101
  dc.b    %00010101
  dc.b    %11110101
  dc.b    %00000101
  dc.b    %11110101
  dc.b    %00000000

  dc.b    %00000000
  dc.b    %00000001
  dc.b    %00110001
  dc.b    %01001001
  dc.b    %01001001
  dc.b    %01001001
  dc.b    %00110000
  dc.b    %00000000

  dc.b    %00000000
  dc.b    %00010000
  dc.b    %00010011
  dc.b    %00010100
  dc.b    %01010100
  dc.b    %01010100
  dc.b    %10100011
  dc.b    %00000000

  dc.b    %00000000
  dc.b    %00000001
  dc.b    %00011001
  dc.b    %10100101
  dc.b    %10100001
  dc.b    %10100001
  dc.b    %00100001
  dc.b    %00000000

  dc.b    %00000000
  dc.b    %00001010
  dc.b    %00111010
  dc.b    %01001010
  dc.b    %01001010
  dc.b    %01001000
  dc.b    %00111010
  dc.b    %00000000
gfx_len equ	*-gfx


multi_gfx:
	dc.b	0,$00,$40,$40,$40,$40,$40,$40
	dc.b	0,$00,$40,$40,$40,$40,$40,$40
	dc.b	0,$00,$01,$01,$01,$01,$01,$01
	dc.b	0,$00,$10,$10,$10,$10,$10,$10
	dc.b	0,0,0,0,0,0,0,0
	dc.b	0,$00,$10,$10,$10,$10,$10,$10
	dc.b	0,$00,$10,$10,$10,$10,$10,$10
	dc.b	0,0,0,0,0,0,0,0
	dc.b	0,0,0,0,0,0,0,0
	dc.b	0,$00,$10,$10,$10,$10,$10,$10
	dc.b	0,$00,$04,$04,$04,$04,$04,$04
	; row 2
	dc.b	$40,$55,$40,$40,$40,$40,$40,0
	dc.b	$41,$44,$44,$45,$44,$44,$41,0
	dc.b	$41,$11,$11,$51,$01,$01,$51,0
	dc.b	$10,$11,$11,$11,$11,$11,$10,0
	dc.b	$50,$04,$04,$04,$04,$04,$50,0
	dc.b	$10,$11,$11,$11,$11,$11,$04,0
	dc.b	$10,$11,$11,$11,$11,$11,$40,0
	dc.b	$50,$04,$04,$04,$04,$04,$50,0
	dc.b	$44,$51,$40,$40,$40,$40,$40,0
	dc.b	$10,$11,$11,$11,$11,$11,$10,0
	dc.b	$54,$04,$04,$04,$04,$04,$54,0

multi_gfx_len equ *-multi_gfx

And that is all. I enjoyed learning about the C64 hardware and I might return to it soon to make something more advanced than just a Hello World hack.

Coming up next is probably another Atari console.