Author Archives: Mikael Degerfält

Hello C64

The Commodore 64 was one of the first computers I played around with. I never had one myself until recent years, but a friend of mine had one and we good times play games on his little breadbox. So I have been looking forward do learn more about the hardware of the C64.

Before we jump into the code, let me just say that I started with the code a few years ago, but I was stumped by some erratic behaviour of the system. I thought it was strange the way the 64 behaved, and thought it was just a weird machine. But last week when I picked up the code again, I realised there was a silly bug in my code that wrote zeros to more or less random locations in RAM. That is why it behaved strange, and when the bug was taken care of, I learned to enjoy my coding sessions.

The code

This time, I display Hello World in two different graphics mode, switching between them every few seconds. The two modes I used where both bitmap modes, so no character based modes, however the layout of the screen RAM is based on blocks of 8 by 8 pixels, or 4 by 8 in the Multi colour bitmap mode. Please note that the following code have had comments removed so I

Let’s start by checking out the assembly code. For this example code, I decided to run the program as a tokenised BASIC program. This kind of executables are always loaded at address $801, and contains BASIC tokens that will be executed BASIC interpret when you type run on the C64. Since I write all my code in assembly, the first few bytes are representation of the BASIC program 10 sys 2064.

Since we told the interpreter to jump to address $810, we tell the assembler to put this next segment at address $810. The we disable interrupts and start executing out main loop. This part is a bit different from the previous implementations I’ve made in this project. Since I wanted to try out more than one graphics mode, I separated the two put putting them in different subroutines and just jump to them after each other in a loop.

Standard bitmap mode

The first routine we execute is using the Standard bitmap mode, which can display 320 by 200 pixels, using 2 colours in every 8 by 8 block. In this mode, the pixel data are spilt into two areas, one where the 8×8 block data is stored, and in the other area we store what two colours should be displayed in each of the blocks. The Hello World graphics data is pretty boring, so the colour data will be set to the same two colours for all block of the screen. The classic white text on black background.

I decided to store the block data at address $2000 and the colours at $400, which you will see later in the code.

Let’s look at some code.

This bit of code is setting up the data for the graphics, but why are we writing 0 to $d011 and $d020?

On the C64, the memory area $d000 to $d3ff can be mapped to the VIC-II chip, which is the chip that handles the graphics on the 64. At $d011 we find the Control register 1. Together with Control register 2 that we will see more of later, this register controls display mode and hardware scrolling. Setting bit 4 to 0 will disable displaying graphics, setting the whole screen to the border colour. Speaking of the border, address $d020 in where the border colour is stored, and setting it to 0 means border is set to black.

So clearing the two memory addresses means the display will be all black.

When the screen is black, we jump to two different subroutines, one that clears the block data memory at address $2000, and the other will set the colours for all blocks to whatever register a contains. In this case, setting a to $10 means a bit with value 0 in the block data will be black (the low nibble of $10) and a 1 bit will be white (colour 1 in the C64 palette, from the high nibble of $10).

When colour RAM has been set up, we copy a bunch of data (defined later on) into the screen memory at $2000.

Now it’s time to switch to the correct display mode.

The register at address $dd00 is for setting what part of the RAM is visible to the VIC-II. The VIC-II can only access 16K of RAM at a time, and setting this register to 3 will expose addresses from 0 to $3fff. The next few lines configures the VIC-II. As I stated before, register $d011 is Control register 1 and its companion register Control register 2 has the address $d018.

Writing $3b to $d011 will set the vertical scroll to 3, screen height to 25 rows (200 pixels), enable the screen and switch to Bitmap mode. Writing $08 to $d016 sets the horizontal scroll to 0 and setting the screen width to 40 columns (320 pixels).

Register $d018 tells the VIC-II from where the graphics data should be read. 8 in the low nibble sets the block pixel data to be read from $2000, and 1 in the high nibble, as you might have figured out will set the colour data to be read from $400.

This last bit just calls wait_vbl 128 times before returning from the subroutine.

Multicolour bitmap mode

In the multicolour mode pixels have double width, so each block of data consists of 4 by 8 pixels, but in the mode each blocks can contain 4 different colours. The first colour is defined in the background colour register $d021, so this is shared between all blocks. The next two colours are defined in the same way as the colours in the Standard bitmap mode, and the final colour is read from the hardwired Colour RAM at $d800 to $dbe7. In this example we only use two colours.

Much of the code here is the same as for the standard bitmap mode, however I changed the graphics a bit, so it now consists of two rows of data, and also we offset the graphics a bit so it’s drawn on the right side of the screen.

First, we set the border and background colours to grey by writing $b to $d020 and d021. It is followed by setting up the VIC-II registers. The only difference from the standard mode is that we set the bit to enable multi colour mode in $d016. Finally we wait for 128 frames before returning, just like before.

Help routines

Here comes the help routines we called from the code above, for clearing screen, filling colour data and waiting for VBL.

Filling the the area we chose for block colour is pretty straight forward. We actually clear 24 bytes more than we need just to get simpler code.

The clear_screen subroutine clears the blocks of pixel data. The screen data is almost 8KB, and this method of clearing data is not very fast, but the amount of code is relatively small. We’re using memory at $fb and $fc as a base pointer for the screen. The first four instructions just store $2000 at those locations. Following that we setup register x for counting and y as an index register used as an offset from the base address.

The clearing itself uses the (<address>),y addressing mode. The effective address is calculated by taking the 16 bit address at <address> and adding the y register. First we loop 256 times and clear addresses $2000 to $20ff. Then we increase the byte at address $fc, which is the high byte of the screen memory, decrease register x and starts clearing the memory at $2100 unless x reached 0.

Now time for a small anecdote. I started writing this example a few years ago, but I was so confused about the result I got. The C64 seemed to be so random and I could not understand why the code was behaving to erratic. When I picked up the code again a few weeks ago, I realised that I was using the (<address>),x addressing mode. Using register X instead of Y might seem like a small thing, but the effective address is calculated in a totally different way. Instead of reading the address from $fb and $fc, and the adding the y register, it adds the x register to $fb and $fc and uses the address from that location. This meant I was clearing more or less random bytes all over the RAM instead of the contiguous bytes in screen RAM. When I fixed the bug I found that the C64 made much more sense.

One last subroutine before the graphics data.

There are probably other ways to wait for a vertical blank on the C64, but this method was pretty simple. The highest bit in the register at $d011 contains the high bit of a 9 bit counter that is increased every scan line. Since there are more than 255 scanlines, and the value is reset to 0 between frames, we know that when this bit goes from 1 to 0, a new frame is about to be drawn.

First we load register a with $80, which has only the high bit set. Then we loop until the value in the high bit is 1, followed by a similar loop until the high bit is 0 and that’s it.

Graphics data

Not sure how interesting this is, but I’m going to include it just for the sake of it.

And that is all. I enjoyed learning about the C64 hardware and I might return to it soon to make something more advanced than just a Hello World hack.

Coming up next is probably another Atari console.


Hello Sega Mega Drive

After a long absence, I’m back with a long overdue post about the Sega Mega Drive. The code itself was completed over 4 years ago, so my memory of it is a bit hazy. I have however been writing new MegaDrive code recently so this is the perfect time to finally write this post.

The Sega Mega Drive (MD) is based on the Motorola 68000 CPU and its graphics chip (VDP) is a more advanced version of the VDP in the Sega Master System. The main CPU have 64K of memory, and the VDP as an extra 64K internal RAM for storing graphics, sprite positions, color palettes and more.

The Code

The code in this section is not the complete code, and it is moved around a bit compared to the original to make it more coherent when explaining. With that said, let’s dive into it.

This first section starts with defining some constants required later in the code. The VDP is the Video Display Processor, and it is access through two different addresses. Address $c00004 is used for writing control commands to the VDP. When you want write data to the VDP memory, you first write a command to the control port to select what address you want to write to. The actual data is then written to the DATA port at $c00000. More on this later.

The entry point of a normal MD cartridge is located at address $200 (512 in decimal terms). The first 512 bytes of the cartridge contains vectors the CPU use for knowing what code to execute at specific times. For example, at address 4 in the ROM the initial address to execute is stored. It this example that would be $200, since that is where we tell the assembler to locate our code with the org $200 command.

So what does the code at $200 do? When the original Mega Drive was released, some game companies release unlicensed game which made Sega a bit mad. Therefore they made a slight modification to the following hardware revision to require the game write the string ‘SEGA’ to a special address when the console is booted, or the VDP will not display anything on screen. This rendered the old unlicensed game unplayable. But all games released after that required the developers to add this little snippet of code (or something similar) at the start of the game.

What we do is get the console version number from address $a10001. If the version is more than 0, we write ‘SEGA’ to the undocumented hardware register $a14000. This will unlock the VDP to work as expected. Finally we jump to the setup_vdp subroutine.

When the VDP is initialised we return here to upload the graphics and write the tile map to the VDP. The first instruction sets CPU register a0 to $C00000, the address for the VDP data port, which means writing to 4(a0) will write to $C00004, the VDP control port. Except for choosing what address to write to in the VDP, telling the VDP to change the write address after each write is the most common thing you need to do. When writing contiguous data to the VDP RAM, you need to write a 2 to VDP register 15. Why a 2 you might ask? It’s because we write two bytes at a time to the VPD, so the address should be updated by 2. Writing to a register in the VDP is done by setting the high byte of a word to the register number, the low byte to the value you want to set the register to, and then set the highest bit of the word before you write it to the VDP control port. Easy! That’s where the #$8f02 is coming from.

The rest of the code that follows is a loop to clear the VDP RAM. Since the VDP RAM is $10000 bytes long, and we write 4 bytes per move we loop $4000 times. Writing $40000000 to VDP control port tells the VDP to write to address 0 of its main RAM.

Let’s write some color data to the internal color RAM of the VDP. I know I said the VDP has 64K internal RAM, but that is not entirely true. It also has some special memory dedicated to color and scrolling. To tell the VDP we want to write to address 0 of the color RAM, write $C0000000 to the control port. Now we can write the actual color data.

The MD have 4 different palettes of 16 color each. Every tile can use of of the palettes to select what colors to use. For every color there are two bytes to store the RGB values. Three bits are used for each of the RGB elements. The binary representation of the bytes (big endian) would be %0000bbb0ggg0rrr0.

Writing #$00000eee to color RAM address 0 means the first entry is be black, and the second color entry is white.

Now that the VDP RAM is empty, and palette is setup we can copy all the graphics data to the VDP. tile_set is the location in the ROM where the tile graphics is located, and we want to write it to address 0 of the VDP. The copy loop just writes data to the VDP RAM until it reaches the end of the tile data.

When we initialised the VDP (read more about it in the next section) we told the VDP to read the tile map for Scroll A (the MD have to separate planes, called Scroll A and Scroll B, to allow for parallax scrolling) from address $2000 in the VDP RAM. More info on what value to write to the control port to select address can be found here.

The tile map consists of a number of words describing what tile data to use, what palette to use, and other display properties. The lowest 11 bits contains the tile number, which is the only data we are interested in in this code. Our ‘Hello world!’ graphics consists of 6 tiles, which should be displayed in order to make any sense, so we write 1 through 6, combined into 32 bit writes to the address the Scroll A plane is located in the VDP RAM.

When the tile map is written, we are all done so we can stop doing anything useful by halting the CPU until there is an interrupt, and the just halt it again until the end of times. Or the power is cut.

VDP init and data

The VDP have a set of registers that controls how it displays the graphics. This loop reads the configuration from the data defined at memory vdp_regs and writes it to the VDP control port. Simple enough, but the important thing are the values we write to the registers.

The VDP register data contain all necessary configuration of the VDP. To have anything at all display, register 1 (mode register 2) is the most important, since it contains a bit to enable the display altogether. Registers 2, 4, and 6 selects at what address in VDP ram it should fetch graphics data for rendering, so those are also a bit more important than some others.

For a full description of each register you should, read more at the wiki at

Finally comes the tile data. The tiles consists of 8 by 8 pixels of 16 color indices. That means each byte holds color data for 2 pixels, and in total 32 bytes is required for every tile.

The data below might look a bit strange since it consists of only ones and zeros. It almost seems like binary data, but it is hexadecimal. As we only use palette entry 0 and 1, it wouldn’t make sense to have any other values in the data.

That is is. We are done. Finally. Hopefully it will be less than 3 year until next entry.


Atari hardware project idea

A few days ago, I started thinking about how difficult it’s becoming to find a good monitor for my Atari, especially one that I can easily bring with me on demo parties and such.

Then it occurred to me that my CosmosEx that is mounted inside the ST have a Raspberry Pi inside it, with both an HDMI and a composite video out. What it also have is a high speed connector for camera hardware, capable of streaming HD video in realtime. What I realised is that, with some custom hardware with an FPGA one could get the video signal directly from the shifter and feed it to the Pi in realtime. All it would take is a pretty simple program on the Pi to display the ST:s video output on any screen.

After studying the schematics for the STe and ST, the respective shifter seems to output 4bit (for STe, 3bit on ST) digital RGB values that is then turned into an analog signal. Hooking into these pins should be fairly simple, and together with the pixel clock and the sync signals, an FPGA should be able to convert the color data and send it as an image through the CSI-2 port to the Pi.

My problem is that I know very little of FPGA and hardware development. I would like to be able to build this by myself, but if some pro could help out, that would be awesome.

Hello Amiga OCS


Weeks later than I had originally planned, the Amiga OCS is now greeted with a small Hello World sample. Since this is my first time coding the Amiga, most of the time was spent on reading various hardware documentation on what registers to set and why. As I have somewhat more experience with the Atari ST, I will comment on the difference between the two platforms as well as trying to explain what the Amiga code does.

After some trial and error coding, I decided to write a more system friendly version for the Amiga, to keep the code-build-test cycle as quick as possible. It took way too much time to restart the emulator after every test.

The Amiga and Atari are both based on the Motorola 68000 CPU family, so I am well-versed in the assembly language. The OS and other hardware on the other hand, that was not as easy to get the grips on.

The first thing to do in our little program is to save the current state of all the registers our program will change, so we can restore them to the same state when the program exist. For the address of the default copper lists (more info on the copper later) I load the graphics.library via the OldOpenLibrary system call, and get the addresses for the two lists.

The address to the exec library, which is the core OS module of the Amiga that handles loading of libraries and other basic needs, can be found at position $4 in the memory.

Load “graphics.library” by placing the address to library name string in CPU register a1, and jump to the “OldOpenLibrary” function at offset -408 from the exec library.

The return value will be in d0.

Now the handle to the graphics.library is in register a1, where we can use it as an pointer. Now it’s time to get the address to where we want to save the register data and save the copper list addresses in that memory area. When we are done with that, we can close the graphics library.

Since a2 pointer de-reference used the post-increment syntax, the content of a2 is counted up by the size of data that was written. We continue to use this syntax as we save some other registers that we need to change later on.

Now that we have saved the registers, we can move to the core business – Setup the screen and draw some data to it.

First off, lets only enable DMA for bitplanes and copper. Writing to the DMA and IRQ registers with the highest bit (16) cleared, the functionality corresponding to the set bits will be disabled. To enable functionality, you must write the mask for the functionality you want to set plus the highest bit set. So writing $7fff will disable all, and $ffff will enable all.

Reusing the graphics from ST version, lets copy our graphics to screen. The Amiga screen is a bit different than of the ST. The bitplanes of the Atari are interleave, where on the Amiga, each bitplane is stored in one continuous block of memory.

Set up the copper data. Most of the data in the copper list have been prepared beforehand, but the address the the screen memory must be updated once the code is executing.

Write the address to our copper list data to the co-processor hardware register. Writing to the strobe register reloads the register and starts to execute the list.

Loop until the left mouse button is pressed.

Restore Registers. Like I mentioned before, the highest bit must be set to enable functionality.

Code is done! Now we have to add the graphics and data used by the program. The section statement below tells the assembler to put the data in the “chip memory”, which on the Amiga is memory that can be accessed by the DMA sub-system.

Define lib_name variable as a string containing the name of the graphics library that we load at the beginning of the code. The even statement informs the assembler that the statement should be placed on an even address, since reading 16- or 32-bit values from odd addresses result in an error.

Reuse graphics from the Atari ST version. Since both the Amiga and Atari ST both use bitplane graphics, the data can be reused without changes.

Finally we reach the copper list. The copper, with is kind of a good nick name for co-processor, is a processing unit that can execute a few simple types of instruction. The most common probably being writing data to a hardware register in the memory space of the hardware registers.

Just for fun I added a so called copper bar, or raster bar. To achieve the similar effect on Atari, a lot of more work is involved. Some day I might show you how it’s done on the ST.

The last entry in the copper list tells the copper to wait until it reaches a line that is below the end of the screen, and will therefore not do anything more.

BSS section where we reserve some memory for the registers we save. We also reserve memory enough for one bitplane at a resolution of 320*256.

That’s it. Assemble and run.

Disclaimer: I don’t know enough of the Amiga to say that this code does everything it’s supposed to do. It might have some unknown side effects that I am not aware of. There might also be factual errors in the blog post. Feel free to point them out if you find any.


Here are some useful links if you are interested in learning more and test some Amiga programming yourself.