Monthly Archives: July 2013

Hello Gameboy Advance

Last weekend while digging around on one of my USB sticks, I found some old test code I wrote for the GBA over 10 years ago. I though the files were lost forever, so I was quite happy that I found them. The code itself was written in an obscure assembler for Windows, so my first goal was to port it to a more modern tool that preferably could cross assemble on both my PC and Mac. I decided to give the vasm a try. After some initial problems with strange syntax and me not understanding what the error messages meant, I got it to assemble into a binary file. To my surprise, the resulting binary could be loaded into the emulator and it worked the way it should! From there I cleaned up the code, removing stuff I didn’t need for the Hello World Project and added the result to the repository.

The code

This is by far the shortest example to date. The video mode I chose made it very easy to draw graphics on screen, and the setup required is minimal. A short note before we continue; the semantics of the data types are not the same as on the 68k processor. On Arm, a word is 32-bit, while on the 68k it’s 16-bit. This can occasionally cause some confusion, at least it has for me when my focus was not high enough.

At the beginning we tell the assembler to use the arm instruction set (which is 32 bit, as opposed to the thumb instruction set that has 16 bit instructions). Then we tell it that the following code will be executed at position 0x08000000, which is where the GBA ROM entry point is.

The first instruction we run makes a jump (branch) over the ROM header. After the branch, we define the GBA header. I have no idea why the values in the header are what they are or if they are correct, but this is what my old code did, and the emulators I’ve tried haven’t complained yet.

Lets begin by setting the display mode. By setting Mode 4, we get 256 colour chunky graphics mode, i.e. each byte in the graphics memory corresponds to the color at that position in the palette. In Mode 4, only background 4 is used, so we need to enabled that background layer.

Time to setup the palette. Since the graphics only consists of two colours, we read both colours from the palette data in one go and writes them to the palette memory. Each entry in the palette consists of two bytes, with packed RGB data, 5 bits per channel. The least significant bits are the red channel, the next 5 bits are green and finally 5 bits representing the blue channel. The most significant bit is ignored.

Time to update the screen with some pixels. Here we setup two nested loops, on for row and one for column. For each pixel on each row, a color entry is read from the pixel data and written to screen. To optimize a bit, we handle four pixels at a time, and also, it is not possible to write just one byte to the VRAM; doing so will result in the same value being written to the other byte in the other half of the 16-bit location.

There is one thing in the loop above that might be hard to understand without further explanation. I’m thinking about the [r4,#4]! syntax. What is does is access the data pointed to by r4 + 4 bytes; the exclamation mark at the end indicates that the r4 register will be updated with the same offset that was used in the access. So, 4 will be added to r4 after the data access. r3 and r4 will therefore increase after each read/write, so the source and destination addresses will be updated for every iteration in the loop.

We have reached the end of the executable code and here we just enter an infinite loop by jumping to the same location forever.

Now all code is done and we reach the data. First we have two entries of colour data; black and white. These are the values that are written to the palette register.

And finally the graphics data. Same graphics as the other examples, except here we have indexed colour mode, so 0 means colour 0 in palette, and 1 means colour 1 in palette. Simple as that.

That is it. I’m quite fond of the ARM assembly language, it’s quite readable compared to, oh say PowerPC assembler code.


There are many site on the internet dedicated to programming Gameboy and Gameboy Advance. Here are a few links that might be useful. Since the original version of my code was written over 10 years ago, I can’t give links to the resources I used initially, but I hope the links below fill your needs.

Hello Sega Master System

Long overdue, it time for another Hello World hack, and this time it’s for the 8bit console Sega Master System (SMS). Based on the Z80 it will be the first system in this series that is not based on the Motorola M68000 CPU. I learned to code the SMS in 2005, and have release two tiny demo hacks under the alias blind io (you can find them here if really want to see them).

The major difference between the SMS and the Atari ST and Amiga is how the graphics hardware works. Most 8- and 16bit consoles, are based on tiled graphics and sprites. What this meas is that the screen is split into blocks of 8 by 8 pixels, and what tile is displayed one of those blocks is read from a tile map. Sprites are also blocks of pixels, but can be moved freely around the screen. The number of sprites are limited and only a few of them can be displayed on the same scanline due to restrictions in the hardware.

The tool used to assemble the code to a ROM image is called WLA DX. It is a great tool and can assemble code and produce ROM images for several different platforms. In this post, I’m going leave out most of the directives that tell the assembler about the output format and such, and focus on the Z80 code. For the full source, you can go the project on bitbucket.

The code

To start off, we tell the assembler to place the following code at address 0. This is where the Master System fetches starts to execute once the logo has been displayed.

In interrupt mode 1, the cpu jumps to address $38 when an interrupt occurs.
Only the VDP can trigger normal interrupts on the SMS, either every VBL and/or
every scanline. Address $66 is where the CPU jumps when a non-maskable interrupt occurs. This happens to be connected to the Pause button on the Master System. For us, lets just ignore the interrupts and return immediately.

Now we can do the thing we came here to do; show some graphics. First we must setup the VDP, which is the video controller. A bit further down in the listing, there is a section of register values for the video chip, and here is the code that writes these values to the data port connected to the VDP.

otir is an interesting command. It actually does several things. First off, it takes the byte pointed to by hl and output it to the port contained in register c and increases hl by one. Then it decreases the value in register b, and if the result is not 0, the program counter is changed to run the same command again.

Once the VDP configuration is done, we upload the graphics data tiles to the tile RAM inside the VDP. The address in the VDP to where we want to place the tiles is $4020 which we output to VDP control port ($bf). The highest two bits are in reality not part of the address, but tells the VDP that we want to write to the video ram (VRAM). By using offset $20 in VRAM, we leave the first tile empty so all unused blocks on the screen are left black. Note: Running this on real hardware will probably leave some junk on screen since the content of the VRAM might not be initialized to 0.

Then we setup the registers for another otir instruction. The port to write data to the VDP is $be, and it’s quicker to decrease $bf by one than to write the new value to the register.

Now we have uploaded the tile data, but we must still set the colour palette. Same procedure as when we uploaded graphics data, by the colour palette address is $c000 (actually the two highest bytes indicate that we should write to colour memory (CRAM), and the following zeros indicate offset in CRAM), and the number of bytes in the palette is 16.

One thing remains, we must tell the VPD what tiles to display at what position. Since we uploaded the tile data to tile number 1 to 6, we should write values 1 to six in the first 6 entries in the tile map. Tile map entries are 16 bit, so we need to write a 0 every second byte. In the code below, we write 0 to register e and writes that after we have written the value in register a, which we use to count from 1 to 6. The last thing we to is just loop forever.

Here comes the VDP setup data. If you want to know what all these bits and values mean, I suggest you go to the development section of the SMS Power homepage in the link at the bottom of this post.

The palette is quite straight forward. RGB data with 2 bits each. This give us a palette of 64 possible colours. 16 entries. There are actually two different palettes you can use on the Master System, one for background tiles and on for the sprites. Since no sprites are used in this short sample code, we only set the tile palette.

Here comes the tile data. It’s the same graphics as with the Atari and Amiga version, but it has been converted to tiles. The tiles are 8*8 pixels with four bitplanes where the first four bytes of data are the four bitplanes for the first row etc. This means that every tile is 32 bytes of data.

That is it really. For more info see the links below. I apologize for any errors and weird stuff in this blog post, it’s also 5 in the morning and I have written most of this post during the night.

The next installment in this blog series might come sooner than you think. 🙂


  • Z80 CPU User Manual – All you need to know about the Z80
  • SMS POWER! – A goldmine for Sega Master System lovers.
  • MEKA – Emulator with debugging possibilities.
  • WLA DX – Multi platform cross assembler