SEGA Genesis: Framebuffer Rendering
As discussed in earlier articles, the SEGA Genesis renders graphics using a combination of tiles and sprites, and does not directly support framebuffer rendering. However, the Genesis Video Display Processor (VDP) can be tricked into doing this by setting up a fixed playfield character mapping and rendering directly to character data. In this article, different types of graphics will be discussed, and sample framebuffer implementations on the Genesis will be shown.
Terms
The following terms are used in the following, it's probably good to define them here:
- Bitmap: Spatially mapped pixel array, a sequence of pixels ordered in memory in a similar fashion to how they should be displayed.
- Framebuffer: Memory containing a bitmap, driving a video display
Types of Rendering
To appreciate the difference between framebuffer and tile-based rendering, it is useful to look at a few foundational types of computer graphics rendering used by video games throughout the years.
Vector Graphics
The very first games such as Steve Russel's Spacewar! (1962) controlled the electron beam of their cathode ray tube (CRT) displays directly, drawing lines of varying light intensity between points. This form of graphics is known as vector graphics.
Atari would create particularly famous examples of vector graphics rendering, including 'Lunar Lander' and 'Asteroids' from 1979, first-person shooter 'Battlezone' from 1980, and 'Star Wars' from 1983. In 1982, Los Angeles-based Smith Engineering released the Vectrex, a home console with a built-in vector display.
Vector displays have become rare, but it is highly recommended to try out original arcade games like Asteroids to see this type of display, the contrast and smoothness can look very different from a normal CRT display.
Raster Bar Graphics
In home TV screens, the electron beam would automatically scan the screen left-to-right, top-to-bottom, 50 or 60 times per second in a horizontal line pattern. This fixed pattern is called 'raster scanning'. Atari's 'Pong' (1972) was designed to be compatible with this type of TV display, and instead of controlling the electron beam's position freely, the beam output intensity would be controlled as a function of the x- and y-position of the electron beam, which would create colored areas on the screen. This approach normally results in blocky graphics with right angles. We might call this 'Raster Bar Graphics'.
The approach of changing the beam color on the fly would be used extensively on the Atari 2600 home console, where incredibly simple graphics hardware forced developers to make very creative choices. Read Nick Montforts and Ian Bogost's 'Racing the Beam' for great stories about the challenges of Atari 2600 development.
Tiles and Sprites
Graphics in many video games from the 1980s and early 1990s with was implemented using a combination of tiles, tilemaps, and sprites. Tiles and sprites would defined using bitmaps, spatially mapped arrays of pixel color data, and tilemaps would map tiles to display positions in a fixed grid on the screen. Video controllers would take this representation of tiles and sprites and display it directly on the screen. We will refer to this type of graphics as tile-based graphics.
The benefit of tile-based graphics is lower memory usage, something absolutely critical during this era, as memory chips were prohibitively expensive. With tiles and tilemaps, full screens of graphics can be represented using a small set of tiles, repeated in the tilemap to form larger graphics. Essentially, it is a form of manual image compression.
A classic example of tile-based graphics is Namco's Pac-Man (1980), where the maze and dots are displayed using tiles. This rendering type would be ubiquitous in arcade games in the 1980s and 1990s, and several home consoles would be designed this way, including Nintendo's NES (1983) and of course, the SEGA Genesis (1988).
Framebuffer Rendering
Computers from this era would also have tiles and sprites, but would often require more flexibility to render application GUIs, graphs, word processing, and the like.
This would be accomplished by rendering graphics in software to a bitmap in memory, called a framebuffer, and then displayed. This approach required more memory, but enabled the ultimate rendering flexibility. First off, previously mentioned rendering techniques could be supported: vector graphics could be emulated in software and rendered to a framebuffer, and, tiles and sprites could be supported in hardware and rendered on top of a framebuffer, and raster bar graphics could be emulated by modifying the color palette during scanline rendering.
Besides emulating existing techniques, framebuffer rendering enabled new techniques such as filled polygon rendering or raycasting 3D graphics.
SEGA Genesis Rendering
The SEGA Genesis does not have a hardware framebuffer, and rendering to a bitmap in RAM is not immediately supported. Most games use the VDP's built-in tile-based rendering with an overlay of sprites. However, if we want to do framebuffer rendering, an approach could be setting up a fixed playfield map where every cell on the screen is mapped to a unique character, and rendering individual pixels to characters in VRAM, which is then automatically displayed by the VDP.
This would enable types of graphics that are normally impossible on the Genesis, such as 3D polygons.
This approach should work, but the layout of characters in VRAM is not the most intuitive for raster graphics. We can simplify rendering by introducing a RAM framebuffer with a simple layout. This separates the rendering itself from mapping pixels to the layout of characters in VRAM. We can render pixels to the framebuffer, and then copy the framebuffer to character data in VRAM with the appropriate translation of memory layout.
First, we should do an analysis whether this approach is even possible on the Genesis hardware.
Is Framebuffer Rendering Possible on Genesis?
So, how much does a fullscreen framebuffer take up in memory? Can we fit a framebuffer into 64 KB RAM?
The Genesis has two basic display modes, 32 cell and 40 cell mode (256 and 320 pixels wide, respectively). In this example, we'll use 32 cell mode (256 pixels). In this mode, we have:
32 x 28 = 896 chars
How much memory is does a single character take up?
- Each char is 8x8 pixels
- Each pixel is 4 bits of color (0-15)
- Each char:
8 * 8 * 4 bits/char = 256 bits/char / 8 bits/B = 32 B/char
In total, the 32 cell framebuffer needs
896 chars * 32 B/char = 28672 B = 28 KB ($7000 B)
So, to answer our question: Yes, we can not only fit 1, but 2 full 28 KB framebuffers into 64 KB RAM.
The RAM framebuffer could have a multitude of layouts. As long as the procedure for copying to VRAM is simple enough to be reasonably fast, many options exist.
Let's consider two basic layout choices for the framebuffer: linear bitmap and character bitmap. In the linear bitmap, pixels are ordered sequentially in memory. In the character bitmap, memory is organized into 8x8 characters.
Linear Bitmap Framebuffer
The linear bitmap framebuffer has the same layout as the screen. Rendering code will be as simple as it gets.
- Simple rendering: Framebuffer is layed out as a matrix of display pixels, each pixel is 4 bits of color.
- Non-trivial VRAM copy: Copying to VRAM is simple, but slow.
Example linear 256x224 pixel bitmap - x and y values are pixel positions, the numbers in the table denote memory byte offsets, every 8-bit byte represents 2 4-bit color palette indices)
x/y 0 2 4 6 8 10 12 14 248 250 252 254
---------------------------------------------------------
0 | 0 1 2 3 | 4 5 6 7 | ... | 7C 7D 7E 7F
1 | 80 81 82 83 | 84 85 86 87 | ... | FC FD FE FF
2 | 100 101 102 103 | 104 105 106 107 | ... | 17C 17D 17E 17F
3 | 180 181 182 183 | 184 185 186 187 | ... | 1FC 1FD 1FE 1FF
4 | 200 201 202 203 | 204 205 206 207 | ... | 27C 27D 27E 27F
5 | 280 281 282 283 | 284 285 286 287 | ... | 2FC 2FD 2FE 2FF
6 | 300 301 302 303 | 304 305 306 307 | ... | 37C 37D 37E 37F
7 | 380 381 382 383 | 384 385 386 387 | ... | 3FC 3FD 3FE 3FF
| ---------------------------------- ... ----------------
8 | 400 401 402 403 | 404 405 406 407 | ... | 47C 47D 47E 47F
9 | 480 481 482 483 | 484 485 486 487 | ... | 4FC 4FD 4FE 4FF
:
etc.
Example code for copying a linear framebuffer from RAM to VRAM. It reads 8x8 squares of pixels from the framebuffer and copies it to character data in VRAM, one character at a time. Full source code on GitHub:
vdp_control = $C00004
vdp_data = $C00000
; ----------------------------------------------------------------------------
; copy_framebuf( framebuf_addr (A0.L), VRAM_char_memory (D0.w) )
; ----------------------------------------------------------------------------
; Copies linear framebuffer from framebuf_addr in RAM (e.g. $E00000) to
; VRAM_char_memory in VRAM (e.g. $0000)
;
; A char is 8 pixels wide, 4 bits/pixel = 32 bits = 4 bytes
;
; RAM base + $000 $004 $008 $09C
; Char0Line0 Char1Line0 Char2Line0 ... Char31Line0
; RAM base + $080 $084 $0FC
; Char0Line1 Char1Line1 Char2Line1 ... Char31Line1
; RAM base + $100 $104 $1FC
; Char0Line2 Char1Line2 Char2Line2 ... Char31Line2
copy_framebuf
movem.l d0-d7/a0-a6,-(sp) ; push all registers to stack
move.w sr,-(sp) ; save status register on stack
move #$2700,sr ; disable interrupts
move.w #$8F02,vdp_control ; Set VDP autoincrement to 2 words/write
; setup_vram_write( addr (D0.w) )
jsr setup_vram_write
move.w #28-1,d5 ; d5 : char row idx
write_char_row
move.w #32-1,d6 ; d6 : char idx
write_char
move.w #8-1,d7
move.l a0,a1
.write_line
move.l (a1),d0 ; d0 = line pixels
move.l d0,vdp_data ; write line to VRAM
add.l #$80,a1 ; next line
dbra d7,.write_line
.next_char
add.l #4,a0
dbra d6,write_char
add.l #$380,a0
dbra d5,write_char_row
move.w (sp)+,sr ; restore status register
movem.l (sp)+,d0-d7/a0-a6 ; pop all registers from stack
rts
; ----------------------------------------------------------------------------
; setup_vram_write( addr (D0.w) )
; ----------------------------------------------------------------------------
setup_vram_write
movem.l d0/d5-d6,-(sp)
; Bits [BBAA AAAA AAAA AAAA 0000 0000 BBBB 00AA]
; B order [10.. .... .... .... .... .... 5432 ....] oper. type
; A order [..DC BA98 7654 3210 .... .... .... ..FE] address
; -------------------------------------------------------------
; VRAM Write (00 0001) to addr $0000 (0000 0000 0000 0000):
; [01.. .... .... .... .... .... 0000 ....] oper. type
; [0100 0000 0000 0000 .... .... 0000 ..00] VRAM address
; [0100 0000 0000 0000 0000 0000 0000 0000] add zeroes
; Hex: 4 0 0 0 0 0 0 0
; > A is destination address, in this order:
; [..DC BA98 7654 3210 .... .... .... ..FE]
; Note: d0 is a *word*, may have garbage in upper 16 bit
move.l d0,d5 ; d5 = (???? ???? ???? ???? FEDC BA98 7654 3210)
and.l #%1100000000000000,d5 ; d5 = (0000 0000 0000 0000 FE00 0000 0000 0000)
lsr.l #7,d5 ; d5 = (0000 0000 0000 0000 0000 000F E000 0000)
lsr.l #7,d5 ; d5 = (0000 0000 0000 0000 0000 0000 0000 00FE)
move.l d0,d6 ; d6 = (???? ???? ???? ???? FEDC BA98 7654 3210)
and.l #%0011111111111111,d6 ; d6 = (0000 0000 0000 0000 00DC BA98 7654 3210)
lsl.l #4,d6 ; d6 = (0000 0000 0000 00DC BA98 7654 3210 0000)
lsl.l #4,d6 ; d6 = (0000 0000 00DC BA98 7654 3210 0000 0000)
lsl.l #4,d6 ; d6 = (0000 00DC BA98 7654 3210 0000 0000 0000)
lsl.l #4,d6 ; d6 = (00DC BA98 7654 3210 0000 0000 0000 0000)
move.l #$40000000,d0 ; VRAM write command
or.l d6,d0 ; d0 = (01DC BA98 7654 3210 0000 0000 0000 0000)
or.l d5,d0 ; d0 = (01DC BA98 7654 3210 0000 0000 0000 00FE)
move.l d0,vdp_control ; VDP write
movem.l (sp)+,d0/d5-d6
rts
This approach is not fast enough for real-time games, but works well enough as a proof of concept.
An alternative approach uses the same memory layout as the characters in VRAM, which should make for significantly improved performance when copying the framebuffer to VRAM.
Character Bitmap Framebuffer
The character bitmap framebuffer has less intuitive memory layout, but should simplify copying the framebuffer to VRAM>
- Non-trivial rendering: Putting a pixel on screen requires a few calculations.
- Framebuffer is layed out as an exact copy of character data in VRAM.
- Simple VRAM copy: Optimal performance when copying to VRAM, and enables using DMA to copy.
Example character 256x224 pixel bitmap - x and y values are pixel positions, the numbers in the table denote memory byte offsets, every 8-bit byte represents 2 4-bit color palette indices)
x/y 0 2 4 6 8 10 12 14 248 250 252 254
---------------------------------------------------------
0 | 0 1 2 3 | 20 21 22 23 | ... | 3C0 3C1 3C2 3C3
1 | 4 5 6 7 | 24 25 26 27 | ... | 3C4 3C5 3C6 3C7
2 | 8 9 A B | 28 29 2A 2B | ... | 3C8 3C9 3CA 3CB
3 | C D E F | 2C 2D 2E 2F | ... | 3CC 3CD 3CE 3CF
4 | 10 11 12 13 | 30 31 32 33 | ... | 3D0 3D1 3D2 3D3
5 | 14 15 16 17 | 34 35 36 37 | ... | 3D4 3D5 3D6 3D7
6 | 18 19 1A 1B | 38 39 3A 3B | ... | 3D8 3D9 3DA 3DB
7 | 1C 1D 1E 1F | 3D 3D 3E 3F | ... | 3DC 3DD 3DE 3DF
| ---------------------------------- ... ----------------
8 | 400 401 402 403 | 420 421 422 423 | ... | 7C0 7C1 7C2 7C3
9 | 404 405 406 407 | 424 425 426 427 | ... | 7C4 7C5 7C6 7C7
:
etc.
Rendering translates screen x- and y-coordinates to character layout. The translation should be performed like this:
putpixel(x,y):
xoff_bytes = (x/2)%4
xoff_tile = ((x/8)%32)*$20
yoff = y*4
yoff_tile = (y/8)*$400
addr = xoff_bytes + xoff_tile + yoff + yoff_tile
The algorithm for copying the character-based framebuffer to VRAM is trivial, simply copy 28 KB of RAM to VRAM. Because there is no change in memory layout, the copy can be performed using a DMA copy.
Alternative Framebuffer Layouts
Other Framebuffer layouts could be used to implement completely different types of raster graphics. For instance, the framebuffer bitmap could have different types of color definitions, such as 1-bit colors, or more than 16-bit colors and use e.g. dithering to emulate them when transferring to VRAM.
Another approach could use a low-resolution framebuffer that gets scaled when transferring to VRAM.
Graphical fidelity can be sacrificed for much better performance by replacing writing to the character data in VRAM by using a fixed set of characters and interpreting the framebuffer by writing to the playfield map instead. If each pixel in the framebuffer is mapped to a single tile, we have something equivalent to ASCII-art, but it is also possible to achieve higher resolutions by using semigraphical characters.
References
- 'Racing the Beam' (2009) by Nick Montfort and Ian Bogost
- SEGA2.DOC - VDP information and terminology