Episode 12: Double-Buffering

We thought we had fixed the flickering in Episode 10 when we made our draw_rect
subroutine more efficient, but if you notice in the Episode 11 DOSBox video (and even more prominently in the video of the code running on my actual PCjr), it still appears occasionally. Our icon is 14x16, and the processor is just not fast enough to copy 16 7-byte chunks of data (remember each byte encodes 2 pixels) in the time it takes the CRT to do the first couple scanlines. Clearly we need a better solution.
I suppose we could make draw_icon
even more efficient, and more carefully synchronize the interplay between the CPU and the CRT (for example, scheduling the redrawing of the icon immediately after the last scanline of the previous icon had been drawn to give us the maximum time in which to draw it), but an easier solution is to use a little extra RAM to double-buffer.
Double-buffering uses two screen-sized areas of memory (let's call them page 1 and page 2) instead of one. While the CPU is busy with the time-consuming process of composing a complex scene on page 1, the CRT can be rendering page 2 to the screen. When the CPU is finished drawing on page 1, it either copies it (usually only the parts that have changed) to the page 2 or swaps pages with the CRT, so that the CRT is now rendering page 1 to the screen and the CPU can begin drawing the next scene in page 2.
The page-swap technique is great if you're redrawing the whole frame every time, but we'll just be moving our player icon a bit and leaving the rest of the frame as it is. Our approach will be to do all our drawing, erasing and redrawing in a new buffer (which we'll call the compositor), and copy the changed area to the video display areas (which we'll call the framebuffer). So:
- Store player's previous position, then move it to a new position (if a movement key is pressed)
- Erase (by drawing a background-colored rectangle over) the player's previous position in the compositor
- Redraw the player in its new position on the compositor
- Calculate a rectangle big enough to cover the player's previous and new positions
- Copy that rectangle of pixels from the compositor to the framebuffer
This may sound like a lot of work, but it's actually not. We'll add one new subroutine, but we actually get to simplify two existing ones. This is because our compositor won't have to use the complex 4-banks-of-scanlines structure that the framebuffer uses -- it can just represent the screen in the simplest way possible -- one continuous block of pixels from left to right and top to bottom.
Where do we create this compositor buffer? As you remember, in 320x200x16 mode, the CRT reads from the upper 32KB of the IBM PCjr's 128KB of memory, or 0x18000-0x1ffff. We'll another 32KB area immediately below this (0x10000-0x17fff) for our compositor. Let's create variables for both in 320x200x16.asm
:
section .data
COMPOSITOR_SEG: dw 0x1000 ; Page 4-5
FRAMEBUFFER_SEG: dw 0x1800 ; Page 6-7
room_width_px: dw 320
room_height_px: dw 168
section .text
draw_rect
and draw_icon
need to be modified to reference [COMPOSITOR_SEG] instead of 0x1800, and we can take out the complex code that calculates bank and row numbers. The memory location for pixel (x,y) in the compositor is simply
[COMPOSITOR_SEG] + (y * 320 + x) / 2
Here's a rewrite of draw_rect
:
; draw_rect( x, y, w, h, color )
; Draws a colored rectangle of pixels of the given size at the given location
; to the compositor.
; Args:
; bp+4 = x, bp+6 = y,
; bp+8 = w, bp+10 = h,
; bp+12 = color
draw_rect:
push bp
mov bp, sp
mov ax, [bp+12]
mov cx, 4
shl al, cl
xor al, [bp+12]
mov si, ax ; Color in SI
mov di, [COMPOSITOR_SEG]
mov es, di ; Compositor segment in ES
mov cx, [bp+10] ; This CX will count down the rows
.copyLine:
; Compute which row number we're writing to
mov ax, [bp+10] ; Start with rect height
sub ax, cx ; Subtract countdown to give us rect row
add ax, [bp+6] ; Add Y location to rect row number
; Compute starting byte offset for this location
; DI = (AX * 320 + x) / 2
mov bx, [cs:room_width_px]
mul bx ; AX *= 320
add ax, [bp+4] ; ... + x
shr ax, 1 ; ... / 2
mov di, ax
push cx
mov cx, [bp+8]
shr cx, 1 ; Because each byte encodes 2 pixels
.copyByte:
mov ax, si
mov [es:di], al
inc di
loop .copyByte
pop cx
loop .copyLine
pop bp
ret 10
You'll see a similar rewrite for draw_icon
in the repo.
Now we have to write a subroutine for copying a rectangle of pixels from the compositor to the framebuffer. This should be pretty straightforward since we know how to calculate the location of a row in the compositor (the simplified formula above) and the framebuffer (the bank/scanline formula). For each row in the rectangle, we just find the starting location in each buffer and do the memory copy.
How did we do?
Pretty good!!
There's one more thing we have to take care of, and it's a big one.
When I ran this for the first time on the PCjr, I got all sorts of strange behavior. Nothing would draw, then only parts of the scene would draw, and the movement keys didn't work at all. In DOSBox, however, the code worked fine – until I tried a different version of DOSBox and saw the same behavior. So my code was clearly in error, and the particular version of DOSBox I had been using for development was shielding me from that error. After enough digging I found out what it was.
Intro to DEBUG.COM
I came across the MS-DOS DEBUG program during my research, and decided to give it a shot. DEBUG.COM lets us load a COM program, set breakpoints, trace (run program steps one at a time), and view register and memory contents. I figured I'd be able to find out where my program was hanging by setting breakpoints farther and farther along in the code until I found a place where execution stopped. But I didn't need to go that far yet. I loaded my program in DEBUG.COM in both versions of DOSBox, hit 'r' to view the initial register contents, and something stuck out to me:
EmuCR DOSBox SVN
AX=0000 BX=0000 CX=03BE DX=0000 SP=3BDE BP=0000 SI=0000 DI=0000
DS=019C ES=019C SS=019C CS=019C IP=0100 NV UP EI PL NZ NA PO NC
019C:0100 BC0020 MOV AX,0F00
DOSBox 0.74-3
AX=FFFF BX=0000 CX=03BE DX=0000 SP=FFFE BP=0000 SI=0000 DI=0000
DS=019C ES=019C SS=019C CS=019C IP=0100 NV UP EI PL NZ NA PO NC
019C:0100 BC0020 MOV AX,0F00
Aside from the difference in AX (which doesn't matter because we immediately set AX), look at the stack pointer locations. The official DOSBox release is doing what it's supposed to do, putting the stack pointer (SP) at 0xFFFE. The other DOSBox, built from SVN commit 4085, is putting the stack pointer at 0x38DBE. DOSBox SVN is what I've been using for development. From what I have always read, COMMAND.COM sets the stack pointer at 0xFFFE for a COM program, so the official DOSBox release is correct. But look where that puts the stack pointer:

Right in the middle of memory we're now writing over! In contrast, DOSBox SVN is setting the stack pointer safely out of the range of our buffers. Since we're writing over the memory where the stack would normally be, we need to make sure we move that pointer as a first step. For now, we'll just set the stack pointer to 0x2000, or 8K above the start of the program in memory. Since our program code plus data is nowhere near 8K at the moment, it should be fine:
; Move stack pointer out of the way so we have free reign of the
; upper half of memory (64K-128K).
mov sp, 0x2000
See the full code for this episode in the project's GitHub.