Episode 12: Double-Buffering

Joshua Foster

08 Jul 2019 — 6 min read

We thought we had fixed the flickering in Episode 10 when we made our draw_rect subroutine more efficient, but if you notice in the Episode 11 DOSBox video (and even more prominently in the video of the code running on my actual PCjr), it still appears occasionally. Our icon is 14x16, and the processor is just not fast enough to copy 16 7-byte chunks of data (remember each byte encodes 2 pixels) in the time it takes the CRT to do the first couple scanlines. Clearly we need a better solution.

I suppose we could make draw_icon even more efficient, and more carefully synchronize the interplay between the CPU and the CRT (for example, scheduling the redrawing of the icon immediately after the last scanline of the previous icon had been drawn to give us the maximum time in which to draw it), but an easier solution is to use a little extra RAM to double-buffer.

Double-buffering uses two screen-sized areas of memory (let's call them page 1 and page 2) instead of one. While the CPU is busy with the time-consuming process of composing a complex scene on page 1, the CRT can be rendering page 2 to the screen. When the CPU is finished drawing on page 1, it either copies it (usually only the parts that have changed) to the page 2 or swaps pages with the CRT, so that the CRT is now rendering page 1 to the screen and the CPU can begin drawing the next scene in page 2.

The page-swap technique is great if you're redrawing the whole frame every time, but we'll just be moving our player icon a bit and leaving the rest of the frame as it is. Our approach will be to do all our drawing, erasing and redrawing in a new buffer (which we'll call the compositor), and copy the changed area to the video display areas (which we'll call the framebuffer). So:

Store player's previous position, then move it to a new position (if a movement key is pressed)
Erase (by drawing a background-colored rectangle over) the player's previous position in the compositor
Redraw the player in its new position on the compositor
Calculate a rectangle big enough to cover the player's previous and new positions
Copy that rectangle of pixels from the compositor to the framebuffer

This may sound like a lot of work, but it's actually not. We'll add one new subroutine, but we actually get to simplify two existing ones. This is because our compositor won't have to use the complex 4-banks-of-scanlines structure that the framebuffer uses -- it can just represent the screen in the simplest way possible -- one continuous block of pixels from left to right and top to bottom.

Where do we create this compositor buffer? As you remember, in 320x200x16 mode, the CRT reads from the upper 32KB of the IBM PCjr's 128KB of memory, or 0x18000-0x1ffff. We'll another 32KB area immediately below this (0x10000-0x17fff) for our compositor. Let's create variables for both in 320x200x16.asm:

section .data

  COMPOSITOR_SEG: dw 0x1000   ; Page 4-5
  FRAMEBUFFER_SEG: dw 0x1800  ; Page 6-7

  room_width_px: dw 320
  room_height_px: dw 168


section .text

draw_rect and draw_icon need to be modified to reference [COMPOSITOR_SEG] instead of 0x1800, and we can take out the complex code that calculates bank and row numbers. The memory location for pixel (x,y) in the compositor is simply

[COMPOSITOR_SEG] + (y * 320 + x) / 2

Here's a rewrite of draw_rect:

; draw_rect( x, y, w, h, color )
; Draws a colored rectangle of pixels of the given size at the given location
; to the compositor.
; Args:
;   bp+4 = x, bp+6 = y,
;   bp+8 = w, bp+10 = h,
;   bp+12 = color
draw_rect:
  push bp
  mov bp, sp

  mov ax, [bp+12]
  mov cx, 4
  shl al, cl
  xor al, [bp+12]
  mov si, ax  ; Color in SI

  mov di, [COMPOSITOR_SEG]
  mov es, di       ; Compositor segment in ES

  mov cx, [bp+10]  ; This CX will count down the rows
  
.copyLine:
  ; Compute which row number we're writing to
  mov ax, [bp+10] ; Start with rect height
  sub ax, cx      ; Subtract countdown to give us rect row
  add ax, [bp+6]  ; Add Y location to rect row number
  
  ; Compute starting byte offset for this location
  ; DI = (AX * 320 + x) / 2
  mov bx, [cs:room_width_px]
  mul bx           ; AX *= 320
  add ax, [bp+4]   ; ... + x
  shr ax, 1        ; ... / 2
  mov di, ax

  push cx
  mov cx, [bp+8]
  shr cx, 1        ; Because each byte encodes 2 pixels

.copyByte:
  mov ax, si
  mov [es:di], al
  inc di
  loop .copyByte

  pop cx
  loop .copyLine

  pop bp
  ret 10

You'll see a similar rewrite for draw_icon in the repo.

Now we have to write a subroutine for copying a rectangle of pixels from the compositor to the framebuffer. This should be pretty straightforward since we know how to calculate the location of a row in the compositor (the simplified formula above) and the framebuffer (the bank/scanline formula). For each row in the rectangle, we just find the starting location in each buffer and do the memory copy.

How did we do?

Pretty good!!

There's one more thing we have to take care of, and it's a big one.

When I ran this for the first time on the PCjr, I got all sorts of strange behavior. Nothing would draw, then only parts of the scene would draw, and the movement keys didn't work at all. In DOSBox, however, the code worked fine – until I tried a different version of DOSBox and saw the same behavior. So my code was clearly in error, and the particular version of DOSBox I had been using for development was shielding me from that error. After enough digging I found out what it was.

Intro to DEBUG.COM

I came across the MS-DOS DEBUG program during my research, and decided to give it a shot. DEBUG.COM lets us load a COM program, set breakpoints, trace (run program steps one at a time), and view register and memory contents. I figured I'd be able to find out where my program was hanging by setting breakpoints farther and farther along in the code until I found a place where execution stopped. But I didn't need to go that far yet. I loaded my program in DEBUG.COM in both versions of DOSBox, hit 'r' to view the initial register contents, and something stuck out to me:

EmuCR DOSBox SVN

AX=0000 BX=0000 CX=03BE DX=0000 SP=3BDE BP=0000 SI=0000 DI=0000
DS=019C ES=019C SS=019C CS=019C IP=0100 NV UP EI PL NZ NA PO NC
019C:0100 BC0020            MOV     AX,0F00

DOSBox 0.74-3

AX=FFFF BX=0000 CX=03BE DX=0000 SP=FFFE BP=0000 SI=0000 DI=0000
DS=019C ES=019C SS=019C CS=019C IP=0100 NV UP EI PL NZ NA PO NC
019C:0100 BC0020            MOV     AX,0F00

Aside from the difference in AX (which doesn't matter because we immediately set AX), look at the stack pointer locations. The official DOSBox release is doing what it's supposed to do, putting the stack pointer (SP) at 0xFFFE. The other DOSBox, built from SVN commit 4085, is putting the stack pointer at 0x38DBE. DOSBox SVN is what I've been using for development. From what I have always read, COMMAND.COM sets the stack pointer at 0xFFFE for a COM program, so the official DOSBox release is correct. But look where that puts the stack pointer:

Right in the middle of memory we're now writing over! In contrast, DOSBox SVN is setting the stack pointer safely out of the range of our buffers. Since we're writing over the memory where the stack would normally be, we need to make sure we move that pointer as a first step. For now, we'll just set the stack pointer to 0x2000, or 8K above the start of the program in memory. Since our program code plus data is nowhere near 8K at the moment, it should be fine:

; Move stack pointer out of the way so we have free reign of the
; upper half of memory (64K-128K).
mov sp, 0x2000

See the full code for this episode in the project's GitHub.

Episode 12: Double-Buffering

Joshua Foster

Intro to DEBUG.COM

EmuCR DOSBox SVN

DOSBox 0.74-3

Read more

Episode 18: A couple quick fixes

Episode 17: Text UI

Episode 16: Adding depth

Episode 15: Adding Boundary Lines