Search This Blog

Sunday, July 29, 2018

ChibitTerm - An alternative to ping pong buffering in rendering

Projects / ChibiTerm  Original post date: 04/25/2016

Just before I took a cat nap, I was reading over NTSC generation. As I was still hazy getting up, I had an idea...

Instead of using a ping pong buffer to render the scan line, why can't I use a single buffer? The rendering code is about 2x the SPI data rate, so all it need is some head start rendering a small part of the line at the horizontal blanking IRQ. The code is a bit more messy and I have to do some trial & error to see how much I can render before putting the core to sleep for jitter free DMA from IRQ.

This will free up 80 byte of RAM.

So it looks like it is a bit more messy than I think. It looks kind of weird with the first few character shifted down.

 Vertical misalignment
Basically there was only enough cycle to render 1 character before the sleep. The core wakes up for DMA, DMA prefetch and filled up the 4 bytes FIFO with what was in the buffer from last scan line as there wasn't enough of a lead in the buffer. After a few characters, the rendering caught up and the correct data was fetched.

I think I have to do some more ugly coding.

  • I render first 8 characters of the first line before the first active display line. This buys time for the rendering loop to start going. Once it started, the loop will be fast enough to stay ahead of the DMA.
  • For each line, starts the DMA.
  1. render the remaining characters in the line. (DMA would reaches half way by then.)
  2. render the 8 character for the next line.
Old code:
New code:
Note: ZI-data usage
It is working now. Free RAM: 4096 - 3920 - 48 = 128 bytes. I free up 88 bytes. Feels like I am working on 8-bit uC. :(


I had a lot of problems trying to keep the values in a struct as the different sections of my IRQ service routine actually get called in separate IRQ instances. It turns out that using local variables save more RAM. The slightly bigger code size isn't that bad as I won't it'll fill the 16k for this project. (The chip actually has 32k as it can be seen from the debugger.)


No comments:

Post a Comment

Note: Only a member of this blog may post a comment.