Search This Blog

Monday, July 30, 2018

STM8 Software SPI

 Projects / Misc 8-bit uC projects  Original post date:  02/28/2017

Software SPI is useful when the shared peripheral pins are occupied.  Here is a typical C code implementation (in Cosmos C):


Here is the waveform:


I have rewritten the code by interleaving in the bit toggling C statements with assembly for efficiency. The GPIO used for SPI can be customized in a header file.


The assembly code takes advantage of the persistence of some of the CPU flags from operation for conditional branches. This allows me to pace out the bit toggling statements to achieve close to 50% duty cycle on the SCK signal. I have also expanded the code for the if statement to cover both branches to reduce a jump. Jumps cause a pipeline flush which might take a few cycles to recover
This is the waveform:


FYI: The C code takes about 17 cycles per bit while the assembly version takes 10 cycles.
The GPIO are declared in header file:

enum _PB { PB4=0x10, PB5=0x20 };
enum _PC { PC3=0x08, PC4=0x10, PC5=0x20, PC6=0x40, PC7=0x80 };

#define MOSI_PORT       GPIOB
#define SPI_MOSI PB5
#define SCK_PORT        GPIOC
#define SPI_SCK  PC5
#define DC_PORT         GPIOB
#define LCD_DC          PB4

The bit level toggling C code is translated into a single Bit Set/Reset instruction that takes a cycle to execute.


I took the suggestions from Thomas on hackaday.io and updated the SPI code. It is a bit shorter and faster. I have to rebalance the duty cycle.


I added the following to my header file to specify the GPIO (_P: port, _B: bit number)

#define MOSI_P      B
#define SPI_MOSI_B  5

The following code before the C function are for passing the port definition into the assembly code


#define STR_EXPAND(X)  #X
#define STR(X)   STR_EXPAND(X)
#define BCCM(GPIO,BIT_NUM) "bccm GPIO" STR(GPIO) "_ODR" ",#" STR(BIT_NUM)

#asm
GPIOA_ODR: equ 0x5000
GPIOB_ODR: equ 0x5005
GPIOC_ODR: equ 0x500a
GPIOD_ODR: equ 0x500f
#endasm

This line: _asm(BCCM(MOSI_P,SPI_MOSI_B));

is translated by C macros to: _asm("bccm GPIOB_ODR,#5");

The assembly code looks like this:


Here is the waveform:

The new code takes 8 cycles per bit vs 10 for previous version vs 17 for C code.

If I use the left shift instruction to operate on the memory instead of the accumulator, it ended up slower. sll (0x02,sp) vs sll a


My speculation is that the "sll" instruction is after a branch which causes a pipeline flush. The different code sequence changes the byte alignment hence different prefetch cycles to recover.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.