PIC Delay routines done wrong

During my recent reverse engineering of a PIC12F675 learning board on EBAY I came across the following piece of code…

L7: MOVWF 0x20
    CLRF 0x22
L6: CLRF 0x21
L5: MOVF 0x20,W
    SUBWF 0x21,W
    BTFSC STATUS,C
    GOTO L4
    INCF 0x21,F
    GOTO L5
L4: MOVLW 0xC8
    INCF 0x22,F
    SUBWF 0x22,W
    BTFSC STATUS,C
    RETURN
    GOTO L6

The routine is a classic example of a delay loop realised by having multiple nested loops, burning CPU time by executing the inner instructions many many times.  In the example program where I found this subroutine the intention was to provide a delay which would be used to slow down the flashing of an LED in a simple demo.


loop:	LED ON
	MOVLW 0xF4
        CALL L7
        LED OFF
	MOVLW 0xF4
        CALL L7
        GOTO loop 

So the usage is between LED ON and LED OFF load the literal value 0xF4 which is 244 decimal and call the delay routine to slow down the flashing. 244 is an odd number so this piqued interest and warranted some further investigation.

Measurement of this code running with a 4MHz oscillator (which gives a 1MHz instruction counter since PIC internals for the 12F operate at FOSC/4) gave a time between flashes of near 350ms so this grows more interesting.

Each PIC instruction takes either a single instruction cycle (i.e. 4 clocks) or if is a branch or other instruction which would alter the program counter to a location other than the next one (e.g. a carry test where carry is set or an explict GOTO or RETURN etc) it will take two instruction cycles (i.e. 8 clocks).

If we refer to the argument passed to the subroutine as D (for delay) then we see there is an outer loop L6 that executed 201 times (0xC8 is 200 in decimal) and within the outer loop there is an inner loop that is executed D+1 times.

It is possible to run a few tests with the subroutine and look at some timings… D is the delay argument and the Instruction and Clks count show the work expended in the routine.

D Instructions Clks
0 2001 10396
1 3001 15996
16 18001 99996
17 19001 105596
32 34001 189596
33 35001 195196
64 66001 368796

We can see that for each increase in D the subroutine executes an additional 1000 instructions. I think this was the designers expectation and that with a 4MHz oscillator this would yield a subroutine that delayed around (D+1) x 1ms however this does not take into account the cost of branching, especially in the inner loop.

By careful inspection of the loops and taking into account the branching cost we get…

Clks = 200*28*(D+1)+24*199+20

i.e. When D is 244 we get a delay of 1376796 clocks which takes 344ms.

So this is a rubbish 1ms delay routine! Fudging the value at 244 didn’t help either. (In fact a better bodge would have been a value of 176 which yields just under 249ms; pretty close)

Coding good delay routines is probably a bit of a niche area but there is an online site http://www.piclist.com/techref/piclist/codegen/delay.htm which might help if a library routine isn’t available.

Even better is to use a hardware timer and either poll for timer completion or ideally use a timer completion interrupt to indicate that the next bit of work can continue.  In the presence of interrupts this advice is even more valid since an work done in an interrupt will cause suspension of the delay loop and extend its ultimate completion time.

I never knew I could have such fun with £5 worth of EBAY cheap technology from China… must get around to fixing my Technics SL-SL33QD record player now my replacement H-bridge chip has arrived… hardware next time perhaps…

Have fun!

About nivagswerdna

Professional Geek
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a comment