During my recent reverse engineering of a PIC12F675 learning board on EBAY I came across the following piece of code…
L7: MOVWF 0x20 CLRF 0x22 L6: CLRF 0x21 L5: MOVF 0x20,W SUBWF 0x21,W BTFSC STATUS,C GOTO L4 INCF 0x21,F GOTO L5 L4: MOVLW 0xC8 INCF 0x22,F SUBWF 0x22,W BTFSC STATUS,C RETURN GOTO L6
The routine is a classic example of a delay loop realised by having multiple nested loops, burning CPU time by executing the inner instructions many many times. In the example program where I found this subroutine the intention was to provide a delay which would be used to slow down the flashing of an LED in a simple demo.
loop: LED ON MOVLW 0xF4 CALL L7 LED OFF MOVLW 0xF4 CALL L7 GOTO loop
So the usage is between LED ON and LED OFF load the literal value 0xF4 which is 244 decimal and call the delay routine to slow down the flashing. 244 is an odd number so this piqued interest and warranted some further investigation.
Measurement of this code running with a 4MHz oscillator (which gives a 1MHz instruction counter since PIC internals for the 12F operate at FOSC/4) gave a time between flashes of near 350ms so this grows more interesting.
Each PIC instruction takes either a single instruction cycle (i.e. 4 clocks) or if is a branch or other instruction which would alter the program counter to a location other than the next one (e.g. a carry test where carry is set or an explict GOTO or RETURN etc) it will take two instruction cycles (i.e. 8 clocks).
If we refer to the argument passed to the subroutine as D (for delay) then we see there is an outer loop L6 that executed 201 times (0xC8 is 200 in decimal) and within the outer loop there is an inner loop that is executed D+1 times.
It is possible to run a few tests with the subroutine and look at some timings… D is the delay argument and the Instruction and Clks count show the work expended in the routine.
We can see that for each increase in D the subroutine executes an additional 1000 instructions. I think this was the designers expectation and that with a 4MHz oscillator this would yield a subroutine that delayed around (D+1) x 1ms however this does not take into account the cost of branching, especially in the inner loop.
By careful inspection of the loops and taking into account the branching cost we get…
Clks = 200*28*(D+1)+24*199+20
i.e. When D is 244 we get a delay of 1376796 clocks which takes 344ms.
So this is a rubbish 1ms delay routine! Fudging the value at 244 didn’t help either. (In fact a better bodge would have been a value of 176 which yields just under 249ms; pretty close)
Coding good delay routines is probably a bit of a niche area but there is an online site http://www.piclist.com/techref/piclist/codegen/delay.htm which might help if a library routine isn’t available.
Even better is to use a hardware timer and either poll for timer completion or ideally use a timer completion interrupt to indicate that the next bit of work can continue. In the presence of interrupts this advice is even more valid since an work done in an interrupt will cause suspension of the delay loop and extend its ultimate completion time.
I never knew I could have such fun with £5 worth of EBAY cheap technology from China… must get around to fixing my Technics SL-SL33QD record player now my replacement H-bridge chip has arrived… hardware next time perhaps…