Embedded Systems Programming - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Embedded Systems Programming

Description:

Embedded Systems Programming – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 16
Provided by: craig60
Category:

less

Transcript and Presenter's Notes

Title: Embedded Systems Programming


1
Embedded Systems Programming
  • Writing Optimised C code for ARM

2
Why write optimised C code?
  • For embedded system size and/or speed are of key
    importance
  • The compiler optimisation phase can only do so
    much
  • In order to write optimal C code you need to know
    details of the underlying hardware and the
    compiler

3
What compilers cant do
  • void memclr( char data, int N)
  • for ( N gt 0 N--)
  • data0
  • data
  • Is N on first loop?
  • 0 1 is dangerous!
  • Is data array 4 byte aligned?
  • Can store using int
  • Is N a multiple of 4?
  • Could do 4 word blocks at a time
  • Compilers have to be conservative!

4
An example Program
  • The program might seem fine even resource
    friendly
  • Using a char saves space
  • for loops make good assembler
  • Lets look at the assembler code
  • / program showing inefficient
  • variable and loop
  • usage craig Nov 04
  • /
  • int checksum_1(int data)
  • char i int sum 0
  • for (i 0 i lt 64 i)
  • sum datai
  • return sum

5
.text .align 2 .global checksum_1 .type
checksum_1,functionchecksum_1 _at_ args 0,
pretend 0, frame 0 _at_ frame_needed 1,
current_function_anonymous_args 0 mov ip,
sp stmfd sp!, fp, ip, lr, pc sub fp, ip,
4 mov r1, r0 mov r0, 0 _at_ sum 0 mov r2,
r0 _at_ i 0.L6 ldr r3, r1, r2, asl 2 _at_
datai add r0, r0, r3 _at_ sum
datai add r3, r2, 1 _at_ i and r2, r3,
255 cmp r2, 63 _at_ i lt 64 bls .L6 ldmea fp,
fp, sp, pc.Lfe1 .size checksum_1,.Lfe1-check
sum_1
6
What is wrong?
  • The use of char means that the compiler has to
    cast to look at 8 bits using
  • and r2, r3, 255
  • The loop variable requires a register and
    initialisation
  • If the loop is called often then the tests and
    branch is quite an overhead

7
Variable sizes
  • In general the compiler will use 32bit registers
    for local variables but will have to cast them
    when used as 8 or 16 bit values
  • If you can, use unsigned ints, if you cant
    explicitly cast
  • Using signed shorts can be quite a problem for
    compilers

8
Watch your shorts!
short add( short a, short b) return a (b
gtgt 1)
  • The above C code turns into the rather nasty
    assembler
  • The gnu C compiler is very cautious when
    confronted with short variables

Becomes .
mov ip, sp stmfd sp!, fp, ip, lr, pc sub fp,
ip, 4 mov r1, r1, asl 16 mov r0, r0, asl
16 mov r0, r0, asr 16 add r0, r0, r1, asr
17 mov r0, r0, asl 16 mov r0, r0, asr
16 ldmea fp, fp, sp, pc
9
Loops 1
  • As well as using a char for a loop counter the
    loop counter could be redundant
  • Terminate loops by counting down to 0 the reduces
    register usage and means no initialisation
  • Use do..while instead of for loops

10
Efficient loop C
/ Program to show efficient use of
variables and loops / int checksum_2(int
data) int sum 0, i 64 do
sum (data) while ( --i !
0 ) return sum
11
Efficient loop assembler
checksum_2 _at_ args 0, pretend 0, frame
0 _at_ frame_needed 1, current_function_anonymous_
args 0 mov ip, sp stmfd sp!, fp, ip, lr,
pc sub fp, ip, 4 mov r1, r0 mov r0, 0 _at_
sum 0 mov r2, 64 _at_ i 64.L6 ldr r3,
r1, 4 _at_ (data) add r0, r0, r3 _at_ sum
(data) subs r2, r2, 1 _at_ --i bne .L6 ldmea
fp, fp, sp, pc
12
Loop unrolling
  • If a loop is going to be repeated often then the
    test and branch can be quite an overhead
  • If the loop is a multiple of 4 and is done quite
    a lot then the loop can be unrolled
  • This increases code a size but is more speed
    efficient
  • Sizes that are not multiples of 4 can be done but
    are less efficient.

13
An unrolled loop
Program to show efficient use of variables
and loops loop unrolling / int checksum_2(int
data) int sum 0, i 64 do
sum (data) sum (data)
sum (data) sum (data) i
- 4 while ( i ! 0 ) return sum
14
checksum_2 _at_ args 0, pretend 0, frame
0 _at_ frame_needed 1, current_function_anonymous_
args 0 mov ip, sp stmfd sp!, fp, ip, lr,
pc sub fp, ip, 4 mov r2, r0 mov r0,
0 mov r1, 64.L6 ldr r3, r2, 4 add r0,
r0, r3 ldr r3, r2, 4 add r0, r0, r3 ldr r3,
r2, 4 add r0, r0, r3 ldr r3, r2,
4 add r0, r0, r3 subs r1, r1,
4 bne .L6 ldmea fp, fp, sp, pc
15
Loop unrolling ! 4
/ Program to show use of loop unrolling
/ int checksum_2(int data, unsigned int N)
int sum 0 unsigned int i for ( i N/4 i
! 0 i--) sum (data)
sum (data) sum (data)
sum (data) for ( i N3 i ! 0
i--) sum (data) return sum
Write a Comment
User Comments (0)
About PowerShow.com