Title: Code Optimization
1Code Optimization
- Witawas Srisa-an
- CSCE 496 Embedded Systems Design and
Implementation
2Agenda
- Talk about possible exam ideas
- Code optimization techniques
- Not everyone has reconfigurable processors!
- Credits
- Most of slides in this lecture are based on
slides created by Profs. Raj Rajkumar and
Professor Priya Narasimhan from ECE Dept at
Carnegie Mellon
3Exam Ideas
4Code Optimization
- Programmers can improve program performance by
writing better code - Improve data structure and/or algorithms
- Merge vs. bubble sorts
- Reorganize code or provide flags to help
compilers - Last option is to write in assembly
5Better Algorithms
- Merge vs. bubble sorts
- Which one runs faster?
- Which one causes more cache misses?
6Common Optimization Techniques
- Sub-expression elimination
- Dead code elimination
- Induction variables
- Strength reduction
- Loop unrolling
- In-lining
7Common Techniques (cont.)
- Sub-expression elimination
myfunction index1 8 i x a index1
temp 8 i index2 4 j t aindex2
atemp t temp2 4 j atemp2 x
goto myfunction
8Common Techniques (cont.)
int i 0 i i 1 if (i 0) j j
8 else j j 10
use ASSERT and ifdef to advice the compiler
about deadcode
9Common Techniques (cont.)
- Induction variables and strength reduction
i 0 j 0 label j j 1 i 4 j ai
2 b i if (i lt 1000) goto label
10Optimization Techniques (cont.)
main addi s0, t1, 0 addi s1, t2, 0 jal
mult add t3, v0, 0 mult addi sp, sp -12 sw
s1, 4(sp) sw s0, 8(sp) sw ra, 12(sp) sll
v0, s0, s1 lw s1, 4(sp) lw s0, 8(sp) lw
ra, 12(sp) addi sp, sp, 12 jr ra
Whats wrong with this picture?
11Optimization Techniques (cont.)
- Loop unrolling
- Eliminate branches (why?)
12Architecture Dependent Optimizations
X Y 64 Convert 8-bit RGB to 8-bit YCC Y
0.299R 0.587G 0.114B Cb -0.169R - 0.331G
0.500B 128 Cr 0.500R - 0.419G 0.082B 128
13Architecture Dependent Optimizations (cont.)
Address Register
Addr Incrementer
Incrementer Bus
ALU Bus
Register Bank
Write Buffer (holds address and data)
A Bus
Barrel Shifter
B Bus
32-bit ALU
Mem Addr Register
Write Data Register
Read Data/Instr Reg
Dout310
Data310
RAM
14Summary
- No magic bullet
- optimizations sometimes dont work
- programmers need to help
- various techniques that may require prior
knowledge of the hardware