57 Code optimization - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

57 Code optimization

Description:

Control and Data Flow Graph (CDFG) ... Multiply Accumulate operation. c) MAC-unit! R1 = R1 R2 * R3 in one cycle! ... assigned the highest priority and so on. ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 23
Provided by: ict65
Category:

less

Transcript and Presenter's Notes

Title: 57 Code optimization


1
5-7 Code optimization
Two functions f and g
define MAX 10int aMAX, bMAX, cMAX,
xMAX, yMAXint i, j, r, s. . .int f(int
a, int b) int z z 2 a b return
zint g(int a, int b, int c) int z z
a c c b return z
What code optimization can the compiler do? -O,
-O0, -O1, -O2, -O3, -Os ?
With the O or O0 you have to do all
optimi-zations yourself
2
Two for loops
. . .for(i 0 i lt MAX -1 i) xi
f(ai, bi) s 2 rfor(j 0 j lt MAX
- 1 j) yj s g(aj, bj, cj)
What can be done?
We want shorter execution time without increasing
the code!
3
Loop integration
The two loops have the same range (0, MAX-1), and
no data dependency (x only in loop1, y only in
loop2). Loops can be integrated saves loop
overhead ( only i )!
s 2 rfor(i 0 i lt MAX - 1 i)
xi f(ai, bi) yj s g(aj, bj,
cj)
4
Precalculation at compile time
The defined constant MAX is used as MAX - 1 in
the loop. MAX - 1 could be precalculated as 10
1 9 at compile time!
s 2 rfor(i 0 i lt 9 i) xi
f(ai, bi) yj s g(aj, bj,
cj)
5
Algebraic simplification
Rewriting function g can save one multiplication
operation
int g(int a, int b, int c) int z z c
(a b) return z
6
Inlining of functions
Both functions f and g are short and their code
could be inserted directly in the loop.
int a10, b10, c10, x10, y10int i, r,
ss 2 rfor(i 0 i lt 9 i) xi
2 ai bi yj s ((ai bi)
ci)
loop unrolling would give shorter execution time,
but it would increase the code size, so it cant
be used in this case.
7
5-2 Register lifetime
A processor has this instruction type op R1, R2,
R3 all three registers must be different. Code
to run
u c d (1) v a b (2)w a u (3)x
v e (4)
How many registers are needed?
8
Register Life Time Graph
u c d (1) v a b (2)w a u (3)x
v e (4)
Four registers are needed!
9
Data Flow Graph
A Data Flow Graph can detect data dependencies.
u c d (1) v a b (2)w a u (3)x
v e (4)
  • Must be before (3)
  • Must be before (4)

(2) and (3) can change execution order!
10
New Register Life Time Graph
New instruction order
u c d (1) w a u (2)v a b
(3)x v e (4)
Now only 3 registers needed. Saving 25.
11
5-8 CDFG
  • Control and Data Flow Graph (CDFG)
  • Multiplication takes 3 cycles, all other
    instructions take 1 cycle. Best/Worst execution
    time?

mode 0 TBest 11 2
y 0if(mode 1) for(i 0 i lt 5
i) y ai bi

mode 1 TWorst 111(51) 54 5 34
12
Multiply Accumulate operation
c) MAC-unit! R1 R1 R2 R3 in one cycle!
y ai bi / one cycle /
TWorst 111(51) 51 5 19
19/34 0.56. With MAC 56 of ordinary processor
execution time.
13
Processes on a CPU
14
Scheduling states of process
15
Priority Driven Scheduling
  • Each process has fixed priority
  • The ready process with the highest priority
    executes
  • Process executes until completion or preemtion
    by higher priority process

16
6-2 Processor utilisation and feasible scheduling
P(execution time, period, deadline) P1(3, 9, 9)
P2(1, 2, 2) P3(1, 6, 6)
Timeline least-common multiple of process
periods 9, 2, 6 3?3, 2, 2?3 3?3?2 18
CPU utilisation
100 ?
17
Rate Monotonic Scheduling
RMS shortest period is assigned the highest
priority and so on.
RMS guarantee, feasible schedule exists if
This case U 1 so there is no guarantee!
n 3 U lt 0.78
18
RMS figure
Priorities P2 gt P3 gt P1 (2 lt 6 lt 9)
P1 misses the deadline! No feasible schedule with
RMS!
19
Earliest Deadline First Scheduling
EDF guarantee, feasible schedule exists if U
? 1This case U 1, EDF shall produce a feasible
schedule.
20
6.3 Scheduling and semaphores
P(execution time, period, deadline) P1(1, 3, 3)
P2(1, 4, 4) P3(2, 6, 6) 3, 2?2, 2?3 3?2?2
12
RMS P1 gt P2 gt P3 (3 lt 4 lt 6)
Sem1 is a binary semaphore. accessSem1() and
releaseSem1() takes 0 time.
21
RMS with no critical sections
22
RMS with critical sections
Write a Comment
User Comments (0)
About PowerShow.com