EECS 583 Lecture 21 Review for Midterm - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

EECS 583 Lecture 21 Review for Midterm

Description:

r, s = cmpp.UN.OC (c d) if q. t, s = cmpp.UN.OC (e f) if r ... could register renaming be applied to remove. control dependences to prior branches assuming ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 33
Provided by: scottm80
Category:

less

Transcript and Presenter's Notes

Title: EECS 583 Lecture 21 Review for Midterm


1
EECS 583 Lecture 21Review for Midterm
  • University of Michigan
  • March 29, 2004

2
Midterm Information
  • When/Where
  • Wednesday, March 31, in class
  • 410pm 640pm (2.5 hrs)
  • Format
  • Open book, open notes
  • But, dont try to learn how to modulo schedule
    during the test!
  • Bring a pencil or 2
  • No laptops
  • Material
  • Everything from lectures/homeworks is fair game
    up to and including clustering, but focus on the
    major topics
  • No Trimaran specifics will be asked

3
Studying
  • Lecture notes are most important thing
  • Go back and familiarize yourself with everything
  • Work through examples/problems done in lecture
  • Today - Work through problems from Winter 03 exam
    some other problems
  • Notes
  • Emphasis is on understanding concepts/algorithms
    and solving problems
  • A few questions which require some thinking
    cannot study for these
  • Go through the class problems again w/o looking
    at the answers!

4
1. Control Flow Analysis
Draw the CFG, label the edges with the branch
conditions
while (i lt 100) x if (a gt 0)
if ((b gt 0) (c gt 0)) y
else z
5
2. If-Conversion
If-convert the following code, using HPL-PD style
CMPP operations
while (i lt 100) x if (a gt 0)
if ((b gt 0) (c gt 0)) y
else z
6
3. Dataflow Analysis
Draw the def-use chains for the code segment
7
4. Classical Optimization
In the following superblock loop segment, which
instructions can be hoisted to the loop preheader
using loop invariant code motion?
8
5. Register Allocation
What is the minimum number of registers to color
the following interference graph such that no
spills are necessary?
9
6. Optimization of Predicated Code
List all the legal opportunities for copy
propagation in the following hyperblock
1 p1,p2 cmpp.UN.UC (r6 lt 0) if T 2 r1 r2 if
T 3 r1 r3 if p2 4 r4 r1 if T 5 r1 r1 1
if p2 6 r5 r1 if p1 7 r6 r1 if p2 8 r7
r1 if T
10
7. Control Flow Analysis
Compute the CD (control dependence) set for basic
blocks 2, 4, 5, 7, 8. Out of each basic block,
the leftmost edge is considered BBid and the
rightmost edge is BBid as the edges leaving BB 1
are labeled
11
8. Predicate Analysis
For the following predicated code, draw the
predicate partition graph. In addition, which
predicate(s) is/are disjoint from s?
p,q cmpp.UN.UC (a lt b) if T r, s cmpp.UN.OC
(c lt d) if q t, s cmpp.UN.OC (e lt f) if r u, s
cmpp.UN.OC (g lt h) if p
12
9. Superblock Scheduling
In a superblock, given a pair of dependent
operations (i.e., opB depends on opA), is it ever
possible for opB to have a higher priority than
opA? Briefly explain why or why not.
13
10. Static/Dynamic Single Assignment Form
Is the following loop segment in SSA form, DSA
form, neither, or both. Briefly explain your
answer
14
11. Instruction Scheduling
Calculate the Estart, Lstart, and Slack for each
node in the following data dependence graph.
Edge latencies are indicated on the graph
15
12. Superblock Scheduling (Speculation)
Which operations have control dependences
to prior branches assuming the restricted
speculation architecture model? Which operations
assuming the general speculation model? Which
operations could register renaming be applied to
remove control dependences to prior branches
assuming the general speculation model? For
restricted speculation, assume the model as in
class where all memory, int div, and FP are
potentially excepting unless the compiler can
prove otherwise?
16
13. Control Flow Analysis
Compute the DOM (dominator) set for each
basic block in the following CFG. Also, name 1
use of dominator information other than loop
detection.
17
14. Static Single Assignment Form
Convert the following into SSA form. You should
perform the necessary renames and show the Phi
nodes.
18
15a. Dataflow Analysis
Given the following definition of anticipation
An expression E is anticipated at a point p if
and only if every path from p to Exit contains an
instruction that evaluates E and is not preceded
on that path by an instruction that might kill
E.   For example, at the top of the following
basic block, the expression r1 r2
is anticipated, but r51 is not.
Define the set of dataflow equations to solve
for anticipation. You should define GEN, KILL,
IN, and OUT
19
15b. Dataflow Analysis
Given the following definition of anticipation
An expression E is anticipated at a point p if
and only if every path from p to Exit contains an
instruction that evaluates E and is not preceded
on that path by an instruction that might kill
E.  
Sketch out a global CSE algorithm that
utilizes anticipation as the only source of
dataflow information.
20
16a. Modulo Scheduling
Draw the dependence graph using the notation
ltdelay, distancegt on each edge. Latencies
add(read0, write 1), load(read0, write3)
assume min/max vals equal
LC 200
1 r1-1 load(r60) 2 r2-1
load(r20) 3 r4-1 r11 r30 4 r3-1
r22 r4-1 5 r6-1 r60 4 6 brlc Loop
21
16b. Modulo Scheduling
Calculate the ResMII, RecMII, and the MII.
Assume 4 fully pipelined FUs 2 ALUs, 1 MEM, 1 BR
LC 200
1 r1-1 load(r60) 2 r2-1
load(r20) 3 r4-1 r11 r30 4 r3-1
r22 r4-1 5 r6-1 r60 4 6 brlc Loop
22
16c. Modulo Scheduling
Generate the modulo schedule. Assume the
priority of the operations is the linear
order. Show the final assembly with staging
predicates assigned.
LC 200
1 r1-1 load(r60) 2 r2-1
load(r20) 3 r4-1 r11 r30 4 r3-1
r22 r4-1 5 r6-1 r60 4 6 brlc Loop
23
16d. Modulo Scheduling
Can you achieve an II1 schedule by adding
resources to the processor? If so, which ones?
If not, why?
LC 200
1 r1-1 load(r60) 2 r2-1
load(r20) 3 r4-1 r11 r30 4 r3-1
r22 r4-1 5 r6-1 r60 4 6 brlc Loop
24
16e. Modulo Scheduling
At the 60th cycle of execution of the modulo
scheduled loop, show which instructions are
executed and from what iteration they come from.
LC 200
1 r1-1 load(r60) 2 r2-1
load(r20) 3 r4-1 r11 r30 4 r3-1
r22 r4-1 5 r6-1 r60 4 6 brlc Loop
25
Important Topics Not Covered in W03 Exam
  • Region formation
  • Traces, superblocks, hyperblocks
  • Dataflow analysis
  • Liveness, reaching defs, liverange construction
  • Classical optimizations
  • CSE, induction variable elimination, dead code
    elim
  • ILP optimization
  • Renaming, accumulator/induction variable
    expansion,
  • Control flow optimization
  • Loop unrolling, unreachable code elimination
  • Scheduling
  • Priority calculation, superblock vs trace
    scheduling, machine descriptions, exception
    handling with control speculation

26
Some More Problems ...
27
Region Formation - Trace Selection
Identify the non-trivial traces. Assume a
threshold probability of 60.
100
BB1
40
60
BB2
BB3
135
100
50
10
5
BB5
BB6
BB4
25
35
15
BB7
25
75
BB8
100
28
Region Formation Forming Superblocks
Convert the traces into superblocks. Redraw the
resultant control flow graph below. Remember that
you must infer the profile information for any
basic blocks that you create.
100
BB1
40
60
BB2
BB3
135
100
50
10
5
BB5
BB6
BB4
25
35
15
BB7
25
75
BB8
100
29
Induction Variable Detection
List the basic and derived induction variables
10
r4 r1 ltlt 2 r5 load(r4) r2 r2 10 r6
load(r2) r6 r6 1 store (r1, r6) r7 r1
r2 p1 cmpp_un(r7 lt 4) branch p1, Exit r8 r2
5 r9 r6 ltlt 3 r1 r1 2 r2 r2 1 p1
cmpp_un(r2 gt r10) branch p1, Loop
Loop
100
6
Exit
4
30
Loop Optimization
Apply induction variable strength reduction to
all derived induction variables to convert them
into basic induction variables. Show the
transformed code below.
10
r4 r1 ltlt 2 r5 load(r4) r2 r2 10 r6
load(r2) r6 r6 1 store (r1, r6) r7 r1
r2 p1 cmpp_un(r7 lt 4) branch p1, Exit r8 r2
5 r9 r6 ltlt 3 r1 r1 2 r2 r2 1 p1
cmpp_un(r2 gt r10) branch p1, Loop
Loop
100
6
Exit
4
31
Dataflow Analysis
Compute the reaching definitions GEN/KILL sets
for each basic block
BB1
1 r1 2 r2
BB2
3 r1 load(r2) 4 r2 r2 1
BB3
BB4
5 r3 r1 6 r2 r3 1 7 r8 r8 1
8 r7 r1 - 1 9 r1 r3 10 r2 r7 ltlt 3
11 r7 load(r2) 12 store (r7, r8)
BB5
32
Register Allocation Live Ranges
Construct the merged live ranges for each
variable. Note, you should not merge live ranges
unless they overlap.
1 a load(100) 2 b load(a)
1
3 c b 1
20
1
80
4 d c ltlt 2 5 store (d,0)
6 d b 0xFF 7 e d - 1
99
80
20
8 b load(b) 9 a load(e)
1
10 store(a,-1)
Write a Comment
User Comments (0)
About PowerShow.com