Quasi-static Scheduling for Reactive Systems - PowerPoint PPT Presentation

1 / 62
About This Presentation
Title:

Quasi-static Scheduling for Reactive Systems

Description:

execute an actor when it is known to be fireable. no overhead due to sequencing of concurrency ... Repeatedly schedule fireable actors up to number of times in ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 63
Provided by: Comp626
Category:

less

Transcript and Presenter's Notes

Title: Quasi-static Scheduling for Reactive Systems


1
Quasi-static Scheduling for Reactive Systems
  • Jordi Cortadella, Universitat Politècnica de
    Catalunya, Spain
  • Alex Kondratyev, Cadence Berkeley Labs, USA
  • Luciano Lavagno, Politecnico di Torino, Italy
  • Claudio Passerone, Politecnico di Torino, Italy
  • Yosinori Watanabe, Cadence Berkeley Labs, USA
  • Joint work with
  • Robert Clarisó, Alex Kondratyev, Luciano
    Lavagno, Claudio Passerone and Yosinori
    Watanabe (UPC, Cadence Berkeley Labs,
    Politecnico di Torino)

2
Outline
  • The problem
  • Synthesis of concurrent specifications
  • Previous work Dataflow networks
  • Static scheduling of SDF networks
  • Quasi-Static Scheduling of process networks
  • Petri net representation of process networks
  • Scheduling and code generation
  • Open problems

3
Embedded Software Synthesis
  • Specification concurrent functional netlist
    (Kahn processes, dataflow actors, SDL processes,
    )
  • Software implementation (smaller) set of
    concurrent software tasks
  • Two sub-problems
  • Generate code for each task
  • Schedule tasks dynamically
  • Goals
  • minimize real-time scheduling overhead
  • maximize effectiveness of compilation

4
Environmental controller
5
Environmental controller
TEMP-FILTER float sample, last last
0 forever sample READ(TSENSOR) if
(sample - last gt DIF) last sample
WRITE(TDATA, sample)
TSENSOR
HSENSOR
TEMP FILTER
HUMIDITY FILTER
HDATA
TDATA
CONTROLLER
AC-on
DRYER-on
ALARM-on
6
Environmental controller
TEMP-FILTER float sample, last last
0 forever sample READ(TSENSOR) if
(sample - last gt DIF) last sample
WRITE(TDATA, sample)
TSENSOR
HSENSOR
TEMP FILTER
HUMIDITY FILTER
HDATA
TDATA
HUMIDITY-FILTER float h, max forever h
READ(HSENSOR) if (h gt MAX) WRITE(HDATA, h)
CONTROLLER
AC-on
DRYER-on
ALARM-on
7
Environmental controller
CONTROLLER float tdata, hdata forever
select(TDATA,HDATA) case TDATA tdata
READ(TDATA) if (tdata gt TFIRE)
WRITE(ALARM-on,10) else if (tdata gt
TMAX) WRITE(AC-on, tdata-TMAX) case HDATA
hdata READ(HDATA) if (hdata gt HMAX)
WRITE(DRYER-on, 5)
TSENSOR
HSENSOR
TEMP FILTER
HUMIDITY FILTER
HDATA
TDATA
CONTROLLER
AC-on
DRYER-on
ALARM-on
8
Environ.
Processes
OS
Tsensor
T-FILTERwakes up
Operating system
T-FILTERexecutes
T-FILTERsleeps
Hsensor
H-FILTERwakes up
H-FILTERexecutes sends datato HDATA
H-FILTERsleeps
CONTROLLERwakes up
CONTROLLERexecutes reads datafrom HDATA
. . .
9
Operating system
  • Goal improve performance
  • Reduce operating system overhead
  • Reduce communication overhead
  • How? Do as much as possible statically
  • Scheduling
  • Compiler optimizations

TSENSOR
HSENSOR
TEMP FILTER
HUMIDITY FILTER
HDATA
TDATA
CONTROLLER
AC-on
DRYER-on
ALARM-on
10
Outline
  • The problem
  • Synthesis of concurrent specifications
  • Previous work Dataflow networks
  • Static scheduling of SDF networks
  • Quasi-Static Scheduling of process networks
  • Petri net representation of process networks
  • Scheduling and code generation
  • Open problems

11
A bit of history
  • Kahn process networks (58) formal model
  • Karp computation graphs (66) seminal work
  • Dennis Dataflow networks (75) programming
    language for MIT DF machine
  • Lees Static Data Flow networks (86) efficient
    static scheduling
  • Several recent implementations(Ptolemy, Khoros,
    Grape, SPW, COSSAP, SystemStudio, DSPStation,
    Simulink, )

12
Intuitive semantics
  • (Often stateless) actors perform computation
  • Unbounded FIFOs perform communication via
    sequences of tokens carrying values
  • (matrix of) integer, float, fixed point
  • image of pixels, ..
  • Determinacy
  • unique output sequences given unique input
    sequences
  • Sufficient condition blocking read
  • (process cannot test input queues for emptiness)

13
Intuitive semantics
  • Example FIR filter
  • single input sequence i(n)
  • single output sequence o(n)
  • o(n) c1 i(n) c2 i(n-1)

i(-1)
i
? c1
? c2

o
14
Examples of Dataflow actors
  • SDF Static Dataflow fixed number of input and
    output tokens
  • BDF Boolean Dataflow control token determines
    number of consumed and produced tokens

1

1
1
T
F
select
merge
F
T
15
Static scheduling of DF
  • Key property of DF networks output sequences do
    not depend on firing sequence of actors
  • SDF networks can be statically scheduled at
    compile-time
  • execute an actor when it is known to be fireable
  • no overhead due to sequencing of concurrency
  • static buffer sizing
  • Different schedules yield different
  • code size
  • buffer size
  • pipeline utilization

16
Balance equations
  • Number of produced tokens must equal number of
    consumed tokens on every edge
  • Repetitions (or firing) vector vS of schedule S
    number of firings of each actor in S
  • vS(A) np vS(B) nc
  • must be satisfied for each edge

np
nc
A
B
17
Balance equations
A
2
3
2
1
1
1
B
C
1
1
  • Balance for each edge
  • 3 vS(A) - vS(B) 0
  • vS(B) - vS(C) 0
  • 2 vS(A) - vS(C) 0
  • 2 vS(A) - vS(C) 0

18
Balance equations
  • M vS 0
  • iff S is periodic
  • Full rank (as in this case)
  • no non-zero solution
  • no periodic schedule
  • (too many tokens accumulate on A?B or B?C)

19
Balance equations
  • Non-full rank
  • infinite solutions exist (linear space of
    dimension 1)
  • Any multiple of q 1 2 2T satisfies the
    balance equations
  • ABCBC and ABBCC are minimal valid schedules
  • ABABBCBCCC is non-minimal valid schedule

20
Static SDF scheduling
  • Main SDF scheduling theorem (Lee 86)
  • A connected SDF graph with n actors has a
    periodic schedule iff its topology matrix M has
    rank n-1
  • If M has rank n-1 then there exists a unique
    smallest integer solution q to
  • M q 0

21
From repetition vector to schedule
  • Repeatedly schedule fireable actors up to number
    of times in repetition vector
  • q 1 2 2T
  • Can find either ABCBC or ABBCC
  • If deadlock before original state, no valid
    schedule exists (Lee 86)

22
Compilation optimization
  • Assumption code stitching
  • (chaining custom code for each actor)
  • More efficient than C compiler for DSP
  • Comparable to hand-coding in some cases
  • Explicit parallelism, no artificial control
    dependencies
  • Main problem memory and processor/FU allocation
    depends on scheduling, and vice-versa

23
Code size minimization
  • Assumptions (based on DSP architecture)
  • subroutine calls expensive
  • fixed iteration loops are cheap
  • (zero-overhead loops)
  • Global optimum single appearance schedule
  • e.g. ABCBC ? A (2BC), ABBCC ? A (2B) (2C)
  • may or may not exist for an SDF graph
  • buffer minimization relative to single appearance
    schedules
  • (Bhattacharyya 94, Lauwereins 96, Murthy 97)

24
Buffer size minimization
  • Assumption no buffer sharing
  • Example
  • q 100 100 10 1T
  • Valid SAS (100 A) (100 B) (10 C) D
  • requires 210 units of buffer area
  • Better (factored) SAS (10 (10 A) (10 B) C) D
  • requires 30 units of buffer areas, but
  • requires 21 loop initiations per period (instead
    of 3)

25
Scheduling more powerful DF
  • SDF is limited in modeling power
  • More general DF is too powerful
  • non-Static DF is Turing-complete (Buck 93)
  • bounded-memory scheduling is not always possible
  • Boolean Data Flow Quasi-Static Scheduling of
    special patterns
  • if-then-else, repeat-until, do-while
  • Dynamic Data Flow run-time scheduling
  • may run out of memory or deadlock at run time
  • Kahn Process Networks quasi-static scheduling
    using Petri nets
  • conservative schedulable network may be declared
    unschedulable

26
Outline
  • The problem
  • Synthesis of concurrent specifications
  • Compiler optimizations across processes
  • Previous work Dataflow networks
  • Static scheduling of SDF networks
  • Code and data size optimization
  • Quasi-Static Scheduling of process networks
  • Petri net representation of process networks
  • Scheduling and code generation
  • Open problems

27
Quasi-Static Scheduling
  • Sequentialize concurrent operations as much as
    possible
  • less communication overhead (run-time task
    generation)
  • better starting point for compilation
    (straight-line code from function blocks)
  • Must handle
  • data-dependent control
  • multi-rate communication

28
The problem
  • Given a network of Kahn processes
  • Kahn process sequential function ports
  • communication port-based, point-to-point,
    uni-directional, multi-rate
  • Find a single task
  • functionally equivalent to the originalnetwork
    (modulo concurrency)
  • driven by input stimuli(no OS intervention)

TSENSOR
HSENSOR
TEMP FILTER
HUMIDITY FILTER
HDATA
TDATA
CONTROLLER
AC-on
DRYER-on
ALARM-on
29
The scheduling procedure
  • 1. Specify a network of processes
  • process C communication operations
  • netlist connection between ports
  • 2. Translate to the computational model Petri
    nets
  • 3. Find a schedule on the Petri net
  • 4. Translate the schedule to a task

30
TSENSOR
TSENSOR
TEMP FILTER
last 0
TDATA
sample READ(TSENSOR)
TEMP-FILTER float sample, last last 0 while
(1) sample READ(TSENSOR) if (sample -
lastgt DIF) last sample
WRITE(TDATA, sample)
F
T
last sample WRITE(TDATA,sample)
TDATA
31
Petri nets for Kahn process networks
Sequential processes (1 token per process)
Input/Output ports (communication with the
environment)
Channels (point-to-point communication between
processes)
32
Petri nets for Kahn process networks
True
True
False
False
  • Data-dependent choices
  • Conservative assumption (any outcome is possible)

33
Scheduling game
Adversary
Scheduler
t1
t2
t3
Data choice inputs
The rest of transitions
t4
t5
t6
t1
t2
t1
t3
t4
t5
t6
34
Scheduling game
Adversary
Scheduler
t1
t2
t3
Data choice inputs
The rest of transitions
t4
t5
t6
t1
t2
t1
t3
t4
t5
?
t6
35
Scheduling game
Adversary
Scheduler
t1
t2
t3
Data choice inputs
The rest of transitions
t4
t5
?
t6
36
Schedule generation
p0
  • Schedule is an RG subset
  • Finite
  • Sequential
  • Live wrt to source transitions
  • All FCS transitions are fired in a state

(FCS always conflicting transitions)
Depth first traversal with backtracking
37
Schedule generation
Await states
38
Handling infinity
PN with source transitions has infinite
reachability space
Need for termination conditions during traversal
  • Irrelevance Criterion
  • Impose place bounds by the structure of the PN.
  • Identify irrelevant nodes in the reachability
    tree.
  • If the algorithm hits an irrelevant node,
    backtrack.

Bounds the reachability space!!!
39
Irrelevance criterion
bound of placemax of
v is irrelevant node iff
max(34-1, 1) 6
1. v succeds u,
2. ?p, M(u, p) ? M(v, p),
3. ?p, if M(u, p) lt M(v, p), then M(u, p) ?
the bound of p.
v is as at least capable as u u already hits the
bounds
Irrelevance is more than marking, it is
markinghistory!!!
40
Quality of irrelevance criterion
Heuristic for the general Petri nets
irrelevant
For unique and/or free choice PNs irrelevance
may be exact (if yes, then schedulability is
decidable in this class)
Open issue
41
Properties of the Algorithm
  • Claim1
  • If the algorithm terminates successfully, a
    schedule is obtained.
  • Claim2
  • If the algorithm does NOT terminate successfully,
    no schedule exists under given termination
    conditions

Semi-decision procedure!!!
42
Divide and conquer
43
Divide and conquer
44
Checking SSS independence
Marking equations
Consumption of tokens
SSS independence
N. and S. condition
M0(p) worst_change(p,a)
SSS_change(p,a) ? 0
Worst consumption of p in SSS(a)
Worst consumption of p in other SSSs
Complexity of checking O(? SSS)
Composition has exponentially larger number of
states!!!
45
Code generation
Initialization
I1
system
Await state
I1
I2
I2
  • Generated code
  • ISRs driven by input stimuli (I1 and I2)
  • Each tasks contains threads from one await
    state to another await state

Choice
I1
I2
T
F
F
T
I1
I2
46
Code generation
I1
system
I1
I2
I2
  • Generated code
  • ISRs driven by input stimuli (I1 and I2)
  • Each tasks contains threads from one await
    state to another await state

I1
I2
T
F
F
T
I1
I2
47
Code generation
Init
I1
system
I1
I2
I2
C9
C1
C4
  • Generated code
  • ISRs driven by input stimuli (I1 and I2)
  • Each tasks contains threads from one await
    state to another await state

C5
C2
C3
C11
F
I2
I1
I1
I2
C8
C6
C10
C7
T
48
Code generation
enum state S1, S2, S3 S
C0
I1
I2
C9
C1
C4
C5
C2
C3
C11
F
I2
I1
I1
I2
C8
C6
C10
C7
T
49
Code generation
enum state S1, S2, S3 S Init () C0() S
S1 return
C0
50
Code generation
enum state S1, S2, S3 S ISR1 ()
switch(S) case S1 C1() C2() SS2
return case S2 C3() C2() return case
S3 C6() C7() C11() C5() return
I1
C1
C5
C2
C3
C11
I1
I1
C6
C7
51
Code generation
enum state S1, S2, S3 S
ISR2 () switch(S) case S1 C4()
C5() SS3 break case S2 C10() C11()
C5() SS3 return case S3 if (C8())
C7() C11() C5() return
else C9() S
S1 return
I2
C9
C4
C5
C11
F
I2
I2
C8
C10
C7
T
52
Code generation
enum state S1, S2, S3 S Init () C0() S
S1 return ISR1 () switch(S) case
S1 C1() C2() SS2 return case S2 C3()
C2() return case S3 C6() C7() C11()
C5() return ISR2 () switch(S)
case S1 C4() C5() SS3 break case S2
C10() C11() C5() SS3 return case S3 if
(C8()) C7() C11() C5()
return else
C9() S S1 return
C0
I1
I2
C9
C1
C4
C5
C2
C3
C11
F
I2
I1
I1
I2
C8
C6
C10
C7
T
53
Code generation
enum state S1, S2, S3 S Init () C0() S
S1 return ISR1 () switch(S) case
S1 C1() C2() SS2 return case S2 C3()
C2() return case S3 C6() C7() C11()
C5() return ISR2 () switch(S)
case S1 C4() C5() SS3 break case S2
C10() C11() C5() SS3 return case S3 if
(C8()) C7() C11() C5()
return else
C9() S S1 return
Reset
Init ()
S
I1
ISR1 ()
I2
ISR2 ()
54
Experimental Results
Thdr
Thdr
QSS
Tvld
TdecMV
Tpredict
Tvld
TdecMV
Tpredict
Tisiq
Tidct
Tadd
Tisiq
Tidct
Tadd
  • QSS applied to a subset of the MPEG-2 Decoder
  • (5 processes out of the original 11)

55
The MPEG2 decoder
  • Performance increased by 45
  • reduction of communication (no internal FIFOs
    between statically scheduled processes)
  • reduction of run-time scheduling (OS)
  • no reduction in computation

56
Open problems
  • Is a system schedulable ? (decidability)
  • False paths in concurrent systems(data
    dependencies)
  • Synthesis for concurrent architectures
  • Timing models

57
(Quasi) Static Scheduling approaches
  • Lee et al. 86 Static Data Flow cannot specify
    data-dependent control
  • Buck et al. 94 Boolean Data Flow undecidable
    schedulability check, heuristic pattern-based
    algorithm
  • Thoen et al. 99 Event graph no schedulability
    check, no task minimization
  • Lin 97 Safe Petri Net no schedulability check,
    single-rate, reachability-based algorithm
  • Thiele et al. 99 Bounded Petri Net partial
    schedulability check, reachability-based
    algorithm
  • Cortadella et al. 00 General Petri Net maybe
    undecidable schedulability check, balance
    equation-based algorithm

58
The false path problem (example)
while (true) a rnd() Write(ct, a, 1)
if (a gt 0.5) Write(dt, d, 2) else
Write(dt, d, 1)
while (true) Read(ct, a, 1) if (a gt 0.5)
Read(dt, d, 2) else
Read(dt, d, 1)
ct
dt
Process P1
Process P2
If P1 does Write(dt, d, 2), then P2 never does
Read(dt, d, 1), i.e. this path is false!
59
False path elimination
  • The designer manually tags sets of correlated
    conditions
  • An implicit data-dependent correlation become
    explicit communication and control-dependent
    synchronization
  • the tool automatically adds synchronization
    channels to model correlation
  • Scheduling is then possible
  • synchronization channels can be deleted after
    code generation (no overhead in final
    implementation)

60
The false path problem (example)
while (true) a rnd() Write(ct, a, 1)
if (a gt 0.5) Write(dt, d, 2) else
Write(dt, d, 1)
while (true) Read(ct, a, 1) if (a gt 0.5)
Read(dt, d, 2) else
Read(dt, d, 1)
ct
dt
Process P1
Process P2
61
False path elimination algorithm
port
pragma tag sync if (cond1) Write(syncT, d,
1) stm1 else Write(syncF, d, 1)
stm2
pragma tag sync (void) (cond2) switch(Select(syn
cT,syncF))
pragma tag sync (void) (cond2) switch(Select(syn
cT,syncF)) case 0 Read(syncT, d, 1) stm3
break case 1 Read(syncF, d, 1) stm4
break
Process P1
Process P2
62
False path elimination algorithm
  • 1. For each correlated control pair, add two
    ports lttaggtT and lttaggtF to processes P1 and P2
    and connect them.
  • 2. Add Write statements at the beginning of both
    branches of if-then-else, writing on the created
    ports in process P1.
  • 3. Delete if-then-else from process P2 and add a
    switch on the output of a Select statement on the
    created ports.
  • 4. Fill in the case clauses with the appropriate
    code from the branches in process P2, reading
    data from the created ports.
  • 5. Finally, apply QSS and eliminate the added
    synchronization.

63
False path elimination algorithm
(void) (cond2) if (cond1) Write(syncT,
P1_d, 1) Read(syncT, P2_d, 1) stm1
stm3 else Write(syncF, P1_d, 1)
Read(syncF, P2_d, 1) stm2 stm4
(void) (cond2) if (cond1) stm1 stm3
else stm2 stm4
port
pragma tag sync if (cond1) Write(syncT, d,
1) stm1 else Write(syncF, d, 1)
stm2
pragma tag sync (void) (cond2) switch(Select(syn
cT,syncF)) case 0 Read(syncT, d, 1) stm3
break case 1 Read(syncF, d, 1) stm4
break
syncT
syncF
Process P1
Process P2
QSS
Process P
64
Conclusions
  • QSS shows significant gains in real examples
  • Current theory has several open problems
  • Future extensions are

QSS
Write a Comment
User Comments (0)
About PowerShow.com