Title: Lic Presentation
 1Lic Presentation
Memory Aware Task Assignment and Scheduling for 
MultiprocessorEmbedded Systems
Radoslaw Szymanek / Embedded System 
Design Radoslaw.Szymanek_at_cs.lth.se http//www.cs.l
th.se/home/Radoslaw_Szymanek 
 2Outline
- Introduction 
 - Problem Formulation and Motivational Example 
 - CLP Introduction 
 - CLP Modeling 
 - Optimization Heuristic and Experimental Results 
 - Conclusions
 
  3System Level Synthesis (SLS)
- Multiprocessor embedded systems are designed 
using CPUs, ASICs, buses, and interconnection 
links  - The application areas range from signal and image 
processing to multimedia and telecommunication  - Task graph representation for application 
 - The main design activities are task assignment 
and scheduling for a given architecture  - Memory constraints (code and data memory)
 
  4SLS with memory constraints
ROM
RAM
ROM
RAM
L1
P1
P2
B1
ROM
RAM
RAM
L2
P3
A1
target architecture
annotated task graph 
 5Problem Assumptions and Formulation
- Data dominated application represented as 
directed bipartite acyclic task graph  - Each task is annotated with execution time, code 
and data memory requirements  - Heterogeneous architecture 
 - Both tasks and communications are atomic and they 
must be performed in one step  - Find a good CLP model 
 - Find a good heuristic for memory constrained time 
minimization task assignment and scheduling 
satisfying all constraints 
  6Motivation
- SoC multiprocessor architectures 
 - Co-design methodology needs tool support 
 - Memory consideration to decrease cost and power 
consumption  - System Level design for fast evaluation
 
  7Motivating example (memory)
Data Memory
Schedule
8
8
P1
P2
DC3
6
6
P2
DC2
C3
C2
4
4
C1
L1
C2
t
t
P1
DC2
P1
P2
task graph
8
8
6
6
P2
4
4
L1
P1
P2
L1
C3
DC1
DC3
P1
t
t
DC3
DC2
architecture
Task - 1kB code memory, 4kB data memory, 
Communication - 2kB data memory 
 8CLP Introduction
- Constraint programming represents one of the 
closest approaches computer science has yet made 
to the Holy Grail of programming the user states 
the problem, the computer solves it.  - Eugene C. Freuder 
 - CONSTRAINTS, April 1997
 
  9CLP Introduction
- Relatively young and attractive approach for 
modeling many types of optimization problems  - Many heterogeneous applications of constraints 
programming exist today  - State decision variables which constitute to 
solution  - State constraints which must be satisfied by 
solution  - Search for solutions using knowledge you can 
derive from constraints 
  10Constraints properties
- may specify partial information  need not 
uniquely specify the values of its variables,  - non-directional  typically one can infer a 
constraint on each present variable,  - declarative  specify relationship, not a 
procedure to enforce this relationship,  - additive  order of imposing constraints does not 
matter,  - rarely independent  typically they share 
variables. 
  11A simple constraint problem
1. Specify all decision variables and their 
initial domains
Natural language description There are three 
tasks, namely, T1, T2, and T3. Each of these 
tasks can execute on any of two available 
processors, P1 and P2. Tasks T1 and T2 send data 
to task T3. The tasks should be assigned and 
scheduled in such a way that the schedule length 
does not exceed 10 seconds.
CLP description TP1, TP2, TP3  1..2, TS1, 
TS2, TS3  0..10, Cost  0..10, 
 12A simple constraint problem
2. Specify all constraints and additional 
variables
The execution time of task T1 is four seconds on 
processor P1 and two seconds on processor P2. 
 Task T2 requires three and five seconds to 
complete execution on processor P1 and P2 
respectively. Task T3 always needs three seconds 
for execution.
If TP1  1 then TD1  4. If TP1  2 then TD1  2, 
 If TP2  1 then TD2  3, If TP2  2 then TD2  
5, TD3  3, 
 13A simple constraint problem
Tasks T1 and T2 must execute on different 
processors. Tasks T1 and T2 send data to task 
T3. If two communicating tasks are executed on 
different processors there must be at least one 
second delay between them so the data can be 
transferred. The tasks should be assigned and 
scheduled in such a way that the schedule length 
does not exceed 10 seconds. 
TP1 ! TP2, 
If TP1 ! TP3 then D1  1 else D1  0, TS1  TD1 
 D1 lt TS3, , 
Cost gt TS1  TD1, Cost gt TS2  TD2, Cost gt 
TS3  TD3. 
 14Search Tree 
 15Modeling
- Constraint Logic Programming (finite domain, CHIP 
solver)  - Global constraints (cumulative, diffn, sequence, 
etc.) reduce model complexity of the synthesis 
problem and exploit specific features of the 
problem  - Global constraints are useful for modeling 
placement problems and graph problems  - Problem-specific search heuristic for NP-hard 
problem 
  16CLP Model
- Decision variables for task 
 - TS  start time of the task execution 
 - TP  resource on which task is executed 
 - TDP  exact placement of task local data in 
memory  - Additional variables for task 
 - TD  task duration 
 - TCM and TDM denote the amount of code and data 
memory for task execution 
  17CLP Model
- Decision variables for data 
 - DS  start time of the data communication 
 - DB  resource on which data is communicated 
 - DCP and DPP  exact placement of data in memory 
of the producer and consumer processor  - Additional variables for data 
 - DD  data communication duration
 
  18CLP Model  Task Requirements 
 19CLP Model  Data Requirements
DM
CU
DM
DD
DA
1
DA
DCP
DB
DPP
time
time
time
DSDD
DS
TSc  TDc
DS
TSp
data mem (cons)
 communication time
data mem (prod) 
 20Simple Example
P2
D1_c
T1
D2_e
T2
D1
D2
B1
C1
P1
T3
T1
T2
T3
D2_p
D2_c
D1_p
P1
D1_e
D3_e
Diffn constraint 
 21Code Memory Constraint
Code Memory Limit
T8
T4
T2
T3
T1
T7
T5
T6
Processor 
 22Constraints types
- precedence constraints 
 - processing resources constraints 
 - communication resource constraints 
 - pipelining constraints 
 - code memory constraints 
 - data memory constraints
 
  23Task Assignment and Scheduling Heuristic
Choose a task from ready task set with 
min(max(Ti))  minimize schedule length
Assign the task to a processor with the minimal 
implementaion cost ci
Schedule communications that Ti is minimal
Assign data memory
Y
data memory estimate no. 1 holds?
N
Y
data memory estimate no. 2 holds?
N
Undo all decision  choose a task which consumes 
the most data 
 24Execution Cost 
Ind  LowTS/PTS  LowCM/PCM
i-th task, n-th processor
ATS  available time slots, ACM  available code 
memory 
 25Data and Communication Cost 
i-th task, n-th processor 
 26Estimates
- Estimate no. 1 
 -  where S (Sn) is a set of tasks already scheduled 
on a processor (processor Pn), tasks tj are 
direct successors of task ti, and dij is amount 
of data communicated between ti and tj.  - Estimate no. 2 uses the global constraint diffn 
and it takes time into account  
  27MATAS System 
 28Synthesis Results - H.261 example
DCT
Video Coding Algorithm H.261 
 29Experimental results H.261 example 
 30Experimental results(random task graphs) 
 31Main Contributions
- Definition of the extended task assignment and 
scheduling problem  - Inclusion of memory constraints to decrease the 
cost for data dominated applications  - Specialized search heuristic to solve resource 
constrained task assignment and scheduling  - CLP modeling framework to facilitate an 
efficient, clean, and readable problem definition 
  32Conclusions and Future Work
- The synthesis problem modeled as a constraint 
satisfaction problem and solved by the proposed 
heuristic,  - Good coupling between model and search method for 
efficient search space pruning,  - Memory constraints and pipelined designs taken 
into account,  - Heterogeneous constraints can be modeled in CLP, 
important advantage over other approaches  - Need for our own constraint engine 
implementation, approximate solutions, mixture of 
techniques  - Need for better lower bounds, problem specific 
global constraints, designer interaction during 
search 
  33Lic Presentation
Memory Aware Task Assignment and Scheduling for 
MultiprocessorEmbedded Systems
Radoslaw Szymanek / Embedded System 
Design Radoslaw.Szymanek_at_cs.lth.se http//www.cs.l
th.se/home/Radoslaw_Szymanek 
 34Related Work
- J. Madsen, P. Bjorn-Jorgensen, Embedded System 
Synthesis under Memory Constraints, CODES 99 
(GA, only RAM)  - S. Prakash and A. Parker, Synthesis of 
Application-Specific Heterogeneous Multiprocessor 
Systems, VLSI Signal Processing, 94 (MILP, no 
ASICs, optimal) 
  35A simple constraint problem
There are three tasks, namely, T1, T2, and T3. 
Each of these tasks can execute on any of two 
available processors, P1 and P2. Tasks T1 and T2 
send data to task T3. Tasks T1 and T2 must 
execute on different processors due to some fault 
tolerant issues. The execution time of task T1 is 
four seconds on processor P1 and two seconds on 
processor P2. Task T2 requires three and five 
seconds to complete execution on processor P1 and 
P2 respectively. Task T3 always needs three 
seconds for execution. In case when two 
communicating tasks are executed on different 
processors there must be one second delay between 
them so the data can be transferred. The tasks 
should be assigned and scheduled in such a way 
that the schedule length does not exceed 10 
seconds.