Scratchpad Allocation for Concurrent Embedded Software - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Scratchpad Allocation for Concurrent Embedded Software

Description:

Radio. Control. SPI. Control. FBW. Main. time. task. message ... Impose slack. Re-analyze WCRT. Propagate lifetime shift. Iterate. Stop when no more improvement ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 30
Provided by: hsienhs
Category:

less

Transcript and Presenter's Notes

Title: Scratchpad Allocation for Concurrent Embedded Software


1
Scratchpad Allocation for Concurrent Embedded
Software
  • Vivy Suhendra
  • Abhik Roychoudhury
  • Tulika Mitra
  • National University of Singapore

2
Scratchpad Memory
  • Software-managed on-chip fast memory
  • Better timing predictability than caches
  • Real-time guarantees
  • Content selection
  • Beneficial memory blocks
  • Runtime management
  • Usage by active tasks

3
Contribution
  • Sequential application
  • Individual tasks
  • Worst-case execution time (WCET)

Scratchpad Allocation
  • Concurrent application
  • Interacting tasks
  • Control/data dependency
  • Preemption
  • Worst-case response time (WCRT)

4
System Model
  • Message Sequence Chart (MSC)

Radio Control
SPI Control
FBW Main
task
message communication
task
task
preemptive
5
Motivating Example
Reload
Reload
Time-multiplexing
More opportunities!
6
Considerations
  • Sharing decision lifetime analysis
  • Feedback loop

7
Workflow
8
Workflow
Start
Interference improves?
Yes
No
Scratchpad sharing scheme allocation
Stop
  • Initialize
  • empty allocation
  • full interference

Task lifetimes interference graph
Scratchpad allocation decision
Task analysis
WCRT analysis
Task WCETs memory profiles
9
Task Analysis
  • Static timing analysis
  • Worst-case execution time path
  • Tool Chronos
  • Micro-architecture modeling
  • Infeasible path detection
  • Memory profile
  • This work code allocation (basic block)
  • Area of code blocks
  • Gain if allocated
  • Execution frequency reduction in latency

10
Workflow
Start
Interference improves?
Yes
No
Scratchpad sharing scheme allocation
Stop
  • Initialize
  • empty allocation
  • full interference

Task lifetimes interference graph
Scratchpad allocation decision
Task analysis
Task WCETs memory profiles
WCRT analysis
11
WCRT Analysis
  • Yen Wolf, TPDS 1998
  • Compute earliest, latest start and finish times
  • Computation time Finish(t) Start(t) WCRT(t)
  • Task dependencies Start(u) Finish(t)
  • ? iterative tightening of bounds
  • WCRT(t) is a function of
  • WCET(t), and
  • Delay by higher-priority tasks whose lifetimes
    overlap ts
  • ? fixed-point computation

12
WCRT Analysis
Application WCRT
13
WCRT Analysis
  • Changed interference pattern after allocation?

Problem returns!
14
Adjustment
W2
Slack enforcement
15
Workflow
Start
Interference improves?
Yes
No
Scratchpad sharing scheme allocation
Stop
  • Initialize
  • empty allocation
  • full interference

Task lifetimes interference graph
Scratchpad allocation decision
Task analysis
WCRT analysis
Task WCETs memory profiles
16
Profile-based Knapsack (PK)
Distribute space among tasks based on memory
profiles
17
Profile-based Knapsack (PK)
  • Integer Linear Programming
  • Objective minimize
  • Capacity constraint

18
Interference Clustering (IC)
fm1
fm2
fm4
fr0
fr1
fs0
  • Isolate interference in clusters
  • Intra-cluster space distribution
  • Inter-cluster time-multiplexing

19
Graph Coloring (GC)
ILP adjustment
fm2
fr0
  • Refined interference relation
  • Same color time-multiplexing
  • Inter-color space distribution

fs0
fm1
fr1
fm4
20
Interference Reduction
fm1
fm2
fm4
fr0
fr1
fs0
Slack enforcement
fm2
fr0
Eliminate chosen interference to allow better
sharing scheme
fs0
fm1
fr1
fm4
21
Critical Path Interference Reduction (CR)
  • Find interferences on critical path
  • (t, u) t on critical path, u preempts t
  • Identify interference with worst impact
  • Longest duration of preemption
  • Eliminate the interference
  • Impose slack
  • Re-analyze WCRT
  • Propagate lifetime shift
  • Iterate
  • Stop when no more improvement

22
Extension to Multiprocessor
CPU1
CPU2
SPM2
SPM1
23
UAV Application PapaBench
Autopilot
Fly-By-Wire
24
Evaluation Parameters
  • Scratchpad latency 1 cycle
  • Main memory latency 100 cycles
  • Fetch width 16 B (2 instructions)
  • Task code sizes 96 B 6.3 KB
  • 512 B 8 KB total scratchpad size
  • 1, 2, 4 processors

25
1-PE Configuration
26
2-PE Configuration
27
4-PE Configuration
28
Concluding Remarks
  • Scratchpad allocation considering concurrent
    application
  • Process interaction significantly affect
    application response time
  • Justifies interference reduction via slack
    enforcement

29
vivy, abhik, tulika _at_comp.nus.edu.sg
  • Contact Us

30
Related Work
31
Graph Coloring (GC)
  • NP-complete need heuristics
  • Welsh-Powell algorithm
  • Initialize all nodes to uncolored
  • Traverse the nodes in decreasing order of degree
  • Assign color 1 to an uncolored nodeif no
    adjacent node has been assigned color 1
  • Repeat second step with colors 2, 3, etc
  • Until no node is uncolored

32
UAV Application PapaBench
33
Algorithm Runtime
Write a Comment
User Comments (0)
About PowerShow.com