Timing Analysis and Timing Predictability - PowerPoint PPT Presentation

1 / 50
About This Presentation
Title:

Timing Analysis and Timing Predictability

Description:

Hard real-time systems, often in safety-critical applications abound ... Hand coders. Timed automata. Basic Notions. t. Best. case. Worst. case. Lower. bound. Upper ... – PowerPoint PPT presentation

Number of Views:26
Avg rating:3.0/5.0
Slides: 51
Provided by: reinhard4
Category:

less

Transcript and Presenter's Notes

Title: Timing Analysis and Timing Predictability


1
Timing Analysis and Timing Predictability
  • Reinhard Wilhelm

2
Real Hard Real-Time
  • Hard real-time systems, often in safety-critical
    applications abound -
  • Aeronautics, automotive, train industries,
    industry automation


3
Hard Real-Time Systems
  • Embedded controllers with hard deadlines.
  • Need to statically know upper bounds on the
    execution times of all tasks
  • Commonly called the Worst-Case Execution Time
    (WCET)
  • Analogously, Best-Case Execution Time (BCET)

4
Who needs this Timing Analysis?
  • TTA
  • Synchronous languages
  • Stream-oriented people
  • UML real-time
  • Hand coders
  • Timed automata

5
Basic Notions
6
Structure of the Talk
  • Timing Analysis a good story, only slightly
    cheating
  • Prediction of Cache Behavior but theres more
    to it!
  • Timing Predictability
  • Variability of Execution Times mostly in the
    memory hierarchy
  • Language constructs and their timing behavior
  • Components
  • Component-wise cache behavior prediction
  • RT CORBA
  • Components and Hard Real Time a Challenge
  • Conclusion

7
Non-exhaustive Analysis
  • Assumption System under analysis too big for an
    exhaustive analysis
  • Approximation/abstraction necessary
  • Resulting uncertainty produces intervals of
    execution times

8
Timing Analysis
9
Industrial Practice
  • Measurements computing maximum of some
    executions. Does not guarantee an upper bound to
    all executions
  • Measurement has acquired a bad reputation, is now
    called observed worst-case execution
    time.Heavily used outside of Old Europe.

10
Once upon a Time,the World was Compositional
u_bound(if c then s1 else s2) u_bound( c )
maxu_bound(s1), u_bound(s2)

add 4

mv m Reg 12
mv Reg m 14
mv Reg Reg 1

u_bound(xyz) time(mv y R1) time(mv z
R2) time(add R1 R2) time(mv R1 x)
11
Modern Hardware Features
  • Modern processors increase (average) performance
    by Caches, Pipelines, Branch Prediction
  • These features make
  • execution times history dependent and
  • WCET computation difficult
  • Execution times of instructions vary widely
  • Best case - everything goes smoothely no cache
    miss, operands ready, needed resources free,
    branch correctly predicted
  • Worst case - everything goes wrong all loads
    miss the cache, resources needed are occupied,
    operands are not ready
  • Span may be several hundred cycles

12
Access Times
x a b
PPC 755
13
(Concrete) Instruction Execution
mul
Execute Multicycle?
Retire Pending instructions?
Fetch I-Cache miss?
Issue Unit occupied?
4
1
3
30
1
s1
3
s2
41
14
Timing Accidents and Penalties
  • Timing Accident cause for an increase of the
    execution time of an instruction
  • Timing Penalty the associated increase
  • Types of timing accidents
  • Cache misses
  • Pipeline stalls
  • Branch mispredictions
  • Bus collisions
  • Memory refresh of DRAM
  • TLB miss

15
Overall Approach Natural Modularization
  • Processor-Behavior Prediction
  • Uses Abstract Interpretation
  • Excludes as many Timing Accidents as possible
  • Determines WCET for basic blocks (in contexts)
  • Worst-case Path Determination
  • Maps control flow graph to an integer linear
    program
  • Determines upper bound and associated path

16
Overall Structure
Static Analyses
Processor-Behavior Prediction
Worst-case Path Determination
17
Murphys Law in Timing Analysis
  • Naïve, but safe guarantee accepts Murphys Law
    Any accident that may happen will happen
  • Consequence hardware overkill necessary to
    guarantee timeliness
  • Example Alfred Rosskopf, EADS Ottobrunn,
    measured performance of PPC with all the caches
    switched off (corresponds to assumption all
    memory accesses miss the cache)Result Slowdown
    of a factor of 30!!!

18
Fighting Murphys Law
  • Static Program Analysis allows the derivation of
    Invariants about all execution states at a
    program point
  • Derive Safety Properties from these invariants
    Certain timing accidents will never
    happen.Example At program point p, instruction
    fetch will never cause a cache miss
  • The more accidents excluded, the lower the upper
    bound
  • (and the more accidents predicted, the higher the
    lower bound)

Warning This story is good, but not always true!
19
True Benchmark Results
  • Airbus with flight-control system,
  • Mälardalen Univ. in industry projects,
  • Univ. Dortmund
  • have found overestimations of 10 by aiT.

20
Caches Fast Memory on Chip
  • Caches are used, because
  • Fast main memory is too expensive
  • The speed gap between CPU and memory is too large
    and increasing
  • Caches work well in the average case
  • Programs access data locally (many hits)
  • Programs reuse items (instructions, data)
  • Access patterns are distributed evenly across the
    cache

21
Speed gap betweenprocessor main RAM increases
P.Marwedel
22
Caches How the work
  • CPU wants to read/write at memory address a,
    sends a request for a to the bus
  • Cases
  • Block m containing a in the cache (hit) request
    for a is served in the next cycle
  • Block m not in the cache (miss) m is
    transferred from main memory to the cache, m may
    replace some block in the cache,request for a is
    served asap while transfer still continues
  • Several replacement strategies LRU, PLRU,
    FIFO,...determine which line to replace

23
A-Way Set Associative Cache
CPU
Address
Compare address prefix If not equal, fetch block
from memory
Main Memory
Byte select align
Data Out
24
LRU Strategy
  • Each cache set has its own replacement logic gt
    Cache sets are independent Everything
    explained in terms of one set
  • LRU-Replacement Strategy
  • Replace the block that has been Least Recently
    Used
  • Modeled by Ages
  • In the following 4-way set associative cache

25
A-Way Set Associative Cache
CPU
Address
Compare address prefix If not equal, fetch block
from memory
Main Memory
Byte select align
Data Out
26
Cache Analysis
  • Static precomputation of cache contents at each
    program point
  • Must Analysis which blocks are always in the
    cache.Determines safe information about cache
    hits.Each predicted cache hit reduces upper
    bound.
  • May Analysis which blocks may be in the
    cache.Complement says what is never in the
    cache.Determines safe information about cache
    misses. Each predicted cache miss increases
    lower bound.

27
Cache with LRU Replacement Transfer for must
28
Cache Analysis Join (must)
Join (must)
Access to memory block a is cache hit
29
Cache with LRU Replacement Transfer for may
30
Cache Analysis Join (may)
31
Cache Analysis
Approximation of the Collecting Semantics
32
Timing Predictability
33
Basic Notions
34
Variability of Execution Times
  • is at the heart of timing inpredictability,
  • is introduced at all levels of granularity
  • Memory reference
  • Instruction execution
  • Function
  • Task
  • Distributed system of tasks
  • Service

35
Penalties for Memory Accesses(in cycles for
PowerPC 755)
cache miss 40
cache miss write back 80
TLB-miss and loading 12 reads, 1 write 500
Memory-mapped I/O 800
Page fault 2000
Remember Penalties have to assumed for
uncertainties!
Tendency increasing, since clocks are getting
faster faster than everything else
36
Further Penalties - Processor periphery
  • Bus protocol
  • DMA

37
Cache Impact of Language Constructs
  • Pointer to data
  • Function pointer
  • Dynamic method invocation
  • Service demultiplexing CORBA

38
Cache with LRU Replacement Transfer for must
under unknown access, e.g. unresolved data pointer
Set of abstract cache
?
If address is completely undetermined, same loss
of information in every cache set!
Analogously for multiple unknown accesses, e.g.
unknown function pointer assume maximal cache
damage
39
Dynamic Method Invocation
  • Traversal of a data structure representing the
    class hierarchy
  • Corresponding worst-case execution time and
    resulting cache damage

40
Components
  • Component-wise cache-behavior prediction
  • a pragmatic, very simplistic notion of
    Component, i.e. unit of analysis (or compilation)
  • A DAG of components defined by calling
    relationship cycles only inside components
  • RT CORBA just to frighten you
  • A Challenge Components with predictable timing
    behavior

41
Component-wise I-Cache Analysis
  • So far, analysis done on fully linked
    executables, i.e. all allocation information
    available
  • Allocation sensitivity
  • Placing module into executable at different
    address changes the mapping from memory blocks to
    sets
  • ? Analyze component under some allocation
    assumption Enforce cache-equivalent allocation
    by influencing linker
  • Cache damage due to calls to a different
    component
  • Callers memory blocks can be evicted by callees
    blocks
  • Callees blocks stay in the cache after return
  • ? Cache-damage analysis

42
Cache Damage Analysis
  • Callers memory blocks can be evicted from the
    cache during the call the cache damage
  • Callees memory blocks are in the cache returning
    from the call the cache residue
  • Cache damage analysis computes safe bounds on
    number of replacements in sets
  • Must analysis upper-bound
  • May analysis lower-bound

43
Cache Damage Analysis
  • Bound of replacements in a set is increased, when
    accessed memory block mapped to this set is not
    yet in the cache
  • Combined update and join functions
  • Use these new functions for fixed point
    computation
  • Cache damage update
  • combines results at the function return

44
Cache Damage and Residue
bar() ... ... ...
foo() ... ... ... ... call bar()
... ... ...
Must
May
4
4
45
Cache Damage and Residue (2)
bar() ... ... ...
foo() ... ... ... ... call bar()
... ... ...
Must
May
46
Proposed Analysis Method
  • Input DAG of inter-module call relationship
  • Bottom-up analysis
  • Start from non-calling module
  • For each module
  • Analyze all functions
  • Initial assumptions
  • Must analysis cache is empty
  • May analysis everything can be in the cache with
    age 0
  • For external calls use results of cache-damage
    analysis
  • Store results of module analysis
  • Will be used during analysis of calling modules
  • Compose analysis results of all modules

47
Real-Time CORBA
  • Attempt to achieve end-to-end middleware
    predictability for distributed real-time systems
  • Real-time CORBA is middleware standard
  • Real-Time Specification for Java (RTSJ)
  • new memory management models, no GC,
  • access to physical memory,
  • strong guarantees on thread semantics

48
RT CORBA
D Schmidt et al. Towards Predictable Real-Time
Java 2003
49
Making Demultiplexing Predictable
  • Dynamics in CORBA
  • POAs activated/deactivated dynamically
  • Servants within a POA activated/deactivated
    dynamically
  • Interface definitions and sets of names of
    operations are static gt use perfect hashing for
    demultiplexing

50
D. Schmidt et al. Enhancing RT-CORBA
51
Timing Predictability
  • Reconciling Predictability with X
  • X (average-case) performance
  • X fault tolerance
  • X reusability/implementation independence

52
Components with Predictable Timing-Behavior - a
Challenge -
  • Needs HW and tool support
  • decreasing variability of execution times by
    combining static with dynamic mechanisms, e.g.
    cache freezing, cache scratchpad memory
  • Needs Occams razor for the language-concept
    design
  • Hard real-time systems, often safety critical,
    have different requirements and priorities than
    systems realized with middleware and components,
    e.g. less frequent updates, no easy
    exchangeability of components

53
Acknowledgements
  • Christian Ferdinand, whose thesis started all
    this
  • Reinhold Heckmann, Mister Cache
  • Florian Martin, Mister PAG
  • Stephan Thesing, Mister Pipeline
  • Michael Schmidt, Value Analysis
  • Henrik Theiling, Mister Frontend Path Analysis
  • Jörn Schneider, OSEK
  • Oleg Parshin, Components

54
Recent Publications
  • R. Heckmann et al. The Influence of Processor
    Architecture on the Design and the Results of
    WCET Tools, IEEE Proc. on Real-Time Systems, July
    2003
  • C. Ferdinand et al. Reliable and Precise WCET
    Determination of a Real-Life Processor, EMSOFT
    2001
  • H. Theiling Extracting Safe and Precise Control
    Flow from Binaries, RTCSA 2000
  • M. Langenbach et al. Pipeline Modeling for
    Timing Analysis, SAS 2002
  • St. Thesing et al. An Abstract
    Interpretation-based Timing Validation of Hard
    Real-Time Avionics Software, IPDS 2003
  • R. Wilhelm AI ILP is good for WCET, MC is not,
    nor ILP alone, VMCAI 2004
  • O. Parshin et al. Component-wise Data-cache
    Behavior Prediction, ATVA 2004
  • L. Thiele, R. Wilhelm Design for Timing
    Predictability, 25th Anniversary edition of the
    Kluwer Journal Real-Time Systems, Dec. 2004
  • R. Wilhelm Determination of Bounds on Execution
    Times, CRC Handbook on Embedded Systems, 2005
Write a Comment
User Comments (0)
About PowerShow.com