Workload Modeling and its Effect on Performance Evaluation PowerPoint PPT Presentation

presentation player overlay
1 / 149
About This Presentation
Transcript and Presenter's Notes

Title: Workload Modeling and its Effect on Performance Evaluation


1
Workload Modelingand its Effect on Performance
Evaluation
  • Dror Feitelson
  • Hebrew University

2
Performance Evaluation
  • In system design
  • Selection of algorithms
  • Setting parameter values
  • In procurement decisions
  • Value for money
  • Meet usage goals
  • For capacity planing

3
The Good Old Days
  • The skies were blue
  • The simulation results were conclusive
  • Our scheme was better than theirs

Feitelson Jette, JSSPP 1997
4
  • But in their papers,
  • Their scheme was better than ours!

5
  • How could they be so wrong?

6
Performance evaluation depends on
  • The systems design
  • (What we teach in algorithms and data structures)
  • Its implementation
  • (What we teach in programming courses)
  • The workload to which it is subjected
  • The metric used in the evaluation
  • Interactions between these factors

7
Performance evaluation depends on
  • The systems design
  • (What we teach in algorithms and data structures)
  • Its implementation
  • (What we teach in programming courses)
  • The workload to which it is subjected
  • The metric used in the evaluation
  • Interactions between these factors

8
Outline for Today
  • Three examples of how workloads affect
    performance evaluation
  • Workload modeling
  • Getting data
  • Fitting, correlations, stationarity
  • Heavy tails, self similarity
  • Research agenda
  • In the context of parallel job scheduling

9
  • Example 1
  • Gang Scheduling and
  • Job Size Distribution

10
Gang What?!?
  • Time slicing parallel jobs with coordinated
    context switching
  • Ousterhout
  • matrix

Ousterhout, ICDCS 1982
11
Gang What?!?
  • Time slicing parallel jobs with coordinated
    context switching
  • Ousterhout
  • matrix
  • Optimization
  • Alternative
  • scheduling

Ousterhout, ICDCS 1982
12
Packing Jobs
  • Use a buddy system for allocating processors

Feitelson Rudolph, Computer 1990
13
Packing Jobs
  • Use a buddy system for allocating processors

14
Packing Jobs
  • Use a buddy system for allocating processors

15
Packing Jobs
  • Use a buddy system for allocating processors

16
Packing Jobs
  • Use a buddy system for allocating processors

17
The Question
  • The buddy system leads to internal fragmentation
  • But it also improves the chances of alternative
    scheduling, because processors are allocated in
    predefined groups
  • Which effect dominates the other?

18
The Answer (part 1)
Feitelson Rudolph, JPDC 1996
19
The Answer (part 2)
20
The Answer (part 2)
21
The Answer (part 2)
22
The Answer (part 2)
  • Many small jobs
  • Many sequential jobs
  • Many power of two jobs
  • Practically no jobs use full machine
  • Conclusion buddy system should work well

23
Verification
Feitelson, JSSPP 1996
24
  • Example 2
  • Parallel Job Scheduling
  • and Job Scaling

25
Variable Partitioning
  • Each job gets a dedicated partition for the
    duration of its execution
  • Resembles 2D bin packing
  • Packing large jobs first should lead to better
    performance
  • But what about correlation of size and runtime?

26
Scaling Models
  • Constant work
  • Parallelism for speedup Amdahls Law
  • Large first ? SJF
  • Constant time
  • Size and runtime are uncorrelated
  • Memory bound
  • Large first ? LJF
  • Full-size jobs lead to blockout

Worley, SIAM JSSC 1990
27
Scan Algorithm
  • Keep jobs in separate queues according to size
    (sizes are powers of 2)
  • Serve the queues Round Robin, scheduling all jobs
    from each queue (they pack perfectly)
  • Assuming constant work model, large jobs only
    block the machine for a short time
  • But the memory bound model would lead to
    excessive queueing of small jobs

Krueger et al., IEEE TPDS 1994
28
The Data
29
The Data
30
The Data
31
The Data
Data SDSC Paragon, 1995/6
32
The Data
Data SDSC Paragon, 1995/6
33
The Data
Data SDSC Paragon, 1995/6
34
Conclusion
  • Parallelism used for better results, not for
    faster results
  • Constant work model is unrealistic
  • Memory bound model is reasonable
  • Scan algorithm will probably not perform well in
    practice

35
  • Example 3
  • Backfilling and
  • User Runtime Estimation

36
Backfilling
  • Variable partitioning can suffer from external
    fragmentation
  • Backfilling optimization move jobs forward to
    fill in holes in the schedule
  • Requires knowledge of expected job runtimes

37
Variants
  • EASY backfilling
  • Make reservation for first queued job
  • Conservative backfilling
  • Make reservation for all queued jobs

38
User Runtime Estimates
  • Lower estimates improve chance of backfilling and
    better response time
  • Too low estimates run the risk of having the job
    killed
  • So estimates should be accurate, right?

39
They Arent
Mualem Feitelson, IEEE TPDS 2001
40
Surprising Consequences
  • Inaccurate estimates actually lead to improved
    performance
  • Performance evaluation results may depend on the
    accuracy of runtime estimates
  • Example EASY vs. conservative
  • Using different workloads
  • And different metrics

41
EASY vs. Conservative
Using CTC SP2 workload
42
EASY vs. Conservative
Using Jann workload model
43
EASY vs. Conservative
Using Feitelson workload model
44
Conflicting Results Explained
  • Jann uses accurate runtime estimates
  • This leads to a tighter schedule
  • EASY is not affected too much
  • Conservative manages less backfilling of long
    jobs, because respects more reservations

45
Conservative is bad for the long jobsGood for
short ones that are respectedConservativeEA
SY
46
Conflicting Results Explained
  • Response time sensitive to long jobs, which favor
    EASY
  • Slowdown sensitive to short jobs, which favor
    conservative
  • All this does not happen at CTC, because
    estimates are so loose that backfill can occur
    even under conservative

47
Verification
Run CTC workload with accurate estimates
48
But What About My Model?
  • Simply does not have such small long jobs

49
Workload Data Sources
50
No Data
  • Innovative unprecedented systems
  • Wireless
  • Hand-held
  • Use an educated guess
  • Self similarity
  • Heavy tails
  • Zipf distribution

51
Serendipitous Data
  • Data may be collected for various reasons
  • Accounting logs
  • Audit logs
  • Debugging logs
  • Just-so logs
  • Can lead to wealth of information

52
NASA Ames iPSC/860 log
  • 42050 jobs from Oct-Dec 1993
  • user job nodes runtime date
    time
  • user4 cmd8 32 70 11/10/93
    101317
  • user4 cmd8 32 70 11/10/93
    101930
  • user42 nqs450 32 3300 11/10/93 102207
  • user41 cmd342 4 54 11/10/93 102237
  • sysadmin pwd 1 6 11/10/93
    102242
  • user4 cmd8 32 60 11/10/93
    102542
  • sysadmin pwd 1 3 11/10/93
    103043
  • user41 cmd342 4 126 11/10/93 103132

Feitelson Nitzberg, JSSPP 1995
53
Distribution of Job Sizes
54
Distribution of Job Sizes
55
Distribution of Resource Use
56
Distribution of Resource Use
57
Degree of Multiprogramming
58
System Utilization
59
Job Arrivals
60
Arriving Job Sizes
61
Distribution of Interarrival Times
62
Distribution of Runtimes
63
User Activity
64
Repeated Execution
65
Application Moldability
66
Distribution of Run Lengths
67
Predictability in Repeated Runs
68
Recurring Findings
  • Many small and serial jobs
  • Many power-of-two jobs
  • Weak correlation of job size and duration
  • Job runtimes are bounded but have CVgt1
  • Inaccurate user runtime estimates
  • Non-stationary arrivals (daily/weekly cycle)
  • Power-law user activity, run lengths

69
Instrumentation
  • Passive snoop without interfering
  • Active modify the system
  • Collecting the data interferes with system
    behavior
  • Saving or downloading the data causes additional
    interference
  • Partial solution model the interference

70
Data Sanitation
  • Strange things happen
  • Leaving them in is safe and faithful to the
    real data
  • But it risks situations in which a
    non-representative situation dominates the
    evaluation results

71
Arrivals to SDSC SP2
72
Arrivals to LANL CM-5
73
Arrivals to CTC SP2
74
Arrivals to SDSC Paragon
What are they doing at 330 AM?
75
330 AM
  • Nearly every day, a set of 16 jobs are run by the
    same user
  • Most probably the same set, as they typically
    have a similar pattern of runtimes
  • Most probably these are administrative jobs that
    are executed automatically

76
Arrivals to CTC SP2
77
Arrivals to SDSC SP2
78
Arrivals to LANL CM-5
79
Arrivals to SDSC Paragon
80
Are These Outliers?
  • These large activity outbreaks are easily
    distinguished from normal activity
  • They last for several days to a few weeks
  • They appear at intervals of several months to
    more than a year
  • They are each caused by a single user!
  • Therefore easy to remove

81
(No Transcript)
82
Two Aspects
  • In workload modeling, should you include this in
    the model?
  • In a general model, probably not
  • Conduct separate evaluation for special
    conditions (e.g. DOS attack)
  • In evaluations using raw workload data, there is
    a danger of bias due to unknown special
    circumstances

83
Automation
  • The idea
  • Cluster daily data in based on various
    workload attributes
  • Remove days that appear alone in a cluster
  • Repeat
  • The problem
  • Strange behavior often spans multiple days

Cirne Berman, Wkshp Workload Charact. 2001
84
Workload Modeling
85
Statistical Modeling
  • Identify attributes of the workload
  • Create empirical distribution of each attribute
  • Fit empirical distribution to create model
  • Synthetic workload is created by sampling from
    the model distributions

86
Fitting by Moments
  • Calculate model parameters to fit moments of
    empirical data
  • Problem does not fit the shape of the
    distribution

87
Jann et al, JSSPP 1997
88
Fitting by Moments
  • Calculate model parameters to fit moments of
    empirical data
  • Problem does not fit the shape of the
    distribution
  • Problem very sensitive to extreme data values

89
Effect of Extreme Runtime Values
Downey Feitelson, PER 1999
90
Alternative Fit to Shape
  • Maximum likelihood what distribution parameters
    were most likely to lead to the given
    observations
  • Needs initial guess of functional form
  • Phase type distributions
  • Construct the desired shape
  • Goodness of fit
  • Kolmogorov-Smirnov difference in CDFs
  • Anderson-Darling added emphasis on tail
  • May need to sample observations

91
Correlations
  • Correlation can be measured by the correlation
    coefficient
  • It can be modeled by a joint distribution
    function
  • Both may not be very useful

92
(No Transcript)
93
Correlation Coefficient
Gives low results for correlation of runtime and
size in parallel systems
94
Distributions
A restricted version of a joint distribution
95
Modeling Correlation
  • Divide range of one attribute into sub-ranges
  • Create a separate model of other attribute for
    each sub-range
  • Models can be independent, or model parameter can
    depend on sub-range

96
Stationarity
  • Problem of daily/weekly activity cycle
  • Not important if unit of activity is very small
    (network packet)
  • Very meaningful if unit of work is long (parallel
    job)

97
How to Modify the Load
  • Multiply interarrivals or runtimes by a factor
  • Changes the effective length of the day
  • Multiply machine size by a factor
  • Modifies packing properties
  • Add users

98
Stationarity
  • Problem of daily/weekly activity cycle
  • Not important if unit of activity is very small
    (network packet)
  • Very meaningful if unit of work is long (parallel
    job)
  • Problem of new/old system
  • Immature workload
  • Leftover workload

99
Heavy Tails
100
Tail Types
  • When a distribution has mean m, what is the
    distribution of samples that are larger than x?
  • Light expected to be smaller than xm
  • Memoryless expected to be xm
  • Heavy expected to be larger than xm

101
Formal Definition
  • Tail decays according to a power law
  • Test log-log complementary distribution

102
Consequences
  • Large deviations from the mean are realistic
  • Mass disparity
  • small fraction of samples responsible for large
    part of total mass
  • Most samples together account for negligible part
    of mass

Crovella, JSSPP 2001
103
Unix File Sizes Survey, 1993
104
Unix File Sizes LLCD
105
Consequences
  • Large deviations from the mean are realistic
  • Mass disparity
  • small fraction of samples responsible for large
    part of total mass
  • Most samples together account for negligible part
    of mass
  • Infinite moments
  • For mean is undefined
  • For variance is undefined

Crovella, JSSPP 2001
106
Pareto Distribution
  • With parameter the density is
    proportional to
  • The expectation is then
  • i.e. it grows with the number of samples

107
Pareto Samples
108
Pareto Samples
109
Pareto Samples
110
Effect of Samples from Tail
  • In simulation
  • A single sample may dominate results
  • Example response times of processes
  • In analysis
  • Average long-term behavior may never happen in
    practice

111
Real Life
  • Data samples are necessarily bounded
  • The question is how to generalize to the model
    distribution
  • Arbitrary truncation
  • Lognormal or phase-type distributions
  • Something in between

112
Solution 1 Truncation
  • Postulate an upper bound on the distribution
  • Question where to put the upper bound
  • Probably OK for qualitative analysis
  • May be problematic for quantitative simulations

113
Solution 2 Model the Sample
  • Approximate the empirical distribution using a
    mixture of exponentials (e.g. phase-type
    distributions)
  • In particular, exponential decay beyond highest
    sample
  • In some cases, a lognormal distribution provides
    a good fit
  • Good for mathematical analysis

114
Solution 3 Dynamic
  • Place an upper bound on the distribution
  • Location of bound depends on total number of
    samples required
  • Example
  • Note does not change during simulation

115
Self Similarity
116
The Phenomenon
  • The whole has the same structure as certain parts
  • Example fractals

117
(No Transcript)
118
The Phenomenon
  • The whole has the same structure as certain parts
  • Example fractals
  • In workloads burstiness at many different time
    scales
  • Note relates to a time series

119
Job Arrivals to SDSC Paragon
120
Process Arrivals to SDSC Paragon
121
Long-Range Correlation
  • A burst of activity implies that values in the
    time series are correlated
  • A burst covering a large time frame implies
    correlation over a long range
  • This is contrary to assumptions about the
    independence of samples

122
Aggregation
  • Replace each subsequence of m consecutive values
    by their mean
  • If self-similar, the new series will have
    statistical properties that are similar to the
    original (i.e. bursty)
  • If independent, will tend to average out

123
Poisson Arrivals
124
Tests
  • Essentially based on the burstiness-retaining
    nature of aggregation
  • Rescaled range (R/s) metric the range (sum) of n
    samples as a function of n

125
R/s Metric
126
Tests
  • Essentially based on the burstiness-retaining
    nature of aggregation
  • Rescaled range (R/s) metric the range (sum) of n
    samples as a function of n
  • Variance-time metric the variance of an
    aggregated time series as a function of the
    aggregation level

127
Variance Time Metric
128
Modeling Self Similarity
  • Generate workload by an on-off process
  • During on period, generate work at steady pace
  • During off period to nothing
  • On and off period lengths are heavy tailed
  • Multiplex many such sources
  • Leads to long-range correlation

129
Research Areas
130
Effect of Users
  • Workload is generated by users
  • Human users do not behave like a random sampling
    process
  • Feedback based on system performance
  • Repetitive working patterns

131
Feedback
  • User population is finite
  • Users back off when performance is inadequate
  • Negative feedback
  • Better system stability
  • Need to explicitly model this behavior

132
Locality of Sampling
  • Users display different levels of activity at
    different times
  • At any given time, only a small subset of users
    is active

133
Active Users
134
Locality of Sampling
  • Users display different levels of activity at
    different times
  • At any given time, only a small subset of users
    is active
  • These users repeatedly do the same thing
  • Workload observed by system is not a random
    sample from long-term distribution

135
SDSC Paragon Data
136
SDSC Paragon Data
137
Growing Variability
138
SDSC Paragon Data
139
SDSC Paragon Data
140
Locality of Sampling
  • The questions
  • How does this effect the results of performance
    evaluation?
  • Can this be exploited by the system, e.g. by a
    scheduler?

141
Hierarchical Workload Models
  • Model of user population
  • Modify load by adding/deleting users
  • Model of a single users activity
  • Built-in self similarity using heavy-tailed
    on/off times
  • Model of application behavior and internal
    structure
  • Capture interaction with system attributes

142
A Small Problem
  • We dont have data for these models
  • Especially for user behavior such as feedback
  • Need interaction with cognitive scientists
  • And for distribution of application types and
    their parameters
  • Need detailed instrumentation

143
  • Final Words

144
  • We like to think that we design systems based on
    solid foundations

145
  • But beware
  • the foundations might be unbased assumptions!

146
Computer Systems are Complex
  • We should have more science in computer
    science
  • Collect data rather than make assumptions
  • Run experiments under different conditions
  • Make measurements and observations
  • Make predictions and verify them
  • Share data and programs to promote good
  • practices and ensure comparability

147
Advice from the Experts
  • Science if built of facts as a house if built of
    stones. But a collection of facts is no more a
    science than a heap of stones is a house
  • -- Henri PoincarĂ©

148
Advice from the Experts
  • Science if built of facts as a house if built of
    stones. But a collection of facts is no more a
    science than a heap of stones is a house
  • -- Henri PoincarĂ©
  • Everything should be made as simple as possible,
    but not simpler
  • -- Albert Einstein

149
Acknowledgements
  • Students Ahuva Mualem, David Talby,
  • Uri Lublin
  • Larry Rudolph / MIT
  • Data in Parallel Workloads Archive
  • Joefon Jann / IBM
  • Allen Downey / Welselley
  • CTC SP2 log / Steven Hotovy
  • SDSC Paragon log / Reagan Moore
  • SDSC SP2 log / Victor Hazelwood
  • LANL CM-5 log / Curt Canada
  • NASA iPSC/860 log / Bill Nitzberg
Write a Comment
User Comments (0)
About PowerShow.com