Online Prediction of the Running Time Of Tasks

About This Presentation

Title:

Online Prediction of the Running Time Of Tasks

Description:

Running Time Advisor (RTA) With 95% confidence, what will be ... Which host should the application send the task to so that its running time is appropriate? ... – PowerPoint PPT presentation

Number of Views:41

Avg rating:3.0/5.0

Slides: 40

Provided by: petera45

Learn more at: https://users.cs.northwestern.edu

more less

Transcript and Presenter's Notes

Title: Online Prediction of the Running Time Of Tasks

1
Online Prediction of the Running Time Of Tasks

Peter A. Dinda
Department of Computer ScienceNorthwestern
University
http//www.cs.northwestern.edu/pdinda

2
Overview

Predict running time of task
Application supplies task size (0.1-10 seconds
currently)
Task is compute-bound (current limit)
Prediction is a confidence interval
Expresses prediction error
Statistically valid decision-making in scheduler
Based on host load prediction
Homogenous Digital Unix hosts (current limit)
System is portable to many operating systems

Everything in talk is publicly available
3
Outline

Running time advisor
Host load results
Computing confidence intervals
Performance evaluation
Related work
Conclusions

4
A Universal Challenge in High Performance
Distributed Applications

Highly variable resource availability
Shared resources
No reservations
No globally respected priorities
Competition from other users - background
workload
Running time can vary drastically
Adaptation example goal soft real-time for
interactivity example mechanism server
selection
Performance queries

5
Running Time Advisor (RTA)
background workload
What will be the running time of this 3 second
task if started now?
App
It will be 5.3 seconds
Host
nominal time running time on empty host, task
size

Entirely user-level tool
No reservations or admission control
Query result is a prediction

6
Variability and Prediction
Prediction
resource
High Resource Availability Variability
t
Low Prediction Error Variability
Predictor
resource
error
t
t
Characterization of variability
ACF
t
Exchange high resource availability
variability for low prediction error variability
and a characterization of that variability
7
Running Time Advisor (RTA)
background workload
With 95 confidence, what will be the running
time of this 3 second task if started now?
App
It will be 4.1 to 6.3 seconds
Host
CI captures prediction error to the extentthe
application is interested in it Independent of
prediction techniques
8
RTA API
9
Outline

Running time advisor
Host load results
Computing confidence intervals
Performance evaluation
Related work
Conclusions

10
Host Load Traces

DEC Unix 5 second exponential average
Full bandwidth captured (1 Hz sample rate)
Long durations

http//www.cs.northwestern.edu/pdinda/LoadTraces
11
Host Load Properties

Self-similarity
long-range dependence
Epochal behavior
non-stationarity
Complex correlation structureLCR 98,
Scientific Programming, 34, 1999

12
Host Load Prediction

Fully randomized study on traces
MEAN, LAST, AR, MA, ARMA, ARIMA, ARFIMA models
AR(16) models most appropriate
Covariance matrix for prediction errors
Low overhead lt1 CPU
HPDC 99, Cluster Computing, 34, 2000

13
RPS Toolkit

Extensible toolkit for implementing resource
signal prediction systems
Easy buy-in for users
C and sockets (no threads)
Prebuilt prediction components
Libraries (sensors, time series, communication)
Users have bought in
Incorporated in CMU Remos, BBN QuO

CMU-CS-99-138
http//www.cs.northwestern.edu/RPS
14
Outline

Running time advisor
Host load results
Computing confidence intervals
Performance evaluation
Related work
Conclusions

15
A Model of the Unix Scheduler
Nominal running time
Task tnom
Background workload
Unix Scheduler
Actual running time
Task tact
Actual Load ltztgt
16
A Model of the Unix Scheduler
Nominal running time
Task tnom
Background workload
Unix Scheduler
Predicted running time
gt
Task texp
Predicted Load ltztgt
gt
texp g(tnom,ltztgt) tact Error
17
Available Time and Average Load
Available time from 0 to t
Average load from 0 to t
Load Signal replace with prediction of load
signal
tact is minimum t where at(t)tnom Fluid model,
Processor Sharing,Idealized Round-Robin,
18
Discrete Time

No magic here this is the obvious
discretization
is the sample interval
ztj replaced with prediction

19
Confidence Intervals
gt
gt
gt
gt
ztj replaced with ztj in prediction, giving
ali, ati, at(t)
gt
gt
Confidence interval for at(t) is a CI for ali
prediction errors
Since this is a sum, the central limit theorem
applies
Then a 95 confidence interval is
20
The Variance of the Sum

Prediction errors atj are not independent
Predictors covariance matrix captures this
Predictor makes it possible
to compute this variance and thus the CI
Important detail load discounting

21
Outline

Running time advisor
Host load results
Computing confidence intervals
Performance evaluation
Related work
Conclusions

22
Experimental Setup

Environment
Alphastation 255s, Digital Unix 4.0
Workload host load trace playback LCR 2000
Prediction system on each host
AR(16), MEAN, LAST
Tasks
Nominal time U(0.1,10) seconds
Interarrival time U(5,15) seconds
95 confidence level
Methodology
Predict CIs
Run task and measure

http//www.cs.northwestern.edu/pdinda/LoadTraces/
playload
23
Metrics

Coverage
Fraction of testcases within confidence interval
Ideally should equal the target 95
Span
Average length of confidence interval
Ideally as short as possible
R2 between texp and tact

24
General Picture of Results

Five classes of behavior
Ill show you two
RTA Works
Coverage near 95 in most cases is possible
Predictor quality matters
Better predictors lead to smaller spans on
lightly loaded hosts and to correct coverage on
heavily loaded hosts
AR(16) gt LAST gt MEAN
Performance is slightly dependent on nominal time

25
Most Common Coverage Behavior
26
Most Common Span Behavior
27
Uncommon Coverage Behavior
28
Uncommon Span Behavior
29
Related Work

Distributed interactive applications
QuakeViz/ Dv, Aeschlimann PDPTA99
Quality of service
QuO, Zinky, Bakken, Schantz TPOS, April 97
QRAM, Rajkumar, et al RTSS97
Distributed soft real-time systems
Lawrence, Jensen assorted
Workload studies for load balancing
Mutka, et al PerfEval 91
Harchol-Balter, et al SIGMETRICS 96
Resource signal measurement systems
Remos HPDC98
Network Weather Service HPDC97, HPDC99
Host load prediction
Wolski, et al HPDC99 (NWS)
Samadani, et al PODC95
Hailperin 93
Application-level scheduling
Berman, et al HPDC96

30
Conclusions

Predict running time of compute-bound task
Based on host load prediction
Prediction is a confidence interval
Confidence interval algorithm
Covariance matrix
Load discounting
Effective for domain
Digital Unix, 0.1-10 second tasks, 5-15 second
interarrival
Extensions in progress

31
For More Information

All software and traces are available
RPS RTA RTSA http//www.cs.northwestern.edu/R
PS
Load Traces and playbackhttp//www.cs.northwester
n.edu/pdinda/LoadTraces
Prescience Lab
Peter Dinda, Jason Skicewicz, Dong Lu
http//www.cs.northwestern.edu/plab

32
Outline

Running time advisor
Host load results
Computing confidence intervals
Performance evaluation
Related work
Conclusions

33
A Universal Problem
Which host should the application send the task
to so that its running time is appropriate?
?
Task
Example Real-time
Known resource requirements
What will the running time be if I...
34
Running Time Advisor
Predicted Running Time
Application notifies advisor of tasks
computational requirements (nominal time) Advisor
predicts running time on each host Application
assigns task to most appropriate host
?
Task
nominal time
35
Real-time Scheduling Advisor
Application specifies tasks computational
requirements (nominal time) and its
deadline Advisor acquires predicted task running
times for all hosts Advisor chooses one of the
hosts where the deadline can be met
Predicted Running Time
deadline
?
Task
nominal time
deadline
36
Confidence Intervals to Characterize Variability
3 to 5 seconds with 95 confidence
Application specifies confidence level (e.g.,
95) Running time advisor predicts running times
as a confidence interval (CI) Real-time
scheduling advisor chooses host where CI is less
than deadline CI captures variability to the
extent the application is interested in it
Predicted Running Time
deadline
?
Task
nominal time
deadline
95 confidence
37
Prototype System
This Paper
38
Load Discounting Motivation