Title: A Predictionbased Realtime Scheduling Advisor
1A Prediction-basedReal-time Scheduling Advisor
- Peter A. Dinda
- Prescience Lab
- Department of Computer Science
- Northwestern University
- http//www.cs.northwestern.edu/pdinda
2RTSA
I have a 5 second job. I want it to finish in
under 10 seconds with at least 95
probability. Here is a list of hosts where I can
run it. Which one should I use?
Real-time Scheduling Advisor
Use host 3. Itll finish there in 7 to 9 secs
There is no host where that is possible. The
fastest is host 5, where itll finish in 12 to
15 seconds.
3Core Results
- RTSA based on predictive signal processing
- Layered system architecture for scalable
performance prediction - Targets commodity shared, unreserved distributed
environments - All at user level
- Randomized trace-based evaluation giving evidence
of its effectiveness - Limitations
- Compute-bound tasks
- Evaluation on Digital Unix platform
Publicly available as part of RPS system
4Outline
- Motivation interactive applications
- Interface
- Implementation
- Performance evaluation
- Conclusions and future work
5Interactive Applications on Shared, Unreserved
Distributed Computing Environments
- Examples visualization, games, vr
- Responsiveness requirements gt soft deadlines
- No resource reservation or admission control
- Constant competition from other users
- Changing resource availability gt adaptation
- Adaptation is via server selection
- Other mechanisms possible
6Interactive Applications and the RTSA
- RTSA controls adaptation mechanisms
- Operates on behalf of single application
- Multiple RTSAs may be running independently
- Current limitation Compute-bound tasks
7Interface - Request
int RTSAAdviseTask(RTSARequest req,
RTSAResponse resp)
struct RTSARequest double tnom double
sf double conf Host hosts
Size of task in CPU-seconds
Maximum slack allowed
Minimum probability allowed
List of hosts to choose from
deadline now tnom(1sf)
I have a 5 second job. I want it to finish in
under 10 seconds with at least 95 probability.
Here is a list of hosts where I can run it. Which
one should I use?
8Interface - Response
int RTSAAdviseTask(RTSARequest req,
RTSAResponse resp)
struct RTSAResponse double tnom double
sf double conf Host
host RunningTimePredictionResponse
runningtime
Size of task in CPU-seconds
Maximum slack allowed
Minimum probability allowed
Host to use
Predicted running timeof task on that host
Use host 3. Itll finish there in 7 to 9 secs
9RunningTimePredictionResponse
struct RunningTimePredictionResponse Host
host double tnom double conf
double texp double tlb double tub
Host to use
Size of task in CPU-seconds
Confidence level
Point estimate of running time
Confidence interval of running time
The most likely running time is 7.5 seconds.
There is a 95 chance that the actual running
time will be in the range 7 to 9 seconds.
10Implementation
11Underlying Components
- Host load measurement
- Digital Unix 5 second load average, 1 Hz
- LCR98, SciProg99
- Host load prediction
- Periodic linear time series analysis
(continuously monitored AR(16) predictors) - lt1 of CPU
- HPDC99, Cluster00
- Running time advisor (RTA)
- Task size host load predictions gt confidence
interval for running time of task - SIGMETRICS01,HPDC01,Cluster02
12RTSA Implementation Simplified
Predicted Running Time
RTA predicts running time on each host
texp
?
Task
tnom
13RTSA Implementation Simplified
RTSA picks randomly from among the hosts where
the deadline can be met If there is no such host,
RTSA returns the host with the lowest running
time RTSA also returns the estimate of the
running time
Predicted Running Time
deadline
?
Task
tnom
deadline(1sf)tnom
14Prediction Error
- Predictions are not perfect
- Some machines harder to predict than others
- Need more than a point estimate (texp)
- Predictors can estimate their quality
- Covariance matrix for prediction errors
- Estimate of predictor error also continually
monitored for accuracy - Confidence interval captures this
- Deadline probability serves as confidence level
15RTSA Implementation
RTSA picks randomly from among the hosts where
the deadline can be met even given the maximum
running time captured in the confidence
interval If there is no such host, RTSA returns
the host with the lowest running time RTSA also
returns the estimate of the running time
tub
Predicted Running Time
deadline
tlb
?
Task
tnom
deadline(1sf)tnom
conf95
16Experimental Setup
- Environment
- Alphastation 255s, Digital Unix 4.0
- Private network
- Separate machine for submitting tasks
- Prediction system on each host
- BG workload host load trace playback lcr00
- Traces from PSC Alpha cluster, wide range of CMU
machines - Reconstruct any combination of these machines
(scenario) - Testcase submit synthetic task to system, run on
host that RTSA selects, measure result
17Scenarios
18The Metrics
- Fraction of deadlines met
- Probability of meeting deadline
- Fraction of deadlines met when predicted
- Probability of meeting deadline if RTSA claims it
is possible - Number of possible hosts
- Degree of randomness in RTSAs decision
- High randomness means different RTSAs are
unlikely to conflict
19Testcases
- Synthetic compute-bound tasks
- Size 0.1 to 10 seconds, uniform
- Interarrival 5 to 15 seconds, uniform
- sf 0 to 2, uniform
- conf 0.95 in all cases
8,000 to 16,000 testcases for each scenario How
do metrics vary with scenario, size, sf?
20The RTSA Implementations
- AR(16)
- RTSA as described here
- Instantiated with the AR(16) load predictor
- MEASURE
- Send task to host with lowest load
- Does not return predicted running time
- High probability of conflicts
- RANDOM
- Send task to a random host
- Does not return predicted running time
- Low probability of conflicts
21Fraction of Deadlines Met 4LS
Performance gain from prediction
22Fraction of Deadlines Met 4LS
Performance gain from prediction
23Fraction of Deadlines Met 4LS
Highest performance gain from prediction
near critical slack
24Fraction of Deadlines Met When Predicted 4LS
Only predictive strategy can indicate whether
meeting deadline is possible
25Fraction of Deadlines Met When Predicted 4LS
Only predictive strategy can indicate whether
meeting deadline is possible
26Fraction of Deadlines Met When Predicted 4LS
Operating near critical slack is most challenging
27Number of Possible Hosts 4LS
Predictive strategy introduces appropriate
randomness
28Number of Possible Hosts 4LS
Predictive strategy introduces appropriate
randomness
29Number of Possible Hosts 4LS
Operation near critical slack is most
challenging
30Conclusions and Future Work
- Introduced RTSA concept
- Described prediction-based implementation
- Demonstrated feasibility
- Evaluated performance
- Current and future work
- Incorporate communication, memory, disk
- Improved predictive models
31For MoreInformation
- Peter Dinda
- http//www.cs.northwestern.edu/pdinda
- RPS
- http//www.cs.northwestern.edu/RPS
- Prescience Lab
- http//www.cs.northwestern.edu/plab