Title: Size-Based Scheduling Policies with Inaccurate Scheduling Information
1Size-Based Scheduling Policies with Inaccurate
Scheduling Information
- Dong Lu, Huanyuan Sheng, Peter A. Dinda
- Prescience Lab, Dept. of Computer Science
- Dept. of Industrial Engineering Management
Science - Northwestern University
- Evanston, IL 60201 USA
2Outline
- Review of size-based scheduling
- Motivation
- Simulation Setup
- Simulation Results
- New applications
- Research Summary by subjects
3Non-size-based scheduling
- FCFS, PS, etc.
- FCFS First Come First Serve
- Intuitive
- Easiest to implement
- PS Processor Sharing
- Fair all jobs accept equal resources
- Also easy to implement
Problem Unaware of job size information, which
results in big mean response time
4Review of size-based scheduling
- SRPT, FSP, etc.
- Utilize the job size (processing time, service
time) information for scheduling - Optimal in mean response time
- Fair?
- Easy to implement?
We use Job Size to refer to the Processing Time
(Service Time) of the job
5Shortest Remaining Processing Time (SRPT)
- Always serve the job with minimum remaining
processing time first, Preemptive scheduling - Yields minimum mean response time Schrage,
Operations Research, 1968 - Performance gains of SRPT over PS do not usually
come at the expense of large jobs, in other
words, it is Fair for heavy-tail job size
distribution Bansal and Harchol-Balter,
Sigmetrics 01 - Easy to implement?
- With accurate a priori job size information, YES
- Otherwise, NO
6Fair Sojourn Protocol (FSP)
- Combined SRPT with PS, preemptive scheduling
- Mean response time is close to that of SRPT and
more fair than PS Friedman, et al, Sigmetrics
03 - Easy to implement?
- With accurate a priori job size information, YES
- Otherwise, NO
7Motivation
- Size-based scheduling requires accurate knowledge
of job sizes
- In practice, a priori job size information is
not always - available
- All the previous work assumes perfect knowledge
of job sizes a priori
- How does performance depend on
- quality of job size information?
8Correlation
We study the performance of Size-based
schedulers as a function of the correlation
coefficient (Pearsons R) between actual job
sizes and estimated job sizes.
9Outline
- Review of size-based scheduling
- Motivation
- Simulation Setup
- Simulation Results
- New applications
- Research Summary by subjects
10Simulation Setup Trace generator
Correlation (Pearsons R)
Distribution A
Distribution B
Trace Generator
- Correlated random pairs of X and Y
- X has distribution A
- Y has distribution B
- X and Y are correlated to R
11Simulation Setup Trace generator
- Algorithm Normal-To-Anything
- First developed by Cario and Nelson, on INFORMS
Journal on Computing 10, 1 (1998). - We simplified the algorithm and first introduced
it into the simulation studies of computer systems
12Scatter plot of example traces
Y
Y
X
X
R0.78
R0.13
13Simulation Setup Performance metrics
- Performance metrics
- Mean response time Sojourn time, Turn-around
time - Slowdown the ratio of response time to its size.
Fairness metric -
14Simulation Setup Simulator
- Simulator
- Written in C
- Supports M/G/1 and G/G/n/m queuing model
- Simulator validation
- Littles law
- Repeat the simulations in the FSP paper
Friedman, et al, Sigmetrics 03 - Compare with available theoretical results
Bansal and Harchol-Balter, Sigmetrics 01
15Simulation Setup Scheduling Policies
- PS Processor sharing
- Size-based scheduling policies
- SRPT Ideal SRPT scheduler
- SRPT-E SRPT scheduler using estimated job size
- FSP Ideal Fair Sojourn Protocol
- FSP-E FSP scheduler using estimated job size
Each simulation is repeated 20 times and we
present the average
16Outline
- Review of size-based scheduling
- Motivation
- Simulation Setup
- Simulation Results
- New applications
- Research Summary by subjects
17Simulation Results Mean response time
18Simulation Results Slowdown (R0.0224)
19Simulation Results Slowdown (R0.239)
20Simulation Results Slowdown (R0.4022)
21Simulation Results Slowdown (R0.5366)
22Simulation Results Slowdown (R0.7322)
23Simulation Results Slowdown (R0.9779)
24Simulation Results Conclusions
- Performance heavily depends on correlation
- SRPT-E and FSP-E can outperform PS given an
effective job size estimator - Crossover point of performance metrics is a
function of correlation - Also of job size distributions (See TR
NWU-CS-04-33)
25Outline
- Review of size-based scheduling
- Motivation
- Simulation Setup
- Simulation Results
- New applications
- Research Summary by subjects
26New Applications Web server scheduling (TR
NWU-CS-04-33)
- Is file size a good estimator of a jobs service
time (processing time)? Not Really (R ? 0.14)
File Size
Service time (wall clock time)
27New Applications Web server scheduling
- Domain-based estimator much more accurate
prediction of the service time at low overhead
28New Applications P2P server side scheduling (LCR
04)
- Server side of current file sharing P2P
applications superficially similar to web server - Both send back files upon requests.
- However, P2P application cant even know the file
size accurately a priori - Partial downloads
- Our ongoing work shows that SRPT-E performs well
using our time-series based job size estimators.
29New Applications Network backup system scheduling
- Incremental backup copies only the files that
have been created or modified since a previous
backup - With Incremental backup, the actual job sizes is
difficult to know until the backup finishes - We believe that SRPT-E or FSP-E can be applied
with time series based job size predictors
30Summary
- Performance of size-based scheduling policies
depends on correlation between size estimates and
actual sizes - Fairness, mean response time, etc.
- Estimator must preserve ordering of job sizes for
high performance - Performance degrades as correlation degrades
- Effective new estimators for Web and P2P
31For MoreInformation
- Prescience Laboratory
- http//plab.cs.northwestern.edu
- Home Page of Dong Lu
- http//www.cs.northwestern.edu/donglu/
32Outline
- Review of size-based scheduling
- Motivation
- Simulation Setup
- Simulation Results
- New applications
- Research Summary by subjects
33Research Summary by subjects
- Grid Computing
- Internet Measurement and Prediction
- Queuing and Scheduling
- Fat-Tree Based End System Multicast
- Wireless Ad Hoc Networks
- Incentivized Protocol Design for Peer-to-Peer
Systems - Parallel Computing
34Grid Computing
- Dong Lu, Peter Dinda, GridG Generating
Realistic Computational Grids, ACM SIGMETRICS
Performance Evaluation Review (Per), Volume 30,
Number 4, 2003. - Dong Lu, Peter Dinda, Synthesizing Realistic
Computational Grids, Proceedings of the 15th
ACM/IEEE Supercomputing (SC 2003), Phoenix, AZ,
November 2003. - Peter Dinda, Dong Lu, Nondeterministic queries
in a relational Grid information service,
Proceedings of the 15th ACM/IEEE Supercomputing
(SC 2003), Phoenix, AZ, November 2003. - Dong Lu, Peter Dinda, Jason Skicewicz Scoped and
Approximated queries in a relational Grid
Information Service, Proceedings of 4th IEEE/ACM
International Workshop on Grid Computing (Grid
2003), November, 2003, Phoenix, AZ. - Bin Lin, Peter Dinda, Dong Lu, User-driven
Scheduling of Interactive Virtual Machines,
Proceedings of Grid 2004, PITTSBURGH, PA,
November, 2004.
35Internet Measurement and Prediction
- Dong Lu, Y. Qiao, Peter Dinda, and F. Bustamante,
Characterizing and Predicting TCP Throughput on
the Wide Area Network, Proceedings of the 25th
IEEE International Conference on Distributed
Computing Systems (ICDCS 2005), June 2005,
Columbus, Ohio. To appear. - Dong Lu, Yi Qiao, Peter Dinda, Fabian Bustamante,
Modeling and Taming Parallel TCP on the Wide
Area Network, Proceedings of the 19th IEEE
International Parallel and Distributed Processing
Symposium (IPDPS 2005), April 4-8, 2005, Denver,
Colorado.
36Queuing and Scheduling
- Dong Lu, Huanyuan Sheng, Peter Dinda, Size-Based
Scheduling Policies with Inaccurate Scheduling
Information, Proceedings of MASCOTS 2004,
October 2004, Volendam, The Netherlands. - Dong Lu, Peter Dinda, Yi Qiao, Huanyuan Sheng,
Fabian Bustamante, Applications of SRPT
Scheduling with Inaccurate Scheduling
Information, (short paper) Proceedings of
MASCOTS 2004, October 2004, Volendam, The
Netherlands. - Yi Qiao, Dong Lu, Fabian Bustamante, Peter Dinda,
Looking at the Server-Side of Peer-to-Peer
Systems, Proceedings of the 7th ACM Workshop on
Languages, Compilers, and Run-time Systems for
Scalable Computers (LCR 2004), October 2004,
Houston, Texas. - Dong Lu, Huanyuan Sheng, and Peter Dinda,
Effects and Implications of File Size/Service
Time Correlation on Web Server Scheduling
Policies, Technical Report NWU-CS-04-33,
Department of Computer Science, Northwestern
University, April, 2004. In Submission.
37Fat-Tree Based End System Multicast FatNemo
- Stefan Birrer, Dong Lu, Fabian Bustamante, Yi
Qiao, Peter Dinda, FatNemo Building a Resilient
Multi-Source Multicast Fat-Tree, Proceedings of
the Ninth International Workshop on Web Content
Caching and Distribution (WCW 2004), October
2004, Beijing, China. Also appeared in LNCS, Vol.
3293/2004, pp. 182-196. - Long version in submission
38Wireless Ad Hoc Networks
- Dong Lu, Haitao Wu, Qian Zhang, Wenwu Zhu, PARS
Stimulating Cooperation for Power-Aware Routing
in Ad-Hoc Networks. Proceedings of the 40th IEEE
International Conference on Communications (ICC
2005), May 2005, Seoul, Korea. To appear.
39Incentivized Protocol Design for Peer-to-Peer
Systems
- Dong Lu, Yi Qiao, Peter Dinda, Fabian Bustamante,
MultiTorrents Bandwidth Optimized Hybrid
Multicast for Incentivized P2P File Sharing. In
Submission.
40Parallel Computing
- Dong Lu, Peter Dinda Virtualized Audio A Highly
adaptive Interactive High Performance Computing
Application, Proceedings of the 6th Workshop on
Languages, Compilers, and Run-time Systems for
Scalable Computers (LCR 2002), Washington, DC,
2002. Also to appear in LNCS.