Title: Grid Prediction and Scheduling with Variance
1Grid Prediction and Scheduling with Variance
- Jennifer M. Schopf
- Argonne National Lab
- Feb 27, 2003
2Scheduling and Prediction on the Grid
- First step of Grid computing basic
functionality - Run my job
- Transfer my data
- Security
- Next step more efficient use of the resources
- Scheduling
- Prediction
- Monitoring
3How can theseresources be used effectively?
- Efficient scheduling
- Selection of resources
- Mapping of tasks to resources
- Allocating data
- Accurate prediction of performance
- Good performance prediction modeling techniques
4Outline
- Predicting large file transfers
- Joint with Sudharshan Vazkhudai
- Scheduling with variance
- Joint with Fran Berman
- Variance scheduling with better predictors
- Joint with Lingyun Yang
5Grid Scheduling Architecture (GGF)
6Scheduling
- Gather information
- Make a decision
- Take an action
- Action can be run a job or transfer a file
- Replica selection is becoming more common place
7High Energy PhysicsData Movement
PBytes/sec
100 MBytes/sec
Offline Processor Farm 20 TIPS
100 MBytes/sec
Tier 0
CERN Computer Centre
622 Mbits/sec
or Air Freight
(deprecated)
Tier 1
FermiLab 4 TIPS
France Regional Centre
Italy Regional Centre
Germany Regional Centre
622 Mbits/sec
Tier 2
622 Mbits/sec
Institute 0.25TIPS
Institute
Institute
Institute
Physics data cache
1 MBytes/sec
Tier 4
Physicist workstations
Image courtesy H. Newman, Caltech and C.
Kesselman, ISI
8Data Replication
- Extremely large data sets
- Distributed storage sites
- One file may be available from a number of
different sources - Question where is the best source for me to copy
it from?
9Replica Selection
- Why not use something like Network Weather
Service (NWS) probes? - Wolski and Swany, UCSB
- Logging and prediction
- Small data transfers
- CPU load, memory, etc.
10Predictions of Large File Transfers
- Large file transfers dont look like small file
transfers
11Predicting File Transfers
- (Work with Sudharshan Vazhkudai)
- Log GridFTP file transfers
- Part of Globus Toolkit
- Allows buffer tuning, parallel streams
- Defacto standard for grid file transfers
- Use standard statistical predictions
- Means, medians, autoregressive techniques
12Sample of Predictions
13Evaluating Predictors 1
14Evaluating Predictors 2
15Why didnt this work better?
- Average 15-25 errors
- GridFTP file transfers are sporadic, but standard
time series expect periodic nature - Current environment may not be captured by latest
measurements - So what if we could add information about the
background behavior?
16Using NWS data
17Some Details
- Regression techniques expect a 1-to-1 mapping
between data streams - Throw away extra NWS data
- Fill GridFTP data with last value
- Fill GridFTP data with average
18Why stop at just BW data?
- Disk time is up to 30 of the transfer time, so
we also looked at combing GridFTP data with I/O
stat info as well as network weather service data
19GridFTP, NWS and I/Ostat Results
20Summary of File Prediction Work
- Using additional information increases the
prediction accuracy - 7-17 error
- Weve also worked at predicting the variance
21Variance
- On average behavior may be different from
variance behavior - Example
- Data transfer from A will take 5-7 minutes
- Data transfer from B will take 3-9 minutes
- Which to pick?
- We looked at this question in the context of data
distributions on shared clusters
22Scheduling with Variance
- Scheduling techniques can be developed to make
use of the dynamic performance characteristics of
shared resources - The approach
- Structural performance models
- Stochastic values and predictions
- Stochastic scheduling techniques
- Joint work with Fran Berman, UCSD
23Stochastic Value Parameters
Point Value Parameters
Structural Prediction Models
Stochastic Prediction
Stochastic Scheduling
24Successive Over-Relaxation (SOR)
- Iterative solution to Laplaces equation
- Typical stencil application
- Divided into a red pahse and a black phase
- 2-d grid of data divided into strips
25SOR
26Models
27Dedicated SOR Experiments
- Platform- 2 Sparc 2s. 1 Sparc 5, 1 Sparc 10
- 10 mbit ethernet connection
- Quiescent machines and network
- Prediction within 3 before memory spill
28Non-dedicated SOR results
- Available CPU on workstations varied from .43 to
.53
29Platforms with Scattered Range of CPU Availability
30Improving structural models
- Available CPU has range of 0.48 /- 0.05
- Prediction should also have a range
31Using Additional Information
- Point value
- Bandwidth reported as 7Mbits/sec
- Single Value
- Often a best guess, estimate under ideal
circumstances, or a value accurate only for a
given time frame
- Stochastic value
- Bandwidth reported as 7Mbits /- 2 Mbits
- A set of possible values weighted by
probabilities - Represents a range of likely behavior
32Stochastic Structural Models
- Goal Extend structural models so that resulting
predictions are distributions - Structural model is an equation so
- Need to represent stochastic information
- Normal distribution
- Interval
- Histogram
- Need to be able to mathematically combine the
stochastic values in a timely manner
33Practical issues when using stochastic data
- Who/what can supply stochastic data?
- User
- Data from past runs
- On-line measurement tools
- Network weather service time series data
- Time frame
- Given a time series, how much data should we
consider?
34Accuracy of stochastic results
- Result of a stochastic prediction will also be a
range of values - Need to consider how to achieve a tight (sharp)
interval - What to do if interval isnt tight
35How can I use these predictions in scheduling?
Point Value Parameters
Stochastic Value Parameters
Structural Prediction Models
Stochastic Prediction
Stochastic Scheduling
36Using stochastic predictions
- Simplest scheduling situation Given a data
parallel application, adjust amount of data
assigned to each processor to minimize execution
time
37Delay in one can cause delay in all
38Stochastic Scheduling
- Examine
- Stochastic data represented as normal
distributions - Data parallel codes
- Fixed set of shared resources
- Question How should data be distributed to
minimize execution time? - Approach Adjust data allocation so that a high
variance machine receives less work in order to
minimize the effects of contention
39Time Balancing
- Minimize execution time by assigning data so that
each processor finishes at roughly the same time - Di data assigned to processor I
- Ui time per unit of data on processor I
- Ci time to distributed the data
- DiUi Ci Dj Uj Cj for all i,j
- Sum Di Dtotal
40Stochastic Time Balancing
- Adapt time to compute a unit of data (ui) to
reflect stochastic information - Larger ui means smaller Di (less data)
- If we have normal distributions
- 95 confidence interval corresponds to m-2sd,
m2sd - If we set u m2 sd
- 95 conservative schedule
41Stochastic Time Balancing (cont)
- Set of equations is now
- Di (mi 2 sdi ) Ci Dj(mj 2 sdj ) Cj
- for all i, j
- Sum Di Dtotal
42How do policies compare in a production
environment
- 4 contended Sparcs over 10 Mbit shared ethernet
43Set of Schedules
44Tuning factor
- Tuning factor is the knobto turn to decide how
conservative a schedule should be - For example,m used to determine number of
standard deviations to add to mean - Let ui mi sdiTF
- Solve
- Di (mi sdiTF) Ci Dj (mjsdjTF) Cj
45Extensible approach
- Dont have to use mean and standard deviation
- TF can be defined in a variety of ways
46Defining our stochastic scheduling policy goals
- Decrease execution time
- Predictable performance
- Avoid spikes in execution behavior
- More conservative when in doubt
47System of benefits and penalties
- Based on Sih and Lees approach to scheduling
- Benefit (give a less conservative schedule to)
- Platforms with fewer varying machines
- Low variance machines, especially those with
lower power
48Partial ordering
49Algorithm for TF
50Scheduling Experiments
- Platform-
- 4 contended PCs running Linux
- 100 mbit shared ethernet connection
- 3 policies run back to back
- Mean Ui based on runtime mean pred.
- VTF Ui based on mean and heuristic TF
evaluation - 95TF Ui based on 95 conf. interval
51Metrics
- Window Which of each window of three runs has
fastest execution time? - Compare How often was one policy better than,
worse than, or split when compared with the
policy run just before and just after - Whats the right metric?
52SOR- scheduling 1
- Window Mean 9, CTF 27, 95TF 22 (of 57)
- Compare Better Mixed Worse
- Mean 3 4 12
- VTF 10 7 3
- 95TF 6 9 4
53CPU performance
54SOR 2
SOR- scheduling 2
- Window Mean 8, VTF 39, 95TF 11 (of 57)
- Compare Better Mixed Worse
- Mean 3 7 9
- VTF 15 2 3
- 95TF 3 8 8
55CPU
56Experimental Conclusions
- Stochastic information was more beneficial when
there was a higher variability in available CPU - Almost always we saw a reduction in variation in
actual execution times - Unclear when it is better to use which heuristic
scheduling policy at this point
57What if we had a better predictor?
- Create a predictor for the average CPU load for
some future time interval, and variation of CPU
load over some future time interval - Use that with performance models to create a
conservative schedule - (joint work with Lingyun Yang, UC, and Ian
Foster, UC/ANL)
58Mixed Tendency Prediction
- // Determine Tendency
- if ((VT-1 - VT )lt0)
- TendencyIncrease
- else if ((VT - VT-1)lt0)
- TendencyDecrease
- if (TendencyIncrease) thenPT1 VT
IncrementConstant - IncrementConstant adaptation process
- else if (TendencyDecrease) thenPT1 VT
VTDecrementFactor - DecrementFactor adaptation process
- IncrementConstant is set initially to 0.1
- DecrementFactor is set to 0.01
59Comparison to NWS
Mixed tendency prediction strategy outperforms
the NWS predictors on all of the 38 CPU load time
series with different properties. It achieves a
prediction error that is 36 lower on average
than that achieved by NWS.
60- Add in slide tying this back into scheduling
61Apply the new predictorto aggregated load
information
62Apply new predictorto standard deviation data
63Use this data inconservative scheduling algorithm
- Cactus Application a simulation of a 3D scalar
field produced by two orbiting astrophysical
sources. - Data distribution based on
- Ei(Di) start_up
- (DiCompi(0) Commi(0)) effective CPU load
- Open question how to define effective CPU load?
64Compare Different Approaches
- (1) One Step Scheduling(OSS) Use the
one-step-ahead prediction of the CPU load - (2) Predicted Mean Interval Scheduling (PMIS)
Use the interval load prediction - (3) Conservative Scheduling (CS) Use the
conservative load prediction - interval load
prediction added to a measure of the predicted
variance . - (4) History Mean Scheduling (HMS) Use the mean
of the history CPU load for the 5 min preceding
the application start. This approximates the
estimates used in other approaches. - (5) History Conservative Scheduling (HCS) Use
the conservative estimate CPU load - add the mean
and variance of the history CPU load collected
for 5 minutes preceding the application run as
the effective CPU load. This approximates Schopf
and Berman.
65Results
66Translate abbrev into policies better
67Comparing Policies
Â
68Average Mean and Average Standard Deviation
69Summary
- Variance information gives us stochastic values
to help meet the prediction needs of Grid
computing - A stochastic scheduling policy that can make use
of predictions to achieve better execution times
and more predictable application behavior - Better predictions of stochastic values will
result in better policies
70Collaborators/References
- Sudharshan Vazhkudai (ANL/MS State)
- IPDPS 2002, HPDC 2002, Grid2002
- The AppLeS group Fran Berman (UCSD), Rich
Wolski (UCSB) - SC99, Schopf Thesis UCSD 1998
- Lingyun Yang (University of Chicago) and Ian
Foster (UC, ANL) - IPDPS 2003, submitted to HPDC 2003
71 72Contact Information
- Jennifer Schopf
- jms_at_mcs.anl.gov
- http//www.mcs.anl.gov.edu/jms