Batch Queue Delay Prediction Tools and NWS - PowerPoint PPT Presentation

About This Presentation
Title:

Batch Queue Delay Prediction Tools and NWS

Description:

Provide user with quantile bounds, with quantifiable confidence, on queue wait time ... Quantile prediction. Not an expected wait time' prediction ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 17
Provided by: daniel399
Category:

less

Transcript and Presenter's Notes

Title: Batch Queue Delay Prediction Tools and NWS


1
Batch Queue Delay PredictionTools and NWS
  • Daniel Nurmi, Rich Wolski, Ryan Garver, John
    Brevik,
  • Graziano Obertelli
  • University of California, Santa Barbara

2
Choices
  • Clusters, workstation farms, SMP machines, etc
  • All differ in significant ways architecture,
    connectivity, interactive instead of
    PBS/torque/maui, lsf, loadleveler, condor
  • Question If I have access to more than one
    system, which do I choose?

3
Wouldn't it be nice ...
  • ... to have estimate of resource availability
  • CPU usage
  • available memory
  • available disk
  • latency
  • bandwidth
  • batch queue waiting time

4
Our approach
  • Provide user with quantile bounds, with
    quantifiable confidence, on queue wait time
  • Often real question is how long will I wait at
    most?
  • Quantile prediction
  • Not an expected wait time prediction
  • Answer question, At most, how long will my job
    wait 95 of the time?

5
(No Transcript)
6
(No Transcript)
7
(No Transcript)
8
(No Transcript)
9
Network Weather Service
  • Monitors multiple resources
  • CPU
  • Memory
  • Bandwidth
  • Latency
  • availability on condor
  • others

10
Thank You!
  • NSF and NMI
  • Next Generation Software (NGS) program
  • SDSC
  • MAYHEM lab
  • Questions?
  • Web Interface http//nws.cs.ucsb.edu/batchq

11
System Overview
Nws_sensor
UCSB
Nws_memory
Nws_sensor
Predictor
Nws_sensor
browser
HTTP Server
cmdline tool
12
Bounds and Mean
  • Previous work attempt to predict the point-value
    wait time
  • Ex 10, 15, 20, 25, 50, 100, 125, 500, 1000,
    1000000
  • Mean 10000 (2.8 hours)
  • 90 of the time lt 1000 (17 minutes)
  • Real data has this shape

13
Shape of Delay
14
Making Predictions
NWS Memory
Historical Job Data
Batch Queue Simulator
Quantile Predictor
Current Predictions
Job Clustering
Changepoint Detection
15
Queue Waiting Time
16
Grouping Jobs
Requested Nodes 1 - 4
Requested Nodes 17 - 64
Write a Comment
User Comments (0)
About PowerShow.com