Scheduling in Heterogeneous Grid Environments: The Effects of Data Migration PowerPoint PPT Presentation

presentation player overlay
1 / 20
About This Presentation
Transcript and Presenter's Notes

Title: Scheduling in Heterogeneous Grid Environments: The Effects of Data Migration


1
Scheduling in Heterogeneous Grid
EnvironmentsThe Effects of Data Migration
  • Leonid Oliker, Hongzhang Shan
  • Future Technology Group
  • Lawrence Berkeley Research Laboratory
  • Warren Smith, Rupak Biswas
  • NASA Advanced Supercomputing Division
  • NASA Ames Research Center

2
Motivation
  • Geographically distributed resources
  • Difficult to schedule and manage efficiently
  • Autonomy (local scheduler)
  • Heterogeneity
  • Lack of perfect global information
  • Conflicting requirements between users and system
    administrators

3
Current Status
  • Grid Initiatives
  • Global Grid Forum, NASA Information Power Grid,
    TeraGrid, Particle Physics Data Grid, E-Grid, LHC
    Challenge
  • Grid Scheduling Services
  • Enabling multi-site application
  • Multi-Disciplinary Applications, Remote
    Visualization, Co-Scheduling,Distributed Data
    Mining, Parameter Studies
  • Job Migration
  • Improve Time-to-Solution
  • Avoid dependency on single resource provider
  • Optimize application mapping to target
    architecture
  • But what are the tradeoffs of data migration?

4
Our Contributions
  • Interaction between grid scheduler and local
    scheduler
  • Architecture distributed, centralized, and ideal
  • Real workloads
  • Performance metrics
  • Job migration overhead
  • Superscheduler scalability
  • Fault tolerance
  • Multi-resource requirements

5
Distributed Architecture
Communication Infrastructure
Info
Job
Job
Middleware
Grid Scheduler
Grid Env
Local Env
Local Scheduler
Compute Server
PE PE PE
6
Interaction between Grid and Local Schedulers
  • AWT Approximate Wait Time
  • CRU Current Resource Utilization
  • JR Job Requirements

If AWT lt ?
7
Sender-Initiated (S-I)
Host
Partner 1
Partner 2
Jobi
Jobi Requirements
Jobi Requirements
ART0 CRU0
ART1 CRU1
ART2 CRU2
Jobi
Resultsi
Select the machine with the smallest Approximate
Response Time (ART), Break tie by CRU ART
Approx Wait Time Estimated Run Time
8
Receiver-Initiated (R-I)
Host
Partner 1
Partner 2
Jobi
Free Signal
Free Signal
Jobi Requirements
Jobi Requirements
ART0 CRU0
ART1 CRU1
ART2 CRU2
Jobi
Querying begins after receiving free signal
9
Symmetrically-Initiated (Sy-I)
  • First, work in R-I mode
  • Change to S-I mode if no machines volunteer
  • Switch back to R-I after job is scheduled

10
Centralized Architecture
Middleware
Grid Scheduler
Advantages Global View Disadvantages Single
point of failure, Scalability
11
Performance Metrics
12
Resource Configuration and Site Assignment
ServerID Number of Nodes CPUs per Node CPU Speed Site Locator Site Locator Site Locator
ServerID Number of Nodes CPUs per Node CPU Speed 3 Sites 6 Sites 12 Sites
S1 184 16 375 MHz 0 0 0
S2 305 4 332 MHz 1 1 1
S3 144 8 375 MHz 2 3 2
S4 256 4 600 MHz 1 0 3
S5 32 2 250 MHz 2 2 4
S6 128 4 400 MHz 2 5 5
S7 64 2 250 MHz 2 5 6
S8 144 8 375 MHz 1 2 7
S9 256 4 600 MHz 0 4 8
S10 32 2 250 MHz 0 1 9
S11 128 4 400 MHz 0 3 10
S12 64 2 250 MHz 1 4 11
  • Each local site network has peak bandwidth of
    800Mb/s (gigabit Ethernet LAN)
  • External network has 40Mb/s available
    point-to-point (high-performance WAN)
  • Assume all data transfers share network equally
    (network contention is modeled)
  • Assume performance linearly related to CPU speed
  • Assume users pre-compiled code for each of the
    heterogeneous platforms

13
Job Workloads
WorkloadID Time Period(Start-End) ofJobs Avg. InputSize (MB)
W1 03/2002-05/2002 59,623 312.7
W2 03/2002-05/2002 22,941 300.8
W3 03/2002-05/2002 16,295 305.0
W4 03/2002-05/2002 8,291 237.3
W5 03/2002-05/2002 10,543 28.9
W6 03/2002-05/2002 7,591 236.1
W7 03/2002-05/2002 7,251 86.5
W8 09/2002-11/2002 27,063 293.0
W9 09/2002-11/2002 12,666 328.3
W10 09/2002-11/2002 5,236 29.3
W11 09/2002-11/2002 11,804 226.5
W12 09/2002-11/2002 6,911 53.7
  • Systems located at Lawrence Berkeley Laboratory,
    NASA Ames Research Center,Lawrence Livermore
    Laboratory, San Diego Supercomputing Center
  • Data volume info not available. Assume volume
    is correlated to volume of work
  • B is number if Kbytes of each work unit (CPU
    runtime)
  • Our best estimate is B1Kb for each CPU second
    of application execution

14
Scheduling Policy
12 Sites Workload B
  • Large potential gain using grid superscheduler
  • Reduced average wait time by 25X compared with
    local scheme!
  • Sender-Initiated performance comparable to
    Centralized
  • Inverse between migration (FOJM,FDVM) and timing
    (NAWT, NART)
  • Very small fraction of response time spent moving
    data (DMOH)

15
Data Migration Sensitivity
Sender-I 12 Sites
  • NAWT for 100B almost 8X than B, NART 50 higher
  • DMOH increases to 28 and 44 for 10B and 100B
    respectively
  • As B increases, data migration (FDVM) decreases
    due to increasing overhead
  • FOJM inconsistent because it measures of jobs
    NOT data volume

16
Site Number Sensitivity
Sender-I
  • 0.1B causes no site sensitivity,
  • 10B has noticeable effect as sites decrease from
    12 to 3
  • Decrease in time (NAWT, NART) due to increase in
    network bandwidth
  • Increase in fraction of data volume migrated
    (FDVM)
  • 40 Increase in fraction of response time moving
    data (DMOH)

17
Communication ObliviousScheduling
Sender-I
  • For B10 If data migration cost is not considered
    in scheduling algorithm
  • NART increases 14X, 40X for 12Sites, 3Sites
    respectively
  • NAWT increases 28X,43X for 12Sites, 3Sites
    respectively
  • DMOH is over 96! (only 3 for B set)
  • 16 of all jobs blocked from executing waiting
    for data
  • Compared with practically 0 for
    communication-aware scheduling

18
Increased WorkloadSensitivity
Sender-I12 Sites Workload B
  • Grid scheduling 40 more jobs, compared with
    non-grid local scheme
  • No increase in time NAWT NART
  • Weighted Utilization increased from 66 to 93
  • However there is fine line, when of jobs
    increase by 45
  • NAWT grows 3.5X, NART grows 2.4X!

19
Conclusions
  • Studied impact of data migration, simulating
  • Compute servers
  • Grouping of serves into sites
  • Inter-server networks
  • Results showed huge benefits of grid scheduling
  • S-I reduced average turnaround time by 60
    compared with local approach, even in the
    presence of input/output data migration
  • Algorithm can execute 40 more jobs in grid
    environment and deliver same turnaround times as
    non-grid scenario
  • For large data files, critical to consider
    migration overhead
  • 43X increase in NART using communication-oblivious
    scheduling

20
Future Work
  • Superscheduling scalability
  • Resource discovery
  • Fault tolerance
  • Multi-resource requirements
  • Architectural heterogeneity
  • Practical deployment issues
Write a Comment
User Comments (0)
About PowerShow.com