Title: Routing and Scheduling for Large File Transfers over Lambda Grids
1Routing and Scheduling for Large File Transfers
over Lambda Grids
- Amitabha Banerjee,
- Wu-chun Feng,
- Biswanath Mukherjee,
- Dipak Ghosal
- abanerjee_at_ucdavis.edu
2Grid computing - Introduction
- Grid computing
- Hardware and software infrastructure that
provides dependable, consistent and pervasive
access to resources to enable sharing of
computational resources, utility computing,
autonomic computing, collaboration among virtual
organizations, and distributed data processing,
among others - Network is as critical a resource as the CPU and
memory of a system. - Large scale data transfers (GigaBytes TeraBytes
of data) for distributed computing.
3Network - Requirements
- Backbone mesh networks for connectivity.
- Dedicated circuits (lambdas) for point-to-point
data transfers. - Lambda reservation on demand.
- Setup and tear down of connections on demand.
- Efficient control plane mechanisms.
- Efficient and fair sharing of lambdas between
applications. -
4Networks - Examples
- DoE UltrascienceNet (USA)
- National LambdaRail Network (USA)
- CANARIE CA Net4 (Canada)
- NetherLight (Netherlands)
- UKLight (UK)
- Properties
- - High RTT between end-points
- - TCP inefficient, alternate protocols RBUDP,
UDT, FAST TCP
5Application Genomes To Life (GTL) Project.
Data Warehouse
Switches
Supercomputer
Another Example - OptIPuter project
6Problem Statement
- Given
- Graph G (V, E) representing the network.
- V represents switches and E the links.
- For file fij (File j from ith node), the size sij
- Destination d ? V, to which all files are
destined. - Find
- Identification of the circuit Path along which
each file fij should be sent to d. - Reservation times for this circuit Start time at
which each file fij should be sent by the
corresponding node, and the end time. - Minimize the finish time for data aggregation.
- Integrated routing and scheduling problem
7Problem Statement - Example
f21 , s21
Finish Time Max t11e, , t12e , t21e,
8Time Path Optimization
Finish time 9 s
Finish time 7 s
4 G
1 G
6 G
3 G
Link 1Gbps G Gigabit
9Lambda Scheduling
- Dedicated circuits must be reserved for file
transfers. - Duration of circuit reservation
- Sufficient for transferring file end-to-end.
- Challenge
- How to predict file transfer time? From file
size? From past transfer profiles? - Important that incomplete file transfers dont
occur. Hence, mean of past values not good. -
10Emulation of Long Distance Circuits
1 Gbps Ethernet Link
1 Gbps Ethernet Link
dummynet (Configuring RTT)
Sending host
Receiving host
11Variability in End Host Performance
Time to transfer 800 MB file using UDT as a
transport protocol
UDT UDP based Data Transfer Protocol
12Statistics of Transfer Time
UDT UDP based Data Transfer Protocol RBUDP
Reliable Blast UDP
13Reasons
- Receiver buffer at NIC sometimes overrun, before
OS schedules transfer of data to memory. - Transport protocol will respond to NAKs similar
to congestion control. - Total achieved throughput varies with
- End system load.
- Which transport protocol is being used.
- How to estimate transfer time for a file so as to
formulate an efficient schedule ?
14Possible Solutions
- Prediction models
- Model network and end-system performance to find
bottleneck. - Use past transfer profiles to estimate transfer
time. - Transfer time cannot be linearly extrapolated
with file size. Use different profiles for
different range of file sizes. - Use weighted values. Give higher weight to recent
transfers.
15Lambda Scheduling
- Off-line Scheduling
- Compute optimal routes and scheduling time
interval using file transfer profiles. - Route is determined and corresponding lambdas are
reserved. - On-line Scheduling
- After file has been completely transferred,
modify the off-line schedule depending on state.
16Off-line Scheduling
- Determine a file transfer schedule on the graph.
Define this as the - Time Path Scheduling Problem (TPSP)
- Proven NPcomplete by reduction from the
Multi-Processor Scheduling Problem (MSP) - Modeled as MILP. MILP does not scale with network
size.
A. Banerjee et al. , A Time Path Scheduling
Problem (TPSP) for Aggregating Large Data Files
from Distributed Databases using an Optical
Burst-Switched Network.", ICC 2004, Paris,
France, 2004.
17Off-line Scheduling - Heuristics
- Longest File First (LFF)
- Priority to longer files (Higher transfer time).
- For scheduling each file.
- Identify K paths to destination randomly. (Choose
random weights for links, and apply Djikstras
algorithm.) - Choose the best path for scheduling this file.
- Repeat for each file.
- Other heuristics
- Disjoint Path (DP)
- Most Distant File First (MDFF)
18On-line Modification of Schedule
- When actual file transfer occurs
- CASE I - Early Finish
- Actual transfer time lt Scheduled transfer time
- Lambda is unused for some duration of time.
- CASE II - Incomplete File Transfer.
- Actual transfer time gt Scheduled transfer time
- Retransmit file (Full / Remaining).
- Determine new circuit and transfer schedule.
19Early Finish
2.5 6 s
X 0 2.5
20On-line Modify Schedule When Early Finish
- Algorithm Modify Schedule Early Finish
- (Circuit, Actual Finish Time, Scheduled Finish
Time) - // All three parameters above refer to the file
transferred early. - 1) Consider a particular virtual link on the
Circuit. - 2) If a file transfer is scheduled to begin at
Scheduled - Finish Time on this link
- a) Identify other virtual links for this file.
- b) Check if this file may be scheduled to
start - transfer between Actual Finish Time and
- Scheduled Finish Time on these links. If
yes, - modify the offline schedule to start file
transfer at - this time.
- 3) Repeat above steps for all virtual links in
the Circuit.
21On-line Modify ScheduleWhen Incomplete File
Transfer
- Algorithm Incomplete File Transfer ( File Number
) - 1) In the lines of the LFF heuristic, choose K
different paths along which this file may
be transmitted. - 2) The predicted transfer time is chosen as the
highest transfer time of past transfer
profiles. The idea here is to avoid an
incomplete transfer again. - 3) Determine the path along which the file may
be scheduled at the earliest. Add this to
the offline schedule.
22Results Topology
National Lambdarail Net DOE Ultrascience Net
23Simulation Parameters
- File size
- Uniform random distribution betw. 400 800 MB.
- File location
- Randomly distributed across DoE centers.
- File Transfer Profiles
- Consider past 200 file transfers.
- Profiles generated at file size intervals of 50
MB.
24Results - I
25Results - II
26Future Research
Receiver End-System
Sender End-System
Hard disk
Hard disk
Writing data from memory to hard disk, scheduled
by the OS. E.g of bandwidth could be 300 Mbps
Data transfer from hard-disk to memory For e.g.
bandwidth could be 300 Mbps
Physical and virtual memory hierarchy
Physical and virtual memory hierarchy
Data transferred from NIC buffer to memory,
scheduled by OS
Writing data into NIC buffer
Network Interface Card (NIC) Buffer
Network Interface Card (NIC) Buffer
Data transfer over dedicated circuit switched
connection For e.g. bandwidth could be OC-192 (10
Gbps).
- Rate matching to determine peak performance rate.
27Tools
- Kernel level event tracing
- MAGNET
- MAGNET A Tool for Debugging, Analysis and
Adaptation in Computing Systems, M. Gardner, W.
Feng, M. Broxton, A Engelhart, and G. Hurwitz at
CCGrid'2003, Tokyo, Japan, May 2003. - Memory and I/O performance
- STREAM Memory bandwidth measurement
- BONNIE Disk I/O Measurement
- Network performance
- pathLoad, pathChirp
- Iperf/ netperf
28Conclusion
- Given the problem of
- Aggregating data in minimum finish time.
- Computing lambda reservation schedule.
- Achieve efficient utilization of resources.
- We observe that
- Prediction of File Transfer Time important for
determining Circuit Holding Time. - Important to avoid incomplete file transfer.
- Combination of on-line and off-line scheduling
helps achieve - Better finish time.
- Better utilization of lambdas.
29Thank you !!!
- Please contact us at
- Amitabha Banerjee abanerjee_at_ucdavis.edu
- Wu-chun Feng feng_at_lanl.gov
- Biswanath Mukherjee mukherje_at_cs.ucdavis.edu
- Dipak Ghosal ghosal_at_cs.ucdavis.edu