Lightweight Replication of Heavyweight Data

About This Presentation

Title:

Description:

Number of Views:41

Avg rating:3.0/5.0

Slides: 12

Provided by: scottk80

Learn more at: https://research.cs.wisc.edu

Category:

Tags: data | heavyweight | lightweight | replication

Transcript and Presenter's Notes

Title: Lightweight Replication of Heavyweight Data

1
Lightweight Replication ofHeavyweight Data

2
Heavyweight Data from LIGO

4 km LIGO interferometer at Livingston, LA
3
Networking to IFOs Limited

GridFedEx protocol
4
Replication to University Sites
Cardiff
MIT
AEI
UWM
PSU
CIT
UTB
5
Why Bulk Replication to University Sites?

Each has compute resources (Linux clusters)
Early plan was to provide one or two analysis
centers
Now everyone has a cluster
Cheap storage is cheap
1/GB for drives
TB RAID-5 lt 10K
Throw more drives into your cluster
Analysis applications read a lot of data
Different ways to slice some problems, but most
want access to large sets of data for a
particular instance of search parameters

6
LIGO Data Replication Challenge

Replicate 200 GB/day of data to multiple sites
securely, efficiently, robustly (no babysitting)
Support a number of storage models at sites
CIT ? SAM-QFS (tape) and large IDE farms
UWM ? 600 partitions on 300 cluster nodes
PSU ? multiple 1 TB RAID-5 servers
AEI ? 150 partitions on 150 nodes with redundancy
Coherent mechanism for data discovery by users
and their codes
Know what data we have, where it is, and
replicate it fast and easy

7
Prototyping Realizations

Need to keep pipe full to achieve desired
transfer rates
Mindful of overhead of setting up connections
Set up GridFTP connection with multiple channels,
tuned TCP windows and I/O buffers and leave it
open
Sustained 10 MB/s between Caltech and UWM, peaks
up to 21 MB/s
Need cataloging that scales and performs
Globus Replica Catalog (LDAP) lt 105 and not
acceptable
Need solution with relational database backend
scales to 107 and fast updates/reads
No need for reliable file transfer (RFT)
Problem with any single transfer? Forget it, come
back later
Need robust mechanism for selecting collections
of files
Users/sites demand flexibility choosing what data
to replicate
Need to get network people interested
Do your homework, then challenge them to make
your data flow faster

8
LIGO, err Lightweight Data Replicator (LDR)

9
Lightweight Data Replicator

10
Lightweight Data Replicator

Lightweight because we think it is the minimal
collection of code needed to get the job done
Logic coded in Python
Use SWIG to wrap Globus RLS
Use pyGlobus from LBL elsewhere
Each site is any combination of publisher,
provider, subscriber
Publisher populates metadata catalog
Provider populates location catalog (RLS)
Subscriber replicates data using information
provided by publishers and providers
Take Condor approach with small, independent
daemons that each do one thing
LDRMaster, LDRMetadata, LDRSchedule, LDRTransfer,

11
Future?