Title: SRM-Lite: overcoming the firewall barrier for data movement
1SRM-Liteovercoming the firewall barrier for
data movement
- Arie Shoshani
- Alex Sim
- Viji Natarajan
- Lawrence Berkeley National Laboratory
SDM Center All-Hands Meeting November, 2007
2Outline
- What are Resource Storage Managers (SRM)
- Requirement of using SRM behind firewalls
- Satisfying the Requirements
- Architecture
- Potential uses
3Storage Resource Managers
- SRMs are middleware components whose function is
to provide - dynamic space allocation AND file management in
spaces - for storage components on the local or wide-area
network - Based on a common standard
client/user applications
SRM (BeStMan)
GPFS
Examples of storage systems currently supported
by SRMs
4Storage Resource ManagersMain concepts
- Non-interference with local policies
- Advance space reservations
- Dynamic space management
- Pinning file in spaces
- Support abstract concept of a file name Site URL
(SURL) - Temporary assignment of file names for transfer
Transfer URL (TURL) - Directory Management and ACLs
- Multi-file requests (srmRquestToPut,
srmRequestToGet, srmCopy) - Transfer protocol negotiation
- Peer to peer request support
- Support for asynchronous multi-file requests
- Support abort, suspend, and resume operations
- SRM relies on other services for data movement
(GridFTP, HTTPS, SCP, )
5Concepts Site URL and Transfer URL
- Provide Site URL (SURL)
- URL known externally e.g. in Replica Catalogs
- e.g. srm//ibm.cnaf.infn.it8444/dteam/test.10193
- Get back transfer URL (TURL)
- Path can be different than SURL SRM internal
mapping - Protocol chosen by SRM based on request protocol
preference - e.g. gsiftp//ibm139.cnaf.infn.it2811//gpfs/dteam
/test.10193 - One SURL can have many TURL
- Files can be replicated in multiple storage
components - Files may be in near-line and/or on-line storage
- In light-weight SRM (a single file system on
disk) - SURL can be the same as TURL except protocol
- File sharing is possible
- Same physical file, but many requests
- Needs to be managed by SRM
6Earth Science Grid Analysis Environment(in
production for 4 years)
gt5000 users
160 TBs managed
LBNL
HPSS High Performance Storage System
disk
ANL
CAS Community Authorization Services
NCAR
HRM Storage Resource Management
gridFTP Striped server
gridFTP server
openDAPg server
Tomcat servlet engine
MyProxy server
LLNL
MCS client
MyProxy client
disk
CAS client
DRM Storage Resource Management
RLS client
DRM Storage Resource Management
GRAM gatekeeper
gridFTP server
ORNL
gridFTP server
gridFTP
HRM Storage Resource Management
ISI
gridFTP
gridFTP server
HRM Storage Resource Management
MCS Metadata Cataloguing Services
SOAP
HPSS High Performance Storage System
RLS Replica Location Services
RMI
MSS Mass Storage System
disk
disk
SRMs are used and inter-communicate in several
sites
SRMs
7Robust Data Movement provided by SRMs and
DataMover
- Problem move thousands of files robustly
- Takes many hours
- Need error recovery
- Mass storage systems failures
- Network failures
- Solution Use Storage Resource Managers (SRMs)
- File streaming paradigm
- By reserving and releasing storage space
automatically - Problem too slow
- Solution
- in GridFTP
- Use parallel streams
- Use large FTP windows
- Pre-stage files from MSS
- Use concurrent transfers
Anywhere
DataMover
SRM-COPY (thousands of files)
Get list of files
NCAR
LBNL
SRM-GET (one file at a time)
SRM (performs writes)
SRM (performs reads)
GridFTP GET (pull mode)
MSS
Network transfer
archive files
stage files
Example setup for Earth System Grid (ESG)
8File tracking shows recovery from transient
failures
Total 45 GBs
9Requirements for SRM-Lite
- Run SRM behind a firewall
- Cannot have third party transfers (source/target
is local) - May not be able to run GridFTP
- Remote site may not support it
- Some communities choose not to use GSI
- Need support for multi-file transfer
- Or entire directory
- Need support for asynchronous request
- Also support for intermediate status of request
- Need to support concurrent file transfers
10Satisfying the Requirements SRM-Lite
- Run SRM behind a firewall
- Must have a client tool (SRM-Lite)
- May not be able to run GridFTP
- Support high-performance SCP Use HPN-SSS from
Pittsburgh supercomputing Center - But, also use other transfer protocols (GridFTP,
bbcp, https, ) - Need support for multi-file transfer
- Manage queues for large requests
- Need support for asynchronous request
- SRM-Lite returns a request token token can be
used for request status - Need to support concurrent file transfers
- Use multi-threading to manage concurrent
transfers - Monitor transfers and recover from mid-transfer
interruptions
11Scenario A firewall at one site
- Process Steps
- Login to ORNL using OTP
- At ORNL invoke SRM-Lite
- User composes XML input file, srmlite.xml for
selectedfiles/directories to copy from/to
another site - Or, user gives command lineoption for a selected
file/directory - SRM-Lite uses srmlite.xml orcommand line
inputto automatically - Push/Pull files to/from NERSC
- Use multiple threads for concurrent transfers
OTP Login
ORNL
NERSC
SRM- Lite
SSH Channel (SCP)
SSH Server
Local Commands And Protocols
GridFTP/FTP/ BBCP/HTTP transfers
srmlite.xml
Disk Cache
Disk Cache
Put example Source file////my_directory/file_fo
o Target scp//host/target_dir/file_foo Get
example Source GridFTP//host/target_dir/file_fo
o Target file////my_directory/file_foo
12Scenario B one end has a firewall, The other end
has SRM
OTP Login
ORNL
NERSC
SRM- Lite
SRM Request
SRM
srmlite.txt
GridFTP/FTP/ SCP transfers
Disk Cache
Disk Cache
HPSS
Put example Source file////my_directory/file_fo
o Target srm//host/target_dir/file_foo
13Scenario C firewalls at both ends
- Process Steps
- Login to Site1 using OTP
- At site1 invoke SRM-Lite
- SRM-Lite at site1 uses SSH to invoke SRM-Lite at
site2 - Use SSH channel for SCP
- Same as before
- User composes XML input file, srmlite.xml for
selected files/directories to copy from/to
another site - Or, user gives command line option for a selected
file/directory
OTP Login
site2
site1
SSH Channel (SCP)
SSH Server
srmlite.xml
Disk Cache
Disk Cache
14Scenario C SRM-Lite manages MSS access
OTP Login
site2
site1
SSH Channel (SCP)
SSH Server
srmlite.xml
Disk Cache
Disk Cache
HPSS
HPSS
15GUI for SRM-Lite
- Used in ESG
- Special version for data movement to user
workstations
- Called DataMover-Lite
- Versions exist for Linux, PC, Mac
16Usage
- Combustion project
- The Applied Partial Differential Equations Center
(APDEC) - John Bell
- Efficient, robust data movement from sites behind
firewalls - At DoE and DoD sites
- Kepler-SRM-Lite actor
- To be used for managing multi-file transfers from
sites behind firewalls - Launch SRM-Lite remotely through SSH
- Initial version help from NCSU Pierre Mouallem
- Two modes
- Entire request
- Streaming file requests
- To be used in CPES workflows first with Norberts
help