BaBar Data Distribution using the Storage Resource Broker - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

BaBar Data Distribution using the Storage Resource Broker

Description:

BaBar Data Distribution using the Storage Resource Broker ... Currently looking at Stork to manage multiple requests (http://www.cs.wisc.edu/condor/stork ... – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 22
Provided by: wwwco1
Category:

less

Transcript and Presenter's Notes

Title: BaBar Data Distribution using the Storage Resource Broker


1
BaBar Data Distribution using the Storage
Resource Broker
  • Adil Hasan, Wilko Kroeger (SLAC Computing
    Services),
  • Dominique Boutigny (LAPP),
  • Cristina Bulfon (INFN,Rome),
  • Jean-Yves Nief (ccin2p3),
  • Liliana Martin (Paris VI et VII),
  • Andreas Petzold (TUD),
  • Jim Cochran (ISU)
  • (on behalf of the BaBar Computing Group)
  • IX International Workshop on Advanced Computing
    and Analysis Techniques in Physics Research
  • KEK Japan
  • 1-5 December 2003

1
2
BaBar the parameters (computing-wise)
  • 80 institutions in Europe and North America.
  • 5 Tier A computing centers
  • SLAC (USA), ccin2p3 (France), RAL (UK), GridKA
    (Germany), Padova (Italy).
  • Processing of data done in Padova.
  • Bulk of simulation production done by remote
    institutions.
  • BaBar computing is highly distributed.
  • Reliable data distribution essential to BaBar.

2
3
The SRB
  • The Storage Resource Broker (SRB) is developed by
    San Diego Supercomputing Center (SDSC).
  • A client-server middleware for connecting
    heterogeneous data resources.
  • Provides a uniform method to access the
    resources.
  • Provides relational database backend to record
    file metadata (metadata catalog called MCAT) and
    for access control lists (acls).
  • Can use Grid Security Infrastructure (GSI)
    authentication.
  • Also provides Audit information.

3
4
The SRB
  • SRB v3
  • Define an SRB zone comprising of one MCAT and one
    or more SRB servers.
  • Provides applications to federate zones (synch
    MCATs, create users, data belonging to different
    zones).
  • Within one federation all SRB servers need to run
    on the same port.
  • Allows an SRB server at one site to belong to
    more than one zone.

4
5
The SRB in BaBar
  • The SRB feature-set makes it a useful tool for
    data distribution.
  • Particle Physics Data Grid (PPDG) effort
    initiated interest in SRB.
  • PPDG and BaBar collaboration effort has gone into
    testing and deploying the SRB in BaBar.

5
6
The SRB in BaBar
  • The BaBar system has 2 MCATs one at SLAC and one
    at ccin2p3.
  • Use SRB v3 to create and federate the two zones
    SLAC and ccin2p3 zone.
  • Advantage that client can connect to SLAC or
    ccin2p3 to see files at other site.

6
7
The SRB in BaBar
ccin2p3
SLAC
Data copied from SLAC
Data copied From ccin2p3
Data replicate from/copied to SLAC
Data replicate from/copied to ccin2p3
SLAC Zone
ccin2p3 Zone
MCAT enabled SRB server
SRB server
SRB clients
7
8
Data Distribution using SRB
  • BaBar Data distribution with SRB consists of the
    following steps
  • Publish files available for distribution in MCAT
    (publication I).
  • Locate files to distribute (location).
  • Distribute files (distribution).
  • Publish distributed files (publication II).
  • Each of these steps requires the user to belong
    to some ACL (authorization).

8
9
Authorization
  • BaBarGrid currently uses European Data Grid
    Virtual Organization (VO).
  • Consists of an Lightweight Directory Access
    Protocol (LDAP) database holding certificate
    Distinguished Name (DN) strings for all BaBar
    members.
  • Used to update Globus grid-mapfiles.
  • SRB authentication akin to grid-mapfile
  • Maps SRB username to DN string.
  • SRB username doesnt have to map to UNIX
    username.
  • Developing application to obtain user DN strings
    from VO.
  • App is experiment neutral.
  • Has ability to include information from Virtual
    Organization Management System.

9
10
Publication I
  • The initial publication step (event store files)
    entails
  • Publication of files into SRB MCAT once files
    have been produced and published in BaBar
    bookkeeping.
  • Files are grouped into collections based on run
    range, release, production type (SRB collection
    ! BaBar collection).
  • Extra metadata information (such as file UUID,
    BaBar collection name) stored in MCAT.
  • SRB object name contains processing spec, etc
    that uniquely id the object.
  • 5K event files (or SRB objects) per SRB
    collection.

10
11
Publication I
  • Detector conditions files are more complicated as
    files are constantly updated (ie not closed).
  • As files are update in SRB need to prevent users
    from taking an inconsistent copy.
  • Unfortunately SRB does not currently permit
    locking of collections.

11
12
Publication I
  • Have devised a workaround
  • Register conditions file objects under
    date-specified collection.
  • Register a locator file object containing the
    conditions date-specified collection name.
  • Then, new conditions files registered under a new
    date-specified collection.
  • Locator file contents updated with new
    date-specified collection.
  • This method prevents users from taking an
    inconsistent set of files.
  • Only two sets kept at any one time.

12
13
Location Distribution
  • Location and distribution happen in one client
    application.
  • User supplies BaBar collection name from BaBar
    bookkeeping.
  • SRB searches MCAT for files that have that
    collection name as metadata.
  • Files are then copied to target site.
  • SRB allows simple checksum to be performed.
  • But checksum is not md5 or cksum.
  • Still can be useful.

13
14
Location Distribution
  • SRB allows 3rd-party replication.
  • But, most likely we will always run distribution
    command from source or target site.
  • Also have the ability to create a logical
    resource of more than 1 physical resource.
  • Can replicate to all resources with one command.
  • Useful if more than 1 site regularly needs the
    data.

14
15
Publication II
  • Optionally can register copied file in MCAT
    (decision a matter of policy).
  • Extra step for data distribution to ccin2p3
  • Publication of files in ccin2p3 MCAT.
  • Required since current SRB v3 does not allow
    replication across zones.
  • Extra step not a problem since need to integrity
    check data before publishing anyway.
  • Important note data can be published accessed
    at ccin2p3 or SLAC since MCATs will be synchd
    regularly.

15
16
SC2003 demonstration
  • Demonstrated distribution of detector conditions
    files using scheme previously described to 5
    sites
  • SLAC, ccin2p3, Rome, Bristol, Iowa State.
  • Data were distributed over 2 servers at SLAC and
    files were copied in a round-robin manner from
    each server.
  • Files were continuously copied and deleted at
    target site.
  • Demonstration ran 1 full week continuously
    without problems.

16
17
SC2003 Demonstration
MCAT
Locate data
Authenticate user
SRB server
Transfer data to target
Transfer data to target
Request to transfer data
Request to transfer data
Rome
Ccin2p3
... Etc
17
18
SC 2003 demonstration
18
19
Future work
  • System currently being used, but not yet
    considered full production quality. Missing
    items
  • SRB log file parser to automatically catch
    errors.
  • SRB server load monitor (cpu, memory).
  • Automatic generation of SRB .MdasEnv and
    .MdasAuth files.
  • Automatic generation of new and deletion of old
    users in MCAT.
  • Better packaging of client and server apps.
  • MCAT integrity checking scripts.
  • Better integration with BaBar Bookkeeping.

19
20
Future work
  • Slightly longer term
  • Inclusion of management system to manage SRB
    requests.
  • If system heavily used will require system to
    queue requests.
  • Currently looking at Stork to manage multiple
    requests (http//www.cs.wisc.edu/condor/stork/).
  • Interoperation with Replica Location Service
    (RLS) and Storage Resource Manager (SRM) (see
    Simon Metsons talk).
  • Allows integration with LCG tools.
  • Move to grid-services.

20
21
Summary
  • Extensive testing and interaction with SRB
    developers has allowed BaBar to develop a data
    distribution system based on existing grid
    middle-ware.
  • Used for distributing conditions files to 5 sites
    since October.
  • Will be used to distribute
  • Detector conditions files.
  • Event store files.
  • Random trigger files (used for simulation).
  • BaBars data distribution system is sufficiently
    modular can be adapted to other environments.

21
Write a Comment
User Comments (0)
About PowerShow.com