DIRAC: Data Transfer Framework - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

DIRAC: Data Transfer Framework

Description:

Use of Configuration Service, File Catalog, Accounting Service ... Single file transfers and registration use Replica Manager functionality (see Andrei's Talk) ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 23
Provided by: www2P3
Category:

less

Transcript and Presenter's Notes

Title: DIRAC: Data Transfer Framework


1
DIRAC Data Transfer Framework

2
Overview
  • Introduction to the Transfer Agent
  • Request DB functionality
  • Use of Replica Manager tools
  • Overview of bulk operations
  • Use of Configuration Service, File Catalog,
    Accounting Service and Monitoring Service
  • Integration with FTS

3
Data Transfer Framework
  • Data Transfer framework supplies the data
    management tools in DIRAC to allow the transfer,
    registration, removal of files.
  • Framework uses the following agents/tools/services
  • Transfer Agent
  • Request DB
  • Replica Manager
  • File Catalog Interfaces
  • Accounting Service
  • Monitoring Service
  • Configuration Service
  • File Transfer Service Client

4
Transfer Agent
  • Transfer Agent is core of the transfer framework.
  • Where the functionalities of the components are
    tied together.
  • Each DIRAC installation has the Transfer Agent
    installed and running by default
  • Operated using nohup runit daemon tools
  • Initialise the transfer agent
  • Execute it periodically
  • Period of execution determined in the dirac.ini
    file.
  • Of the order of a minute.

5
Data Management Architecture
6
Transfer Agent cont.
  • Each time Transfer Agent executes it checks
    Request DB for pending or running requests.
  • These requests can be transfer, bulk transfer,
    register, bulk removal operations.
  • Each of these different types treated
    independently by the Transfer Agent.
  • Transfer Agent loops over the operations of each
    type synchronously performing all requests of
    that type.
  • Could operate as distinct agents.

7
Request DB
  • Request DB contains information on the requests
    to be processed by the transfer agent.
  • Requests exist as XML files with a directory/file
    database
  • Local to the DIRAC install
  • transfer/ToDo
  • transfer/Done
  • XML file contains a tag/parameter to allow
    Transfer Agent to distinguish different request
    types (transfer, bulk transfer, registration,
    bulk removal)
  • Also, contains the relevant parameters for the
    transfer to execute a particular type of job.

8
Request DB cont. (Parameters in XML files)
  • Parameters of the jobs as a key/value pair stored
    as string
  • SourceSECERN_Castor
  • Value can also contain a list
  • LFN/lhcb/production/DC04/file1,
    /lhcb/production/DC04/file2...
  • When request executed additional information
    added to XML by the transfer agent
  • e.g. submission time, FTS guid etc.
  • For blocking requests (transfer/registration/bulk
    removal) additional information added as after
    operation finished
  • For non-blocking requests (bulk transfer) this
    additional information used next time request is
    processed

9
Request DB cont. (Populating DB and obtaining
params)
  • Populating Request DB
  • Job submission through secure job receiver
    service (see Stuarts talk)
  • JDL file submitted points to XML file containing
    required Request parameters
  • JDL file contains entries to designate as
    Transfer Request.
  • JobType "request"
  • RequestType "transfer"
  • Parsing XML files
  • When Transfer Agent executes the XML files read
    into Agent
  • XML DOM parser converts the XML key/value pairs
    to python dictionary.
  • Python dictionary used by Transfer Agent to
    perform operation

10
Transfer Agent Behaviour (Replica Manager tools)
  • Transfer Agent has separate logic for each type
    of request
  • Single file transfers and registration use
    Replica Manager functionality (see Andreis
    Talk)
  • transfer files from local cache to grid SE
  • replicate files from one grid SE to another
    (either by third party copy or two stop
    replication through local cache)
  • registration of files and replicas
  • removal of individual files from storage
  • removal of catalog entries
  • Replica Manager tools called from within Transfer
    Agent
  • Once finished XML file transferred to Done
    directory

11
Transfer Agent Behaviour (Bulk Operations)
  • Bulk Data Management Operations
  • Bulk transfer requests
  • Using gLites File Transfer Service (FTS)
  • Bulk physical removal operations
  • Using bulk operation of srm-advisory-delete CLI
    tool.
  • Bulk transfer requests require preparation/submiss
    ion of FTS jobs
  • FTS used asynchronously for efficient use of
    transfer agent
  • Monitoring of FTS jobs required
  • Bulk removal requests currently blocking
    operations
  • SURLs to be submitted are resolved then submitted
    to CLI
  • Successfully removed physical files then removed
    from catalog

12
File Catalog and the Transfer Agent
  • Interactions with the File Catalog required for
    all operations
  • Transfer Agent uses File Catalog API to be
    abstracted from actual catalog implementation
    (see Juans talk)
  • Supports use of all available catalogs
  • Reading from the catalog
  • Obtain replica information (Site and PFNs) for
    given LFN
  • Used for single file and bulk file transfer
  • Bulk transfers require the PFN to be transformed
    to SURL for use with FTS
  • Writing to the catalog
  • Registering files and replicas
  • Removing file and replicas

13
Configuration Service and the Transfer Agent
  • Configuration Service used by Transfer Agent to
    obtain Storage Element information
  • Protocols supported, host for given protocol,
    path to LHCb area for particular protocol
  • Transfer Agent uses this information when
    interacting with grid SEs.
  • Creating PFNs/SURLs when replicating data to SE
  • Creating SURLs when deleting files
  • Deciding on how to replicate a single file
    between SEs
  • If the SE doesnt support third party transfer
    protocol
  • Must perform initial copy through local cache

14
Accounting Service and the Transfer Agent
  • Transfer agent sends accounting messages to the
    transfer accounting service to create and publish
    relevant statistics on the data management
    operations.
  • Transfer Operations
  • Source/target SEs, number of successful/failed
    file transfers, size of the completed transfer,
    time taken for transfer and registration
    operations, protocol used for transfer
  • Removal operations
  • Site of file removal, number of successful/failed
    physical removals, number of successful/failed
    removal from the catalog, size of the files
    removed, protocol used for removal, time taken
    for the physical removal and catalog removal
    operations

15
Monitoring Service and the Transfer Agent
  • As transfer agent jobs can be submitted through
    the WMS they are assigned a job ID.
  • This then used as a key to send monitoring info
    to the monitoring service.
  • Each time Transfer Agent executes with active
    request monitoring information on the request is
    send.
  • Currently only implemented for bulk transfer
    operations
  • Updates the webpage to show progress of transfer
  • As percentage of the submitted job size
  • To reflect the status of the transfer job
  • matched, submitted, running, done/failed
  • This functionality could be easily extended to
    include monitoring information of bulk removal
    requests.

16
File Transfer Service and the Transfer Agent
  • File Transfer Service supplied by gLite to allow
    the point to point transfer of physical files
  • FTS accepts the source/destination SURL pairs
    then performs the transfer asynchronously
  • FTS allows the submission of bulk transfers of
    gtgt100s of files
  • Relies on definition of channels between the
    source and destination SE.
  • At the moment we have channels set up between
    CERN-T1s and (most) T1s-T1s matrix.
  • In future FTS (sometimes referred to as FPS) will
    transparently handle the submission and routing
    of transfers from any site to any other site
    through central FTS server at CERN

17
Architecture of LHCb integration with FTS
LCG
LHCb
  • Transfer Agent responsible to creation of FTS
    requests
  • Asynchronous FTS Jobs submitted through client
    CLI
  • Transfer Agent requests monitoring information
    from FTS via CLI
  • This information parsed by Transfer Agent and the
    relevant action taken.

18
Transfer Agent Logic for FTS Job Submission
19
Transfer Agent Logic for FTS Job Monitoring
20
Summary
  • Transfer Agent interacts with many services to
    deliver Data Management Framework
  • Request DB, File Catalog, Configuration Service,
    Replica Manager, Accounting Service, Monitoring
    Service, gLite FTS (bulk transfers)
  • Transfer Agent supports transfer, registration,
    bulk transfer and bulk removal operations which
    are treated independently
  • Transfer and registration requests utilise the
    functionality of the replica manager module
  • Bulk transfer and bulk removal operations use
    additional functionality

21
On-going/possible future work
  • New development of bulk operations raised
    bottleneck issues
  • Integration of new catalog functionality for
    replica information
  • Use of sessions for bulk registration
  • FTS Jobs asynchronous but submission can take
    2mins per 100 files
  • Possible to have multithreaded submission agent
  • Request DB to be transferred to SQL Service
  • Request API there so only have to implement
    backend
  • Possible deployment of separate deletion agent
    because of blocking operation
  • Integration of new FTS features and VO agent
    features

22
Questions?
Write a Comment
User Comments (0)
About PowerShow.com