Title: DDM
1DDM
- ATLAS software week 23-27 May 2005
2Outline
- Existing Tools
- Lessons Learned
- DQ2 Architecture
- Datasets
- Catalogs, mappings and interactions
- DQ2 Prototype
- Technology
- Servers, CLI, web page
- Data Movement, subscriptions and datablocks
- Plans for future
- SC3
- Evolution of prototype
3Tools
dms2
dms3
dms4
Using improved version of DQ1 Servers, separate
from Production Includes Reliable File Transfer
Will use new Distributed Data Management
Infrastructure (DQ2 Servers)
First version of tool for end-users Using DQ1
Production Servers
New Grid catalogs Native support for
Datasets Automatic movement of blocks of data
Slow servers and Grid catalogs Problems with file
movement
Slow Grid catalogs Dealing only with individual
files is difficult
Current Version
4dms3
- dms3 is currently the tool to get data from Rome
or DC-2 production - Still based on older DQ Production Servers and
existing Grid catalogs - Requires Grid Certificates
- http//lcg-registrar.cern.ch
- Documentation on Twiki, including installation
notes and common use cases - https//uimon.cern.ch/twiki/bin/view/Atlas/DonQuij
oteDms3 - (please, feel free to add notes to this page!)
- Software
- CERN
- /afs/cern.ch/atlas/offline/external/DQClient/dms3/
dms3.py - External Users
- Download http//cern.ch/mbranco/cern/dms3.tar.gz
- or contact your site administrator to have a
shared installation
5Reliable File Transfer
- DQ includes a simple file transfer service RFT
- MySQL backend database with transfer definition
and queue - Set of transfer agents fetching requests from
database - Only uses GridFTP to move data, supports SRM
(get, put) and transfer priorities - Uses existing DQ Servers to interface with Grid
catalogs - Two client interfaces
- dms3, using replicate command transfer is queued
into RFT - for end-users only transfer priority is limited
- super-user client
- allows priorities to be set, transfers to be
rescheduled, cancelled, tagging of transfers, - access limited to Production
6Reliable File Transfer
- Transfer rate depends mostly on status of sites
- Can go from smooth running to very large number
of failures - critical for files with a single copy on the Grid
- Bottlenecks
- Grid catalogs for querying
- Transferring individual files, not blocks of
files - Lack of people to monitor transfer failures and
sites - A single machine being used
- 7000 transfers/day (mostly requests to transfer
data to CERN) - Will not increase otherwise kills Castor_at_CERN
for other users - Before xmas with an additional machine did double
that amount - But was killing Castor GridFTP front-end machines
very often
7Lessons learned
- Catalogs were provided by Grid providers and used
as-is - Granularity file-level
- No datasets, no file collections
- No scoping of queries (difficult to find data,
slow) - No bulk operations
- Metadata support not usable
- Too slow
- Not valid workaround to query data per site, MD5
checksums, file sizes - Logical Collection Name as metadata
/datafiles/rome/ - Catalogs not always geographically distributed
- Single point of failure (middleware,
people/timezones) - No ATLAS resources information system (with
known/negotiated QoS) - and unreliable information systems from Grid
providers
8Lessons learned
- No managed and transparent data access,
unreliable GridFTP - SRM (and GridFTP with mass storage) still not
sufficient - Difficult to handle different mass storage
staggers from Grid - DQ
- Single point of failure
- NaĂŻve validation procedure
- No self-validation at sites, between site
contents and global catalogs - Operations level
- Too centralized
- Insufficient man-power
- Still need to identify site contacts, at least
for major sites - Insufficient training for users/production
managers - Lack of coordination launching requests for
files not stagged, .. - Also due to lack of automatic connections between
Data Management and Production System tasks - Monitoring insufficient tools and people!
9Lessons learned
- Multiple flavors of Grid Catalogs with slightly
different interfaces - Effort wasted on developing common interfaces
- Minimal functionality with maximum error
propagation! - No single data management tool for
- Production
- End-user analysis
- (common across all Grids!)
- No reliable file transfer plugged into Production
System - Moving individual files non-optimal!
- Too many sites used for permanent storage
- Should restrict the list and comply with
Computing Model and Tier organization
10Distributed Data Management - outline
- Database ( Data Management) project recently
took responsibility in this area (formerly
Production) - Approach proceed by evolving Don Quijote, while
revisiting requirements, design and
implementation - Provide continuity of needed functionality
- Add dataset management above file management
- Dataset named collection of files descriptive
metadata - Container Dataset named collection of datasets
descriptive metadata - Design, implementation, component selection
driven by startup requirements for performance
and functionality - Covering end user analysis (with priority) as
well as production - Make decisions on implementation and component
selection accordingly, to achieve the most
capable system - Foresee progressive integration of new middleware
over time
11Don Quijote 2
- Moves from a file based system to one based on
datasets - Hides file level granularity from users
- A hierarchical structure makes cataloging more
manageable - However file level access is still possible
- Scalable global data discovery and access via a
catalog hierarchy - No global physical file replica catalog (but
global dataset replica catalog and global logical
file catalog)
12Catalog architecture and interactions
13Global catalogs
Holds all dataset names and unique IDs ( system
metadata)
Maintains versioning information and information
on container datasets, datasets consisting of
other datasets
Maps each dataset to its constituent files This
one holds info on every logical file so must be
highly scalable, however it can be highly
partitioned using metadata etc..
Stores locations of each dataset
All logically global but may be distributed
physically
14Local Catalogs
Per grid/site/tier logical to physical file name
mapping. Implementations of this catalog are Grid
specific but must use a standard interface.
Per site storing of user claims on files and
datasets. Claims are used to manage stage
lifetime, resources and provide accounting.
15(Some) DDM Use Cases (1)
- Data acquisition and publication
Publish replica locations
Publish dataset info
Publish file replica locations
Publish dataset file content
16DDM Use Cases (2)
Select datasets based on physics attributes
Versioning / container dataset info
Get locations of datasets
Local file information
Get the files in the datasets
17DDM Use Cases (3)
- Dataset replication (see also subscriptions
later)
Get current dataset location, replicate, then
publish new replica info
Get/publish local file info
Get the files to replicate
For more use cases and details see https//uimon.c
ern.ch/twiki/bin/view/Atlas/DonQuijoteUseCases
18Implementation - Prototype Development Status
- Technology choices
- Python clients/servers based on HTTP GET/POST
- POOL FC interface gives us choice of back-end
(all our catalogs fit to the LFNGUIDPFN mapping
system) - For prototype MySQL DB is used (with planned
future evaluation of LCG File Catalog would
give us ACLs, support for user defined catalogs
etc.) - Servers
- Use HTTPS (with Globus proxy certs) for POSTs and
HTTP for GETs, ie world-readable data (can be
made secure to eg ATLAS VO if required though) - Clients
- Python command line client per server and overall
UI client dq2 - Web page interface directly to HTTP servers for
querying
19dq2 commands
- dq2
-
- Usage dq2 ltcommandgt ltargsgt
-
- Commands
-
- registerNewDataset ltdataset namegt ltlfn1 guid1
lfn2 guid2...gt - registerDatasetLocations lt-i-cgt -v
dataset version ltdataset namegt ltlocation(s)gt - registerNewVersion ltdataset namegt ltnew
files lfn1 guid1 lfn2 guid2...gt -
- listDatasetReplicas -i-c -v dataset version
ltdataset namegt - listFilesInDataset -v dataset version
ltdataset namegt listDatasetsInSite -i-c ltsite
namegt - listFileReplicas ltlogical file namegt
- listDatasets -v dataset version ltdataset namegt
- eraseDataset ltdataset namegt
- -i and -c signify incomplete and complete
datasets respectively (mandatory for adds,
optional for queries (default is return both)) If
no -v option is supplied the latest version is
used.
20Web browser interface
21Datablocks
- Datablocks are defined as immutable and
unbreakable collections of files - They are a special case of datasets
- A site cannot hold partial datablocks
- There are no versions for datablocks
- Used to aggregate files for convenient
distribution - Files grouped together by physics properties, run
number etc.. - Much more scalable than file level distribution
- The principal means of data distribution and data
discovery - immutability avoids consistency problems when
distributing data - moving data in blocks improves data distribution
(bulk SRM requests)
22Subscriptions
- A site can subscribe to data
- When a new version is available, this latest
version of the dataset is automatically made
available through site-local specific services
carrying out the required replication - Subscriptions can be made to datasets (for file
distribution) or container datasets (for
datablock distribution) - Use cases
- Automatic distribution of datasets holding a
variable collection of datablocks (container
datasets) - Automatic replication of files by subscribing to
a mutable dataset (eg file-based calibration data
distribution)
Site X
Subscriptions
Site Y
23Subscriptions
- System supports subscriptions for
- Datasets
- latest version of a dataset (triggers automatic
updates whenever a new version appears) - Container Datasets
- which in turn contain datablocks or datasets
- supports subscriptions to the latest version of a
container dataset (automatically triggers updates
whenever e.g. the set of datablocks making up the
container dataset changes) - Datablocks (immutable set of files)
- Databuckets (see details next)
- replication of a set of files using notification
model (whenever new content appears on the
databucket, the replication is triggered)
24Subscription Agents
File state (local XML POOL FC)
Function
Agents
25Data buckets
Remote Site
Data bucket
(file-based data bucket)
26DQ concepts vs DQ2 concepts
- DQ
- File
- identified by GUID or
- by LFN
- Only unit for data
- movement, querying,
- identifying sites (PFN),
27Claims
28Plans for future development
- Service challenge 3 July - Dec
- Prototype evolution
- Fill catalogs with real data (Rome) and test
robustness and scalability - Implement catalogs not yet done (hierarchy,
claims) - External components
- Testing of gLite FTS underway soon
- POOL FC interfaces for LFC should be available
nowish will evaluate as suitable backend based
on performance - Users
- Agreed with TDAQ to start discussions on whether
DDM can/should be applied to EF-gtT0 data movement - Support commissioning in the near term
- Gradual release to user community for analysis
29What still needs to be done / Milestones
- To be done
- Finish hierarchical cataloguing system
- Monitoring/logging of user operations
- Security policies
- Claims management system
- Milestones
June July August September
October .