Title: ARDA Reports to the LCG SC2
1ARDA Reportsto the LCG SC2
- L.A.T.Bauerdick
- for the RTAG-11/ARDA group
Architectural Roadmap towards a Distributed
Analysis
2ARDA Mandate
3ARDA Schedule and Makeup
Done
Done
- Alice Fons Rademakers and Predrag Buncic
- Atlas Roger Jones and Rob Gardner
- CMS Lothar Bauerdick and Lucia Silvestris
- LHCb Philippe Charpentier and Andrei
Tsaregorodtsev - LCG GTA David Foster, stand-in Massimo Lamanna
- LCG AA Torre Wenaus
- GAG Federico Carminati
4ARDA mode of operation
- Thank you for an excellent committee -- large
expertise, agility and responsiveness, very
constructive and open-minded, and sacrificing
quite a bit of the summer - Series of weekly meetings July and August,
mini-workshop in September - Invited talks from existing experiments
projects - Summary of Caltech GAE workshop (Torre)
- PROOF (Fons)
- AliEn (Predrag)
- DIAL (David Adams)
- GAE and Clarens (Conrad Steenberg)
- Ganga (Pere Mato)
- Dirac (Andrei)
- Cross-check w/ other projects of emerging ARDA
decomposition of services - Magda, DIAL -- Torre, Rob
- EDG, NorduGrid -- Andrei, Massimo
- SAM, MCRunjob -- Roger, Lothar
- BOSS, MCRunob -- Lucia, Lothar
- Clarens, GAE -- Lucia, Lothar
- Ganga -- Rob, Torre
- PROOF -- Fons
- AliEn -- Predrag
Done
Done
5Initial Picture Distributed Analysis (Torre,
Caltech w/s)
6Hepcal-II Analysis Use Cases
- Scenarios based on GAG HEPCAL-II report
- Determine data sets and eventually event
components - Input data are selected via a query to a metadata
catalogue - Perform iterative analysis activity
- Selection and algorithm are passed to a workload
management system, together with spec of the
execution environment - Algorithms are executed on one or many nodes
- User monitors progress of job execution
- Results are gathered together and passed back to
the job owner - Resulting datasets can be published to be
accessible to other users - Specific requirements from Hepcal-II
- Job traceability, provenance, logbooks
- Also discussed support for finer-grain access
control and enabling to share data within physics
groups
7Analysis Scenario
- This scenario represents the analysis activity
from the user perspective. However, some other
actions are done behind the scene of the user
interface - To carry out the analysis tasks users are
accessing shared computing resources. To do so,
they must be registered with their Virtual
Organization (VO), authenticated and their
actions must be authorized according to their
roles within the VO - The user specifies the necessary execution
environment (software packages, databases, system
requirements, etc) and the system insures it on
the execution node. In particular, the necessary
environment can be installed according to the
needs of a particular job - The execution of the user job may trigger
transfers of various datasets between a user
interface computer, execution nodes and storage
elements. These transfers are transparent for the
user
8Example Asynchronous Analysis
- Running Grid-based analysis from inside ROOT
(adapted from AliEn example) - ROOT calling the ARDA API from the command prompt
- // connect authenticate to the GRID Service
arda as lucia - TGrid arda TGridConnect(arda",lucia,"",""
) - // create a new analysis Object ( ltunique IDgt,
lttitlegt, subjobs) - TArdaAnalysis analysis new TArdaAnalysis(pass
001",MyAnalysis",10) - // set the program, which executes the Analysis
Macro/Script - analysis-gtExec("ArdaRoot.sh,"file/home/vincenzo
/test.C") // script to execute - // setup the event metadata query
- analysis-gtQuery("2003-09/V6.08.Rev.04/00110/gjet
met.root?ptgt0.2") - // specify job splitting and run
- analysis-gtOutputFileAutoMerge(true) // merge
all produced .root files - analysis-gtSplit() // split the task in subjobs
- analysis-gtRun() // submit all subjobs to the
ARDA queue - // asynchronously, at any time get the (partial
or complete) results - analysis-gtGetResults() // download
partial/final results and merge them - analysis-gtInfo() // display job information
9Asynchronous Analysis Model
- Extract a subset of the datasets from the virtual
file catalogue using metadata conditions provided
by the user. - Split the tasks according to the location of data
sets. - A trade-off has to be found between best use of
available resources and minimal data movements.
Ideally jobs should be executed where the data
are stored. Since one cannot expect a uniform
storage location distribution for every subset of
data, the analysis framework has to negotiate
with dedicated Grid services the balancing
between local data access and data replication. - Spawn sub-jobs and submit to Workload Management
with precise job descriptions - User can control the results while and after data
are processed - Collect and Merge available results from all
terminated sub-jobs on request - Analysis objects associated with the analysis
task remains persistent in the Grid environment
so the user can go offline and reload an analysis
task at a later date, check the status, merge
current results or resubmit the same task with
modified analysis code.
10Synchronous Analysis
- Scenario using PROOF in the Grid environment
- Parallel ROOT Facility, main developer Maarten
Ballintjin/MIT - PROOF already provides a ROOT-based framework to
use a (local) cluster computing resources - balancing dynamically the workload, with the goal
of optimizing CPU exploitation and minimizing
data transfers - makes use of the inherent parallelism in event
data - works in heterogeneous clusters with distributed
storage - Extend this to the Grid using interactive
analysis services, that could be based on the
ARDA services
11ARDA Roadmap Informed By DA Implementations
- Following SC2 advice, reviewed major existing DA
projects - Clearly AliEn today provides the most complete
implementation of a distributed analysis
services, that is fully functional -- also
interfaces to PROOF - Implements the major Hepcal-II use cases
- Presents a clean API to experiments application,
Web portals, - Should address most requirements for upcoming
experiments physics studies - Existing and fully functional interface to
complete analysis package --- ROOT - Interface to PROOF cluster-based interactive
analysis system - Interfaces to any other system well defined and
certainly feasible - Based on Web-services, with global (federated)
database to give state and persistency to the
system - ARDA approach
- Re-factoring AliEn, using the experience of the
other project, to generalize it in an
architecture Consider OGSI as a natural
foundation for that - Confront ARDA services with existing projects
(notably EDG, SAM, Dirac, etc) - Synthesize service definition, defining their
contracts and behavior - Blueprint for initial distributed analysis
service infrastructure -
12ARDA Distributed Analysis Services
- Distributed Analysis in a Grid Services based
architecture - ARDA Services should be OGSI compliant -- built
upon OGSI middleware - Frameworks and applications use ARDA API with
bindings to C, Java, Python, PERL, - interface through UI/API factory --
authentication, persistent session - Fabric Interface to resources through CE, SE
services - job description language, based on Condor
ClassAds and matchmaking - Database(ses) through Dbase Proxy provide
statefulness and persistence - We arrived at a decomposition into the following
key services - API and User Interface
- Authentication, Authorization, Accounting and
Auditing services - Workload Management and Data Management services
- File and (event) Metadata Catalogues
- Information service
- Grid and Job Monitoring services
- Storage Element and Computing Element services
- Package Manager and Job Provenance services
13AliEn (re-factored)
14ARDA Key Services for Distributed Analysis
15ARDA HEPCAL matching an example
- HEPCAL-II Use Case Group Level Analysis (GLA)
- User specifies job information including
- Selection criteria
- Metadata Dataset (input)
- Information about s/w (library) and configuration
versions - Output AOD and/or TAG Dataset (typical)
- Program to be run
- User submits job
- Program is run
- Selection Criteria are used for a query on the
Metadata Dataset - Event ID satisfying the selection criteria and
Logical Dataset Name of corresponding Datasets
are retrieved - Input Datasets are accessed
- Events are read
- Algorithm (program) is applied to the events
- Output Datasets are uploaded
- Experiment Metadata is updated
- Report summarizing the output of the jobs is
prepared for the group (eg. how many evts to
which stream, ...) extracting the information
from the application and GridMW
- Authentication
- Authorization
- Metadata catalog
- Package manager
- Compute element
- File catalog
- Data management
- Storage Element
- Metadata catalog
- Job provenance
16API to Grid services
- ARDA services present an API, called by
applications like the experiments frameworks,
interactive analysis packages, Grid portals, Grid
shells, etc - In particular the importance of UI/API
- Interface services to higher level software
- Exp. framework
- Analysis shells, e.g. ROOT
- Grid portals and other forms of user interactions
with environment - Advanced services e.g. virtual data, analysis
logbooks etc - Provide experiment specific services
- Data and Metadata management systems
- Provide an API that others can project against
- Benefits of common API to framework
- Goes beyond traditional UIs à la GANGA, Grid
portals, etc - Benefits in interfacing to analysis applications
like ROOT et al - Process to get a common API b/w experiments --gt
prototype - The UI/API can use the Condor ClassAds as a Job
Description Language - This will maintain compatibility with existing
job execution services, in particular LCG-1.
17API and User Interface
18File Catalogue and Data Management
- Input and output associated with any job can be
registered in the VOs File Catalogue, a virtual
file system in which a logical name is assigned
to a file. - Unlike real file systems, the File Catalogue does
not own the files it only keeps an association
between the Logical File Name (LFN) and (possibly
more than one) Physical File Names (PFN) on a
real file or mass storage system. PFNs describe
the physical location of the files and include
the name of the Storage Element and the path to
the local file. - The system should support file replication and
caching and will use file location information
when it comes to scheduling jobs for execution. - The directories and files in the File Catalogue
have privileges for owner, group and the world.
This means that every user can have exclusive
read and write privileges for his portion of the
logical file namespace (home directory).
19Job Provenance service
- The File Catalogue is extended to include
information about running processes in the system
(in analogy with the /proc directory on Linux
systems) and to support virtual data services - Each job sent for execution gets an unique id and
a corresponding /proc/id directory where it can
register temporary files, standard input and
output as well as all job products. In a typical
production scenario, only after a separate
process has verified the output, the job products
will be renamed and registered in their final
destination in the File Catalogue. The entries
(LFNs) in the File Catalogue have an immutable
unique file id attribute that is required to
support long references (for instance in ROOT)
and symbolic links.
20Package Manager Service
- Allows dynamic installation of application
software released by the VO (e.g. the experiment
or a physics group). - Each VO can provide the Packages and Commands
that can be subsequently executed. Once the
corresponding files with bundled executables and
libraries are published in the File Catalogue and
registered, the Package Manager will install them
automatically as soon as a job becomes eligible
to run on a site whose policy accepts these jobs. - While installing the package in a shared package
repository, the Package Manager will resolve the
dependencies on other packages and, taking into
account package versions, install them as well.
This means that old versions of packages can be
safely removed from the shared repository and, if
these are needed again at some point later, they
will be re-installed automatically by the system.
This provides a convenient and automated way to
distribute the experiment specific software
across the Grid and assures accountability in the
long term.
21Computing Element
- Computing Element is a service representing a
computing resource. Its interface should allow
submission of a job to be executed on the
underlying computing facility, access to the job
status information as well as high level job
manipulation commands. The interface should also
provide access to the dynamic status of the
computing resource like its available capacity,
load, number of waiting and running jobs. - This service should be available on a per VO
basis. - Etc pp.
22Talking Points
- Horizontally structured system of services with a
well-defined API and a database backend - Can easily be extended with additional services,
new implementations can be moved in, alternative
approaches tested and commissioned - Interface to LCG-1 infrastructure
- VDT/EDG interface through CE, SE and the use of
JDL, compatible with existing i/s - ARDA VO services can build on emerging VO
management infrastructure - ARDA initially looked at file based datasets, not
object collection - talk with POOL how to extend the file concept to
a more generic collection concept - investigate experiments metadata/file catalog
interaction - VO system and site security
- Jobs are executed on behalf of VO, however users
fully traceable - How do policies get implemented, e.g. analysis
priorities, MoU contributions etc - Auditing and accounting system, priorities
through special optimizers - accounting of site contributions, that depend
what resources sites expose - Database backend for the prototype
- Address latency, stability and scalability issues
up-front good experience exists - In a sense, the system is the database (possibly
federated and distributed) that contains all
there is to know about all jobs, files, metadata,
algorithms of all users within a VO - set of OGSI grid services provide
windows/views into the database, while the
API provides the user access - allows structuring into federated grids and
dynamic workspaces
23General ARDA Roadmap
- Emerging picture of waypoints on the ARDA
roadmap - ARDA RTAG report
- review of existing projects, common architecture
componenta decomposition re-factoring - recommendations for a prototypical architecture
and definition of prototypical functionality and
a development strategy - Development of a prototype and first release
- Integration with and deployment on LCG-1
resources and services - Re-engineering of prototypical ARDA services, as
required - OGSI gives framework in which to run ARDA
services - Addresses architecture
- Provides framework for advanced interactions with
the Grid - Need to address issues of OGSI performance and
scalability up-front - Importance of modeling, plan for scaling up,
engineering of underlying services infrastructure
24Roadmap to a GS Architecture for the LHC
- Transition to grid services explicitly addressed
in several existing projects - Clarens and Caltech GAE, MonaLisa
- Based on web services for communication,
Jini-based agent architecture - Dirac
- Based on intelligent agents working within
batch environments - AliEn
- Based on web services and communication to
distributed database backend - DIAL
- OGSA interfaces
- Initial work on OGSA within LCG-GTA
- GT3 prototyping
- Leverage from experience gained in Grid M/W RD
projects
25ARDA Roadmap for Prototype
- No evolutionary path from GT2-based grids
- Recommendation build early a prototype based on
refactoring existing implementations - Prototype provides the initial blueprint
- Do not aim for a full specification of all the
interfaces - 4-prong approach
- Re-factoring of AliEn, Dirac and possibly other
services into ARDA - Initial release with OGSILite/GT3 proxy,
consolidation of API, release - Implementation of agreed interfaces, testing,
release - GT3 modeling and testing, ev. quality assurance
- Interfacing to LCG-AA software like POOL,
analysis shells like ROOT - Also opportunity to early interfacing to
complementary projects - Interfacing to experiments frameworks
- metadata handlers, experiment specific services
- Provide interaction points with community
26Experiments and LCG Involved in Prototyping
- ARDA prototype would define the initial set of
services and their interfaces. Timescale spring
2004 - Important to involve experiments and LCG at the
right level - Initial modeling of GT3-based services
- Interface to major cross-exp packages POOL,
ROOT, PROOF, others - Program experiment frameworks against ARDA API,
integrate with experiment environments - Expose services and UI/API to other LHC projects
to allow synergies - Spend appropriate effort to document, package,
release, deploy - After the prototype is delivered, improve on
- Scale up and re-engineer as needed OGSI,
databases, information services - Deployment and interfaces to site and grid
operations, VO management etc - Build higher-level services and experiment
specific functionality - Work on interactive analysis interfaces and new
functionalities
27Major Role for Middleware Engineering
- ARDA roadmap based on a well-factored prototype
implementation that allows evolutionary
development into a complete system that evolves
to the full LHC scale - ARDA prototype would be pretty lightweight
- Stability through basing on global database to
which services talk through a database proxy - people know how to do large databases -- well
founded principle (see e.g. SAM for RunII), with
many possible migration paths - HEP-specific services, however based on generic
OGSI-compliant services - Expect LCG/EGEE middleware effort to play major
role to evolve this foundation, concepts and
implementation - re-casting the (HEP-specific event-data analysis
oriented) services into more general services,
from which the ARDA services would be derived - addressing major issues like a solid OGSI
foundation, robustness, resilience, fault
recovery, operation and debugging - Expect US middleware projects to be involved in
this!
28Conclusions
- ARDA is identifying a services oriented
architecture and an initial decomposition of
services required for distributed analysis - Recognize a central role for a Grid API which
provides a factory of user interfaces for
experiment frameworks, applications, portals, etc - ARDA Prototype would provide an distributed
physics analysis environment of distributed
experimental data - for experiment framework based analysis
- Cobra, Athena, Gaudi, AliRoot,
- for ROOT based analysis
- interfacing to other analysis packages like JAS
event displays like Iguana grid portals etc.
can be implemented easily