Title: Distributed Analysis in the BaBar Experiment
1Distributed Analysisin the BaBar Experiment
- Tim Adye
- Particle Physics Department
- Rutherford Appleton Laboratory
- University of Oxford
- 11th November 2002
2Talk Plan
- Physics motivation
- The BaBar Experiment
- Distributed analysis and the Grid
3Where did all the Antimatter Go?
- Nature treats matter and antimatter almost
identically - but the Universe is made up of just matter
- How did this asymmetry arise?
- The Standard Model of Particle Physics allows
for a small matter-antimatter asymmetry in the
laws of physics - Seen in some K0-meson decays
- Eg. 0.3 asymmetry
- This CP Violation in the Standard Model is not
large enough to explain the cosmological
matter-antimatter asymmetry on its own - Until recently, CP Violation had only been
observed in K-decays - To understand more, we need examples from other
systems
4What BaBar is looking for
- The Standard Model also predicts that we should
be able to see the effect in B-meson decays - B-mesons can decay in 100s of different modes
- In the decays
- B0 ? J/Ã K0 and
- B0 ? J/Ã K0
- we look for differences in the time-dependent
decay rate betweenB0 and anti-B0 (B0).
Asymmetry
s
s
5First ResultsSummary of the summary
- First results from BaBar (and rival experiment,
Belle) confirm the Standard Model of Particle
Physics - The observed CP Violation is too small to explain
the cosmological matter-antimatter asymmetry - but there are many many more decay modes to
examine - We are making more than 80 measurements with
differentB-meson, charm, and -lepton
decays.
6Experimental Challenge
- Individual decays of interest are only 1 in 104
to 106B-meson decays - We are looking for a subtle effect in rare (and
often difficult to identify) decays, so need to
record the results of a large number of events.
7The BaBar Collaboration
9 Countries 74 Institutions 566
Physicists
8PEP-II ee- Ring at SLAC
Low Energy Ring (e, 3.1 GeV)
Linear Accelerator
High Energy Ring (e-, 9.0 GeV)
BaBar
PEP-II ring C2.2 km
9The BaBar Detector
108 B0B0 decays recorded
26th May 1999 first events recorded by BaBar
10- To effectively analyse this enormous dataset, we
need large computing facilities more than can
be provided at SLAC alone - Distributing the analysis to other sites raises
many additional research questions - Computing facilities
- Efficient data selection and processing
- Data distribution
- Running analysis jobs at many sites
- Most of this development either has, or will,
benefit from Grid technologies
11Distributed computing infrastructure
1. Facilities
- Distributed model originally partly motivated by
slow networks - Now use fast networks to make full use of
hardware (especially CPU and disk) at many sites - Currently specialisation at different sites
concentrates expertise - eg. RAL is primary repository of analysis data in
the ROOT format
Tier A
Lyon
Padua
RAL
Tier C 20 Universities,
9 in UK
121. Facilities
RAL Tier ADisk and CPU
13RAL Tier A
1. Facilities
- RAL has now relieved SLAC of most analysis
- BaBar analysis environment tries to mimic SLAC so
external users feel at home - Grid job submission should greatly simplify this
requirement - Impressive takeup from UK and non-UK users
14BaBar RAL Batch Users(running at least one
non-trivial job each week)
1. Facilities
A total of 153 new BaBar users registered since
December
15BaBar RAL Batch CPU Use
1. Facilities
16Data Processing
2. Data Processing
- Full data sample (real and simulated data) in all
formats is currently 700 TB. - Fortunately processed analysis data is only 20
TB. - Still too much too store at most smaller sites
- Many separate analyses looking at different
particle decay modes - Most analyses only require access to a sub-sample
of the data - Typically 1-10 of the total
- Cannot afford for all the people to access all
the data all the time - Overload the CPU or disk servers
- Currently specify 104 standard selections
(skims) with more efficient access
17Strategies for Accessing Skims
2. Data Processing
- Store an Event tag with each event to allow fast
selection based on standard criteria - Still have to read past events that arent
selected - Cannot distribute selected sub-samples to Tier C
sites - Index files provide direct access to selected
events in the full dataset - File, disk, and network buffering still leaves
significant overhead - Data distribution possible, but complicated
- therefore only just starting to use this
- Copy some selected events into separate files
- Fastest access and easy distribution, but uses
more disk space a critical trade-off - Currently this gives us a factor 4 overhead in
disk space - We will reduce this when index files are deployed
18Physics Data Selection(metadata)
2. Data Processing
- Currently have about a million ROOT files in a
deep directory tree - Need a catalogue to facilitate data distribution
and allow analysis datasets to be defined. - SQL database
- Locates ROOT files associated with each dataset
- File selection based on decay mode, beam energy,
etc. - Each site has its own database
- Includes a copy of SLAC database with local
information (eg. files on local disk, files to
import, local tape backups)
19Data Distribution
3. Data Distribution
- Tier A analysis sites currently take all the data
- Requires large disks, fast networks, and
specialised transfer tools - FTP does not make good use of fast wide-area
networks - Data imports fully automated
- Tier C sites only take some decay modes
- We have developed a sophisticated scheme to
import data to Tier A and C sites based on SQL
database selections - Can involve skimming data files to extract events
from a single decay mode. This is done
automatically as an integral part of the import
procedure
20Remote Job SubmissionWhy?
4. Job Submission
- The traditional model of distributed computing
relies on people logging into each computing
centre, building, and submitting jobs from there. - Each user has to have an account at each site and
write or copy their analysis code to that
facility - Fine for one site, maybe two. Any more is a
nightmare for site managers (user registration
and support) and users (set everything up from
scratch)
21Remote Job Submission
4. Job Submission
- A better model would be to allow everyone to
submit jobs to different Tier A sites directly
from their home university, or even laptop - Simplifies local analysis code development and
debugging, while providing access to full dataset
and large CPU farms - This is a classic Grid application
- This requires significant infrastructure
- Authentication and authorisation
- Standardise job submission environment
- Grid software versions, batch submission
interfaces - The program and configuration for each job has to
be sent to the executing site and results
returned at the end. - We are just now starting to use this for real
analysis jobs
22The Wider Grid
- We are already using many of the systems being
developed for the European and US DataGrids. - Globus, EDG job submission, CA, VO, RB,
high-throughput FTP, SRB - Investigating the use of many more
- RLS, Spitfire, R-GMA, VOMS,
- We are collaborating with other experiments
- BaBar is a member of EDG WP8 and PPDG (European
and US particle physics Grid applications groups) - We are providing some of the first Grid
technology use-cases
23Summary
- BaBar is using B decays to measure
matter-antimatter asymmetries and perhaps explain
why the universe is matter dominated. - Without distributing the data and computing, we
could not meet the computing requirements of this
high-luminosity machine. - Our initial ad-hoc architecture is evolving
towards a more automated system borrowing
ideas, technologies, and resources from, and
providing ideas and experience for, the Grid.