Title: Data Grid Deployment Projects
1- Data Grid Deployment Projects
Paul Avery University of Florida avery_at_phys.ufl.ed
u
GLORIAD Launch Beijing, ChinaJanuary 12, 2004
2LHC Key Driver for Data Grids
- Complexity Millions of individual detector
channels - Scale PetaOps (CPU), Petabytes (Data)
- Distribution Global distribution of people
resources
2000 Physicists 159 Institutes 36 Countries
CMS Collaboration
3Global Context Data Grid Projects
- U.S. Infrastructure Projects
- GriPhyN (NSF)
- iVDGL (NSF)
- Particle Physics Data Grid (DOE)
- PACIs and TeraGrid (NSF)
- DOE Science Grid (DOE)
- NEESgrid (NSF)
- NSF Middleware Initiative (NSF)
- EU, Asia major projects
- European Data Grid (EU)
- LHC Computing Grid (CERN)
- EGEE (EU)
- DataTAG (EU)
- EDG-related national Projects
- CrossGrid (EU)
- GridLab (EU)
- Japanese Grid Projects
- Korea Grid project
- Not exclusively HEP (LIGO, SDSS, ESA, Biology, )
- But most driven/led by HEP (with CS)
- Many collaborative links between projects
4U.S. GriPhyN and iVDGL Projects
- Both funded by NSF (ITR/CISE Physics)
- GriPhyN 11.9M (NSF) (2000 2005)
- iVDGL 14.0M (NSF) (2001 2006)
- Basic composition (120 people)
- GriPhyN 12 universities, SDSC, 3 labs
- iVDGL 20 universities, SDSC, 3 labs, foreign
partners - Expts CMS, ATLAS, LIGO, SDSS/NVO
- Grid research/infrastructure vs Grid deployment
- GriPhyN CS research, Virtual Data Toolkit (VDT)
development - iVDGL Grid laboratory deployment using VDT
- 4 physics experiments provide frontier challenges
- Extensive student involvement
- Undergrads, grads, postdocs participate at all
levels - Strong outreach component
5U.S. Particle Physics Data Grid
DOE funded
- Funded 1999 2004 _at_ US9.5M (DOE)
- Driven by HENP experiments D0, BaBar, STAR, CMS,
ATLAS - Maintains practical orientation Grid tools for
experiments
6International Virtual Data Grid Laboratory (Fall
2003)
- Partners
- EU
- Brazil
- Korea
- Japan?
7USA PetaScale Virtual-Data Grids
Production Team
Single Researcher
Workgroups
Interactive User Tools
Request Execution Management Tools
Request Planning Scheduling Tools
Virtual Data Tools
ResourceManagementServices
Security andPolicyServices
Other GridServices
- PetaOps
- Petabytes
- Performance
Transforms
Distributed resources(code, storage,
CPUs,networks)
Raw datasource
8EU DataGrid Project
?
?
?
?
?
?
?
?
9Sep. 29, 2003 announcement
10Current LCG Sites
11- Grid2003 An Operational Grid
- 27 sites (U.S., Korea)
- 2300-2800 CPUs
- 700-1100 concurrent jobs
- Running since October 2003
Korea
http//www.ivdgl.org/grid2003
12Grid2003 A Necessary Step
- Learning how to cope with large scale
- Interesting failure modes as scale increases
- Enormous human burden, barely possible on SC2003
timescale - Previous experience from Grid testbeds critical
- Learning how to operate a Grid
- Add sites, recover from errors, provide
information,update software, test applications,
- Need tools, services, procedures, documentation,
organization - Need reliable, intelligent, skilled people
- Learning how to delegate responsibilities
- Multiple levels Project, VO, service, site,
application - Essential for future growth
- Grid2003 experience critical for building
useful Grids - See Grid2003 Project Lessons for details
13US-CMS Production on Grid2003
USCMS
Non-USCMS
14Grid2003 Lessons (1)
- Building something draws people in
- (Similar to a large HEP detector)
- Cooperation
- Willingness to invest time
- Striving for excellence!
- Further Grid development requires significant
deployments - US-CMS testbed debugging Globus, Condor (early
2002) - US-ATLAS testbed early development of Grid tools
- Powerful training mechanism
- Good starting point for new institutions
15Grid2003 Lessons (2) Packaging
- Installation and configuration (VDT Pacman)
- Simplifies installation, configuration of Grid
tools applications - Hugely important for first testbeds in 2002
- Major advances over 13 VDT releases
- Great improvements expected in Pacman 3
- Packaging is a strategic issue!
- More than a convenience crucial to our future
success - Packaging ? Uniformity automation ? lower
barriers to scaling - Automation is the next frontier
- Reduce FTE overhead, communication traffic
- Automate installation, configuration, testing,
validation - Automate software updates
- Remote installation, etc.
16Grid2003 and Beyond (1)
- Continuing commitment of Grid2003 stakeholders
- Deploy Functional Demonstration Grids
Grid2004, Grid2005, - New release every 6-12 months, increasing
functionality scale - Continuing commitment to Grid and related RD
- CS research, VDT improvements (GriPhyN, PPDG)
- Security (PPDG)
- Advanced monitoring (MonALISA/GEMS, MDS, )
- Collaborative tools, e.g. VRVS, AG,
- Continuing development of new tools, services
- Grid enabled analysis
- UltraLight infrastructures CPU storage
optical networks - Continuing development and exploitation of
networks - National HENP WG on Internet2, National Lambda
Rail - International SCIC, AMPATH, world data xfer
speed records
17UltraLight
Unified Infrastructure Computing, Storage,
Networking
- 10 Gb/s network
- Caltech, UF, FIU, UM, MIT
- SLAC, FNAL, BNL
- Intl partners
- Cisco, Level(3), Internet2
18Grid2003 and Beyond (2)
- Continuing commitment to international
collaboration - Close coordination with LHC Computing Grid
- New international partners (GLORIAD, ITER,
Brazil, Korea, ) - Continuing commitment to multi-disciplinary
activities - HEP, CS, ITER, LIGO, Astronomy, Biology,
- Continuing evolution of interactions w/ funding
agencies - Partnership of DOE (labs) and NSF (universities)
- Continuing commitment to coordinated outreach
- QuarkNet, GriPhyN, iVDGL, PPDG, CHEPREO, CMS,
ATLAS - Jan. 29-30 Needs Assessment Workshop in Miami
- Digital Divide efforts (Feb. 15-20 Rio workshop)
19CHEPREO Center for High Energy Physics Research
and Educational Outreach
- Physics Learning Center (Miami)
- iVDGL Grid activities
- HEP research
- AMPATH network (S. America)
Funded September 20034M (MPS, CISE, EHR, ENG)
20Grid2003 and Open Science Grid
- http//www.opensciencegrid.org/
- U.S. Grid supporting many scientific disciplines
- Grid2003 A first step towards Open Science Grid?
- Funding mechanism DOE/NSF
- Laboratories (DOE) and universities (NSF)
- Kickoff meeting Jan. 12 in Chicago