Title: GriPhyN: Grid Physics Network and iVDGL: International Virtual Data Grid Laboratory
1GriPhyN Grid Physics NetworkandiVDGL
International Virtual Data Grid Laboratory
2Collaboratory Basics
- Two NSF-funded Grid projects in HENP (high energy
and nuclear physics) and computer science - MPS and CISE have oversight
- GriPhyN and iVDGL are too closely related to
discuss one without discussing the other - One is CS research and test application, the
other is to build an international scale facility
to do these tests, and to address other goals as
well - Share vision, personnel and components
- These two collaboratories are part of a larger
effort to develop the components and
infrastructure for supporting data intensive
science
3Some Science Drivers
- Computation is becoming an increasingly important
tool of scientific discovery - Computationally intense analyses
- Global collaborations
- Large datasets
- The increasing importance of computation in
science is more pronounced in some fields - Complex (e.g. climate modeling) and high volume
(HEP) simulations - Detailed rendering (e.g. biomedical informatics)
- Data intensive science (e.g. astronomy and
physics) - GriPhyN and iVDGL were founded to provide the
models and software for the data management
infrastructure for four large projects
4SDSS / NVO
- SDSS / NVO are in full production
- Explore how the Grid can be used in astronomy
- Whats the benefit?
- How to integrate?
- How can the Grid be used for future sky surveys?
- Data processing pipelines are complex
- Has made the most sophisticated use of the
virtual data concept
5LIGO
- Not in full production, but real data is being
taken - LIGO I Engineering Runs
- 35 TB since 1999 and growing
- LIGO I Science Runs
- 62 TB in two science runs, additional run planned
that will generate 135 TB - Eventual constant operation at 270 TB/year
- LIGO II Upgrade
- Eventual Operation at 1-2 PB / year
- Need distributed computing power of the Grid
- Need virtual data catalogs for efficient
dissemination of data and management of workflow
6CMS / ATLAS
- CMS and ATLAS are two experiments being
developed for the Large Hadron Collider at CERN - Two projects, two cultures, but
- Similar data challenges
- Similar geographic distribution
- Moving closer to common tools through the LCG
computing grid. - Petabytes of data per year (100 PB by 2012)
7Function Types
- GriPhyN
- Distributed Research Center
- iVDGL
- Community Data System
8GriPhyN
9GriPhyN Funding
- Funded in 2000 through NSF ITR program
- 11.9M 1.6M matching
10GriPhyN Project Team
- Led by U. Florida and U. Chicago
- PDs Paul Avery (UF) and Ian Foster (UC)
- 22 Participant institutions
- 13 funded
- 9 unfunded
- Roughly 82 people involved
- 2/3 of activity computer science, 1/3 physics
11- Funded Institutions
- U. Florida
- U. Chicago
- CalTech
- U. Wisconsin - Madison
- USC / ISI
- Indiana U.
- Johns Hopkins U.
- Texas A M
- UT Brownsville
- UC Berkeley
- U Wisconsin Milwaukee
- SDSC
- Unfunded Institutions
- Argonne NL
- Fermi NAL
- Brookhaven NL
- UC San Diego
- U. Pennsylvania
- U. Illinois - Chicago
- Stanford
- Harvard
- Boston U.
- Lawrence Berkeley Lab
12Technology
- GriPhyNs science drivers demand timely access to
very large datasets and the computer cycles and
information management infrastructure needed to
manipulate and transform those datasets in a
meaningful way - Data Grids are an approach to data management and
resource sharing in environments where datasets
are very large - Policy-driven resource sharing, distributed
storage, distributed computation, replication and
provenance tracking - GriPhyN and iVDGL aim to enable petascale virtual
data grids
13 Petascale Virtual DataGrids
14GriPhyN Datagrid Contributions
- GriPhyN has three areas of contribution for
achieving the DataGrid vision - Contributing CS research
- Virtual Data as a unifying concept
- Planning, execution and performance monitoring
- Integrating these facilities in a transparent and
high-productivity manner Making the grid as easy
to use as a workstation and the web - Disseminating this research through the Virtual
Data Toolkit and other tools - Chimera
- Pegasus
- Integrate CS research results into GriPhyN
science projects - GriPhyN experiments serve as an exciting but
demanding CS and HCI laboratory
15Virtual Data Toolkit VDT
- A suite of tools developed by the CS team to
support science on the Grid - Uniting theme is virtual data
- Nearly all data in physics / astronomy is virtual
data - derivations of a large, well known data
set - It is possible to represent derived data as the
set of instructions that created it - There is no need to always copy a derived data
set - it can be recomputed if you have the
workflow - Virtual data also has a number of beneficial side
effects, e.g. data provenance,discovery,
re-creation, workflow automation - Many packages, a few are unique to GriPhyN,
others are common across many Grid projects
16Motivations
Ive come across some interesting data, but I
need to understand the nature of the corrections
applied when it was constructed before I can
trust it for my purposes.
Ive detected a calibration error in an
instrument and want to know which derived data to
recompute.
Data
created-by
consumed-by/ generated-by
Transformation
Derivation
execution-of
I want to apply an astronomical analysis program
to millions of objects. If the results already
exist, Ill save weeks of computation.
I want to search an astronomical database for
galaxies with certain characteristics. If a
program that performs this analysis exists, I
wont have to write one from scratch.
Slide courtesy Ian Foster
17Chimera
- The Chimera Virtual Data System is one of the
core tools of the GriPhyN Virtual Data Toolkit - Virtual Data Catalog
- Represents transformation procedures and derived
data - Virtual Data Language Interpreter
- Translates user requests into Grid workflow
18Pegasus
- Planning Execution in Grids
- Tool for mapping complex workflows onto the Grid
- Converts abstract Chimera workflow into a
concrete workflow, which is sent to DAGman for
execution - DAGman is the Condor meta-scheduler
- Determines sites and data transfers
19Virtual Data Processing Tools
VDLx
abstract planner
XML DAG
Pegasus concrete planner
Local shell planner
Condor DAG
shell DAG
20ExampleSloan Galaxy Cluster Finder
DAG
Sloan Data
Galaxy cluster size distribution
Jim Annis, Steve Kent, Vijay Sehkri, Fermilab,
Michael Milligan, Yong Zhao, Chicago
21ExampleSloan Galaxy Cluster
DAG
Sloan Data
Galaxy cluster size distribution
With Jim Annis Steve Kent, FNAL
22Resource Diagram
23International Virtual Data Grid Laboratory iVDGL
24Some Context
- There is much more to the DataGrid world than
GriPhyN - Broad problem space, with many cooperative
projects - U.S.
- Particle Physics Data Grid (PPDG)
- GriPhyN
- Europe
- DataTAG
- EU DataGrid
- International
- iVDGL
25Background and Goals
- U.S. portion funded in 2001 as a Large ITR
through NSF - 13.7M 2M matching
- International partners responsible for own
funding - Aims of iVDGL
- Establish a Global Grid Laboratory
- Conduct DataGrid tests at scale
- Promote interoperability
- Promote testbeds for non-physics applications
26Relationship to GriPhyN
- Significant overlap
- Common management, personnel overlap
- Roughly 80 people on each project, 120 total
- Tight technical coordination
- VDT
- Outreach
- Testbeds
- Common External Advisory Committee
- Different focus - domain challenges
- GriPhyN - 2/3 CS, 1/3 Physics IT Research
- iVDGL - 1/3 CS, 2/3 Physics Testbed deployment
and operation
27Project Composition
- CS Research
- U.S. iVDGL Institutions
- UK e-science programme
- DataTAG
- EU DataGrid
- Testbeds
- ATLAS / CMS
- LIGO
- National Virtual Observatory
- SDSS
28iVDGL Institutions
- U Florida CMS
- Caltech CMS, LIGO
- UC San Diego CMS, CS
- Indiana U ATLAS, iGOC
- Boston U ATLAS
- U Wisconsin, Milwaukee LIGO
- Penn State LIGO
- Johns Hopkins SDSS, NVO
- U Chicago CS
- U Southern California CS
- U Wisconsin, Madison CS
- Salish Kootenai Outreach, LIGO
- Hampton U Outreach, ATLAS
- U Texas, Brownsville Outreach, LIGO
- Fermilab CMS, SDSS, NVO
- Brookhaven ATLAS
- Argonne Lab ATLAS, CS
T2 / Software
CS support
T3 / Outreach
T1 / Labs(not funded)
29US-iVDGL Sites
- Partners?
- EU
- CERN
- Brazil
- Australia
- Korea
- Japan
30Component Projects
- iVDGL contains several core projects
- iGOC
- International Grid Operations Center
- GLUE
- Grid Laboratory Uniform Environment
- WorldGrid 2002 international demo
- Grid3 2003 deployment effort
31iGOC
- International Grid Operations Center
- iVDGL headquarters
- Analogous to a Network Operations Center
- Located at Indiana University
- Single point of contact for iVDGL operations
- Database of contact information
- Centralized information about storage, network
and compute resources - Directory for monitoring services at iVDGL sites
32GLUE
- Grid Laboratory Uniform Environment
- A grid interface subset specification that
permits applications to run on grids from VDT and
EDG sources - Effort to ensure interoperability across
numerous physics grid projects - GriPhyN, iVDGL, PPDG
- EU DataGrid, DataTAG, CrossGrid, etc.
- Interoperability effort focuses on
- Software
- Configuration
- Documentation
- Test suites
33WorldGrid
- Effort at a world wide DataGrid
- Easy to deploy and administer
- Middleware based on VDT
- Chimera development
- Scalability
- Demo at SC2002
- United DataTAG and iVDGL
34Resource Diagram
35iVDGL Management
36Issues across projects
- Technical readiness
- Infrastructure readiness
- Collaboration readiness
- Common ground
- Coupling of tasks
- Incentives
37Technical readiness
- Very high
- Physics and CS are both very high on the adoption
curve, generally - Long history of infrastructure development to
support national and global experiments
38Infrastructure readiness
- Also quite high
- Not all of the pieces are in place to meet demand
- The expertise exists within these communities to
build and maintain the necessary infrastructure - Community is inventing the infrastructure
- Real understanding in the project that
interoperability and standards are part of
infrastructure
39Collaboration readiness
- Again, quite high
- Physicists have a long history of large scale
collaboration - CS collaborations built on old relationships with
long time collaborators
40Common ground
- Perhaps a bit too high
- What you can do with a physics background
- Win the ACM Turing Award
- Co-invent the World Wide Web
- Direct the development of the Abilene backbone
- Because application community has a strong
understanding of the required work and the
technical aspects of the work, some friction
about how work separates - History of physicists building computational
tools e.g. ROOT
41Coupling of tasks
- Tasks decompose into subtasks that are somewhat
tightly coupled - Locate tightly coupled tasks at individual sites
42Incentives
- Both groups are well motivated, but for different
reasons - CS is engaging in extremely cutting edge research
across a large range of activities - Funded for deployment as well as development
- Physics is structurally committed to global
collaborations
43Some successes
- Lessons in infrastructure development
- Outreach and engagement
- Community buy-in / investment
- Achieving the CS research goals for Virtual Data
and Grid execution
44Infrastructure Dev
- Looking at the history of the Grid (electrical,
not computational) - Long phases
- Invention
- Initial production use
- Adaptation
- Standardization / regulation
- Geographically bounded dominant design
- I.e. 220 vs. 110
45Infrastructure Dev
- We dont see this with GriPhyN / iVDGL
- Projects concurrent, not consecutive
- Pipeline approach to phases of infrastructure
development - Real efforts at cooperation with other DataGrid
communities - Why?
- Deep understanding at high levels of project that
building it alone is not enough - Directive and funding from NSF to do deployment
46Outreach
- The GriPhyN / iVDGL community is extremely active
in outreach to other projects and communities - Evangelizing virtual data
- Distributed tools
- This is a huge win for building CI that others
can use
47Community buy-in
- Together, these projects are funded at nearly
30M over 6 years - This does not represent the total investment that
was needed to make this work - Leveraged FTE
- Unfunded testbed sites
- International partners
- Lots of collaboration with PPDG starting some
with Alliance Expeditions, etc - This kind of community commitment necessary for a
project of this size to succeed
48Challenges
- Staying relevant
- Building infrastructure with term limited funding
49Staying Relevant (1)
- The application communities are fast paced, high
power groups of people - Real danger in those communities developing tools
that satisfice while they wait for the tools that
are optimal and fit into a greater
cyberinfrastructure - Each experiment ideally wants tools perfectly
tailored to their needs - Maintaining user engagement and meeting the needs
of each community is critical, but difficult
50Staying Relevant (2)
- In addition to staying relevant to the
experiments, GriPhyN must also be relevant to the
greater scientific community - To CS researchers
- To similarly data-intensive projects
- Easy to understand code, concepts, APIs, etc.
- How do you accommodate both a focused client
community and the broader scientific community - Common challenge across many CI initiatives
51Limited Term Investment
- These projects are both funded under the NSF ITR
mechanism - 5 year limit
- Would you buy your telephone service from a
company that was going to shut down after 5
years? - Challenge to find a sustainable support mechanism
for CI