Title: The Big Questions
1The Big Questions
Life death
Future of the planet
Nature of the universe
Consciousness
2How Do We Answer Them?
lt0
1700
1950
1990
3The Same is True of Smaller Questions
- Designing new chemical catalysts
- Selling advertising
- Creating entertainment
- Finding parking
4Information Technology and Science
- Paul Erdös claimed that a mathematician is a
machine for turning coffee into theorems. The
scientist is arguably a machine for turning data
into insight. - Service-Oriented Science, I. Foster, Science,
308, p. 814.
5What are the Products of Science?
- Papers
- We learned this, and heres how.
- Data and Datasets
- We collected this data. Download it, write an
analysis program, and see what you can learn from
it. - Web Portals
- We constructed this scientific model. Use our
data or bring your own, supply some parameters,
and see how it behaves. - Requires manual operation.
- Web Services
- Heres our climate model. Integrate it with your
models for ocean currents/weather/crop
forecasts and see what happens. - Heres our indexed data from the latest
experiment run. Run your filters against it and
see if you can find anything interesting. - Heres our genome analysis engine. Upload your
proteins and see what they will do in a cell.
Increasing degrees of collaboration
6GridAn Enabler of eScience
The dubious electrical power grid analogy
- Must we buy (or travel to) a power source?
Or can we ship power to where we want to work?
Enable on-demand access to, and integration
of, diverse resources services, regardless of
location
71st Generation Grids
Focus on aggregation of many resources for
massively (data-)parallel applications
EGEE
8Second-Generation Grids
- Empower many more users by enabling on-demand
access to services - Science gateways (TeraGrid)
- Service oriented science
- Or, Science 2.0
Service-Oriented Science, Science, 2005
9Web 2.0
- Software as services
- Data- computation-richnetwork services
- Services as platforms
- Easy composition of services to create new
capabilities (mashups)that themselves may be
made accessible as new services - Enabled by massive infrastructure buildout
- Google projected to spend 1.5B on computers,
networks, and real estate in 2006 - Many others are spending substantially
- Paid for by advertising
Declan Butler, Nature
10Automating Science
- Human access to data is nice
- Automated access by software tools is
revolutionary - In the time that a human user takes to locate
one useful piece of information within a Web
site, a program may access and integrate data
from many sources and identify relationships that
a human might never discover unaided. - Foster
11Science 2.0E.g., Virtual Observatories
Gateway
Data Archives
Figure S. G. Djorgovski
12Service-Oriented Science
- People create services (data or functions)
- which I discover ( decide whether to use)
- compose to create a new function ...
- then publish as a new service.
- ? I find someone else to host services, so I
dont have to become an expert in operating
services computers! - ? I hope that this someone else can manage
security, reliability, scalability,
!
!
Service-Oriented Science, Science, 2005
13Are Scientists Really Developing Web Services?
14Cancer Bioinformatics Grid
- Common system architecture (caGrid) provides the
service interface plumbing and the service
hosting capability - Web services
- Community participants supply useful services
- Data access, analysis, modeling, filtering,
authoring, etc. - https//cabig.nci.nih.gov/
15The Introduce Authoring Tool
- Define service
- Create skeleton
- Discover types
- Add operations
- Configure security
- Modify service
Generates GT4-compatible WebServices
Introduce Hastings, Saltz, et al., Ohio State
University
16The Importance of Hostingand Management
Tell me about this star
Tell me about these 20K stars
Support 1000sof users
E.g., Sloan DigitalSky Survey, 10 TB others
much bigger
17Skyserver Sessions(Thanks to Alex Szalay)
18Hosting ManagementApplication Hosting Services
Appln Code
Application providers
Application deployment
Application Prep Tool(s)
Provisioning
Application client
Resource Provider
Appln Code
Users
Resource Provider
Appln Code
AHSmanagement
Hosting Service
Author ization
Admins
Persistence
Policymanagement
PDP
19Who Will Host Your Services?
- Your institution (campus resources)
- (Inter)national systems
- TeraGrid, Open Science Grid, UK Natl Grid
Service, ChinaGrid, NaukaGrid, etc. - Science domain systems
- caBIG, NEES, Earth System Grid, Orion, LEAD,
NEON, LHC Computing Grid, etc. - Commercial systems
- Amazon, Google, etc.
20Examples ofProduction Scientific Grids
- APAC (Australia)
- China Grid
- China National Grid
- DGrid (Germany)
- EGEE
- NAREGI (Japan)
- Open Science Grid
- Taiwan Grid
- TeraGrid
- ThaiGrid
- UK Natl Grid Service
21Application-Infrastructure Gap
- Dynamicand/orDistributedApplications
22Bridging the Application-Resource Gap
User Application
Database
Specialized resource
Computers
Storage
23Grid Infrastructure
- Distributed management
- Of physical resources
- Of software services
- Of communities and their policies
- Unified treatment
- Build on Web services framework
- Use WS-RF, WS-Notification (or WS-Transfer/Man)
to represent/access state - Common management abstractions interfaces