Title: Emerging Technologies: Grid Computing
1Emerging Technologies Grid Computing
- Jon B. Weissman (jon_at_cs.umn.edu) Department of
Computer Science - University of Minnesota
- http//dcsg.cs.umn.edu
2Outline
- What is a Grid?
- What is it good for?
- Grid Evolution and Models
- Grid Initiatives at the DTC and U Minnesota
3What is a Grid?
- Computational Grids
- ensemble of geographically-dispersed resources
- seamless, transparent
- Analogy to Power Grids
- cheap, ubiquitous, consistent
- computational grids deliver computing data -
not power -
- Core Grid Features and Challenges
- single-sign on
- dynamic and shared
- highly heterogeneous
- multiple administrative domains
- sheer scale
4What is it good for?
- Grid is an Opportunity
- High performance computing
- Distributed Supercomputing
- aggregate resources for large problems to reduce
runtime - e.g. physical process simulation climate
modeling - High Throughput Computing
- throughput gt many jobs or tasks / unit time
- exploit idle resources to increase throughput
- e.g simulate 10 parameters each taking on 10
values 1010 tasks! - Resource sharing
- Exploit idle resources
- On-demand computing near real-time remote access
- short-term access of remote capability (CPUs,
software, etc) - driven by cost-performance and increased
functionality - e.g. one-time access to a computer-enhanced MRI
machine
5Grids are Evolving
- Grid Evolution
- 1st Generation enabling technology make it
run - Globus and Legion toolkits - focus on HPC,
resources were computers, and data stores - Grid is visible
- 2nd Generation problem-solving
commercialization make it familiar - Moving out of academic labs into govt labs
resources include scientific instruments - Grid is becoming invisible
- 3rd Generation problem-solving make it
useable and general-purpose - Moving into the Enterprise focus on
integration/standards, resources include
software, services, and people. - Grid will be invisible
61st Gen Grid Infrastructure
- Legion/Avaki - Everything is an object
- CPU host object, Data store vault objects,
- Globus Bag of functions
- Job submission, Remote I/O,
Applications
Middleware
Core Globus services
Local OS
72nd Gen (today) Grid Paradigms
- Grid Models
- Top-down Nasa IPG, DOE PPDG
- expensive resources, few resource owners
- Bottom-up seti, genome, _at_home
- cheaper donated resources P2P, many resource
owners - users donate for participation in new technology
or other incentives - Grid Function Diversity
- Data Grids, Compute Grids, Physics Grids, etc.
- NeesGrid, PPDG,
- Grid Standards
- Global Grid Forum 500-1000 attendees 4 X a year
- 25 Corporate Sponsors IBM, HP, Sun, Msoft,
83rd Gen Virtual Organizations and Grid Services
- A Virtual Organization is a logically
centralized, physically distributed community
that pursues common goals and objectives - multi-institutional
- no central authority
- sharing conditional issues of trust, policy,
negotiation, payment, - dynamic requires new capabilities synthesized
from existing services - multiple qualities of service
9VO Examples (Foster 2002)
- Civil engineers collaborate to design, execute,
analyze shake table experiments - A biochemist exploits 10,000 computers to screen
100,000 compounds in an hour - 1,000 physicists worldwide pool resources for
analyses of petabytes of data - Climate scientists visualize, annotate, analyze
terabyte simulation datasets - An emergency response team couples real time
data, weather model, population data - A multidisciplinary analysis in aerospace couples
code and data in four companies
10Grid Services
- Construct VOs Using Grid Services
- Services Encapsulate
- Computations, resources, information sources,
- Everything is a service
- Web Service Standard
- Protocols for service and interface discovery
- Web services are static
- Grid Service is a Dynamic Web Service
- Grid service has a negotiated lifetime
- Open Grid Services Architecture (OGSA)
11Grid Services OGSA
- OGSA provides lifetime management and
introspection for Grid Services - OGSA provides several standard services
including - factory service that creates other service
instances - registry service that registers services for
lookup - myVO will be a collection of interacting Grid and
non-Grid Services
12Example Genome Comparison
- A GenCompare Grid service has two interfaces
- GenCompare comparison algorithm (SW, BLAST, )
- Compare (source, target) -gt score
- GridService lifetime, introspection,
- Query_Comparison() -gt BLAST
13VO Scenario
GenCompare Factory
GeneDB Service
Community Registry
GeneDB 1
Compute Service Provider
User Application
. . .
. . .
GeneDB Service
Target Factory
I want to compare my source sequence library
against all known target sets
GeneDB n
Storage Service Provider
14VO Scenario
Find me a genome comparison service, and a
target service
GenCompare Factory
GeneDB Service
Community Registry
GeneDB 1
Compute Service Provider
User Application
. . .
. . .
GeneDB Service
Target Factory
GeneDB n
Storage Service Provider
15VO Scenario
Handle for GenCompare And Target factories
GenCompare Factory
GeneDB Service
Community Registry
GeneDB 1
Compute Service Provider
User Application
. . .
. . .
GeneDB Service
Target Factory
GeneDB n
Storage Service Provider
16VO Scenario
GenCompare Factory
GeneDB Service
Community Registry
Create a Genome Compare service with initial
lifetime x
GeneDB 1
Compute Service Provider
User Application
. . .
. . .
GeneDB Service
Target Factory
GeneDB n
Create a Target service with initial lifetime x
Storage Service Provider
17VO Scenario
GenCompare Factory
GeneDB Service
GeneDB 1
Miner
Compute Service Provider
User Application
. . .
GeneDB Service
Target Factory
GeneDB n
Miner
Storage Service Provider
18VO Scenario
GenCompare Factory
GeneDB Service
Compare MySourceDB
GeneDB 1
Miner
Compute Service Provider
User Application
. . .
GeneDB Service
Target Factory
GeneDB n
Miner
Storage Service Provider
19VO Scenario
GenCompare Factory
GeneDB Service
Compare MySourceDB
GeneDB 1
Miner
pull source lib
Compute Service Provider
User Application
. . .
GeneDB Service
Target Factory
GeneDB n
Miner
Storage Service Provider
20VO Scenario
pull target lib
GenCompare Factory
GeneDB Service
Compare MySourceDB
GeneDB 1
Miner
Compute Service Provider
User Application
. . .
get next target
GeneDB Service
Target Factory
GeneDB n
Miner
Storage Service Provider
21VO Scenario
GenCompare Factory
GeneDB Service
Compare MySourceDB
GeneDB 1
Miner
Compute Service Provider
User Application
pull target lib
get next target
GeneDB Service
Target Factory
GeneDB n
Miner
Storage Service Provider
22VO Scenario
GenCompare Factory
GeneDB Service
comparison score
GeneDB 1
Miner
Compute Service Provider
User Application
. . .
GeneDB Service
Target Factory
GeneDB n
Miner
Storage Service Provider
23Grid Initiatives at the DTC
- Research theme make the Grid invisible
- Four points in Grid space
- basic research (NSF)
- scheduling and resource management
- infrastructure (NSF, DOE)
- community services
- programming (AHPCRC)
- component-based toolkit for a Grid API
- live Grids (DTC, NSF)
- ADCS student lab
24Infrastructure
- Community Services Project
- the Grid should be service-oriented ala OGSA
- code -gt service -gt Grid service
- designing infrastructure for high-end services
- adaptive, scalable, resilient, self-managing,
consistent, high performance - reusable libraries and software that can be used
by service-providers - service provider maintains, tunes, and upgrades
service automatically - user need not have high-end resources
25Mixture of Experts
Service Manager
Request
Request Manager
Adaptive code library
Result
26Grid-Enabled Network Services
Home Site
Network Service Front-End
SM
Request
RM
RM
..
Site N
SM
Result
RM
27Gene Sequence Service Manager
Request
Request Manager
Result
beowulf.cs.umn.edu
Gene Sequence Libraries
28Experimental Results
- Dynamic selection of performance predictors and
scheduling policies
29Grid Programming Toolkit
- How to Program Grid Applications?
- many Grid applications are not just a single
component or service but a web of interacting
components ala OGSA - Component Model Toolkit
- allow components and services to be composed
together - toolkit allows component interactions to be
specified - Toolkit runtime will (goal)
- schedule components/services
- select best component/service communication
mechanism
30ADCS Grid
- Shared Student Lab of 98 machines downstairs
- Outfitted with GigE and fast large disks
- Turn Lab into a large storage Grid for data
processing and visualization first - Astronomy (PPM gas dynamics/visualization) -
Woodward - Movie Making
- Biology (Plant Genomics) - Retzel
- Similarity searches on large plant genomic
datasets - Students!