Title: RealityGrid
1RealityGrid
- Software Infrastructure, Integration and
Interoperability
Stephen Pickles ltstephen.pickles_at_man.ac.ukgt http
//www.realitygrid.org Royal Society, Tuesday 17
June, 2003
2Middleware landscape
3A tour of the Middleware Landscape
- Traditional Grid computing, roots in
Supercomputing - Globus Toolkit
- US led, de-facto standard, adopted by UK
e-Science programme - UNICORE
- German and European
- Business-business integration, roots in
e-Commerce - Web services
- The great white hope
- Open Grid Services Architecture (OGSA)
4Globus Toolkit
- A software toolkit addressing key technical
problems in the development of Grid enabled
tools, services, and applications - Offer a modular bag of technologies
- Enable incremental development of grid-enabled
tools and applications - Implement standard Grid protocols and APIs
- Make available under liberal open source license
- The Globus Toolkit centers around four key
protocols - Connectivity layer
- Security Grid Security Infrastructure (GSI)
- Resource layer
- Resource Management Grid Resource Allocation
Management (GRAM) - Information Services Grid Resource Information
Protocol (GRIP) - Data Transfer Grid File Transfer Protocol
(GridFTP)
5Globus features
- "Single sign-on" through Grid Security
Infrastructure (GSI) - Remote execution of jobs
- GRAM, job-managers, Resource Specification
Language (RSL) - Grid-FTP
- Efficient file transfer third-party file
transfers - MDS (Metacomputing Directory Service)
- Resource discovery
- GRIS and GIIS
- Today, built on LDAP
- Co-allocation (DUROC)
- Limited by support from scheduling infrastructure
- Other GSI-enabled utilities
- gsi-ssh, grid-cvs, etc.
- Commodity Grid Kits (CoG-kits), Java, Python,
Perl - Widespread deployment, lots of projects, UK
e-Science Grid
6UNICOREUNIform access to COmputing REsources
- Started 1997
- German projects
- UNICORE
- UNICORE PLUS
- EU projects with UK partners
- EUROGRID
- GRIP
- www.unicore.org
7UNICORE features
- High-level abstractions
- Abstract Job Object (AJO) represents a user's job
- Task dependencies represented in a DAG
- A task may be script, a compilation or run of an
executable, a file transfer, - SubAJOs can be run on different computational
resource - Software resources (we miss this!)
- High-level services or capabilities
- Built in workflow
- Support for conditional execution and loops (tail
recursive execution of a DAG) - Graphical user interfaces for job definition and
submission - Extensible through plug-ins
- Resource broker
8Growing awareness of UNICORE
- Early UNICORE-based prototype OGSI implementation
by Dave Snelling May 2002 - Later used in first RealityGrid computational
steering demonstrations at UK e-Science All Hands
Meeting, September 2002 - Dave Snelling, senior architect of UNICORE and
RealityGrid partner is now co-chair of the OGSI
working group at Global Grid Forum - UNICORE selected as middleware for Japanese
NAREGI project (National Research Grid
Initiative) - 5 year project 2003-2007
- 17M budget in FY2003
9Recent UNICORE developments
- GridFTP already supported as alternative file
transfer mechanism - Multiple streams help even when there is only one
network link - pipelining! - Support for multiple streams and re-startable
transfers will go into UNICORE Protocol Layer
(UPL) mid-summer 2003 - UNICORE servers now support streaming (FIFOs)
- Ideal for simulation-visualization connections
RealityGrid style - Translators between UNICORE resource descriptions
and MDS exist - Looking closely at XML based representation of
UNICORE resource descriptions - Ontological model to be informed by GLUE schema
and NAREGI experiences - UNICORE programmer's guide (draft)
- OGSI implementation in progress
- With XML serialization over UPL (faster than SOAP
encapsulation, even with MIME types) - OGSI portType for Abstract Job Object (AJO)
consignment - NB. OGSI compliance alone does not imply
interoperability with GRAM! - Job Management and Monitoring portTypes to follow
10UNICORE and Globus
- UNICORE
- Java and Perl
- Packaged Software with GUI
- Top down, vertical integration
- Not originally open source
- Designed with firewalls in mind
- Easy to customise
- No delegation
- Abstract Job Object (AJO) Framework
- Workflow
- Resource Broker
- Easy to deploy
- Globus
- ANSI C, Perl, shell scripts
- Toolkit approach
- Bottom up, horizontal integration
- Open Source
- Firewalls are a problem
- Easy to build on
- Supports Delegation
- Weak, if any, higher level abstractions
11Web Services
- Loosely Coupled Distributed Computing
- Think Java RMI or C remote procedure call
- Service Oriented Architecture
- Text Based Serialization
- XML Human Readable serialization of objects
- Three Parts
- Messages (SOAP)
- Definition (WSDL)
- Discovery (UDDI)
12Web Services in Action
13Enter Grid Services
- Experiences of Grid computing (and business
process integration) suggest extending Web
Services to provide - State
- Service Data Model
- Persistence and Naming
- Two Level Naming (GSH, GSR)
- Allows dynamic migration and QoS adaptation
- Lifetime Management
- Explicit Semantics
- Grid Services specify semantics on top of Web
Service syntax. - Infrastructure support for common patterns
- Factories, registries (service groups),
notification - By integrating Grid and Web Services we gain
- Tools, investment, protocols, languages,
implementations, paths to exploitation - Political weight behind standards and solutions
14OGSA - outlook
- The Open Grid Services Infrastructure (OGSI) is
the foundation layer for the Open Grid Services
Architecture (OGSA) - OGSA has a lot of industrial and academic weight
behind it - IBM is committed, Microsoft is engaged, Sun and
others are coming on board - Globus, UNICORE and others are re-factoring for
OGSA - Goal is to push all OGSI extensions into core
WSDL - Some early successes
- OGSI specification is now in RFC after 12 months
concerted effort by many organisations and
individuals - OGSI promises "a better kind of string" to hold
the Grid together. But its not a "silver
bullet". - Lacking higher-level semantics
- Much standardisation effort still required in
Global Grid Forum and elsewhere
15OGSI implementations
- Complete
- Globus Toolkit version 3 (Apache/Axis)
- Fast and furious pre-releases trying to keep pace
with evolving standard - General release Q3 2003
- .NET implementation from Marty Humphrey,
University of Virginia - Partial
- Java/UNICORE Snelling and van den Berghe, FLE
- production-quality implementation informed by May
'02 prototype - Python Keith Jackson, LBNL
- Perl Mark Mc Keown, University of Manchester
- GGF panel will report next week.
16The pieces of RealityGrid
- Integration, Independence and Interoperability
- Or
- Putting it all together
17The pieces
- Steering library and tools
- Visualization
- See also talk by Steven Kenny, Loughborough
University - Web portal
- More on this and other topics from
Jean-Christophe Desplat, EPCC - ICENI
- More later from Steven Newhouse, LeSC, Imperial
College - Performance Control
- More later from John Gurd, CNC, University of
Manchester - Instruments LUSI, XMT (not this talk)
- Human-Computer Interfaces (HCI)
- Should not be last in the list!
- More later from Roy Kalawsky, Loughborough
University - Not to mention application codes and the
middleware landscape itself.
18Design philosophies and tensions
- Grid-enabled
- Component-based and service-oriented
- plug in and compose new components and services,
from partners and third parties - Independence and modularity
- to minimize dependencies on third-party software
- Should be able to steer locally without and Grid
middleware - to facilitate parallel development within project
- Integration and/or interoperability
- Things should work together
- Respect autonomy of application owners
- Prefer light-weight instrumentation of
application codes to wholesale re-factoring - Same source (or binary) should work with or
without steering - Dynamism and adaptibility
- Attach/detach steering client from running
application - Adapt to prevailing conditions
- Intuitive and appropriate user interfaces
19First Fast Track Demonstration
Jens Harting at UK e-Science All Hands Meeting,
September 2002
20Fast Track Steering DemoUK e-Science AHM 2002
Bezier SGI Onyx _at_ Manchester Vtk VizServer
Firewall
SGI OpenGL VizServer
UNICORE Gateway and NJS Manchester
Laptop SHU Conference Centre
Simulation Data
UNICORE Gateway and NJS QMUL
- VizServer client
- Steering GUI
- The Mind Electric GLUE web service hosting
environment with OGSA extensions - Single sign-on using UK e-Science digital
certificates
Dirac SGI Onyx _at_ QMUL LB3D with
RealityGrid Steering API
Steering (XML)
21Steering architecture today
- Communication modes
- Shared file system
- Files moved by UNICORE daemon
- GLOBUS-IO
Simulation
Client
data transfer
Data mostly flows from simulation to
visualization. Reverse direction is being
exploited to integrate NAMDVMD into RealityGrid
framework.
Visualization
Visualization
22Computational steering how?
- We instrument (add "knobs" and "dials" to)
simulation codes through a steering library,
written in C - Can be called from Fortran90, C and C
- Library distribution includes Fortran90 and C
examples - Library features
- Pause/resume
- Checkpoint and restart new in today's
demonstration - Set values of steerable parameters (parameter
steer) - Report values of monitored (read-only) parameters
(parameter watch) - Emit "samples" to remote systems for e.g. on-line
visualization - Consume "samples" from remote systems for e.g.
resetting boundary conditions - Automatic emit/consume with steerable frequency
- No restrictions on parallelisation paradigm
- Images can be displayed at sites remote from
visualization system, using e.g. SGI OpenGL
VizServer - Interactivity (rotate, pan, zoom) and shared
controls are important
23Steering client
- Built using C and Qt library currently have
execs. for Linux and IRIX - Attaches to any steerable RealityGrid application
- Discovers what commands are supported
- Discovers steerable monitored parameters
- Constructs appropriate widgets on the fly
24Implementing steering
- Steps required to instrument a code for steering
- Register supported commands (eg. pause/resume,
checkpoint) - steering_initialize()
- Register samples
- register_io_types()
- Register steerable and monitored parameters
- register_params()
- Inside main loop
- steering_control()
- Reverse communication model
- User code actions, in sequence, each command in
list returned - Support routines provided (eg. emit_sample_slice)
- When you write a checkpoint, register it
- When finished,
- steering_finalize()
25Steering in an OGSA framework
Simulation
bind
Steering library
publish
Client
connect
data transfer
Steering client
Registry
find
publish
bind
Visualization
Visualization
26Steering in OGSA continued
- Each application has an associated OGSI-compliant
Steering Grid Service (SGS) - SGS provides public interface to application
- Use standard grid service technology to do
steering - Easy to publish our protocol
- Good for interoperability with other steering
clients/portals - Future-proofed next step to move away from
file-based steering - Application still need only make calls to the
steering library - SGSs used to bootstrap direct inter-component
connections for large data transfers - Early working prototype of OGSA Steering Grid
Service exists - Based on light-weight Perl hosting environment
27Steering wishlist
- Logging of all steering activity
- Not just checkpoints
- As record of investigative process
- As basis for scripted steering
- Scripted steering
- Breakpoints ( IF (temperature gt TOO_HOT) THEN )
- Replay of previous steering actions
- Advanced checkpoint management to support
exploration of parameter space (and code
development) - Imagine a tree where nodes are checkpoints and
branches are choices made through steering
interface (cf. GRASPARC)
28On-line visualisation
- Fast track demonstrators use open source VTK for
on-line visualisation - Simple GUI built with Tk/Tcl, polls for new data
to refresh image - Some in-built parallelism
- VTK extended to use the steering library
- AVS-format data supported
- XDR-format data for sample transfer between
platforms - Volume-data focus so far but this will change as
more steerable applications bring new
visualization requirements - Volume render (parallel)
- Isosurface
- Hedgehog
- Cut-plane
29Visualization experiences
- Positive experiences of SGI VizServer
- Delivers pixels to remote displays, transparently
- Reasonable interactivity, even over long
distances - Will test further over QoS-enabled networks in
collaboration with MB-NG - Also experimenting with Chromium
- VizServer Integration
- Put GSI authentication into VizServer PAM when
released - Use VizServer API for Advance Reservation of
graphics pipes, and - Exploit new collaborative capabilities in latest
release. - Access Grid Integration trying two approaches
- VizServer clients in Access Grid nodes (lose
benefits of multicast) - Using software from ANL (developers of Chromium)
to write from VTK to video interface of Access
Grid
30Visualization Integration issues
- Traditional Modular Visualization Environments
tend to be monolithic - Incorporating the simulation in an extended
visualization pipeline makes steering possible,
but often implies lock-in to a single framework - When remote execution is supported, rarely "Grid
compatible" - But the UK e-Science project gViz is changing
this for Iris Explorer - Can't easily compose components from different
MVEs - No single visualization package meets all
requirements for features and performance
31Portal
- Portal "web client"
- Includes job launching and monitoring
- EPCC Work package 3 portal integration
- Initial design document released, work about to
start - Integration issues
- Portal must provide steering capability
- Through Steering Grid Service
- On-line visualization is harder for a web client
- Do only what's practical
- gt 40 UK e-Science projects have a portal work
package!
32ICENI (deep track)
- ICENI provides
- Metadata-rich component framework
- Exposure of components as OGSA services
- Steering and visualization capabilities of its
own - Rich set of higher level services
- Potentially a layer between RealityGrid and
middleware landscape - Recent developments
- First controlled release of ICENI within project
to CNC - Integration of "binary components" (executables)
- Dynamic composition in progress (important for
attach/detach) - More in talk by Steven Newhouse
- ICENI integration challenges
- Describe and compose the components of a "fast
track" computational steering application in
ICENI language - Reconcile "fast track" and ICENI steering
entities and concepts
33Performance Control
34Performance Control (deep track)
- Component loader supports parallel job migration
- Relies on checkpoint/restart software
- Proven in LB3D
- Plan to generalise, co-ordinating with Grid-CPR
working group of GGF - Malleable from N processors on platform X to M
processors on platform Y - Integration issues
- Must work with ICENI components
- Groundwork laid
- Application codes must be instrumented for
Performance Control - Steering Library
- Both trigger taking of checkpoints
- Steering system needs to know when a component
migrates - Embedding Performance Control in Steering Library
may be feasible
35Human Computer Interfaces
- Recommendations from HCI Audit beginning to
inform a wide range of RealityGrid developments - Steering
- Visualization
- Web portal
- Workflow
- Management of scientific process
- Sometimes architecturally separate concepts
should be presented in single interface to the
end-user - More in Roy Kalawsky's talk
36Middleware, again
- All this has to build on, and interface with, the
shifting sands of the middleware landscape - We seem to be spoiled for choice
- Globus, UNICORE, Web Services, ICENI are all on
an OGSA convergence track - with the promise of interoperability
- but at a very low level.
- Higher level services will emerge
- But will they be timely?
- Will they be backed by standards?
- Will we be able to use them?
37Widespread deployment
- Short term, GT 2 is indicated
- To gain leverage from common infrastructure of UK
e-Science programme - Grid Support Centre, Level 2 Grid, Training,
Engineering Task Force, etc. - Hence our recent work on RealityGrid-L2
- Demonstrated on Level 2 Grid at LeSC open day in
April - But it's hard
- Deployment and maintenance overhead of Globus is
high - We need "high quality" cycles for capability
computing - Gaining access to L2G resources is still painful,
but slow progress is being made - Much work required to adapt applications and
scripts to available L2G resources - Discovering useful information like queue names
and policies location of compilers, libraries,
tools, scratch space retention policies, etc. is
still problematic
38External dependencies
- Robustness of middleware
- Maturity of OGSA and availability of services
- Tooling support for web services in languages
other than Java and C - Support for advanced reservation and
co-allocation - Need to co-allocate (or co-schedule)
- (a) multiple processors to run a parallel
simulation code, - (b) multiple graphics pipes and processors on the
visualization system - More complex scenarios involve multi-component
applications.
39Conclusions
- We've achieved a lot
- But there's tough challenges ahead
40Partners
- Academic
- University College London
- Queen Mary, University of London
- Imperial College
- University of Manchester
- University of Edinburgh
- University of Oxford
- University of Loughborough
- Industrial
- Schlumberger
- Edward Jenner Institute for Vaccine Research
- Silicon Graphics Inc
- Computation for Science Consortium
- Advanced Visual Systems
- Fujitsu
41SVE _at_ Manchester Computing
Bringing Science and Supercomputers
Together http//www.man.ac.uk/sve sve_at_man.ac.uk