Developing SERVOGrid: eScience for Earthquake Simulation - PowerPoint PPT Presentation

1 / 55
About This Presentation
Title:

Developing SERVOGrid: eScience for Earthquake Simulation

Description:

the application of information technology in the context of another field. ... Discussion boards, MOTDs, message boards, chat. Calendar tools ... – PowerPoint PPT presentation

Number of Views:70
Avg rating:3.0/5.0
Slides: 56
Provided by: servo
Category:

less

Transcript and Presenter's Notes

Title: Developing SERVOGrid: eScience for Earthquake Simulation


1
Developing SERVOGrid e-Science for Earthquake
Simulation
Marlon Pierce Community Grids Lab Indiana
University
2
Some slides to introduce myself
3
What is Informatics?
  • Informatics is...
  • understanding the impact technology has on
    people.
  • the development of new uses for technology.
  • the application of information technology in the
    context of another field.
  • http//www.informatics.indiana.edu/overview/what_i
    s_informatics.asp

4
A Personal Example
  • My graduate training is in computational
    condensed matter physics.
  • I developed quantum Monte Carlo codes for
    simulating helium physically adsorbed on
    graphite.
  • My problems are not so much parallel computing,
    but
  • Finding enough computing resources for parameter
    space studies.
  • Managing lots and lots of data files and their
    metadata.
  • How can I keep track of all the information I am
    generating?

5
A Personal Example, Cont.
  • Price of success codes used by others in my
    advisers group in their Ph. D. work, but
  • How do we remove quirks of file names, parameter
    settings?
  • How can we simplify running the applications and
    reduce the learning time.
  • How can we avoid wasted computing with incorrect
    settings?
  • Using He-3 mass with He-4 potentials
  • How can I share results with collaborators?
  • These problems became more interesting to me and
    have been the themes of my postgraduate career.

6
What Have I Learned?
  • Science Informatics should apply appropriate
    information technologies and other tools to
    science problems
  • Working scientists vary widely in their expertise
    in tools.
  • Some technologies we will examine
  • Web services, Web portals, Semantic Web
  • My emphasis is on application of technologies.
  • Must have a broad knowledge of available
    technologies.
  • Must avoid reinventions.
  • Avoid the Hammer A and Hammer B
    anti-patterns.
  • Hammer Fallacy A all problems are nails.
  • Hammer Fallacy B always use the big hammer.

7
Now introduce servogrid as a science informatics
application/challenge
8
SERVOGrid Solid Earth Research Virtual
Observatory
  • Grid Services and Portals to Support Earthquake
    Science

9
First, explain the problems, give background
10
Solid Earth Research Virtual Observatory (iSERVO)
  • Web-services (portal) based Problem Solving
    Environment (PSE)
  • Couples data with simulation, pattern recognition
    software, and visualization software
  • Enable investigators to seamlessly merge multiple
    data sets and models, and create new queries.
  • Data
  • Spaced-based observational data
  • Ground-based sensor data (GPS, seismicity)
  • Simulation data
  • Published/historical fault measurements
  • Analysis Software
  • Earthquake fault
  • Lithospheric modeling
  • Pattern recognition software

11
Philosophy
  • Store simulated and observed data
  • Archive simulation data with original simulation
    code and analysis tools
  • Access heterogeneous distributed data through
    cooperative federated databases
  • Couple distributed data sources, applications,
    and hardware resources through an XML-based Web
    Services framework.
  • Users access the services (and thus distributed
    resources) through Web browser-based Problem
    Solving Environment clients. 
  • The Web services approach defines standard,
    programming language-independent application
    programming interfaces, so non-browser client
    applications may also be built.

12
SERVOGrid Basics
  • Under development in collaboration with
    researchers at JPL, UC-Davis, USC, and Brown
    University.
  • Geoscientists develop simulation codes, analysis
    and visualization tools.
  • We need a way to bind distributed codes, tools,
    and data sets.
  • This is referred to as a Grid.
  • We need a way to deliver it to a larger audience
  • Instead of downloading and installing the code,
    use it as a remote service.

13
SERVOGrid Codes, Relationships
Elastic Dislocation Inversion
Viscoelastic FEM
Viscoelastic Layered BEM
Elastic Dislocation
Pattern Recognizers
Fault Model BEM
14
SERVOGrid Application Descriptions
  • Codes range from simple rough estimate codes to
    parallel, high performance applications.
  • Disloc handles multiple arbitrarily dipping
    dislocations (faults) in an elastic half-space.
  • Simplex inverts surface geodetic displacements
    for fault parameters using simulated annealing
    downhill residual minimization.
  • GeoFEST Three-dimensional viscoelastic finite
    element model for calculating nodal displacements
    and tractions. Allows for realistic fault
    geometry and characteristics, material
    properties, and body forces.
  • Virtual California Program to simulate
    interactions between vertical strike-slip faults
    using an elastic layer over a viscoelastic
    half-space
  • RDAHMM Time series analysis program based on
    Hidden Markov Modeling. Produces feature vectors
    and probabilities for transitioning from one
    class to another.
  • PARK Boundary element program to calculate fault
    slip velocity history based on fault frictional
    properties.a model for unstable slip on a single
    earthquake fault.
  • Preprocessors, mesh generators
  • Visualization tools RIVA, GMT

15
Problems Data Access and Sharing, Code
Integration
  • Codes all use custom text formats for describing
    input and output.
  • Input and output data often combined with
    code-specific information.
  • Number of iterations, array sizes, etc.
  • Data files often created by hand from journals,
    online repositories
  • Online repositories themselves use differing
    formats
  • Challenges are to develop common data formats,
    access services, and client query tools.

16
Data Formats
  • Faults, GPS or seismic data used in this project
    are retrieved from different servers.
  • Supported seismic data formats
  • SCSN
  • SCEDC
  • Dinger-Shearer
  • Haukkson
  • Supported GPS data formats
  • JPL
  • SOPAC
  • USGS

17
Next, present an overall architecture
18
SERVOGrid Architecture
  • Challenging problems like SERVOGrid are solved by
    starting with the right architecture.
  • Implementations and tools may change.
  • Having the right architecture and vision of the
    solution allows flexibility with point solutions.

19
Service Oriented Architectures
  • SERVOGrid is built around the Service Oriented
    Architecture Model.
  • W3C
  • Constituent pieces
  • Remotely accessible services
  • Capabilities are defined through interface
    definition languages (WSDL).
  • Accessible through messages and protocols (SOAP).
  • Implementations may change but interfaces must
    remain the same.
  • Client applications access remote services.
  • Client hosting environments
  • Web Portals are an example.
  • Going beyond services
  • Semantic descriptions for service and information
    modeling.
  • Programming/orchestration tools for connecting
    distributed services.

20
Browser Interface
JSP Client Stubs
DB Service 1
Job Sub/Mon And File Services
Viz Service
JDBC
DB
Operating and Queuing Systems
RIVA
Host 1
Host 2
Host 3
21
Web Services
  • Web services are the fundamental pieces of
    distributed Service Oriented Architectures.
  • We should define lots of useful services that are
    remotely available
  • Archival data access services supporting queries,
    real time sensor access, and mesh generation all
    seem to be popular choices.
  • Web services have two important parts
  • Distributed services
  • Client applications
  • These two pieces are decoupled one can build
    clients to remote services without caring about
    the programming language implementation of the
    remote service.
  • Java, C, Python

22
Web Services, Continued
  • Clients can be built in any number of styles
  • We build portal clients ubiquitous, can combine
  • One can build fancier GUI client applications.
  • You can even embed Web service client stubs
    (library routines) in your application code, so
    that your code can make direct calls to remote
    data sources, etc.
  • Regardless of the client one builds, the services
    are the same in all cases
  • my portal and your application code may each use
    the same service to talk to the same database.
  • So we need to concentrate on services and let
    clients bloom as they may
  • Client applications (portals, GUIs, etc.) will
    have a much shorter lifecycle than service
    interface definitions, if we do our job
    correctly.
  • Client applications that are locked into
    particular services, use proprietary data formats
    and wire protocols, etc., are at risk.

23
SERVOGrid Required Services
  • Computing Grid services
  • Remote command execution/job submission, file
    transfer, job monitoring.
  • These services
  • We may develop these using any number of toolkits
  • Globus, Apache Axis, GSoap.
  • Data Grid services
  • Access data bases and other data sources (faults,
    GPS, Seismic records).
  • Information Grid services
  • Metadata management

24
Here follows some descriptions about building
services
25
Execution Grid Service Examples(with Ahmet Sayar)
  • Simplest of these just run remote execution
    calls.
  • More interesting combining several services into
    a single meta-service.
  • Run Disloc, when done move the output from darya
    to danube, generate a PDF image of the output
    using GMT, then pull the output back to the
    client browser for display.
  • Expressing these workflows in languages is an
    active area.
  • Simple solution Apache Ant build tool.
  • Not a full fledged programming language, but it
    can do most of the workflow problems I encounter,
    and is easy to extend.
  • Tasks are expressible in XML, so you can build
    authoring tools to hide antisms and validate
    scripts.
  • Open source and because it is generally
    applicable, likely to outlive most workflow tools.

26
Templating Applications and Generating Interfaces
  • Users fill in ant templates through web forms
  • Ant execution services then invoke scripts.
  • Ant is a good way to wrap applications.
  • Ant template authoring tools simplify deployment
    of new wrapped services.
  • Ant scripts also can be used to automate user
    interface generation.

Figure Here
27
Some Screen Shots of Prototype
28
SERVOGrid Data Services(with Galip Aydin)
  • SERVO applications need real data sources
  • Online GPS and Seismic Activity catalogs
  • Lots of different formats.
  • Typically, a geoscientist downloads a catalog by
    hand, prunes out the undesired parts with
    scripts, and then runs analysis code.
  • Data services that unify formats and support
    database queries is obviously useful.

29
Data Sources
  • A summary of all supported formats can be found
    here
  • http//grids.ucs.indiana.edu/gaydin/servo
  • Information about supported seismicity catalog
    formats can be found in http//www.scecdc.scec.or
    g/catalogs.html
  • Information about supported GPS data formats can
    be found in http//www.scign.org
  • Future step directly grab data from sensors
  • Ric McMullen, Knowledge Acquisition Lab

30
GML Schemas as Data Models for Services
  • Fault and GPS Schemas are based on GML-Feature
    object.
  • Seismicity Schema is based on GML-Observation
    object.
  • Working schema available from http//grids.ucs.ind
    iana.edu/gaydin/schemas/

31
Metadata Management
  • Common problems in computational science
  • Where are the input and output files? When was
    this created? What parameters did I use to create
    this output? What version of the code? Is there
    a validation scenario for this code?
  • These are all metadata problems.

32
Context Management Service
  • Metadata may be organized into tree-like
    structures (see figure).
  • Context nodes hold one or more leaves and nodes.
  • Leaves are name/value pairs.
  • We usually need to create arbitrary trees.
  • Represent with recursive XML schema.
  • Search with XPath.
  • Context data storage is implementation dependent
    but service interface is independent.

Figure here
33
Context Manager Service Architecture
Client
SOAP/HTTP
Axis Servlet
Shared WSDL Interface
Context Manager
Internal Communication
Context Data
FS
XMLDB
34
Now Describe portals
35
Grid Client Environments
  • The services we have previously described are
    headless.
  • WSDL descriptions are all you need to create
    client stubs (if not client applications).
  • Clients to services can built with anything
  • Java, Python, .NET GUIs
  • Browser clients an extremely common example.
  • Web Portals
  • Client Hosting Environments

36
Computational Web Portal Stack
  • Web service dream is that core services, service
    aggregation, and user inteface development
    decoupled.
  • How do I manage all those user interfaces?
  • Use portlets.

Aggregate Portals
User Interfaces
Application Web Services and Workflow
Core Web Services
37
Portal Architecture
Clients (Pure HTML, Java Applet ..)
Aggregation and Rendering
Portlet ClassWebForm
Gateway (IU)
Web/Gridservice
Computing
Remoteor ProxyPortlets
Portlet ClassIFramePortlet
Web/Gridservice
Data Stores
Portlet ClassJspPortlet
GridPort etc.
Web/Gridservice
Instruments
Portlet ClassVelocityPortlet
(Java) COG Kit
Hierarchical arrangement
Jetspeed Internal Services
LocalPortlets
Clients
Portal Portlets
Libraries
Services
Resources
(Jetspeed)
38
Open Grid Computing Environment Collaboratory
Members
  • University of Chicago
  • Gregor von Laszewski
  • University of Illinois/NCSA
  • Jay Alameda
  • Joe Futrelle
  • Indiana University/Community Grids Lab and CS
  • Marlon Pierce
  • Geoffrey Fox
  • Dennis Gannon
  • Beth Plale
  • University of Michigan
  • Charles Severance
  • Joseph Hardin
  • University of Texas/TACC
  • Mary Thomas
  • Jay Boisseau

39
What Are Grid Portals?
  • Computing portals provide ubiquitous,
    browser-based access to grid resources.
  • No special client software or platform needed
  • Access information in visually intuitive form
  • Provide services to support user interactions
  • Job archiving?portal metadata management services
  • Combine core grid services into custom services
  • Launch multistage jobs with dependencies
  • Couple execution, file transfer,
    visualization/analysis
  • Many, many such projects
  • Concurrency and Computation Practice and
    Experiences special issue described more than
    two dozen in 2001.
  • GCE Research Group of the GGF is the community
    forum.
  • Thomas, Gannon, and Fox are chairs.

40
General Portal Architectures
41
What Are the Problems?
  • NMI team members have worked in various
    combinations on other projects
  • Alliance Portal (Gannon, PI)
  • SciDAC Fusion Portal (Thomas, PI)
  • DOD Computing Portal (Thomas, PI)
  • SciDAC CMCS (contributions from Severance,
    Hardin, and von Laszewski).
  • Problems are always the same
  • How do we share portal services?
  • How do we reuse components between projects and
    groups?
  • Can we provide a standard abstraction for portal
    services and interfaces?
  • Can we provide an architecture that allows
    services and user interface components to be
    added in a standard way?
  • Need to shorten the standard service deployment
    phase so that we can concentrate on harder
    problems, specific sophisticated services
  • Fusion Grid needs very interactive, visual
    interface for setting up problems
  • Need to be able to deploy standard components
    like MyProxy, GridFTP, etc interfaces quickly

42
Portlets and Containers
  • Provide a portal container/component system
  • Portal components are called portlets
  • Create a packaged, easy-to-install, customizable
    portal system with standard, useful components.
  • Pick and choose from available functionality
  • Support community extensions
  • Plug useful contributions from other groups
  • We base our system on Jakartas Jetspeed project
  • JSR 168 (released this summer) standardizes
    portlet systems
  • Commercial and open source implementations should
    interoperate
  • WSRP will provide standard ways to build remote
    portlets

43
OGCE Initial Architecture
44
Evolving Portal Architecture
45
(No Transcript)
46
Whats In the Release?
  • A component-based portal container
  • Jetspeed with CHEF enhancements, patches
  • Will evolve to JSR 168 standards
  • Portlet components and services
  • Discussion boards, MOTDs, message boards, chat
  • Calendar tools
  • Newsgroups and citation/reference managers
  • Grid information services (LDAP-based, GPIR)
  • Portlet interfaces to MyProxy credential
    management
  • Portlet interfaces to GridFTP
  • Scheduled for release, SC2003

47
Deliverables Science Portal Tools
  • Will concentrate on science applications
  • Provide services and examples for building
    Science Portals
  • Deliverables include
  • Application Manager Web Service with sample
    application
  • Portlets for IU Extreme Labs Application tools
    Application Factories, XEvents, Xdirectory,
    Xbooks
  • Portlets and services for QuakeSim Earthquake
    simulation
  • Metadata repository user interfaces and services

48
Deliverables Portal Collaboratory
  • We will use our own tools to provide a community
    portal
  • Provide information, collaboration for portal
    building community
  • Demonstrate capabilities
  • Provide a repository for community contributions
  • CMCS, NEESGrid, GridLab, and others

49
QuakeSim Portal Shots
50
Making SERVO Semantic
  • Application of Semantic Web tools and concepts to
    SERVOGrid

51
Where Is the Semantic Web?(with Mehmet Aktas)
  • Last summer I went on a quest to find this
    somewhat elusive entity.
  • What I found lots of great ideas, even a few
    implementations, but
  • Too much semantic, not enough web.
  • My conclusion it really needs some driving
    applications and more distributed computing
    infrastructure.
  • Driving application Scientific Metadata

52
Semantic Web in One Slide
dry_at_stateu.edu
http//.../CMCS/Entry/1
dccreator
vcardEMAIL
http//.../People/DrY
dctitle
H20
vcardN
RDF provides a subject/predicate/value syntax.
Predicates and values are URIs.
vcardFamily
vcardGiven
53
Semantic Needs for SERVOGrid
  • SERVOGrid has many types of metadatalittle
    ontologies
  • Computing resources
  • Applications
  • Data
  • Services
  • I have designed XML schemas and built services
    for this sort of metadata before, but they were
    too monolithic.
  • RDF has an interesting way of expressing linkages
    between different RDF fragments.
  • If we can exploit this, it will make for much
    more flexible metadata services.

54
A SERVOGrid Ontology
  • Show figure here.

55
Making It Work
  • One of the problems we encountered with
    processing RDF metadata is that tools assume all
    data is local.
  • What we really have though are metadata fragments
    scattered throughout SERVOGrid.
  • Need ways of processing RDF triplets when
    predicate values are not local.
Write a Comment
User Comments (0)
About PowerShow.com