Title: SCIRun with PtolemyII
1Toward interactive visualization in a distributed
workflow
Steven G. Parker Oscar Barney Ayla Khan Thiago
Ize
2Component-Based Architectures
- Experience with numerous component-based
architectures - CCA (Parallel, Method Invocation, multi-language)
- SCIRun (Shared memory, Dataflow, C)
- Uintah (Parallel, Method Invocation, C)
- Kepler (Single process web services,
Generalized dataflow, Java ) - SCIRun2 (Distributed/Parallel, Multi-model,
mutli-language)
3DOE Common Component Architecture Project
- A CA for large-scale Scientific Computation
- Component Characteristics
- May be SPMD or multi-threaded parallel objects
- Heterogeneity
- Parallel platforms to desktops and any language
- Local and Remote
- Parallel communication for remote parallel
interfaces and 0-copy in-process connection - Dynamic Composition and Integration
- Hot-swapable components, shared instances
- www.cca-forum.org
- Open forum involving DOE labs, Universities,
others
4Uintah
- CCA-ish component architecture (C only)
- Plus components for multiphysics structured AMR
simulations - Scales to 2000 processors
5SCIRun
6SCIRun PowerApps BioImage
7The KEPLER Systemfor Scientific Workflows
- A framework for design, execution and deployment
of scientific workflows
- Caters specifically to the domain scientist
- Builds on Ptolemy II
- Application pull from
- various projects
http//kepler.ecoinformatics.org
Slide thanks to Ilkay Altintas and Efrat
Jeager SDSC UCSD
8Kepler Workflow
9Component Architecture Design Choices
- Degree of isolation processes, threads, single
address space? - Mechanism for communication dataflow, process
networks, method invocation - Synchronization
- Programming languages expressiveness tradeoffs
- Data types explicitly supported
- Performance requirements
- Extra tools required?
- Explicit support for parallelism?
- ?Multiple designs for component architectures
- Tailored to application needs
- Islands of functionality
10SCIRun2
- SCIRun2 provides a component model for component
models (metacomponents) - Plug-ins provide support for
- CCA
- SCIRun
- Vtk
- Others
- Components use native communication mechanisms
to connect to similar components - Bridges connect models
SCIRun2
11Meta-components example
12Application
13SDM Requirements
- Distributed Workflow
- Repetitive
- Shared resources
- Automatically driven
- Coarse-grained (seconds to minute per operation)
- Interactive Visualization
- Exploratory
- Dedicated resources
- User-driven
- Fine-grained (milliseconds to seconds per
operation)
14Goal
- What the user wants
- To get work done
- Make hard things easy
- How to do this
- Combine tools with disparate strengths
- Make them work efficiently
- Focus on interfaces
- Enable consistent user interfaces
15Utah's Contibution To the SPA Group
- SCIRun can now be controlled from SPA/Kepler
workflows - Server interface
- JNI interface
- Smart Re-run capability
- Provenance framework
16Kepler Workflow
17Workflow Requirements and Wants We Address
- Seamless access to resources and services
- Smart re-runs
- Data provenance
- Reliability and fault-tolerance
- Detached execution
- From B. Ludäscher, et al. Scientific Workflow
Management and the Kepler System. Concurrency and
Computation Practice Experience, Special
Issue on Scientific Workflows, to appear, 2005.
18SCIRun With SPA/Kepler
- Kepler actor sends requests to a SCIRun server
- Useful for processing batch jobs or iterating
through the parameter space of a SCIRun module
(actor) - Requires existing SCIRun network, which the
workflow actor will tell SCIRun to load - JNI interface to SCIRun
19SCIRun Server
- Simple TCP/IP server that can be started remotely
by Kepler - Accepts requests from client actor in the
workflow and then sends back location of results
when it has finished - Allows for the possibility of remote or/and
detached execution of SCIRun
20SCIRun and Kepler Dataflow Integration
Incorporate SCIRun computation and visualization
with the SPA workflow engine
Automate SCIRun network execution with a Kepler
actor driving execution through a JNI interface
or a remote connection to a SCIRun server
21JNI interface with workflow
22What is provenance data?
- In general steps taken to get a result
- Information about computational experiments or
runs of scientific workflows that is needed to
reproduce results - We want to log metadata, steps applied to data,
tools used to create data products - Useful when you want to share/publish results
23The Standalone Provenance Framework
http//kepler-project.org/Wiki.jsp?pageKeplerProv
enanceFramework
24Smart re-runs
- Instead of running a workflow from scratch we
only re-run parts of the workflow that have not
been done before - Example we change a parameter downstream and
dont want to re-run the actors that lead up to
the one with the parameter change - Especially useful in visualization pipelines and
long running workflows
25Utah and Smart Re-runs
- Uses VisTrails cache manager algorithm
- Idea is to re-run as little of the network as
possible by combining intermediate results from
different workflow runs - Recreates input to actors that need to be
re-fired - L. Bavoil, et al. VisTrails Enabling
Interactive Multiple-View Visualizations. IEEE
Visualization, 2005.
26(No Transcript)
27What is needed for Smart Re-runs
- We need to keep track of what we have done before
- Specifically we need to know what actors have
been given what inputs with what outputs - Stored provenance data can give us the
information we need
28Other uses for provenance data
- Recreate results
- Recover from a system failure
- Checkpoint a workflow
- Create semantic links
29Future work
- Continue work on Smart Re-runs system
- Help workflow users integrate SCIRun with their
workflows - Get provenance framework checked into Ptolemy CVS
- Work on other provenance issues
- Help SCIRun users take advantage of workflow
technology - Develop CCA to Kepler bridging mechanisms