Ilkay Altintas - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Ilkay Altintas

Description:

Mission of scientific workflow systems ... KEPLER = 'Ptolemy II X' for Scientific Workflows ... Kepler is extending Ptolemy directors with specialized ones ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 32
Provided by: ilka7
Category:
Tags: altintas | ilkay

less

Transcript and Presenter's Notes

Title: Ilkay Altintas


1
KEPLER Collaboration for Scientific Workflows and
ROADNet
  • Ilkay Altintas
  • Lead, Scientific Workflow Automation Technologies
    Laboratory
  • SDSC Project Manager, Kepler Scientific Workflow
    Project
  • San Diego Supercomputer Center, UCSD

2
Cyberinfrastucture Needs
  • Goal is for NSF facilities to provide capability
    over the whole space
  • SDSC HEC DATA

Cyberinfrastructure is the coordinated
aggregate of software, hardware and other
technologies, as well as human expertise,
required to support current and future
discoveries in science and engineering.
3
Scientific Workflow Systems are a Glue
  • Tools to combine different CI technologies
  • Mission of scientific workflow systems
  • Promote scientific discovery by providing
    tools and methods to generate scientific
    workflows
  • Create a generic customizable graphical user
    interface for scientists from different
    scientific domains
  • Support computational experiment creation,
    execution, sharing, reuse and provenance
  • Design frameworks which define efficient ways to
    connect to the existing data and integrate
    heterogeneous data from multiple resources
  • Bring CI into users monitor!!!

4
SWF Systems Requirements (1/2)
  • it should work (No kidding!)
  • USER REQUIREMENTS
  • Design tools-- especially for non-expert users
  • Ease of use-- fairly simple user interface having
    more complex features hidden in the background
  • Reusable generic features
  • Generic enough to serve to different communities
    but specific enough to serve one domain (e.g.
    geosciences)
  • Extensibility for the expert user-- almost a
    visual programming interface
  • Registration and publication of data products and
    process products (workflows) provenance

5
SWF Systems Requirements (2/2)
  • TECHNICAL REQUIREMENTS
  • Error detection and recovery from failure
  • Logging information for each workflow
  • Allow data-intensive and compute-intensive tasks
  • (Maybe at the same time)
  • HPC Data management/integration
  • Allow status checks and on the fly updates
  • Remote execution
  • Visualization
  • Semantics, metadata based data access
  • Certification, trust, security

6
Kepler is a Scientific Workflow System
www.kepler-project.org
  • and a cross-project collaboration
  • Latest alpha release out last week!
  • Builds upon the open-source Ptolemy II framework

7
Kepler is a Team Effort
Griddles
SKIDL
Resurgence
SRB
Cypres
NLADR
Contributor names and funding info are at the
Kepler website!!
New contributors - Chesire (UK Text Mining
Center) - SCEC
LOOKING
8
Strategic Plan/Position
  • An multi-project, multi-institution,
    multi-national collaboration derived by
    application pull from each project
  • Development principal
  • gt Define your requirements
  • Reuse existing development if possible
  • Extend features if needed
  • Add new components if they dont exist
  • Merge features if they can be
    generalized.
  • Create a core of production-quality programmers
    who share experiences via online media like
    mailing lists, IRC, wiki, shared code and
    documents (Currently 24 developers, 10 active)
  • Develop methodology for scientific software and
    workflow development
  • Make your community happy

9
Kepler Software Practice
  • Joint CVS
  • Open-source (BSD)
  • Website Wiki
  • Communications
  • Busy IRC channel
  • Mailing lists
  • Kepler-dev
  • Kepler-users
  • 6-monthly hackatons

10
A co-development in KEPLER GEON Dataset
Generation Registration
Makefile gt ant run
SQL database access (JDBC)
Matt,Chad, Dan et al. (SEEK)
Ilkay (SDM)
Efrat (GEON)
Yang (Ptolemy)
Xiaowen (SDM)
Edward et al.(Ptolemy)
11
Actors are the Processing Components
  • Actor
  • Encapsulation of parameterized actions
  • Interface defined by ports and parameters
  • Port
  • Communication between input and output data
  • Without call-return semantics
  • Model of computation
  • Communication semantics among ports
  • Flow of control
  • Implementation is a framework
  • Examples
  • Simulink(The MathWorks)
  • LabVIEW ( from National Instruments)
  • Easy 5x (from Boeing)
  • ROOM(Real-time object-oriented modeling)
  • ADL(Wright)

Actor-Oriented Design
12
Directors are the WF Engines that
  • Implement different computational models
  • Define the semantics of
  • execution of actors and workflows
  • interactions between actors
  • Ptolemy and Kepler are unique in combining
    different execution models in heterogeneous
    models!
  • Kepler is extending Ptolemy directors with
    specialized ones for web service based workflows
    and distributed workflows.
  • Process Networks
  • Rendezvous
  • Publish and Subscribe
  • Continuous Time
  • Finite State Machines
  • Dataflow
  • Time Triggered
  • Synchronous/reactive model
  • Discrete Event
  • Wireless

13
Vergil is the GUI for Kepler
Actor Search
Data Search
  • Actor ontology and semantic search for actors
  • Search -gt Drag and drop -gt Link via ports
  • Metadata-based search for datasets

14
Actor Search
  • Kepler Actor Ontology
  • Used in searching actors and creating conceptual
    views ( folders)
  • Currently 160 Kepler actors added!

15
Some actors in place for
  • Generic Web Service Client and Web Service
    Harvester
  • Customizable RDBMS query and update
  • Command Line wrapper tools
  • Some Grid actors-Globus Job Runner,
    GridFTP-based file access, Proxy Certificate
    Generator
  • SRB support
  • Native R support
  • Interaction with Nimrod and APST
  • Communication with ORBs through actors and
    services
  • Imaging, Gridding, Vis Support
  • Textual and Graphical Output
  • more generic and domain-oriented actors

16
Data Search and Usage of Results
  • Kepler DataGrid
  • Discovery of data resources through local and
    remote services
  • SRB,
  • Grid and Web Services,
  • Db connections
  • Registry of datasets on the fly using workflows

17
Promoter Identification Workflow
18
(No Transcript)
19
(No Transcript)
20
Enter initial inputs, Run and Display results
21
Custom Output Visualizer
22
Kepler and ROADNet
  • Interaction with ORB
  • To handle different data packets
  • convert them into Kepler objects
  • textually and graphically display information
  • plot, visualize, and monitor data values
  • QA and QC of data using user constraints
  • Distribution in ROADNet stack

23
(No Transcript)
24
Data Packet Handling -- Streaming
25
(No Transcript)
26
Coming soon in Kepler
  • MORE INFRASTRUCTURE TO SUPPORT SCIENCE!
  • Full support for distributed execution
  • Plug-in Kepler archives and better versioning
    support
  • Semantic and hybrid typed actors and workflow
    construction
  • Portal support and registration of products
  • Support on process and data provenance
  • Standardization of data interfaces
  • Integration with SCIRun and SDSC vis modules
  • Documentation of generated products in addition
    to the existing manuals and documentation

27
Hot Topics in Kepler Development
28
What can LOOKING get from Kepler?
  • Streaming applications generated by visual
    programming interface
  • Resource search and usage in analysis workflows
  • Deployment of control and analysis workflows as
    web services
  • Easy archiving of and access to data through
    actors
  • 24X7 runs of analysis and control tasks on
    identified servers
  • Use it as a web service composition tool for RAD
  • Can make use of different computation models in
    different layers!

29
Question and System DemonstrationThanks!
Ilkay Altintas altintas_at_sdsc.edu 1 (858)
822-5453 http//www.sdsc.edu
30
Examples of Model of Computations
  • Dataflow
  • Connections represents data streams
  • Actors compute their output data stream for input
    streams
  • Useful for designing signal processing algorithms
    and sampled control laws
  • Time Triggered
  • Follow the principle of global progress of time
  • Strong composability,diagnostically, and formal
    analysis
  • Synchronous/reactive model
  • Stimulated by events from the environment, but
    responds instantaneously
  • Excellent for applications with concurrent and
    complex control logic
  • Discrete Events
  • Actors share a global notion of time and
    communication through events that placed on a
    continuous line
  • Used in modeling hardware and software timing
    properties, communication networks, and queuing
    systems

31
Examples of Model of Computations(Con)
  • Process Networks
  • Asynchronous communication between processes
  • Excellent for signal processing
  • Difficult to interoperate with models including
    notion of time
  • Rendezvous
  • Synchronous communication between processes or
    threads
  • Excellent for applications where resource sharing
    is a key element Poor in maintaining determinacy
  • Difficult to interoperate with models including
    notion of time
  • Publish and Subscribe
  • Connections are event stream Components produce
    or consume events
  • Good for distributed applications
  • Continuous Time
  • Connection carries a continuous-time signal
    Actors denote the relation among these signals
  • Used in control system design for modeling
    physical dynamics and continuous control laws
  • Finite State Machines
  • Component is called state or node. The connection
    represent transitions of transfer of the control
    between states
  • Sequential execution
  • Excellent for describing control logic
Write a Comment
User Comments (0)
About PowerShow.com