Karma Provenance Framework v2 Provenance Challenge Workshop/GGF18 - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Karma Provenance Framework v2 Provenance Challenge Workshop/GGF18

Description:

XBaya Workflow composer GUI. Central GPEL workflow engine orchestrates execution ... Satoshi Shirasuna (XBaya Composer) LEAD Members. NSF. Questions. www. ... – PowerPoint PPT presentation

Number of Views:67
Avg rating:3.0/5.0
Slides: 22
Provided by: ysim7
Category:

less

Transcript and Presenter's Notes

Title: Karma Provenance Framework v2 Provenance Challenge Workshop/GGF18


1
Karma Provenance Framework v2Provenance
Challenge Workshop/GGF18
  • Yogesh L. Simmhan
  • Beth Plale, Dennis Gannon, Srinath Perera
  • Indiana University

2
Outline
  • Architecture of Karma
  • Workflow Setup Collecting Provenance
  • Provenance Traces
  • canonical Challenge Queries
  • Suggested Variations

3
Provenance Collection Challenges Uses
  • Linked Environments for Atmospheric Discovery
    (LEAD) project
  • Weather Severe Storm Prediction Applications
  • Provenance on workflow (process) data products
    at fine granularity
  • Dynamic, Long running workflows
  • Helps scientists to search for workflows data
    products, Track workflow execution, Analyze
    mine data products from runs

4
Karma Provenance Framework
  • Lightweight do not duplicate existing metadata
    cataloging effort
  • myLEAD personal metadata catalog
  • ResCat service data registry
  • Glue to integrate metadata on data services
    with runtime workflow information
  • Scalability1 500 users, 100s of workflows,
    10,000s of data products

1 Performance Evaluation of the Karma
Provenance Framework, Simmhan, Y., et al. IPAW,
2006
5
Karma Architecture2
Workflow Engine
Workflow Instance 10 Data Products Consumed
Produced by each Service
Orchestration
Service 2
Service 1
Service 10
Service 9

10C
10P
10P/10C
10P
10C
10P/10C
2 A Framework for Collecting Provenance in
Data-Centric Scientific Workflows, Simmhan, Y.,
et al., Submitted to ICWS Conference, 2006
6
Provenance Challenge Workflow
  • Applications modeled as web-services
  • GFac toolkit creates service for command-line
    applications
  • Service invokes a shell-script wrapper of the
    application, passing command-line arguments
  • Created services automatically instrumented to
    generate provenance using Karma client library
  • Workflow composed as GPEL script
  • XBaya Workflow composer GUI
  • Central GPEL workflow engine orchestrates
    execution

Grid Process Execution Language, an extension of
the Business Process Execution Language (BPEL)
7
Provenance Challenge Workflow
8
Provenance Traces
  • Data Provenance getRecursiveDataProvenance
  • What (ID), where (URL), when (Timestamp)
  • How (Process, inputs)

9
Provenance Traces
  • Process Provenance getProcessProvenance
  • What (ID), when (Timestamp), who (Invoker)
  • State (execution/completion status)
  • Input Output data products

10
Provenance Traces
  • Workflow Trace getWorkflowTrace
  • What (ID), when (Timestamp), who (Invoker)
  • State (execution/completion status)
  • Process provenance of workflow steps

11
(No Transcript)
12
Provenance Challenge Queries
  • ?! Answered by Karma Service API Directly
  • ? Answered by Karma Service API,
  • with post-processing by client
  • ? Answered by access to backend DB (SQL)
  • ? Not answered

Query 1 2 3 4 5 6 7 8 9
Result ?! ? ?! ? ? ? ? ? ?
13
Provenance Challenge Queries Q1
  • Find everything that caused Atlas X Graphic to be
    as it is
  • ?! Answered by Karma Service API Directly
  • This is the recursive data provenance of the
    Atlas X Graphic file
  • A call to
  • getRecursiveDataProvenance(
  • leaduuid1157946992-atlas-x.gif)
  • returns this www

14
Provenance Challenge Queries Q2
  • Find the process that led to Atlas X Graphic,
    excluding all prior to softmean
  • ? Answered by Karma Service API, with
    post-processing by client
  • First call getDataProvenance
  • Then recursively get data provenance till
    SoftmeanService is seen
  • Returns this www

1. let dataList 'leaduuid1157946992-atlas-x
.gif' 2. while (dataList ! empty) do //
get data provenance for this level a.
dataProvenance karma.getDataProvenance(dataLis
t0) // print process information
remove data from list b. Print
dataProvenance dataList.delete(0) c. if
(dataProvenance.getProducedBy()
'SoftmeanService') break // found Softmean.
Stop. // get input data used by this data
recurse up the tree d. foreach (inputData in
dataProvenance.getUsingData()) do i.
dataList.add(inputData) 3. End
15
Provenance Challenge Q4
  • Find all invocations of align_warp ( with
    parameter "-m 12") that ran on a Monday
  • ? Answered by access to backend DB (SQL)
  • Use SQL query to get matching invocations
  • Call getProcessProvenance to get description of
    align_warp
  • Returns this www

SELECT invokee.workflow_id, invokee.service_id,
invokee.workflow_node_id, invokee.workflow_timeste
p, invoker.workflow_id, invoker.service_id,
invoker.workflow_node_id, invoker.workflow_timeste
p FROM invocation_state_table invocation,
entity_table invokee, entity_table invoker,
notification_table notifications WHERE
invokee.entity_id invocation.invokee_id AND
invoker.entity_id invocation.invoker_id AND
notifications.source_id invocation.invokee_id
AND notifications.notification_type
'ServiceInvoked' AND invokee.service_id
'urnqnamehttp//www.extreme.indiana.edu/karma/ch
allenge06AlignWarpService' AND
notifications.notification_xml LIKE'ltModelMenuNum
bergt12lt/ModelMenuNumbergt AND DayOfWeek(invocatio
n.request_receive_time) 2 // 1Sunday,
2Monday, ...
16
Provenance Challenge Q9
  • Find all the graphical atlas sets that have
    metadata annotation studyModality with values
    speech, visual or audio, and return all other
    annotations to these files.
  • ? Not answered
  • We do not expect to answer such queries through
    the provenance system
  • We push the provenance information to external
    metadata management systems such as MyLEAD, which
    can answer such join queries on data product
    metadata and provenance

17
Variations of Workflow
  • Workflows with loops
  • Workflows whose structure changes dynamically
  • or, as a simpler case, workflows with conditional
    branches
  • Hierarchical composition of workflows
  • workflows invoking other workflows

18
Variations of Queries
  • Find all workflows processes with a
    particular execution status completed failed
    waiting for input
  • Show the client view and service view of the
    provenance and check for differences

19
AcknowledgementsAlek Slominski (GPEL
Engine)Satoshi Shirasuna (XBaya Composer)LEAD
MembersNSF
  • Questions
  • www.extreme.indiana.edu/karma

20
Sample Activities Published
  • More here www

21
Karma DB Schema
Write a Comment
User Comments (0)
About PowerShow.com