Query Optimization over Web Services - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

Query Optimization over Web Services

Description:

Precedence constraints. Output of WSi may be needed as input for WSj ... With Precedence Constraints. Bottleneck (max) cost metric ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 41
Provided by: jennif280
Category:

less

Transcript and Presenter's Notes

Title: Query Optimization over Web Services


1
Query Optimization overWeb Services
  • Utkarsh Srivastava
  • Jennifer Widom
  • Kamesh Munagala
  • Rajeev Motwani

2
Performance Numbers
Relative Contribution to Research
100
80
This Work
60
Percent Contribution
40
20
0
0
1
2
3
4
5
Time in Program (years)
3
Future Directions (sample)
  • Web services with monetary cost
  • Web services with unstable response times
  • (QoS guarantees?)
  • Multiple web services for same data
  • Caching web-service query results
  • More expressive queries, also workflows
  • Web service profiling and statistics-tracking

4
First Steps in Big Problem
New Query Optimization Problem
5
Web Services
  • Standardized way of sharing data and
  • functionality
  • Description and discovery
  • Communication

Data, Functionality
WSDL,UDDI
Web Services
Users/ Clients
SOAP
6
Example Web Services
Stock symbol
WS1
Company info
Reuters
Stock symbol
WS2
Stock activity
NASDAQ
7
Querying Across Web Services
Get info about all companies with high-activity
stock
Stock symbol
WS1
Company info
Query
User/ Client
Reuters
Results
  • Easy
  • Transparent
  • Efficient
  • Etc.

Stock symbol
WS2
Stock activity
NASDAQ
8
Same Basic Goal as Traditional DBMS
Declarative Interface
Query
User/ Client
Data
Database Management System
Results
  • Easy
  • Transparent
  • Efficient
  • Etc.

9
Web Service Management System
Web Service Management System
  • Easy
  • Transparent
  • Efficient
  • Etc.

10
WSMS Architecture
Declarative Interface
WSMS
WS Invocations
Metadata Component
Schema mapper
Web service registration
WS1
Query input data
Query Processing Component
WS2
Client
Plan selection
Plan execution
Results
Profiling and Statistics Component
WSn
Statistics tracker
Response- time profiler
11
Running Example
  • Credit card company wants to send offers to
  • people with
  • credit rating gt 600, and
  • payment history good on prior credit card
  • Company has at its disposal
  • L List of potential recipients (identified by
    SSN)
  • WS1 SSN ? credit rating
  • WS2 SSN ? cc number(s)
  • WS3 cc number ? payment history

12
Plan 1
WSMS
SSN
SSN cr
1 500
2 700
WS1
SSN,cr
SSN?cr
SSN
1
2
Filter on cr, keep SSN
L(SSN)
SSN ccn
2 123
2 456
Query Plan
WS2
Client
SSN?ccn
SSN,ccn
SSN
2
ccn ph
123 bad
456 good
WS3
SSN,ccn,ph
ccn?ph
Filter on ph, keep SSN
Note Pipelined processing
13
Simple Representation of Plan 1
WS1
WS3
WS2
L
Results
ccn?ph
SSN?cr
SSN?ccn
14
Plan 2
WSMS
SSN cr
1 500
2 700
WS1
SSN
SSN,cr
SSN
1
2
SSN?cr
Filter on cr, keep SSN
SSN
SSN
L(SSN)
SSN ccn
2 123
2 456
WS2
Client
SSN?ccn
Join
SSN,ccn
SSN
2
ccn ph
123 bad
456 good
WS3
SSN
SSN,ccn,ph
ccn?ph
Filter on ph, keep SSN
15
Simple Representation of Plan 2
SSN?cr
WS1
L
Results
WS2
WS3
SSN?ccn
ccn?ph
16
Quiz
Which plan is better?
WS1
WS3
WS2
Plan 1
L
Results
WS1
Plan 2
L
Results
WS2
WS3
  • Cost metric steady-state throughput
  • Assume join is free

Plan 1 is never worse
17
Query Optimization Primer
  • Possible query plans P1, , Pn
  • Data/access statistics S
  • Execution cost metric cost(Pi, S)
  • GOAL Find least-cost plan

18
Query Optimization Primer
  • Possible query plans P1, , Pn
  • Data/access statistics S
  • Execution cost metric cost(Pi, S)
  • GOAL Find least-cost plan

19
Queries and Plans
  • Select-Project-Join queries over input data L
  • and set of web services WS1, , WSn
  • Precedence constraints
  • Output of WSi may be needed as input for WSj
  • Ex WS2 SSN ? ccn and WS3 ccn ? ph
  • Precedence DAG defines space of query plans

20
Query Optimization Primer
  • Possible query plans P1, , Pn
  • Data/access statistics S
  • Execution cost metric cost(Pi, S)
  • GOAL Find least-cost plan

21
Statistics
  1. Web service response times
  2. Web service selectivities

New Query Optimization Problem
22
Statistics Response Times
  • ri per-tuple response time of WSi from client

SSN
Client
WS1
SSN?cr
cr
r1
  • ri 1/throughput, can be reduced by batching,
    parallel calls

batching
(see paper)
  • Assume independent response
  • times within query plans

New Query Optimization Problem
23
Statistics Selectivities
  • si selectivity of WSi
  • Average output tuples per input tuple to WSi
  • including post-filtering in query plan
  • WS1 SSN ? cr, filter cr gt 600
  • If 90 of SSNs have cr gt 600 then s1 0.9
  • WS2 SSN ? ccn
  • If on average each SSN has 2 credit cards then s2
    2.0
  • Assume independent
  • selectivities within query plans

New Query Optimization Problem
24
Query Optimization Primer
  • Possible query plans P1, , Pn
  • Data/access statistics S
  • Execution cost metric cost(Pi, S)
  • GOAL Find least-cost plan

25
Bottleneck Cost Metric
New Query Optimization Problem
26
Bottleneck Cost Metric
Conference Lunch Buffet
Dish 1
Dish 2
Dish 3
Dish 4
Average per-tuple processing time response time
of slowest (bottleneck) stage in pipeline Note
selectivities1 in this example
27
Cost Equation for Plan P
  • Ri(P) Predecessors of WSi in plan P

?j?Ri(P) sj
  • Fraction of input tuples seen by WSi
  • WSi response time per input tuple

(?j?Ri(P) sj)ri
  • Bottleneck cost metric

cost(P) max1in( (?j?Ri(P) sj)ri )
(assumes WSMS processing is not the bottleneck)
28
Contrast with Sum Cost Metric
cost(P) ?1in( (?j?Ri(P) sj)ri )
  • Stream filter ordering
  • Expensive predicate placement

Polite Lunch Buffet
Dish 1
Dish 2
Dish 3
Dish 4
29
Problem Statement
  • Input
  • Web services WS1, , WSn
  • Response times r1, , rn
  • Selectivities s1, , sn
  • Precedence constraints among web services
  • Output
  • Web services arranged into a plan P
  • P respects all precedence constraints
  • cost(P) is minimized

30
No Precedence Constraints
  • All selectivities 1
  • Theorem Optimal to order linearly by ri
  • (selectivities irrelevant)
  • General case
  • (optimal)

proliferative web services
selective web services ordered by response-time

join at WSMS
Results
31
With Precedence Constraints
cost(P) max1in( (?j?Ri(P) sj)ri )
32
With Precedence Constraints
cost(P) ?1in( (?j?Ri(P) sj)ri )
  • Sum cost metric
  • Hard to even obtain a factor O(n?) of optimal

33
With Precedence Constraints
cost(P) max1in( (?j?Ri(P) sj)ri )
  • Bottleneck (max) cost metric
  • Surprisingly, optimal solution in polynomial time
  • O(n5) algorithm in paper
  • Add one WS at a time to the plan
  • WS chosen by solving a linear program

34
Example Revisited
WS1
WS3
WS2
Plan 1
WS1
WS2
WS3
L
Results
SSN?cr
SSN?ccn
ccn?ph
SSN?cr
max1in( (?j?Ri(P) sj)ri )
WS1
WS1
Plan 2
L
Results
WS2
WS3
WS2
WS3
SSN?ccn
ccn?ph

Selective
WS3
WS2
Precedence constraint

Proliferative
35
Implementation
  • Built prototype WSMS query processor
  • Optimizer and execution engine
  • Assumes schema issues resolved, statistics
    provided
  • Written in Java and uses Apache Axis
    (open-source SOAP implementation)
  • Experiments (see paper) validate analytical
    results

36
Isnt Problem the Same as ?
  • Web Service composition
  • Targeted for workflow-oriented applications
  • No provably optimal strategies
  • Parallel/distributed query optimization
  • Freedom to place query operators
  • Much larger space of execution plans
  • Data integration, mediators
  • For general sources of data
  • Optimization of total resource consumption

37
Future Directions (sample)
  • Web services with monetary cost
  • Web services with unstable response times
  • (QoS guarantees?)
  • Multiple web services for same data
  • Caching web-service query results
  • More expressive queries, also workflows
  • Web service profiling and statistics-tracking

38
Conclusion
New Query Optimization Problem
39
Conclusion
New Query Optimization Problem
Our contribution
40
Questions?
100
80
60
Percent Contribution
40
20
0
0
1
2
3
4
5
Time in Program (years)
Write a Comment
User Comments (0)
About PowerShow.com