Partial QueryEvaluation in Internet Query Engines - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Partial QueryEvaluation in Internet Query Engines

Description:

Want 1998 Red BMW. No accidents. 20% avg. model price ... Red Used BMW Cars (carId, model, price, otherinfo) Not Exists (carId, model, price, otherinfo) ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 34
Provided by: jayavelsha
Category:

less

Transcript and Presenter's Notes

Title: Partial QueryEvaluation in Internet Query Engines


1
Partial Query-Evaluation in Internet Query Engines
  • Jayavel ShanmugasundaramKristin TufteDavid
    DeWittDavid MaierJeffrey Naughton

University of Wisconsin Oregon Graduate
Institute
2
Outline
  • Motivation
  • Desired Operator Properties
  • Implementation Alternatives
  • Performance Evaluation
  • Conclusion

3
Querying the WWW The Present
Who won the Nobel prize for Physics in 1999?
4
Querying the WWW The Present
HTMLFile
HTMLFile
HTMLFile
HTMLFile
HTMLFile
HTMLFile
HTMLFile
Want 1998 Red BMWNo accidents 20 price
5
Querying the WWW The Future?
Want 1998 Red BMWNo accidents 20 price
6
Inside the Internet Query Engine
(carId, model, price, otherinfo)
(carId, model, price, otherinfo)
Red Used BMW Cars
7
The Problem
  • Return results to users as soon as possible
  • Results so far for queries with blocking
    operators
  • Arbitrary blocking operators
  • Not exists, Average, Nest
  • Blocking operators occurring anywhere in the
    query
  • Potentially intermixed with non-blocking operators

8
Outline
  • Motivation
  • Desired Operator Properties
  • Implementation Alternatives
  • Performance Evaluation
  • Conclusion

9
What is a Partial Result of a Query?
  • Let Full Result of Query Q on Inputs A and B be
  • Q(A, B)
  • Then Partial Result of Query Q on Inputs A and B
    is
  • Q(PA, PB)
  • PA ? A
  • PB ? B

10
Maximal Output Property
  • Produce correct results as soon as possible
  • Why?
  • If query is non-blocking
  • Produces results soon
  • If query is blocking
  • Return non-blocking parts soon (e.g., outer
    join)

11
Inside the Internet Query Engine
(carId, model, price, otherinfo)
(carId)
(carId, model, price, otherinfo)
Red 1998 BMW Cars
Accident Reports
12
Anytime Property
  • Blocking operators should be able to return the
    result so far at any time
  • Why?
  • User can request partial results at any time

13
Inside the Internet Query Engine
(carId, model, price, otherinfo)
(carId)
(carId, model, price, otherinfo)
Red 1998 BMW Cars
Accident Reports
14
Non-Monotonic Input/Output Property
  • Operators should handle changes, not just
    additions to input
  • Similarly, operators should produce changes,
    not just additions to output
  • Both blocking and non-blocking operators
  • Why?
  • Partial results may represent wrong answers
  • Need to be corrected later

15
Inside the Internet Query Engine
(carId, model, price, otherinfo)
(carId)
(carId, model, price, otherinfo)
Red 1998 BMW Cars
Accident Reports
16
Flexible Input Property
  • Should be able to process data from any input at
    any time
  • Processes data as it becomes available
  • Why?
  • If query is non-blocking
  • Can return results soon
  • If query is blocking
  • Faster partial result response time

17
A Note on Partial Result Accuracy
  • Focus is on producing partial results
  • Architecture is general enough to exploit
    existing techniques
  • Online aggregation Hellerstein et. al.
  • Nested aggregates Tan et. al.
  • Accuracy for general blocking operators?

18
Outline
  • Motivation
  • Desired Operator Properties
  • Implementation Alternatives
  • Performance Evaluation
  • Conclusion

19
Where do we start?
  • Use known flexible input, maximal output operator
    implementations
  • Non-blocking select, symmetric hash join, Xjoin
  • Blocking group-by, symmetric outer join
  • Blocking operator implementations should satisfy
    anytime property
  • All operator implementations should satisfy
    non-monotonic input/output property

20
Non-Monotonic Input/Output
  • Re-evaluation Approach
  • On partial result request, compute results so
    far
  • Then forget all potentially incorrect inputs
  • Differential Approach
  • On partial result request, compute results so
    far
  • Update incorrect inputs for future result
    computation

21
Inside the Internet Query Engine
(carId, model, price, otherinfo)
(carId)
(carId, model, price, otherinfo)
Red 1998 BMW Cars
Accident Reports
22
Re-evaluation Join
(1, Z3, 10000)
(Z3, 15000)
(19, Z3, 20000)
(3, 400i, 20000)
(400i, 25000)
(5, 400i, 30000)
(3, 400i, 20000)
(19, Z3, 20000)
(Z3, 15000)
(1, Z3, 10000)
(400i, 25000)
(5, 400i, 30000)
23
Re-evaluation Join
(1, Z3, 10000)
(Z3, 15000)
(19, Z3, 20000)
(3, 400i, 20000)
(400i, 23333)
(5, 400i, 30000)
(8, 400i, 20000)
(Z3, 15000)
(8, 400i, 20000)
(400i, 23333)
24
Differential Join
(1, Z3, 10000)
(Z3, 15000)
(19, Z3, 20000)
(3, 400i, 20000)
(400i, 25000)
(5, 400i, 30000)
(3, 400i, 20000)
(19, Z3, 20000)
(Z3, 15000)
(1, Z3, 10000)
(400i, 25000)
(5, 400i, 30000)
25
Differential Join
(1, Z3, 10000)
(Z3, 15000)
(19, Z3, 20000)
(3, 400i, 20000)
(400i, 25000)
(5, 400i, 30000)
update (400i, 23333)
26
Differential Join
(1, Z3, 10000)
(Z3, 15000)
(19, Z3, 20000)
(3, 400i, 20000)
(400i, 23333)
(5, 400i, 30000)
(8, 400i, 20000)
(8, 400i, 20000)
27
Re-evaluation vs. Differential
  • Re-evaluation Approach
  • Simple just forget partial inputs
  • Easier to extend (no changes to tuple structure)
  • Unnecessary computation
  • Differential Approach
  • Need to handle deletions/updates of inputs
  • Changes to tuple structure
  • Re-computes only what is necessary

28
Outline
  • Motivation
  • Desired Operator Properties
  • Implementation Alternatives
  • Performance Evaluation
  • Conclusion

29
Response Time
30
Outline
  • Motivation
  • Desired Operator Properties
  • Implementation Alternatives
  • Performance Evaluation
  • Conclusion

31
Conclusion
  • New properties for query engine operators
  • Operator implementation alternatives
  • Re-evaluation
  • Differential
  • Evaluation
  • Partial results improve response time
  • Re-evaluation approach is simpler
  • Differential approach is more efficient

32
Future Work
  • General GUI
  • Partial result accuracy for general blocking
    operators
  • Changes at finer granularities
  • Consistent partial results

33
Related Work
  • Online aggregation Hellerstein et. al.
  • Nested aggregates Tan et. al.
  • Online reordering Raman et. al.
  • Symmetric hash join Wilschut et. al.
  • Adaptive operators Ives et. al.
  • XJoin Urhan et. al.
Write a Comment
User Comments (0)
About PowerShow.com