Adaptive Execution of Variable-Accuracy Functions - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Adaptive Execution of Variable-Accuracy Functions

Description:

iterate() method: refines bounds with more work ... Iterate over the object that has the best ratio of benefit to CPU cost among the ... – PowerPoint PPT presentation

Number of Views:13
Avg rating:3.0/5.0
Slides: 29
Provided by: aitrcK
Category:

less

Transcript and Presenter's Notes

Title: Adaptive Execution of Variable-Accuracy Functions


1
Adaptive Execution of Variable-Accuracy
Functions

Matt Denny - UC Berkeley/Fred Alger, Inc.Michael
Franklin - UC Berkeley
  • VLDB Conference
  • Seoul
  • September 2006

2
Introduction
  • Many applications apply expensive functions to
    streams of data
  • Finance real-time market monitoring with
    securities models
  • Power Management overload prediction using
    current weather conditions
  • Supply Chain Management inventory models using
    RFID data to find shortages in real-time

3
Continuous Queries w/ UDFs
4
The Problem
  • Analytical functions can be expensive!
  • minutes or hours per data point.
  • Query processor has no control over execution of
    individual function calls.
  • UDF API is a Black Box
  • Earlier work aims to avoid UDF calls
  • predicate reordering (HS93KMPS94CS96))
  • memoization and caching (HN96, DF05)
  • Remaining calls can still be a showstopper.

5
The Intuition
  1. Many functions have accuracy/cost tradeoffs.
    e.g., iterative solvers.
  2. UDFs often appear in predicates and aggregates
    where exact answers are not required.

6
Our Solution
  • VAOs (Variable Accuracy Operators)
  • New query operators that
  • Expose function cost/accuracy tradeoffs using a
    new UDF API.
  • Exploit this tradeoff to avoid excess work while
    correctly answering the query.

7
VAOs - Basic Idea
  • Initially run function to obtain a coarse answer.
  • This needs to be cheaper than running to a more
    accurate answer.
  • If more accuracy needed - iterate!

8
Traditional Execution - Select
SELECT BD.bondID FROM BondData BD, IntRate IR
Rows 1 WHERE model(BD,IR.rate) gt 100
9
VAO Execution Select
SELECT BD.bondID FROM BondData BD, IntRate IR
Rows 1 WHERE model(BD,IR.rate) gt 100
10
VAO Execution Select
SELECT BD.bondID FROM BondData BD, IntRate IR
Rows 1 WHERE model(BD,IR.rate) gt 100
11
VAO API
  • Use iterative interface
  • Traditional ltnumbergt f(ltargsgt)
  • VAO ltresult objectgt f(ltargsgt)
  • fields for (conservative) error bounds
  • iterate() method refines bounds with more work
  • for some vaos also need estimates for CPU cost
    and error reduction of next iteration
  • Useful for
  • Any sort of iterative function (e.g. root
    finders, numerical integration)
  • Any technique with iterative step refinement
    (e.g. PDEs)

12
Iteration Strategy
  • Selection iterates over an object until predicate
    value is known.
  • Aggregate operators more difficult
  • Answer dependent on sets of result objects
  • Need to decide how to iterate over multiple
    result objects

13
Example MAX(f(x1), f(x2))
Need an iteration strategy that attempts to
minimize cost
14
Solution Greedy Strategy
  • Iterate over the object that has the best ratio
    of benefit to CPU cost among the current choices.
  • Good strategy if functions converge
  • Later iterations likely to have less benefit/unit
    cost
  • Operator-dependent

15
Example Revisited
  • MAX(f(x1),f(x2))
  • Goal State no overlap between f(x1) and f(x2)
  • Greedy Strategy
  • choose best overlap reduction per CPU cost
  • Use error reduction estimates to estimate overlap
    reduction.
  • Cost estimation depends on function.

16
Example Revisited
  • Determine if f(x1) gt f(x2)

Function Overlap Red. Est. CPU Cost Est.
f(x1)
f(x2)
17
Example Revisited
  • Determine if f(x1) gt f(x2)

Function Overlap Red. Est. CPU Cost Est.
f(x1)
f(x2)
18
Example Revisited
  • Determine if f(x1) gt f(x2)

f(x)
x
x
x
1
2
Function Overlap Red. Est. CPU Cost Est.
f(x1)
f(x2)
19
Aggregates
Operator Goal State Greedy Heuristic
min/max(general) No overlap between minimum (maximum) value and other function error bounds Make educated guess for max. Choose iteration that reduces most overlap between guess and other error bounds per cycle
avg/sum avg/sum of error bounds have width less than user-defined tolerance Choose iteration which reduces avg/sum of bounds the most per cycle
20
Performance Setup
  • Standalone implemenation of VAO framework in C
  • Used numeric bond model and bond data from DF05
  • Real Bond Data - 500 Mortgage-backed Securities.
  • Synthetic Bond Data - to stress test VAOs
  • Single Interest Rate.

21
VAO Implementation
  • Numeric bond model S95 implemented with
    traditional and VAOs interface
  • Based on PDE solver
  • VAO iterate() double size of PDE grid
  • Bounds and error reduction estimates derived by
    using current and previous iteration results and
    Richardsons Extrapolation BF01

22
Selection Performance
  • 500 bonds, 1 interest rate

Runtime depends on number of bonds close to
predicate.
23
Stress Test
  • Generate bonds with accurate values near the
    predicate
  • Gaussian, mean predicate value, vary std. dev.
  • Std. dev. of real
  • bonds 7.78

24
In the Paper
  • Other Results
  • Max
  • Real bonds 111 sec. vs. 6953 sec.
  • Synthetic bonds VAOs better than traditional
    above .05 std. dev.
  • Average
  • Up to 5x improvement if a small number of bonds
    are weighted heavily in average.
  • Details on Error and Cost estimates for PDE-based
    bond model.
  • Other types of models covered in Matts thesis.

25
Conclusion
  • Many emerging CQ applications require the
    repeated execution of expensive functions.
  • VAOs are new operators that change how these
    functions execute
  • Use new iterative API that exposes work-accuracy
    tradeoff in functions
  • Do only enough work to answer the query using
    greedy strategy to choose iterations
  • With real bond data and models, VAOs show 1-2
    orders of magnitude improvement.
  • For more detailed information
  • mdenny_at_cs.berkeley.edu

26
The Advisors Dodge
Relative Contribution to Research
100
80
This Work
60
Percent Contribution
40
20
0
0
1
2
3
4
5

Time in Program (years)
Courtesy of Jennifer Widom
27
Bibliography
  • HS93 J. M. Hellerstein and M. Stonebraker,
    Predicate Migration Optimizing Queries with
    Expensive Predicates, SIGMOD 1993.
  • HN96 J. M. Hellerstein and J. Naughton, Query
    Execution Techniques for Caching Expensive
    Predicates, SIGMOD 1996.
  • DF05 M. Denny and M.J. Franklin. Predicate
    Result Range Caching for Continuous Queries,
    SIGMOD 2005

28
Bibliography
  • S95 R. Stanton, Rational Prepayment and the
    Valuation of Mortgage-Backed Securities, The
    Review of Financial Studies, Vol. 8, No. 3,
    677-708.
  • BF01 R.L. Burden, J.D. Faires, Numerical
    Analysis. Brooks/Cole, 2001.
Write a Comment
User Comments (0)
About PowerShow.com