Engine Issues for Data Stream Processing - PowerPoint PPT Presentation

1 / 10
About This Presentation
Title:

Engine Issues for Data Stream Processing

Description:

Oracle talk at CIDR showed how to do a lot of ... where A.a = B.b and B.b = C.c. and C.c = D.d. D. A. B. C. Michael J. Franklin. Dynamic Query Addition ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 11
Provided by: MichaelF188
Category:

less

Transcript and Presenter's Notes

Title: Engine Issues for Data Stream Processing


1
Engine Issues for Data Stream Processing
  • Mike Franklin
  • UC Berkeley
  • 1st Duodecennial SWiM Meeting
  • January 9, 2003

2
Panel Goals
  • Identify those key areas where existing database
    engine technology falls short for supporting data
    streams.
  • Succinct justification for the area
  • Oracle talk at CIDR showed how to do a lot of
    interesting things using tables/standard ixs/SQL
  • Identification of interesting research areas/open
    problems
  • Road map for progress
  • To point out possible solutions, non-solutions or
    just potential cool things.

3
Panel Structure
  • Approach Panelists requested to identify their
    1 concern in engine design.
  • Panelists (in order of desc distance travelled)
  • Alex Buchmann
  • Ugur Cetintemel
  • Ted Johnson
  • Jennifer Widom

4
My 1 Issue(s) Sharing Adaptivity
  • Sharing
  • Opportunity Standing queries
  • can see and analyze most of the queries as a
    group
  • long-lived queries mean benefits accrue, costs
    are amortized
  • Benefit Scalability
  • obvious avoid duplicate work
  • need to keep up with the dataflow dont want to
    stall pipeline (similar to staged db ideas)
  • reduce cost of entry for new queries
  • Adaptivity
  • no stats, dynamic environment,
  • in particular, the query mix and workload
    intensity continually fluctuate.

5
Common Sub-expressions
  • Traditional MQO approaches suffer from same
    problems as traditional QP approaches in
    streaming environments.
  • namely, they are static
  • Insertion and removal of queries degrades global
    plan quality over time.
  • Two approaches
  • YFilter shared XML filtering
  • TelegraphCQ extreme adaptive QP

6
YFilterShared Processing (Yanlei Diao)
  • XFilter showed how to use an event-based (SAX)
    parser to drive state transitions for XML
    filtering.
  • YFilter uses an NFA-based approach to share work
    among queries.

7
Combining NFA Fragments
8
YFilter NFA Structure Matching
Q1/a/b Q2/a/c Q3/a/b/c Q4/a//b/c Q5/a//c Q6
/a//c Q7/a///c Q8/a/b/c
Key to scalability is sharing of machine states
and processing.
9
The TelegraphCQ Approach
  • Aggressive adaptivity
  • Say no to static dataflows
  • Continuous adaptivity

Aggressive sharing Beyond common
sub-expressions Easy addition of new queries
Sharing and Adaptivity Two sides of the same coin
! Use a single framework for both
10
Fun with Eddies and STeMs
Q2 select from B, D where B.b D.d
and B.b gt 25
Q1 select from A,B,C,D where A.a
B.b and B.b C.c and C.c D.d
Grouped Selection Filter
B
D
SteMs
A
C
Output
Eddy
A B C D
11
Dynamic Query Addition
Write a Comment
User Comments (0)
About PowerShow.com