Continuous Stream Monitoring Technology - PowerPoint PPT Presentation

About This Presentation
Title:

Continuous Stream Monitoring Technology

Description:

Stock tickers/feeds. Sensor networks. Web usage transactions ... Real-time and accurate responses required. May have time-varying rates and high-volumes ... – PowerPoint PPT presentation

Number of Views:87
Avg rating:3.0/5.0
Slides: 32
Provided by: alfred2
Learn more at: https://davis.wpi.edu
Category:

less

Transcript and Presenter's Notes

Title: Continuous Stream Monitoring Technology


1
Continuous Stream Monitoring Technology
  • Elke A. Rundensteiner
  • Database Systems Research Laboratory
  • Department of Computer Science
  • Worcester Polytechnic Institute, USA
  • rundenst _at_ cs.wpi.edu
  • November 2006

2
A Database . . .
  • Vast amount of electronic information in
    organisations, companies, scientific institutes
    that needs to be organized, stored securily, and
    accessed efficiently and easily.
  • Three common steps
  • Make schema design
  • Load database
  • Query static database

Select name from employee
DBMS
Stored Database
3
  • So what next ?

Select name from employee
DBMS
Stored Database
4
A Look at Modern Data Streams !
  • Digital radio telescopes
  • Network traffic flow
  • Stock tickers/feeds
  • Sensor networks
  • Web usage transactions
  • Outpatient care
  • Environmental instruments

DSMS Filter Transform
select fft(s) from radiosignal s where
source(s) Antenna1
5
Databases Everything is Upside Down !
static data
data
Query
one-time queries
6
Continuous Queries on Data Streams
Online Stream Monitoring
7
Motivating Applications Everywhere
  • Traffic Management Streams of Cars and Mobile
    Requests
  • Market Analysis Streams of Stock Exchange
    Data
  • Critical Care Streams of Vital Sign
    Measurements
  • Physical Plant Monitoring Streams of
    RFID/Environmental Readings
  • Emergency Response Streams of Sensors and
    People tracking

8
Mobile Traffic-Related Streams
- moving objects
- dynamic range query
- dynamic kNN query
9
Spatio-Temporal Continuous Tracking
Monitor the traffic in the red areas
Continuously return the area covered by the herd
during the migration
10
FireEngine Project Sensors in Rooms
11
Fire Monitoring Queries
  • Track smoke and heat clouds (moving clusters) in
    terms of their sizes and speeds?
  • Is there an outlier (prank), or an actual fire ?
  • Match sensors readings of fire with a fire stream
    simulation to determine similarity ?
  • Any sensors faulty, and thus should be ignored?

12
Dynamicity in Stream Query Processing
Register Continuous Queries
High workload of queries
Real-time and accurate responses required
Scalable Stream Query Engine
Streaming Data (push-based paradigm)
Streaming Result
May have time-varying rates and high-volumes
Available resources for executing each operator
may vary over time.
Memory- and CPU resource limitations (continuous
evaluation)
New query processing technology required.
13
Execution of Queries
Slide
s
s
. . .
. . .
s
s
m
. . .
. . .
. . .
È
m
Tumble
s
m
  • Queries
  • Graph Query Plan
  • Boxes Query Operators such as Filter or Join
  • Arcs Streams with time-stamped tuples

14
Execution of Queries
Slide
s
s
s
s
s
s
. . .
. . .
s
s
s
s
s
s
s
App
s
m
s
s
s
m
m
s
. . .
. . .
. . .
È
È
È
È
È
È
È
m
m
m
App
Tumble
Tumble
Tumble
s
m
s
s
m
s
m
s
Execution via Operator Scheduling
15
Adaptation Techniques in CAPE
  • On-Line Query Plan Reshaping
  • (with Yali Zhu and G. Heineman )

Published in ACM SIGMOD 2004, and in Submission
to TODS journal 2006
16
Query Optimization
BC
AB
AB
BC
A
A
B
B
C
C
How optimize if query is continuously running?
17
Run-time Plan Re-Optimization
  • Step1 - Decide when to optimize
  • Statistics monitoring
  • Step2 Generate new query plan
  • Query optimization
  • Step3 Replace current plan by new plan
  • Plan Migration

18
Naïve Plan Migration Strategy
BC
AB
AB
BC
A
A
B
B
C
C
  • Migration Steps
  • Pause execution of old plan
  • Drain out all tuples inside old plan
  • Replace old plan by new plan
  • Resume execution of new plan

Problem Works for stateless operators only
19
Stateful Operator in Streaming
  • Why stateful
  • Need non-blocking operators
  • Operator needs to output partial results

Symmetric hash join For each new tuple A purge
state B, join state B, insert to state A
State A
State B
AB
A
B
Key Observation The purge of tuples in states
relies on processing of new tuples.
20
Naïve Migration Strategy Revisited
BC
AB
Deadlock Waiting Problem
A
B
C
(2) All tuples drained
  • Steps
  • (1) Pause execution of old plan
  • (2) Drain out all tuples inside old plan
  • (3) Replace old plan by new plan
  • (4) Resume execution of new plan

(3) Old Replaced By new
(4) Processing Resumed
21
Proposed Dynamic Migration Strategies
  • Moving State Strategy
  • Parallel Track Strategy

22
Moving State Strategy
  • Basic idea
  • Share common states between two boxes
  • Key Steps
  • Identify common states
  • State matching
  • Share common states
  • State moving
  • Recompute unmatched states
  • State recomputing

23
Moving State Strategy
  • State Matching
  • State in old box has unique ID
  • During rewriting, new ID given to new state in
    new box
  • When rewriting done, match states based on IDs.
  • State Moving
  • Between matched states
  • On same machine, creates new pointers for matched
    states in new box
  • Whats left?
  • Unmatched states in new box

QABCD
QABCD
CD
AB
SABC
SD
SA
SBCD
CD
BC
SD
SBC
SAB
SC
BC
AB
SB
SC
SA
SB
QA
QB
QC
QD
QA
QB
QC
QD
Old Box
New Box
24
Unmatched States
  • State Recomputing
  • Recursively recompute unmatched SBC and SBCD by
    joining matched states

QABCD
AB
SA
SBCD
CD
SBC
SD
BC
SB
SC
QA
QB
QC
QD
25
MS Migration Pros and Cons
  • Pros
  • Fast when of tuples in states is small
  • Low input rates or small window size
  • Cons
  • Output silence during entire migration stage
  • Can we output results even during migration?
  • Motivation for Parallel Track Strategy

26
Parallel Track Strategy
  • Basic idea
  • Execute both old and new plans in parallel
  • Gradually push old tuples out of old box by
    purging
  • Key Steps
  • Connect new box
  • Execute both boxes in parallel
  • Remove old box once expired
  • Contains only new tuples
  • No old tuples or sub-tuples

27
Parallel Track Strategy
A Tuple ABC in SABC
A
B
C
  • Connect boxes
  • Execute in parallel
  • Until all old tuples purged
  • Disconnect old box

QABCD
QABCD
SABC
SD
SBCD
SA
CD
AB
SBC
SAB
SD
SC
BC
CD
SA
SB
SB
SC
BC
AB
QA
QB
QC
QD
QD
QA
QB
QC
28
PT Migrations Pros and Cons
  • Pros
  • Keep on producing results even during migration
  • No results during MS migration
  • Cons
  • Migration duration is at least 2W
  • MS may be faster depends on of tuples in states

29
Summary Stream Plan Migration
  • Our central theme Optimization via Adaptation
  • First run-time solution for stateful operators
  • Two migration methods
  • Moving State Strategy
  • Parallel Track Strategy
  • Cost Models for Comparative Analysis
  • System Implementation in CAPE
  • Experimental Evaluations

30
Overall Summary So Much Left to Do !
  • Large variety of challenging stream applications
  • Generic core technology for stream processing
    engines
  • Startup starting to pop up StreamBase for
    Stockmarket
  • Major DBMS players like IBM, Oracle, etc.
    joining in
  • Cool open research, great potential for real
    impact !

31
The End
http//davis.wpi.edu.edu/dsrg
  • Questions ? Questions ?

32
Subset of CAPE Publications
  • RDZ04 E. A. Rundensteiner, L. Ding, Y. Zhu, T.
    Sutherland and B. Pielech, CAPE A
    Constraint-Aware Adaptive Stream Processing
    Engine. Invited Book Chapter. http//www.cs.uno.e
    du/nauman/streamBook/. July 2004
  • ZRH04 Y. Zhu, E. A. Rundensteiner and G. T.
    Heineman, "Dynamic Plan Migration for Continuous
    Queries Over Data Streams. SIGMOD 2004, pages
    431-442.
  • DMR04 L. Ding, N. Mehta, E. A. Rundensteiner
    and G. T. Heineman, "Joining Punctuated Streams.
    EDBT 2004, pages 587-604.
  • DR04 L. Ding and E. A. Rundensteiner,
    "Evaluating Window Joins over Punctuated
    Streams. CIKM 2004, to appear.
  • DRH03 L. Ding, E. A. Rundensteiner and G. T.
    Heineman, MJoin A Metadata-Aware Stream Join
    Operator. DEBS 2003.
  • RDSZBM04 E A. Rundensteiner, L Ding, T
    Sutherland, Y Zhu, B Pielech And N Mehta. CAPE
    Continuous Query Engine with Heterogeneous-Grained
    Adaptivity. Demonstration Paper. VLDB 2004
  • SR04 T. Sutherland and E. A. Rundensteiner,
    "D-CAPE A Self-Tuning Continuous Query Plan
    Distribution Architecture. Tech Report,
    WPI-CS-TR-04-18, 2004.
  • SPR04 T. Sutherland, B. Pielech, Yali Zhu,
    Luping Ding, and E. A. Rundensteiner, "Adaptive
    Multi-Objective Scheduling Selection Framework
    for Continuous Query Processing . IDEAS 2005.
  • SLJR05 T Sutherland, B Liu, M Jbantova, and E
    A. Rundensteiner, D-CAPE Distributed and
    Self-Tuned Continuous Query Processing, CIKM,
    Bremen, Germany, Nov. 2005.
  • LR05 Bin Liu and E.A. Rundensteiner,
    Revisiting Pipelined Parallelism in Multi-Join
    Query Processing, VLDB 2005.
  • B05 Bin Liu , Yali Zhu and E.A.
    Rundensteiner, Spill Policies for Long-Running
    Queries, ACM SIGMOD 2006, to appear.
  • CAPE Project http//davis.wpi.edu/dsrg/CAPE/index
    .html
Write a Comment
User Comments (0)
About PowerShow.com