DCAPE: Distributed and Self-Tuned Continuous Query Processing - PowerPoint PPT Presentation

About This Presentation
Title:

DCAPE: Distributed and Self-Tuned Continuous Query Processing

Description:

Department of Computer Science, Worcester Polytechnic Institute ... {tims, binliu, jbantova, rundenst}_at_cs.wpi.edu. CIKM'05 Poster ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 12
Provided by: lupin5
Learn more at: https://davis.wpi.edu
Category:

less

Transcript and Presenter's Notes

Title: DCAPE: Distributed and Self-Tuned Continuous Query Processing


1
DCAPE Distributed and Self-TunedContinuous
Query Processing
  • Tim Sutherland,Bin Liu,Mariana Jbantova,
  • and Elke A. Rundensteiner
  • Department of Computer Science, Worcester
    Polytechnic Institute
  • 100 Institute Road, Worcester, MA 01609
  • Tel 1-508-831-5857, Fax 1-508-831-5776
  • tims, binliu, jbantova, rundenst_at_cs.wpi.edu
  • CIKM05 Poster
  • http//davis.wpi.edu/dsrg/CAPE/index.html

2
Uncertainties in Stream Query Processing
Register Continuous Queries
Receive Answers
High workload of queries
Real-time and accurate responses required
Distributed Stream Query Engine
Streaming Data
Streaming Result
May have time-varying rates and high-volumes
Available resources for executing each operator
may vary over time.
Memory- and CPU resource limitations
Distribution and Adaptations are required.
3
Adaptation in DCAPE Distributed Stream
Processing in a Nutshell
  • Adaptation Techniques
  • Spilling data to disk
  • Relocating work to other machines
  • Reoptimizing and migrating query plan
  • Granularity of Adaptation
  • Operator-level distribution and adaptation
  • Partition-level distribution and adaptation
  • Integrated Methodologies
  • Consider trade-offs between spill vs redistribute
  • Consider trade-offs between migrate vs
    redistribute

4
CAPE System Architecture
Query Processor
Distribution Manager
Local Plan Migrator
Connection Manager
Local Statistics Gatherer
Local Adaptation Controller
Global Plan Migrator
CAPE-Continuous Query Processing Engine
Runtime Monitor
Query Plan Manager
Repository
Data Distributor
Data Receiver
Global Adaptation Controller
Repository
Streaming Data
Streaming Data
Network
Streaming Data
End User
LZ05, TLJ05
  • Application Servers

Stream Servers
  • Application Servers
  • Application Servers

5
Random Distribution
Balanced Network Aware Distribution
Goal To minimize network connectivity. Algorithm
Takes each query plan and creates sub-plans
where neighbouring operators are grouped together.
Goal To equalize workload per machine. Algorithm
Iteratively takes each query operator and places
it on the query processor with the least number
of operators.
6
Initial Distribution of Query Plan Across Cluster
of Machines
M1
M2
Step 1
Step 2
  • Step 1 Create distribution table using initial
    distribution algorithm.
  • Step 2 Send distribution information to
    processing machines (nodes).

7
Run-Time Plan Redistribution
Cost per machine is determined as percentage of
memory filled with tuples.
Cost Table (current)
Cost Table (desired)
Balance
Operators redistributed based on a
redistribution policy.
Redistribution policies in Cape Balance and
Degradation.
Legend --------- M1 M2
Legend --------- M1 M2
8
Redistribution Protocol Across Machines
  • No tuples lost
  • No-duplicates produced
  • No incorrect results produced
  • Seamless

9
Query Plan Performance with Query Plan of 40
Operators.
  • Observations
  • Initial distribution is important for query plan
    performance.
  • Redistribution improves at run-time query plan
    performance.

10
From Operator- to Partition-level Adaptation
  • Problem of operator-level adaptation
  • Operators have large states.
  • Moving them across machines can be expensive.
  • Solution as partition-level adaptation
  • Partition state-intensive operators
    Gra90,SH03,LR05
  • Distribute Partitioned Plan into Multiple
    Machines

11
CAPE Publications and Reports
  • RDZ04 E. A. Rundensteiner, L. Ding, Y. Zhu, T.
    Sutherland and B. Pielech, CAPE A
    Constraint-Aware Adaptive Stream Processing
    Engine. Invited Book Chapter. http//www.cs.uno.e
    du/nauman/streamBook/. July 2004.
  • ZRH04 Y. Zhu, E. A. Rundensteiner and G. T.
    Heineman, "Dynamic Plan Migration for Continuous
    Queries Over Data Streams. SIGMOD 2004, pages
    431-442.
  • DMR04 L. Ding, N. Mehta, E. A. Rundensteiner
    and G. T. Heineman, "Joining Punctuated Streams.
    EDBT 2004, pages 587-604.
  • DR04 L. Ding and E. A. Rundensteiner,
    "Evaluating Window Joins over Punctuated
    Streams. CIKM 2004, to appear.
  • DRH03 L. Ding, E. A. Rundensteiner and G. T.
    Heineman, MJoin A Metadata-Aware Stream Join
    Operator. DEBS 2003.
  • RDSZBM04 E A. Rundensteiner, L Ding, T
    Sutherland, Y Zhu, B Pielech \
  • And N Mehta. CAPE Continuous Query Engine
    with Heterogeneous-Grained Adaptivity.
    Demonstration Paper. VLDB 2004
  • SR04 T. Sutherland and E. A. Rundensteiner,
    "D-CAPE A Self-Tuning Continuous Query Plan
    Distribution Architecture. Tech Report,
    WPI-CS-TR-04-18, 2004.
  • SPR04 T. Sutherland, B. Pielech, Yali Zhu,
    Luping Ding, and E. A. Rundensteiner, "Adaptive
    Multi-Objective Scheduling Selection Framework
    for Continuous Query Processing . IDEAS 2005.
  • SLJR05 T Sutherland, B Liu, M Jbantova, and E
    A. Rundensteiner, D-CAPE Distributed and
    Self-Tuned Continuous Query Processing, CIKM,
    Bremen, Germany, Nov. 2005.
  • LR05 Bin Liu and E.A. Rundensteiner,
    Revisiting Pipelined Parallelism in Multi-Join
    Query Processing, VLDB 2005.
  • B05 Bin Liu and E.A. Rundensteiner,
    Partition-based Adaptation Strategies Integrating
    Spill and Relocation, Tech Report, WPI-CS-TR-05,
    2005. (in submission)
  • CAPE Project http//davis.wpi.edu/dsrg/CAPE/index
    .html
Write a Comment
User Comments (0)
About PowerShow.com