TelegraphCQ: Continuous Dataflow Processing for an Uncertain World

About This Presentation

Title:

Description:

Number of Views:107

Avg rating:3.0/5.0

Slides: 12

Provided by: defau635

Learn more at: http://web.cs.wpi.edu

Category:

more less

Transcript and Presenter's Notes

Title: TelegraphCQ: Continuous Dataflow Processing for an Uncertain World

1
TelegraphCQ Continuous Dataflow Processing for
an Uncertain World

Sirish Chandrasekaran, Owen Cooper, Amol
Deshpande, Michael J. Franklin, Joseph M.
Hellerstein,Wei Hong, Sailesh Krishnamurthy, Sam
Madden, Vijayshankar Raman, Fred Reiss, and
Mehul Shah
University of California, Berkeley
Intel Berkeley Laboratory
IBM Almaden Research Center
http//telegraph.cs.berkeley.edu/

2
Contents

3
TelegraphCQ Background and Motivation

Adaptive Dataflow Architecture systems that
could adjust their processing on-the-fly in
response to
Changes in user needs HACO99
Intermittent delays in accessing data across WANs
UFA98
Shared Processing
CACQ MSHR02
PSoup CF02
Limitations -
processing restricted to in-memory data
No scheduling and resource management for queries
with little or no overlap
No Quality of Service (QoS) for adapting to
resource limitations
No tradeoff between flexibility and overhead

4
Telegraph - Architecture

5
Adaptive Processing Eddies SteMs

6
Fjords InterModule Communication

Allow use of mixture of push and pull connections
between modules
a pull-queue is implemented using a blocking
dequeue on the consumer side and a blocking
enqueue on the producer side.
A push-queue is implemented using non-blocking
enqueue and dequeue control is returned to the
consumer when the queue is empty
Execute query over any combination of streaming
and static data sources

Flux Scaling Up Dataflow Processing

Interposed between a producer-consumer operator
pair in a pipelined, partitioned dataflow
Fault-tolerant, Load-balancing eXchange
Load-balancing via online repartitioning of the
input stream and corresponding state of operators
Fault-tolerance by leveraging these state
movement mechanisms to replicate an operators
internal state and in-flight data

7
Initial CQ Approaches

CACQ
First CQ engine exploiting adaptive query
processing framework
Modification of Eddies- execution of multiple
queries by executing a single super- query as
disjunction of all the queries
Tuple Lineage state to determine the client
Grouped Filters index for single variable
Boolean factors over the same attribute for
optimizing selections in the shared execution

8
Window Semantics in TelegraphCQ

Rich windowing schemes over both already-arrived
as well as incoming data
Various window semantics are-
Snapshot query execute exactly once over one
window
e.g. Select the closing prices for MSFT on the
first five days of trading
Landmark query fixed beginning point and a
forward moving endpoint
e.g. Select all the days after the hundredth
trading day, on which the closing price of MSFT
has been greater than 50. Keep this query
standing in the system for a thousand trading
days
Sliding query forward moving beginning and end
e.g. On every fifth trading day starting today,
calculate the average closing price of MSFT for
the five most recent trading days. Keep the query
standing for fifty trading days
Temporal Band-Join join tuples in one stream
with those in another based on timestamp
e.g. For the five most recent trading days
starting today, select all stocks that closed
higher than MSFT on a given day. Keep the query
standing for twenty trading days

9
TelegraphCQ Design Overview

Adapted the architecture of PostgreSQL
Implemented the new system in C/C to leverage
the open source PostgreSQL code base
Reused components with different levels of changes

10
TelegraphCQ Architecture

11
Conclusion

TelegraphCQ provides adaptive dataflow and shared
processing architecture
Eddy and SteM form building blocks for adaptive
processing
Features like Fjords inter-module communication
(push and pull connections) and Flux
Fault-tolerant and Load-balancing Exchange
CACQ (tuple-lineage and group-filters) PSoup
(Symmetrical treatment of data and queries)
Built over the PostgreSQL framework

Thank you ?

Write a Comment

User Comments (0)