STREAM: The Stanford Data Stream Management System - PowerPoint PPT Presentation

About This Presentation
Title:

STREAM: The Stanford Data Stream Management System

Description:

STREAM: The Stanford Data Stream Management System Rebuttal Team Mingzhu Wei Di Yang CS525s - Fall 2006 Rebuttal Areas Foundation Windows Joins Full Recalculation ... – PowerPoint PPT presentation

Number of Views:184
Avg rating:3.0/5.0
Slides: 10
Provided by: Andrew978
Learn more at: http://web.cs.wpi.edu
Category:

less

Transcript and Presenter's Notes

Title: STREAM: The Stanford Data Stream Management System


1
STREAM The Stanford Data Stream Management System
  • Rebuttal Team
  • Mingzhu Wei
  • Di Yang
  • CS525s - Fall 2006

2
Rebuttal Areas
  • Foundation
  • Windows
  • Joins
  • Full Recalculation Strategy
  • Language Issues

3
Foundation
  • Rebuttal
  • No proof provided to guarantee correctness or
    completeness of query plans resulting from the
    combination of operators, queues, and synopses
  • Analysis
  • STREAM is based on relation database theory
  • Relational databases have been around for a long
    time
  • Proofs exist that demonstrate their correctness
  • CQL is a minor extension to SQL
  • More effort could have been put into providing a
    more formal proof of CQL

4
Windows
  • Rebuttal
  • Stream does not provide value-based windows
  • For example, without a value-based window the
    system cannot process a query such as
  • Give me the name of the students who have the top
    10 exam scores, efficiently
  • Analysis
  • Feature not supported by STREAM

5
Joins
  • Rebuttal
  • STREAM only uses the self-purge mechanism when
    performing a window-based join
  • Analysis
  • STREAMs criteria for judging when a tuple (in
    the state of the window) has expired is
    determined by comparison of its timestamp with
    that of the new incoming tuples in the same
    stream
  • Cross-purge might be more efficient in some cases
  • Cross-purge Compare timestamps across two
    streams

6
Full Recalculation Strategy
  • Rebuttal
  • Stream uses a full recalculation strategy for
    result updating
  • Could be very inefficient with big window sizes
  • Example
  • We are trying to join two windows each of size
    1000
  • If both windows only slide 10 at each time,
    recalculation for the whole result would be much
    more expensive, than incremental result updating
  • Analysis
  • Using an Incremental Result Update Strategy might
    be more efficient in some cases
  • Keep most of the joined result and only calculate
    those for newly arrived tuples

7
Language Issues
  • Rebuttal
  • Stream does not provide the stream to stream
    operator
  • Analysis
  • The absence of a stream-to-stream operator is not
    explicitly justified in the paper
  • Its absence is reasonable because STREAM
    operators treat all input as relations
  • STREAM does provide operators for converting
    streams to relations and for converting relations
    to streams

8
Language Issues (cont)
  • Rebuttal
  • Stream uses an append-only model
  • It does not provide an operator for updating data
    value in stream
  • Analysis
  • Although not perfect, this is a common assumption
    in current stream processing papers

9
Conclusion
  • Foundation
  • Windows
  • Joins
  • Full Recalculation Strategy
  • Language Issues
Write a Comment
User Comments (0)
About PowerShow.com