Title: PSoup
1PSoup
- Kevin Menard
- CS 561
- 4/11/2005
2Streaming Queries over Streaming Data
Slides are modified versions of the following
original presentation
- Sirish Chandrasekaran
- UC Berkeley
- August 20, 2002
- with Michael J. Franklin
VLDB 2002
3Psoup Insight 1
- Queries and data are duals
- Store new queries, apply to data that arrived
earlier - Store new data, apply to queries that arrived
earlier
Index
Index
Data
Queries
- Multiquery Processing join of query and data
- Supports all three types of queries queries over
the past, (landmark and sliding window)
continuous, and hybrid
4Psoup Insight 1
- Queries and data are duals
- Store new queries, apply to data that arrived
earlier - Store new data, apply to queries that arrived
earlier
Index
Index
Data
Queries
- Multiquery Processing join of query and data
- Supports all three types of queries queries over
the past, (landmark and sliding window)
continuous, and hybrid
5Motivation?
- Why another model for continuous queries?
- What is wrong with how Aurora and STREAM supply
responses?
6Motivation Disconnected Operation
- Previous solutions stream out answers immediately
- Not feasible/suitable for all applications
- Intermittent Connectivity e.g., Applications on
hand-held devices (as in this mornings keynote
address) - Even if connected Not always interested in
streaming answers
7Psoup Insight 2
- Separate computation from delivery
- Query answers continuously generated in
background - Apply windows on-demand to transmit current
results
Data
Query
Queries
ID
R.a
R.b
ID
Predicate
T
T
F
T
F
T
T
T
Data
F
F
F
F
T
F
F
T
Results Structure
- Efficient support for disconnected operation
- Low response time, Shared computation and storage
across invocations
8PSoup Query Model
- SELECT select_list
- FROM from_list
- WHERE where_clause
- BEGIN begin_time
- END end_time
- Where clause conjunction of boolean factors
- BEGIN-END clause system clock or sequence
numbers - (begin_time, end_time)
- (constant, constant) snapshot query
- (constant, variable) landmark window query
- (variable, variable) sliding window query
9Query Registration
- SELECT select_list
- FROM from_list
- WHERE where_clause
- BEGIN begin_time
- END end_time
Standing Query Clause (SQC)
to the
Symmetric Join
to the
Windows_Table
- QueryID handle for future query invocations
10Selections over Single Stream Arrival of New
Query Specification
Data Store
Query Store
ID
R.a
R.b
ID
Predicate
20
0ltR.alt5
48
4
3
21
R.agt4 and R.b3
49
7
3
22
0gtR.bgt4
50
3
8
23
R.a4 and R.b3
51
0
0
52
8
4
PSoup
(a) Initial State
11Selections over Single Stream Arrival of New
Query Specification
Data Store
Query Store
ID
R.a
R.b
ID
Predicate
20
0ltR.alt5
48
4
3
21
R.agt4 and R.b3
49
7
3
22
0gtR.bgt4
50
3
8
23
R.a4 and R.b3
51
0
0
52
8
4
Select From R Where R.alt4 and R.bgt3
PSoup
New query
(b) Arrival of new Query
12Selections over Single Stream Arrival of New
Query Specification
Data Store
Query Store
ID
Predicate
ID
R.a
R.b
20
0ltR.alt5
48
4
3
21
R.agt4 and R.b3
49
7
3
22
0gtR.bgt4
50
3
8
23
R.a4 and R.b3
51
0
0
24
R.alt4 and R.bgt3
52
8
4
BUILD
PSoup
(c) Building Query Store
13Selections over Single Stream Arrival of New
Query Specification
Data Store
Query Store
ID
Predicate
ID
R.a
R.b
match
20
0ltR.alt5
48
4
3
21
R.agt4 and R.b3
49
7
3
match
22
0gtR.bgt4
50
3
8
23
R.a4 and R.b3
PROBE
51
0
0
24
R.alt4 and R.bgt3
52
8
4
PSoup
(d) Probing Data Store
14Selections over Single Stream Arrival of New
Query Specification
Queries
20
21
22
23
24
48
?
48
4
3
49
?
Data
Results
50
?
50
3
8
51
?
52
?
Results Structure
(e) Inserting Results
15Selections over Single Stream Arrival of New
Query Specification
Queries
20
21
22
23
24
48
T
48
4
3
49
F
Data
Results
50
T
50
3
8
51
F
52
F
Results Structure
(e) Inserting Results
16Selections over Single Stream Arrival of New
Data
Data Store
Query Store
ID
R.a
R.b
ID
Predicate
20
0ltR.alt5
48
4
3
21
R.agt4 and R.b3
49
7
3
22
0gtR.bgt4
50
3
8
23
R.a4 and R.b3
51
0
0
24
R.alt4 and R.bgt3
52
8
4
PSoup
(a) Initial State
17Selections over Single Stream Arrival of New
Data
Query Store
Data Store
ID
Predicate
ID
R.a
R.b
20
0ltR.alt5
48
4
3
21
R.agt4 and R.b3
49
7
3
22
0gtR.bgt4
50
3
8
23
R.a4 and R.b3
24
R.alt4 and R.bgt3
51
0
0
52
8
4
PSoup
New data
53
3
6
(b) Arrival of new Data
18Selections over Single Stream Arrival of New
Data
Query Store
Data Store
ID
Predicate
ID
R.a
R.b
20
0ltR.alt5
48
4
3
21
R.agt4 and R.b3
49
7
3
22
0gtR.bgt4
50
3
8
23
R.a4 and R.b3
24
R.alt4 and R.bgt3
51
0
0
52
8
4
53
3
6
BUILD
PSoup
(c) Building Data Store
19Selections over Single Stream Arrival of New
Data
Query Store
Data Store
ID
Predicate
ID
R.a
R.b
match
20
0ltR.alt5
48
4
3
21
R.agt4 and R.b3
49
7
3
22
0gtR.bgt4
50
3
8
23
R.a4 and R.b3
match
24
R.alt4 and R.bgt3
51
0
0
PROBE
52
8
4
53
3
6
PSoup
(d) Probing Query Store
20Selections over Single Stream Arrival of New
Data
Queries
20
21
22
23
24
48
20
0ltR.alt5
49
Data
Results
50
51
24
R.alt4 and R.bgt3
52
53
?
?
?
?
?
Results Structure
(e) Inserting Results
21Selections over Single Stream Arrival of New
Data
Queries
20
21
22
23
24
48
20
0ltR.alt5
49
Data
Results
50
51
24
R.alt4 and R.bgt3
52
53
T
F
F
F
T
Results Structure
(e) Inserting Results
22Query Invocation
- System returns the results corresponding to the
current value of the BEGIN-END clause
Queries
20
21
22
23
24
48
T
49
F
Data
- BEGIN begin_time
- END end_time
50
T
Current Window
51
F
52
F
53
T
F
F
F
T
Results Structure
23Joins over R and S Arrival of New Query
Specification
S-Data Store
ID
S.a
S.b
21
2
2
25
3
3
36
4
4
49
5
5
Query Store
R-Data Store
ID
Predicate
20
R.a5 and R.bltS.b
ID
R.a
R.b
21
R.agt4 and R.bltS.b and S.alt10
10
2
5
22
R.b4 and R.a5gtS.a and S.bgt2
14
3
3
31
4
1
48
9
7
PSoup
(a) Initial State
24Joins over R and S Arrival of New Query
Specification
S-Data Store
ID
S.a
S.b
21
2
2
25
3
3
36
4
4
49
5
5
Query Store
R-Data Store
ID
Predicate
20
R.a5 and R.bltS.b
ID
R.a
R.b
21
R.agt4 and R.bltS.b and S.alt10
10
2
5
22
R.b4 and R.a5gtS.a and S.bgt2
14
3
3
31
4
1
48
9
7
PSoup
New query
23
R.alt5 and R.agtS.a and S.bgt1
(b) Arrival of new Query
25Joins over R and S Arrival of New Query
Specification
S-Data Store
ID
S.a
S.b
21
2
2
25
3
3
36
4
4
49
5
5
Query Store
R-Data Store
ID
Predicate
20
R.a5 and R.bltS.b
ID
R.a
R.b
21
R.agt4 and R.bltS.b and S.alt10
10
2
5
22
R.b4 and R.a5gtS.a and S.bgt2
14
3
3
31
4
1
23
R.alt5 and R.agtS.a and S.bgt1
48
9
7
PSoup
BUILD
(c) Building Query Store
26Joins over R and S Arrival of New Query
Specification
S-Data Store
ID
S.a
S.b
21
2
2
25
3
3
36
4
4
49
5
5
Query Store
Matches
R-Data Store
ID
Predicate
20
R.a5 and R.bltS.b
ID
R.a
R.b
21
R.agt4 and R.bltS.b and S.alt10
10
2
5
22
R.b4 and R.a5gtS.a and S.bgt2
14
3
3
PROBE
31
4
1
23
R.alt5 and R.agtS.a and S.bgt1
48
9
7
PSoup
(d) Probing R-Data Store
27Joins over R and S Arrival of New Query
Specification
S-Data Store
Hybrid Structs
ID
S.a
S.b
R.ID
Q.ID
Q.Predicate
21
2
2
10
23
2gtS.a and S.bgt1
25
3
3
14
23
3gtS.a and S.bgt1
36
4
4
31
23
4gtS.a and S.bgt1
49
5
5
Query Store
R-Data Store
ID
Predicate
Matches
20
R.a5 and R.bltS.b
ID
R.a
R.b
21
R.agt4 and R.bltS.b and S.alt10
10
2
5
22
R.b4 and R.a5gtS.a and S.bgt2
14
3
3
31
4
1
23
R.alt5 and R.agtS.a and S.bgt1
48
9
7
PSoup
(e) Constructing Hybrid Structs
28Joins over R and S Arrival of New Query
Specification
S-Data Store
Hybrid Structs
Results
ID
S.a
S.b
Matches
R.ID
Q.ID
Q.Predicate
R,S,Q
21
2
2
10
23
2gtS.a and S.bgt1
?
25
3
3
PROBE
14
23
3gtS.a and S.bgt1
?
36
4
4
31
23
4gtS.a and S.bgt1
?
49
5
5
Query Store
R-Data Store
ID
Predicate
20
R.a5 and R.bltS.b
ID
R.a
R.b
21
R.agt4 and R.bltS.b and S.alt10
10
2
5
22
R.b4 and R.a5gtS.a and S.bgt2
14
3
3
31
4
1
23
R.alt5 and R.agtS.a and S.bgt1
48
9
7
PSoup
(f) Probing S-Data Store
29Joins over R and S Arrival of New Query
Specification
S-Data Store
Hybrid Structs
Results
ID
S.a
S.b
Matches
R.ID
Q.ID
Q.Predicate
R,S,Q
21
2
2
10
23
2gtS.a and S.bgt1
14,21,23
25
3
3
PROBE
14
23
3gtS.a and S.bgt1
31,21,23
36
4
4
31
23
4gtS.a and S.bgt1
31,25,23
49
5
5
Query Store
R-Data Store
ID
Predicate
20
R.a5 and R.bltS.b
ID
R.a
R.b
21
R.agt4 and R.bltS.b and S.alt10
10
2
5
22
R.b4 and R.a5gtS.a and S.bgt2
14
3
3
31
4
1
23
R.alt5 and R.agtS.a and S.bgt1
48
9
7
PSoup
(f) Probing S-Data Store
30Joins over R and S Arrival of New Data
S-Data Store
ID
S.a
S.b
48
4
4
49
5
3
52
3
2
R-Data Store
Query Store
ID
Predicate
ID
R.a
R.b
20
R.a5 and R.bltS.b
47
4
3
21
R.agt4 and R.bltS.b and S.alt10
50
5
3
22
R.b4 and R.a5gtS.a and S.bgt2
51
3
8
23
R.alt4 and R.bltS.b
PSoup
(a) Initial State
31Joins over R and S Arrival of New Data
S-Data Store
ID
S.a
S.b
48
4
4
49
5
3
52
3
2
R-Data Store
Query Store
ID
Predicate
ID
R.a
R.b
20
R.a5 and R.bltS.b
47
4
3
21
R.agt4 and R.bltS.b and S.alt10
50
5
3
22
R.b4 and R.a5gtS.a and S.bgt2
51
3
8
23
R.alt4 and R.bltS.b
PSoup
New data
53
5
4
(b) Arrival of new Data
32Joins over R and S Arrival of New Data
S-Data Store
ID
S.a
S.b
48
4
4
49
5
3
52
3
2
R-Data Store
Query Store
ID
Predicate
ID
R.a
R.b
20
R.a5 and R.bltS.b
47
4
3
21
R.agt4 and R.bltS.b and S.alt10
50
5
3
22
R.b4 and R.a5gtS.a and S.bgt2
51
3
8
23
R.alt4 and R.bltS.b
53
5
4
BUILD
PSoup
(c) Building R-Data Store
33Joins over R and S Arrival of New Data
S-Data Store
ID
S.a
S.b
48
4
4
49
5
3
52
3
2
R-Data Store
Matches
Query Store
ID
Predicate
ID
R.a
R.b
20
R.a5 and R.bltS.b
47
4
3
21
R.agt4 and R.bltS.b and S.alt10
50
5
3
22
R.b4 and R.a5gtS.a and S.bgt2
51
3
8
PROBE
23
R.alt4 and R.bltS.b
53
5
4
PSoup
(c) Probing Query Store
34Joins over R and S Arrival of New Data
S-Data Store
Hybrid Structs
ID
S.a
S.b
R.ID
Q.ID
Q.Predicate
48
4
4
?
?
4ltS.b
49
5
3
53
21
?
52
3
2
53
22
?
R-Data Store
Query Store
Matches
ID
R.a
R.b
ID
Predicate
47
4
3
20
R.a5 and R.bltS.b
50
5
3
21
R.agt4 and R.bltS.b and S.alt10
51
3
8
22
R.b4 and R.a5gtS.a and S.bgt2
23
R.alt4 and R.bltS.b
53
5
4
PSoup
(d) Constructing Hybrid Structs
35Joins over R and S Arrival of New Data
S-Data Store
Hybrid Structs
ID
S.a
S.b
R.ID
Q.ID
Q.Predicate
48
4
4
53
20
4ltS.b
49
5
3
53
21
4ltS.b and S.alt10
52
3
2
53
22
10gtS.a and S.bgt2
R-Data Store
Query Store
Matches
ID
R.a
R.b
ID
Predicate
47
4
3
20
R.a5 and R.bltS.b
50
5
3
21
R.agt4 and R.bltS.b and S.alt10
51
3
8
22
R.b4 and R.a5gtS.a and S.bgt2
23
R.alt4 and R.bltS.b
53
5
4
PSoup
(d) Constructing Hybrid Structs
36Joins over R and S Arrival of New Data
S-Data Store
Results
Hybrid Structs
ID
S.a
S.b
R,S,Q
R.ID
Q.ID
Q.Predicate
Matches
48
4
4
53,48,22
53
20
4ltS.b
49
5
3
53,49,22
PROBE
53
21
4ltS.b and S.alt10
52
3
2
53
22
10gtS.a and S.bgt2
R-Data Store
Query Store
ID
R.a
R.b
ID
Predicate
47
4
3
20
R.a5 and R.bltS.b
50
5
3
21
R.agt4 and R.bltS.b and S.alt10
51
3
8
22
R.b4 and R.a5gtS.a and S.bgt2
23
R.alt4 and R.bltS.b
53
5
4
PSoup
(e) Probing S-Data Store
37Other Queries
- N-way Joins
- Similar to 2-way joins
- Probe, generate hybrid structs, repeat
- Can be executed without intermediate tables
- Aggregations
- Performed at query invocation
- Uses n-ary ranked tree, clustered on time
38Telegraph Background CACQ
- CACQ MSHR02
- Shared execution of multiple queries with one
Eddy - Tuple lineage
- Query Indices
- Queries and Data treated very differently
- Only Landmark Continuous Queries
- No support for disconnected operation
39PSoup in Telegraph
- Leverage SteMs to store and index queries
- Changes to Eddies
- Encode queries as tuples
- break Where clause into individual boolean
factors (BF) - encode each BF as
- R.a relop R.bS.b - constant
- Stream Prefix Consistency
- A new query or data tuple is completely processed
before any other tuple no holes in Result
Structure. - Results Structure to buffer the results.
40Experiments and Results
- Alternatives
- NoMat No background processing
- PSoup-Partial background processing, apply
current window on invocation - PSoup-Complete current windows are also
continuously applied in the background - Experimental Parameters
- Unloaded Server with two Intel Pentium III, 666
MHz processors with 768 MB RAM - Data arrives as fast as possible, in domain
0,255 - Queries of form R.a relop C, where c in 0,255
- Join Queries of form R.a relop S.b /- C.
41Experiments Response Time vs. Window Size
- Interval Predicates, Selection Queries
42Experiments Response Time vs. Window Size
- Equality Predicates, Selection Queries
43Experiments Max data arrival rate vs. SQCs
44PSoup in traditional query processor
- PSoup SQL QUERY over data and client query
streams? - Joins expression evaluators
- Notes
- Conventional QPs do not have tuple lineage
- Conventional QPs always use intermediate tables
45Conclusions
- Treating Queries and Data the same
- Combines approaches for previously studied
queries - Queries over the past and continuous queries
- Allows new functionality hybrid queries
- Separating Result Generation and Delivery
- Makes disconnected operation feasible
- Efficient support for repeated query invocations