Title: Scheduling for Shared Window Joins over Data Streams
1Scheduling for Shared Window Joins over Data
Streams
Moustafa Hammad Purdue University
Michael Franklin UC Berkeley
Walid Aref Purdue University
Ahmed Elmagarmid Purdue University
2Stream Applications and Continuous Queries
- Streaming Applications
- Sensor networks (smart building, biosensors, toll
roads) - Location-based services for mobile objects,
Enhanced 911 (USA) and Enhanced 112 (Europe)
IEEE Spectrum, July 2003 - Retail transactions.
- Continuous Queries (CQ)
- Reacts to new data as it arrives,
- Data is continuously arriving ? Query is
continuously running
3Challenges Query Processing for Data Streams
- Data streams break a number of basic assumptions
of traditional query processing technology - Infinite streams ? Infinite state (e.g., most
join operations) - Ordered execution
- Shared execution
- Multiple continuous queries (overlapping
interests)
4Shared Execution among Multiple CQs
- Sharing is a key technique in Stream Processing
- Resources are bound for long time (duration of
CQ) - Large number of CQs, high stream rates and tight
responsiveness - Join Operation, why?
- A core component combines data from multiple
streams for further processing and analysis - A costly operation in stream processing
- Selection pull up emphasizes sharing Chen et al.
ICDE02
?B
?A
Split
S2
S1
?B
?A
S2
S2
S1
S1
5Window Join Operator
- Data streams are unbounded
- Traditional Approach
- Deals with stored relations !
- Stream Processing Approach
- Maintains a window (scope of interest)
- E.g., joins tuples in the last one hour
- Q is repeatedly executed (CQ) ? Sliding window
join (SWJ) - Multiple SWJs (same streams and different
windows) - Naïve Approach No Sharing
- Shared SWJs ?
- CACQ02, PSoup02, process the larger window,
filter later for smaller windows. (discriminates
against queries with smaller window sizes)
Centralized stream processing system
6Presentation Outline
- Problem Specification
- Scheduling Algorithms for SWJs
- LWO Largest Window Only
- SWF Smallest Window First
- MQT Maximum Query Throughput
- Performance Study
- Conclusion
7Example Sliding Window Join Operation
- Window Join
- Example A data center with hundreds of sensors
that monitor temperature and humidity values - Schema
- Temperature Stream (LocationID, Value, TimeStamp)
- Humidity Stream (LocationID, Value, TimeStamp)
Data Center
- CQ Continuously monitor the count of sensors
reporting temperature and humidity values above
specific thresholds within the last one minute.
8Problem Specification Sliding Window Join
Operation
- Q1
- Select COUNT(DISTINCT A.LocationId))
- FROM Temperature A, Humidity B
- WHERE A.LocationId B.LocationId and
- A.Value gt Threshold_t and
- B.Value gt Threshold_h
- WINDOW 1 min
(a9,b12)
(a11,b8)
Q2 SELECT A.LocationId, MAX(A.Value),
MAX(B.Value) FROM Temperature A,
Humidity B WHERE A.LocationId
B.LocationId GROUP BY A.LocationId WINDOW 1 hour
9Problem Specification Shared Window Join
Routing
Joining
- Sharing must be transparent to the queries.
- Transparency Requirement
- Output order must be the same as single execution
- Otherwise, produces incorrect output (e.g.,
online MAX, online COUNT) - Response time penalty due to sharing should be
minimized
1 min.
Q1
COUNT
?
A
B
1 min.
MAX Group By
Q2
1 hour
1 hour
- In bursty workloads
- On average system must accommodate aggregate
input rates.
10Largest Window Only (LWO) CACQ02
Completely scans the largest window before
serving a new tuple Example 3 Queries with 3
different windows (w1lt w2lt w3)
(a9,b12)
(a11,b8)
Q1(w1)
(a1,b12)
(a5,b12)
(a9,b12)
(a5,b12)
(a11,b0)
(a9,b12)
a11
a9
a7
a5
a3
a1
(a11,b4)
(a11,b4)
A
(a11,b8)
(a11,b8)
Q2(w2)
b0
b2
b4
b6
b8
b10
b12
(a1,b12)
B
(a5,b12)
(a9,b12)
w1
(a11,b0)
w2
(a11,b4)
w3
(a11,b8)
Q3(w3)
Output Data Streams
Routing Part
Join Part
Ordered output (property 1 in transparent
execution ? )
11Largest Window Only (LWO)
- Analytical Analysis
- Average Response time for query Qi
- Example
- 7 queries with windows sizes between 1 second and
10 minutes.
- High response time for queries with small window
sizes (fails property 2)
10
12Proposed Alg1 Smallest Window First (SWF)
For all arriving tuple Scan small window first,
then the next larger window
(a9,b12)
(a1,b12)
w3
(a11,b8)
Q1(w1)
(a11,b0)
w2
(a5,b12)
(a5,b12)
w1
(a9,b12)
(a11,b4)
a11
a9
a7
a5
a3
a1
(a9,b12)
(a11,b4)
A
(a11,b8)
(a11,b8)
Q2(w2)
b0
b2
b4
b6
b8
b10
b12
(a1,b12)
B
(a5,b12)
(a9,b12)
w1
(a11,b0)
w2
(a11,b4)
w3
(a11,b8)
Q3(w3)
Output Data Streams
Routing Part
Join Part
Ordered output not straightforward, routing
part must buffer output before release.
13Smallest Window First (SWF)
- Analytical Analysis
- Average Response time for query Qi
- Example
- 7 queries with windows sizes between 1 second and
10 minutes.
- High response time for queries with large window
sizes (fails property 2)
14Proposed Alg2 Maximum Query Throughput (MQT)
- Observations
- LWO SWF make wrong scheduling decisions.
- LWO SWF ignore the count of queries per window.
- Greedy Approach
- Schedule the tuple that serves max. number of
queries in shortest time! - ( E.g. MAX N1/w1, (N2-N1)/(w2-w1) )
- Problem Ignores future scans. (local optimum)
- Maximum Query Throughput (MQT)
- Considers all future scans at a given tuple
position. - MQT (a) MAX (N2-N1)/(w2-w1),
(N3-N1)/(w3-w1) - MQT (b) MAX(N1/w1)
- ?Schedule (a or b) ? MAX (MQT(a), MQT(b) )
N1/w1
(N2-N1)/(w2-w1)
b
15Maximum Query Throughput (MQT)
- Given a window-query configuration build a matrix
(MaxQT) - MaxQT matrix Each entry MAX(
) for all partial windows
- Updated MaxQT as new query is added.
- Index MaxQT by relative tuple positions
- Example
- 3 Queries with window sizes
- w1 2w, w2 3w, w3 6w
(a9,b12)
(a11,b12)
(a11,b6)
16Performance Study
- Performance Metrics
- Average and Maximum response time.
- Memory requirements.
- Implementation
- Using a prototype database management system,
PREDATOR. - Both hash-based and nested loop versions of the
W-joins. - Stream is an abstract data type, stream-type,
with specific interface functions. - StreamScan operator and stream manager to
communicate query execution plan to the stream
type. - Settings
- Synthetic data streams, join selectivity 0.002
- Sun Enterprise 450, Solaris 2.6 with 4GBytes of
memory - The window is based on time units.
171 Varying window distributions
- Different window distributions
- Single query per window
- Hash-based implementation
- window sizes (1 sec to 10 minutes), ?100
tuples/sec
182 Varying level of burstiness
- Pareto distribution
- Average burst size 15 tuples
- Window distribution small-large
193 Varying Query Distribution
- Window distribution small-large
- 80 of the total queries share a single window wi
- 20 are uniformly distributed on other window
- Total of 30 queries
wi1 - wi
Small-Large
w3
w4
w5
w6
w7
w2
204 Memory Requirement
Memory Buffers
- For ? 100 tuples/sec and w 600 sec,
- Maximum size for joinBuffer Smax ? wmax
60K tuples - For SWF and MQT
- Maximum size for inputBuffer always less than 10
of Smax - Maximum size for outputBuffer always less than 3
of Smax - With the basic assumption that system can finally
keep up with the input arrival rate extra memory
requirement is not significant for SWF and MQT.
21Conclusion
- Sharing window joins is a key technique to
achieve scalability and optimize system resources
for CQ processing. - We presented three algorithms,
- LWO, Largest Window Only CACQ02
- SWF, Smallest Window First and proposed
- MQT, Maximum Query Throughput proposed
- MQT provides the best average response time among
the three. - MQT and SWF require additional processing to
match isolated execution, experimentally the
gained performance outweigh the additional
overhead for large window differences. - The additional memory requirement for SWF and MQT
is not significant (less than 13 of joinbuffers
).
22Thank You
23Future Work
- Application of shared window execution for other
window operations such as online-aggregation,
online-GroupBy, duplicate elimination, union,
intersect and difference operators. - Starvation avoidance, e.g. for long-lasted bursty
workloads. - Lad shedding approach The filtered workload
will naturally fit in our proposed approaches - Partial window join explore partial areas in the
overlapping windows to maximize output
throughput. - Window clustering and hierarchical filtering for
large number of queries with different windows.
An interesting option to limit the number of
scheduled windows.
242 Per window response time
- Window distribution uniform