Title: Fjording The Stream An Architecture for Queries over Streaming Sensor Data
1Fjording The StreamAn Architecture for Queries
over Streaming Sensor Data
- Samuel Madden, Michael Franklin
- UC Berkeley
2Introduction
- Telegraph Sensor Query Processing Architecture
- Fjords
- Enable push and pull in query plans
- Operators for streaming data
- Sensor proxy
- Sensor - Query Mediator
User
Proxy
Web
Sensors
3Roadmap
- Background
- Sensors
- Requirements
- Traffic Scenario
- Fjords
- Continuous Queries
- Stream-sensitive operators
- Sensor Proxies
- Querying a Sensor Network
- Results
- Graphs
- Related Work
4Roadmap
- Background
- Sensors
- Requirements
- Traffic Scenario
- Fjords
- Continuous Queries
- Stream-sensitive operators
- Sensor Proxies
- Querying a Sensor Network
- Results
- Graphs
- Related Work
5Sensor Challenges
- Battery Powered
- 2AA Cells (2800 mAh 1.01x104 J), Coin Cell
(100mAh) - Communication Dominates Power Cost
- Can be exhausted - tens to hundreds of MBs of
data / sensor - Wireless
- High loss rates (20 _at_ 5meters , typical)
- Low bandwith (10kbps)
- Near Real Time
- Streaming Data
- Pushed at (user defined) regular intervals
TinyOS Mote
6Requirements of Sensor Query Processing
- Power Sensitivity
- Tolerance to unbounded streams of data
- Tolerance to push
- Tolerance to intermittent, lossy connections
- Tolerance to failed sensors
7Traffic Scenario
- CA Department of Transportation (CalTrans) has
sensors all over bay area freeways - Inductive loop sensors give speed, flow, vehicle
size - Motes could collect this data cheaply
- Many possible queries
- Commuters want to find congestion
- California Highway Patrol (CHP) wants to find
accidents
8Roadmap
- Background
- Sensors
- Requirements
- Traffic Scenario
- Fjords
- Continuous Queries
- Stream-sensitive operators
- Sensor Proxies
- Querying a Sensor Network
- Results
- Graphs
- Related Work
9Fjords
- Query plan implementation
- Useful for streams and distributed environments
- Combine push (streaming) data and pull (static)
data - E.g. traffic sensors with CHP accident reports
10Push vs. Pull
- Problem Need an API to combine push pull
- Operators (e.g. join) data-direction agnostic
- Push vs. pull implemented in queues (connectors)
- Contrast with
- Iterator model
- Exchange operator
- Allows arbitrary combinations of push and pull
11Pull Example
- Operator
- Queue q
- Tuple process()
- Tuple t q.get(), outt null
- If (t ! null)
- ltprocess tgt
- else do something else
- return outt
Pull Queue Operator parent, child Tuple get()
Tuple t null while (t null) t
child.process() return t
s
- Notice
- Iterator semantics by making get() blocking
- Get() can return null
- Process() can return null
Pull Connection
Scan
12Push Example
Push Queue Operator parent, child Vector v new
Vector() Tuple get() if (v.size() gt 0) return
v.removeFirst() else return null Tuple
enqueue(Tuple t) v.put(t)
- Operator
- Queue q
- Tuple process()
- Tuple t q.get(), outt null
- If (t ! null)
- ltprocess tgt
- else do something else
- return outt
-
- Thread
- while(true)
- Tuple t op.process()
- if (t ! null) op.outq.enqueue(t)
s
Push Connection
Scan
13Fjord Example
Push
Push
Pull
14Roadmap
- Background
- Sensors
- Requirements
- Traffic Scenario
- Fjords
- Continuous Queries
- Stream-sensitive operators
- Sensor Proxies
- Querying a Sensor Network
- Results
- Graphs
- Related Work
15Continuous Queries
- Given user queries over current sensor data
- Expect that many queries will be over the same
data sources (e.g. traffic sensors) - Queries over current data always looking at same
tuples - Those queries can share
- Current tuples
- Work (e.g. selections)
- Sharing reduces messages, thereby power!
- A new query can be folded into an existing
query - Use old instances of scans and selections
16Roadmap
- Background
- Sensors
- Requirements
- Traffic Scenario
- Fjords
- Continuous Queries
- Stream-sensitive operators
- Sensor Proxies
- Querying a Sensor Network
- Results
- Graphs
- Related Work
17Relational Operators And Streams
- Selection and Projection Apply Naturally
- No Blocking Operators
- Sorts and aggregates over the entire stream
- Nested loops and sort-merge join
- Windowed Operators
- Sorts, aggregates, etc.
- Online, Interactive QP Techniques
- In memory symmetric hash join
- Alternatives ripple-join, Xjoin, etc.
- Partial Results
18Roadmap
- Background
- Sensors
- Requirements
- Traffic Scenario
- Fjords
- Continuous Queries
- Stream-sensitive operators
- Sensor Proxies
- Querying a Sensor Network
- Results
- Graphs
- Related Work
19Sensor Proxies
- Mediate between Sensors and Fjords
- Push operators out to sensors
- Hide query processing, knowledge of multiple
queries from sensors - Hide details of sensors from query processor
- Enable power-sensitivity
Query Processor
20What runs where?
- (Multi-query) Distributed Optimization Challenge
- Given set of operators, proxy must choose
- Run on sensor or,
- Run on local query processor
- Running on sensors saves power
- Simple computations cheaper than messages
- Selection, aggregation reduce communication cost
- Sensors have limited resources
- All queries cant run in all sensors
simultaneously - Limited state precludes big joins or lots of
groups - Queries share operators
- Operators vary in selectivity
21Roadmap
- Background
- Sensors
- Requirements
- Traffic Scenario
- Fjords
- Continuous Queries
- Stream-sensitive operators
- Sensor Proxies
- Querying a Sensor Network
- Results
- Graphs
- Related Work
22Building a Query
- How to translate a declarative query into a
Fjord? - Just like traditional query processing, except
- Branches originating in sensors connected by push
connectors - Sensor proxy handles scans, selections over
sensors - Proxy delivers tuples from sensors
- Proxy pushes-down selections transparently
- Output of join is push if one or both inputs is
push - Join carefully chosen
Pull Data Request
Data Request
Speed lt 30
Push
Push
Data
23Roadmap
- Background
- Sensors
- Requirements
- Traffic Scenario
- Fjords
- Continuous Queries
- Stream-sensitive operators
- Sensor Proxies
- Querying a Sensor Network
- Results
- Graphs
- Related Work
24Query Fjord
- Simple Test Query
- SELECT AVG(s.speed, w)
- FROM sensors AS s
- WHERE s.loc in userLocs
Telegraph Server
Average
BHL Server
25Multiple Queries in a Fjord
Telegraph Server
BHL Server
26Fjord Performance (In Telegraph)
27Sensor Proxy for Traffic
- Measure benefit of pushing computation into
sensors in this case, vehicle identification. - Simple aggregation dramatically reduces power
costs
28Pushing Aggregates Saves Power
Atmel (TinyOS CPU) Simulator 100 samples / sec
5 vehicles / sec 7x power savings
29Roadmap
- Background
- Sensors
- Requirements
- Traffic Scenario
- Fjords
- Continuous Queries
- Stream-sensitive operators
- Sensor Proxies
- Querying a Sensor Network
- Results
- Graphs
- Related Work
30Related Work
- Cougar
- Interactive Adaptive Query Processors
- Tukwila, Xjoin, Eddy, CONTROL
- Continuous Query Processors
- NiagaraCQ, Xfilter, CACQ
- Directed Diffusion
- Volcano / Exchange Operator
- Temporal Databases
31Conclusions Future Work
- Required for sensor query processing
- Fjords API for combining push pull
- Sensor Proxy Mediator between sensors and QP
- Streaming Operators
- Big benefit from pushing selections, aggregates
into network - Combining multiple queries is a win
- Extensions Future Work
- Multi-query optimization adaptivity (CACQ,
SIGMOD 2002) - Push down selections aggregates (Submitted to
VLDB 2002) - Sensor proxy policies