Wireless Sensor Networks and Query Processing

About This Presentation

Title:

Wireless Sensor Networks and Query Processing

Description:

Chemicals, food, vehicles (car parks), machines, containers, ... People sitting at an airport lounge. New York taxi cabs. Kids playing. Military movements ... – PowerPoint PPT presentation

Number of Views:155

Avg rating:3.0/5.0

Slides: 74

Provided by: CIT788

Category:

more less

Transcript and Presenter's Notes

Title: Wireless Sensor Networks and Query Processing

1
Wireless Sensor Networks and Query Processing

WSN Wireless Sensor Networks
Routing Problems
Routing Algorithms
Real-time Query Processing
Sensor Selection and Data Aggregation

2
A typical Wireless Sensor Network
SN
SN
GW
Bluetooth
GW
SN
SN
SN
SN
SN
SN
SN
SN
GW
SN
GW
SN
WLAN
GPRS
Ethernet

Integration of Sensor Nodes (SN) and Gateways (GW)

3
MANET Mobile Ad-hoc Networking
4
Why Wireless Sensor Networks ?

Ease of deployment
Speed of deployment
Decreased dependence on infrastructure
Self-adaptive and self-organizing
Sensors are cheap devices and can be deployed in
large number
Sensors can work in harsh environment conditions,
i.e., desert
Sensors can work continuously for monitoring and
surveillance purposes
Connected to the rest of the system through a
gateway
From the gateway, various functions and queries
may be submitted into the system to access the
sensor data

5
Todays Wireless Sensor Networks (WSN)

First generation of WSNs is available
Diverse sensor nodes, several gateways
Even with special sensors cameras, body
temperature
Basic software
Routing, energy conservation, management
Several prototypes for different applications
Environmental monitoring, industrial automation,
wildlife monitoring
Many see new possibilities for monitoring,
surveillance, protection
Sensor networks as a cheap and flexible new
meansfor surveillance (i.e., security)
Monitoring and protection of goods
Chemicals, food, vehicles (car parks), machines,
containers,
Large application area besides military
Law enforcement, disaster recovery, industry,
private homes,

6
Mobile ad-hoc networks (MANET)

Network without infrastructure
Use components of participants for networking
Examples
Single-hop All partners max. one hop apart
Bluetooth piconet, PDAs in a room,gaming
devices
Multi-hop Cover larger distances, circumvent
obstacles
Bluetooth scatternet, police network,
car-to-car networks
MANET (Mobile Ad-hoc Networking) group
Dynamic network topology
Mobile nodes

7
Many Variations

Fully Symmetric Environment
All nodes have identical capabilities and
responsibilities
Asymmetric Capabilities
Transmission ranges and radios may differ
Battery life at different nodes may differ
Processing capacity may be different at different
nodes
Speed of movement (fixed and mobile)
Asymmetric Responsibilities
Only some nodes may route packets
Some nodes may act as leaders of nearby nodes
(e.g., cluster head)

8
Many Variations

Traffic characteristics may differ in different
mobile ad hoc networks
Bit rate
Timeliness constraints
Reliability requirements
Unicast / multicast / geocast
May co-exist (and co-operate) with an
infrastructure-based network

9
Many Variations

Mobility patterns may be different
People sitting at an airport lounge
New York taxi cabs
Kids playing
Military movements
Mobility characteristics
Speed
Predictability
Direction of movement
Pattern of movement
Uniformity (or lack thereof) of mobility
characteristics among different nodes

10
Wireless Sensor Networks Challenges

Long-lived, autonomous networks
Use environmental energy sources
Embed and forget
Self-healing
Self-configuring networks
Routing
Data aggregation
Localization
Managing wireless sensor networks
Tools for access and programming
Update distribution
Scalability, Quality of Service

11
Routing Problem

Routing finding a route to send data from the
source to the destination
Highly dynamic network topology
Device mobility plus varying channel quality
Separation and merging of networks possible
Asymmetric connections possible

N6
N7
N6
N7
N1
N1
N2
N3
N2
N3
N4
N4
N5
N5
time t1
time t2
good link weak link
Changing topology
12
Mobile Ad Hoc Networks

May need to traverse multiple links to reach a
destination

13
Mobile Ad Hoc Networks

Mobility causes route changes

14
Routing Problems

Asymmetric links
A path from node A to B does not implies that
node B can use the same path to send packet to
node A
Redundant links
Multiple paths from A to B, which one is the best
one (minimizing the number of hops count) and
should be chosen
Interference
Collision, neighboring nodes send packets at the
same time
Collision -gt retransmission (MAC)
Dynamic topology
Changing link quality due to movement
Need to find a new path every short period of
time. The old one does not work
Update of path information in the intermediate
nodes
No nodes have a complete information of the
status of all the nodes in the system
Transmission delay is changing
Difficult for loading balancing and traffic
control

15
Routing Problems

Routing Problem
To find a route to connect the source node (S) to
the destination node (D) through a sequence of
relay nodes
The route may just for a one time connection or
for a period of time (continuous monitoring)
Issues in routing algorithms
Minimize message overhead (no. of messages)
On-demand algorithms
Minimize the searching delay
Table-driven algorithms
Route maintenance
Minimize energy consumption rate
Power-aware routing algorithms (choosing high
energy nodes as relay nodes
Switching some of the mobile hosts to doze mode
to conserve energy

16
Routing Methods

Two types of routing algorithms
On-demand protocols (reactive)
A route is searched upon the receipt a connection
request
Table-driven protocols (proactive)
The topology of the whole network is maintained
When a connection is needed, the source node can
select the route from its memory directly

17
Routing Methods

Latency of route discovery
Proactive protocols may have lower latency since
routes are maintained at all times
Reactive protocols may have higher latency
because a route from X to Y will be found only
when X attempts to send to Y
Overhead of route discovery/maintenance
Reactive protocols may have lower overhead since
routes are determined only if needed
Proactive protocols can (but not necessarily)
result in higher overhead due to continuous route
updating
Which approach achieves a better trade-off
depends on the traffic and mobility patterns

18
Routing Algorithms for Ad Hoc Networks

Flooding
Dynamic Source Routing (DSR)
Location-Aided Routing (LAR)
Power-Aware Routing (PAR)
Least Interference Routing (LIR)

19
Flooding for Data Delivery

Sender S broadcasts data packet P to all its
neighbors
Each node receiving P forwards P to its neighbors
Sequence numbers used to avoid the possibility of
forwarding the same packet more than once
Packet P reaches destination D provided that D is
reachable from sender S
Node D does not forward the packet

20
Flooding for Data Delivery
Y
Represents that connected nodes are within each
others transmission range
Z
S
E
F
B
C
M
L
J
A
G
H
D
K
I
N
Represents a node that has received packet P
21
Flooding for Data Delivery
Y
Represents transmission of packet P
Broadcast transmission
Z
S
E
F
B
C
M
L
J
A
G
H
D
K
I
N
Represents a node that receives packet P for the
first time
22
Flooding for Data Delivery
Y
Z
S
E
F
B
C
M
L
J
A
G
H
D
K
I
N

Node H receives packet P from two neighbors
potential for collision

23
Flooding for Data Delivery
Y
Z
S
E
F
B
C
M
L
J
A
G
H
D
K
I
N

Node C receives packet P from G and H, but does
not forward
it again, because node C has already forwarded
packet P once

24
Flooding for Data Delivery
Y
Z
S
E
F
B
C
M
L
J
A
G
H
D
K
I
N

Nodes J and K both broadcast packet P to node D
Since nodes J and K are hidden from each other,
their
transmissions may collide
gt Packet P may not be delivered to node
D at all,
despite the use of flooding

25
Flooding for Data Delivery
Y
Z
S
E
F
B
C
M
L
J
A
G
H
D
K
I
N

Node D does not forward packet P, because node D
is the intended destination of packet P

26
Flooding for Data Delivery
Y
Z
S
E
F
B
C
M
L
J
A
G
H
D
K
I
N

Flooding completed
Nodes unreachable from S do not receive packet P
(e.g., node Z)
Nodes for which all paths from S go through the
destination D
also do not receive packet P (example node N)

27
Flooding for Data Delivery
Y
Z
S
E
F
B
C
M
L
J
A
G
H
D
K
I
N

Flooding may deliver packets to too many nodes
(in the worst case, all nodes reachable from
sender
may receive the packet)

28
Flooding for Data Delivery Advantages

Simplicity
May be more efficient than other protocols when
the rate of information transmission is low
enough that the overhead of explicit route
discovery/maintenance incurred by other protocols
is relatively higher
This scenario may occur, for instance, when nodes
transmit small data packets relatively
infrequently, and many topology changes occur
between consecutive packet transmissions
Potentially higher reliability of data delivery
Because packets may be delivered to the
destination on multiple paths

29
Flooding for Data Delivery Disadvantages

Potentially, very high overhead
Data packets may be delivered to too many nodes
who do not need to receive them
Potentially lower reliability of data delivery
Flooding uses broadcasting -- hard to implement
reliable broadcast delivery without significantly
increasing overhead
In our example, nodes J and K may transmit to
node D simultaneously, resulting in loss of the
packet
In this case, destination would not receive the
packet at all

30
Flooding of Control Packets

Many protocols perform (potentially limited)
flooding of control packets, instead of data
packets
The control packets are used to discover routes
Discovered routes are subsequently used to send
data packet(s)
Overhead of control packet flooding is amortized
over data packets transmitted between consecutive
control packet floods

31
Dynamic Source Routing (DSR)

In DSR, it consists of two steps
route discovery a node tries to discover a route
to a destination if it has to send something to
its destination
route maintenance if a node detects the current
route has changed, it needs to find a new route
In route discovery, if node S wants to send a
packet to node D, but does not know a route to D,
node S initiates a route discovery (small size
message)
Source node S floods Route Request (RREQ)
Each node appends own identifier when forwarding
RREQ
If a node has already received the request, it
will drop the request

32
Route Discovery in DSR
Y
Z
S
E
F
B
C
M
L
J
A
G
H
D
K
I
N
Represents a node that has received RREQ for D
from S
33
Route Discovery in DSR
Y
Broadcast transmission
Z
S
S
E
F
B
C
M
L
J
A
G
H
D
K
I
N
X,Y Represents list of identifiers appended
to RREQ
Represents transmission of RREQ
34
Route Discovery in DSR
Y
Z
S
S,E
E
F
B
C
M
L
J
A
G
S,C
H
D
K
I
N

Node H receives packet RREQ from two neighbors
potential for collision

35
Route Discovery in DSR
Y
Z
S
E
F
S,E,F
B
C
M
L
J
A
G
H
D
K
S,C,G
I
N

Node C receives RREQ from G and H, but does not
forward
it again, because node C has already forwarded
RREQ once

36
Route Discovery in DSR
Y
Z
S
E
F
S,E,F,J
B
C
M
L
J
A
G
H
D
K
I
N
S,C,G,K

Nodes J and K both broadcast RREQ to node D
Since nodes J and K are hidden from each other,
their
transmissions may collide

37
Route Discovery in DSR
Y
Z
S
E
S,E,F,J,M
F
B
C
M
L
J
A
G
H
D
K
I
N

Node D does not forward RREQ, because node D
is the intended target of the route discovery

38
Route Discovery in DSR

Destination D on receiving the first RREQ, sends
a Route Reply (RREP)
RREP is sent on a route obtained by reversing the
route appended to received RREQ
RREP includes the route from S to D on which RREQ
was received by node D

39
Route Reply in DSR
Y
Z
S
RREP S,E,F,J,D
E
F
B
C
M
L
J
A
G
H
D
K
I
N
Represents RREP control message
40
Route Reply in DSR

Route Reply can be sent by reversing the route in
Route Request (RREQ) only if links are guaranteed
to be bi-directional
To ensure this, RREQ should be forwarded only if
it received on a link that is known to be
bi-directional
If unidirectional (asymmetric) links are allowed,
then RREP may need a route discovery for S from
node D
Unless node D already knows a route to node S
If a route discovery is initiated by D for a
route to S, then the Route Reply is piggybacked
on the Route Request from D

41
Dynamic Source Routing (DSR)

Node S on receiving RREP, caches the route
included in the RREP
When node S sends a data packet to D, the entire
route is included in the packet header
Hence the name source routing
Intermediate nodes use the source route included
in a packet to determine to whom a packet should
be forwarded

42
Data Delivery in DSR
Y
Z
DATA S,E,F,J,D
S
E
F
B
C
M
L
J
A
G
H
D
K
I
N
Packet header size grows with route length
43
Dynamic Source Routing Advantages

Routes maintained only between nodes who need to
communicate
reduces overhead of route maintenance
Route caching can further reduce route discovery
overhead
A single route discovery may yield many routes to
the destination, due to intermediate nodes
replying from local caches

44
Dynamic Source Routing Disadvantages

Packet header size grows with route length due to
source routing
Flood of route requests may potentially reach all
nodes in the network
Care must be taken to avoid collisions between
route requests propagated by neighboring nodes
Insertion of random delays before forwarding RREQ
Increased contention if too many route replies
come back due to nodes replying using their local
cache
Route Reply Storm problem
Reply storm may be eased by preventing a node
from sending RREP if it hears another RREP with a
shorter route

45
Enhancement to routing

There may be multiple route from the source node
to the destination node. How to choose the route?
Interference
The number of neighboring nodes
If the number of neighboring nodes is larger, the
probability of having conflict in transmission is
higher. Therefore, more re-transmission and
greater waste in bandwidth
Energy level of the intermediate nodes
Eliminate those nodes with energy level below a
threshold value
Location area
Estimate the possible region of the destination
node
Broadcast the packets to the estimated region
i.e., LAR

46
Location-Aided Routing (LAR)

Exploits location information to limit scope of
route request flood
Location information may be obtained using GPS
Expected Zone is determined as a region that is
expected to hold the current location of the
destination
Expected region determined based on potentially
old location information, and knowledge of the
destinations speed
Route requests limited to a Request Zone that
contains the Expected Zone and location of the
sender node

47
Expected Zone in LAR
X last known location of node D, at time
t0 Y location of node D at current time
t1, unknown to node S r (t1 - t0) estimate
of Ds speed
X
r
Y
Expected Zone
48
Request Zone in LAR
Network Space
Request Zone
B
X
S
r
A
Y
49
LAR

Only nodes within the request zone forward route
requests
Node A does not forward RREQ, but node B does
(see previous slide)
Request zone explicitly specified in the route
request
Each node must know its physical location to
determine whether it is within the request zone
If route discovery using the smaller request zone
fails to find a route, the sender initiates
another route discovery (after a timeout) using a
larger request zone
the larger request zone may be the entire network
Rest of route discovery protocol similar to DSR

50
Energy-aware routing

Only sensors with sufficient energy forward data
for other nodes
Example Routing via nodes with enough solar
power is considered for free

51
System Monitoring and Surveillance

Wireless sensor systems
Needs to monitor the occurrences of (simple)
events in the system environment
I.e., When the temperature is higher than 50C
I.e., The max and min pressure in a day
Complex events
The occurrences of multiple simple events at the
same time
The maximum temperatures of two rooms when the
pressure is higher than 1000mmHg
The light intensity at the arrival time of a bird

52
System Monitoring

Continuous monitoring queries (CMQs)
Submit to monitor the events occurring in the
system environment for a period of time Begin
time and end time
A condition is defined. Once the condition is
satisfied, an alert is sent to the user
Based on the attributes defined in the condition,
a set of data items are identified as input to
the query
Access to a set of data items (pre-defined)
The data items are generated by sensor nodes
distributed in the system environment
Sensor nodes
Each sensor node may be installed multiple
sensors to capture different signals of the
system environment
Fixed sampling frequency
Communicate through low bandwidth wireless
network

53
In-Network Processing

Processing of queries (two approaches)
(1) Send sensor data to a centralized server for
processing
(2) Process the queries at the sensor nodes
In-networking processing
A query is divided into a set of sub-queries
Each sub-query is processed at the sensor node
(called participating nodes) which is responsible
for generating its required data items
A coordinator node (one of the participating
sensor node) is responsible for aggregating the
results from the sensor nodes
Example get the average temperature of sensors A
to D from now for then 10 min if they are higher
than 100F
No need to transmit large volume of data to a
centralized server for processing
Issues routing and aggregation

54
System Architecture

MSPU Mobile sensor processing units
Base Station connecting with MSPU through a
wireless link
Back-end server maintains a database, and
provides an interface for submitting CMQ and
displaying query results including performance
statistics

55
Continuous Monitoring Query

CMQi consists of a set of sub-queries, SCMQi,1,
SCMQi,2, SCMQi,n defined according to the
distribution of the required nodes of the query
One of the nodes is the coordinator node and the
others are participating nodes
Each sub-query contains a selection condition to
process on the sensor data from its node
A CMQ contains an aggregation condition for
execution,
i.e., to have the results from all the
sub-queries
Calculating the maximum value requires at least
two inputs

56
Execution of CMQs

Step 1
Evaluation on the sensor data items generated by
a sensor node using the selection condition
defined in the sub-query
Step 2
Sending sub-query evaluation results to the
coordinator node for evaluation if the
aggregation conditions have been satisfied
Report the query result to the client as a
function of time during the activation period of
the query

57
Execution of CMQs
58
An Example of a CMQ

Get the maximum temperature of Sensors A and B
from now if they are higher than 100F on until 15
min later
CMQ1 (SCMQ1,1 , SCMQ1,2, Operation1, 1200,
1215)
SCMQ1,1 If temperature T1 of sensor data from
MSPU1 gt 100F, return the temperature
SCMQ1,2 If temperature T2 of sensor data from
MSPU2 gt 100F, return the temperature
Aggregate condition1 The output from both
SCMQ1,1 and SCMQ1,2 are data values
Aggregate operation1 IF T1 gt T2, return T1 ELSE
return T2

59
Temporal Consistency

The sensor nodes follow their pre-set frequency
(period) to generate sensor data values
A sensor data value is invalid if the new version
is generated
Data version X is valid if creation time of x
generation period gt current time
The main purpose of a CMQ is to monitor system
environment
Not to miss the occurrences of any such events
Require continuous evaluation on sensor data
Ensure that all results generated from the CMQ
are correct (consistent with the real situation
in the monitoring environment)
Require each evaluation on temporally consistent
data such that they are valid at the same time
point

60
Temporal Inconsistency Problem

If MSPU1 is assigned to be the coordinator node,
MSPU2 will forward its sub-query results to MSPU1
Due to communication delay, the set of sub-query
results from MSPU2 received by MSPU1 will be
shifted by the transmission delay
The generated query results may become incorrect
Incorrect light intensity at the arrival time of
a bird
Incorrect maximum temperature of the two rooms

61
Temporal Inconsistency Problem
62
Temporal Inconsistency Problem

Time-stamping technique
Using time-stamp to label the validity of a data
version
From lower valid time (LVT) to upper valid time
(UVT)
Relative consistency
The intersection of the validity intervals of all
the accessed data items is non-empty
The data versions are not too old (currency
requirement)
Buffering
The coordinator node buffer the received
sub-query results
Evaluation follows the relative consistency
requirement

63
Aggregation Problem

How to aggregate the sub-query results from the
participating nodes?
Objectives
To minimize the aggregation cost (data
communication cost)
Fault tolerance to message loss
Minimize the processing cost at the coordinator
node
Centralized aggregation
Select a coordinator node which is close to all
the participating nodes

64
Periodic Pushing

The latest generated sub-query results form a
message and are forwarded to the coordinator node
every fixed submission period
Each message contains several sensor data
versions of a data item to minimize the message
overhead
Results are time-stamped to indicate their
validity intervals
Evaluation at the coordinator node follows the
time-stamps by searching the received data at the
buffer
Message loss can easily be detected

65
Periodic Pushing
66
Conditional Pushing

Aims to reduce the sizes of data versions for
aggregation
The scheme is the same as periodic pushing except
the data values are compressed
Successive data versions with the same value are
compressed
Although the redundancy in data values within a
message is eliminated, the redundancy in
successive messages cannot be eliminated
The amount of bandwidth saved depends on how the
data values changes from the sensor node

67
Conditional Pushing
68
Sequential Pushing

The sensor nodes of a CMQ are assumed to be close
to each other and they can directly communicate
with each other
The submission of sub-query results is triggered
by a triggering node which is one of the
participating nodes of the CMQ
The determination of which participating node to
be the triggering node is based on which one has
the least number of satisfied results in
evaluation
The pushing of sub-query results follows a
sequential order according to the evaluation
results
Partial processing of the operation, which is
originally to be performed at the coordinator
node, is performed on its way

69
Sequential Pushing
70
Sequential Pushing

Due to dynamic properties of sensor data, the
probability of satisfying the condition in a
sub-query at a node may change with time
Reordering of the nodes
Assigns the false node to be the first node in
the sequence.
All the nodes following the false node will
remain in the same relative order to each other.
All the nodes in front of the false node remain
in their original relative order. They rejoin the
node sequence by putting them after the last node
of the original sequence

71
Sequential Pushing
72
SeqPush Vs Centralized Scheme

The total number of messages is normally smaller
especially for the case where the probability of
satisfying the aggregation conditions in all the
sub-queries is not high
The processing workload at the coordinator node
is lower as the participating nodes are
responsible for partial computation of the
aggregation operation
The processing cost in searching for relatively
consistent sensor data values will be lower due
to a false result from a sub-query

73
References

Schiller 8.3
David B. Johnson and David A. Maltz, Dynamic
Source Routing in Ad Hoc Wireless Networks
(DSR), in Mobile Computing, 1996.
Young-Bae Ko and Nitin H. Vaidya, Location-aided
Routing (LAR) in Mobile Ad Hoc Networks, in
Proceedings of 1998 ACM International Conference
on Mobile Computing and Networking
Y. Yao and J. E. Gehrke, Query Processing in
Sensor Networks, in Proceedings of the First
Biennial Conference on Innovative Data Systems
Research (CIDR 2003), Asilomar, California,
January 2003