Title: Searching the Physical World: Distributed Protocols for Data Coverage and Caching in WSNs
1Searching the Physical World Distributed
Protocols for Data Coverage and Caching in WSNs
_at_ Dept. of Computer Communication Engineering,
University of Thessaly
Dimitrios Katsaros, Ph.D.
Nicosia, June 17th, 2008
2Outline of the talk
- WSNs A working reality
- What is the Sensory Web?
- Data Coverage issues in WSNs
- Cooperative Caching for WSNs
- Concluding remarks
3Outline of the talk
- WSNs A working reality
- What is the Sensory Web?
- Data Coverage issues in WSNs
- Cooperative Caching for WSNs
- Concluding remarks
4Wireless Sensor Networks (WSNs)
- Wireless Sensor Networks features
- Homogeneous devices
- Stationary nodes
- Dispersed network
- Large network size
- Self-organized
- All nodes acts as routers
- No wired infrastructure
- Potential multihop routes
5WSNs - Applications
6More exotic applications of WSNs
7Whats special about WSNs ?
- Resource constraints
- sensor nodes are battery-, memory- and
processing-starving devices - Variable channel capacity
- multi-hop nature of WSNs implies that wireless
link capacity depends on the interference level
among nodes - Multimedia in-network processing
- sensor nodes store rich media (image, video), and
must retrieve such media from remote sensor nodes
with short latency
8Challenges
- Huge network size
- Unknown/variable network topology
- Agnostic users
- Fault tolerance
- Sensor readings are simply votes
9Outline of the talk
- WSNs A working reality
- What is the Sensory Web?
- Data Coverage issues in WSNs
- Cooperative Caching for WSNs
- Concluding remarks
10Research areas Ultimately ? ???
Sensory Web
Mobile/Pervasive Computing
Overlay Nets
Web
Mobile Ad Hoc
Wireless Sensors Networks
Information Retrieval
11Search Engines for the Physical World
- Cooperating Sensors
- Distributed Protocols
- Energy-efficient Communication
- Short-latency Data Retrieval
- Unknown Network Topology
- Topology Control
- Storage in Flash Devices
12Outline of the talk
- WSNs A working reality
- What is the Sensory Web?
- Data Coverage issues in WSNs
- Cooperative Caching for WSNs
- Concluding remarks
13Querying WSNs
- Simple queries, e.g., Report the value of the
humidity - Aggregate queries, e.g., Report the average
humidity of all sensors in region X - Approximate queries, requiring data summarization
to perform holistic data aggregation in the form
of histograms, contour maps, e.g., Report the
contour of toxic chemical gas in region X - Complex queries, which, if expressed in SQL,
would involve joins nested or conditioned-based
sub-queries, e.g., Among regions X and Y, report
the average humidity of the region with the
highest temperature - Advanced queries, such as top-k queries, e.g.,
Report the k data objects with the highest
temperature
14Qyerying limitations (1/2)
Report the k smallest values of humidity within
region X along with the sensors that sensed them
What about sensor failures?
15Qyerying limitations (2/2)
Report the k smallest values of humidity across
the whole sensornet along with the sensors that
sensed them
What about small shifts in the region boundaries?
16The concept of Data Coverage
Report the sensor(s) whose humidity value is not
covered by any other humidity value across the
whole sensornet
Sensor with max humidity value
17The concept of k-Data Coverage
Report the sensor(s) whose humidity value is
covered by at most k (e.g., k2) other humidity
values across the whole sensornet
Sensor with max value
Sensor with 2nd max value
Sensor with 3rd max value
18Feature Distribution Maps
Still, we can not find out what happens in
neighborhoods, i.e., local minima, local maxima,
etc. These are not network-wide (global)
19The concept of d-hop k-Data Coverage
Depict the points (i.e., sensors) with the
largest, relative to their neighboring sensors,
humidities
- localized definition of neighborhoods
- no region prespecification
- define d to be the sensornet diameter
- Network-wide k-coverage
20The d-hop k-Data Coverage problem
- Generalizes
- The k-skyband query
- The top-k query
- The d-hop dominating set formation problem
- Deals with
- Any number of readings by a sensor node
- Any number of measured quantities, e.g.,
humidity, temperature, etc. - More generic predicates, not only maximum, minimum
21Data Coverage in Neighborhoods-DaCoN
- Distributed protocol for processing d-hop k-data
coverage queries in WSNs - Runs localized in neighborhoods
- No network spanners, e.g., aggregation tree,
spanning tree - No demanding initialization phase to construct
the spanner - Uniform energy consumption, no hot spots of
communication - Runs in 3 phases
22DaCoNs execution
- In a 2-dimensional space, assume that we wish the
maximization of the first dimension and the
minimization of the second one - v_i.d_x denotes the x-th dimension of value v_i
- v_i covers a value v_j, if it holds
- v_i.d_1 gt v_j.d_1 and v_i.d_2 lt v_j.d_2
23PHASE 1. First d-rounds
- Each sensor sends its k-th larger values to all
its 1-hop neighbors - It finds the k-th larger values taking account
its own values and the values that has received
from its neighbors - It forms a message with these values and it
stores the message into a buffer frb - In the next d-1 rounds, the above procedure is
repeated with the difference that now each sensor
considers as its k-th larger values, the values
of the last message of the frb
24PHASE 2. Next d-rounds
- Similarly to the previous rounds, but
- Each sensor finds its k-th values by taking into
account the previous message and the messages
that has received from its neighbors as follows
each v_i value (1 i k) is selected by
keeping the smaller i-th value of these
messages - These values form a message that is stored into a
buffer srb
25PHASE 3. Answer of query
- Each value v_i (1 i k) of the answer is
selected as follows the sensor compares
the messages of frb and srb and tries to find
pairs of values in the first i-th values of each
message - After the identification of all pairs of
values, the sensor selects the minimum pair as
the i-th value of its answer - If a pair of values does not exist, the sensor
selects the maximum of the first i-th values of
the messages of frb
26(No Transcript)
27DaCoN evaluation
- No competing methods
- Network topologies,
- existence and strength of clusters of sensors
- density of sensor nodes, etc
- Sensor data generator
28Impact of sensornet size messages
29Impact of sensornet size activated sens
30Impact of assortativity messages
31Impact of assortativity activated sens
32Impact of k (500 sensors) activated sens
33Impact of k (1000 sensors) activated sens
34d-hop k-data coverage
- Feature Distribution Maps
- Fully distributed solution DaCoN
- Little overhead
- Little storage
- Light computational load
- Few messages no hotspots in communication
- How do we improve upon the latency, when the
sensors need data from other sensors? - Cooperative Caching
35Outline of the talk
- WSNs A working reality
- What is the Sensory Web?
- Data Coverage issues in WSNs
- Cooperative Caching for WSNs
- Concluding remarks
36Our proposal
- Cooperative Caching NICOCA protocol
- multiple sensor nodes share and coordinate cache
data to cut communication cost and exploit the
aggregate cache space of cooperating sensors
- Each sensor node has a moderate local storage
capacity associated with it, i.e., a flash memory - Jim Gray predicted that flash memories will
replace hard disks
37Relevant work (1/2)
- Caching in OSs, DBMS, Web
- No extreme resource constraints
- Caching for wireless broadcast cellular networks
- more powerful nodes,
- one-hop communication with resource-rich base
stations - Most relevant research works
- cooperative caching protocols for MANETs
- GroCoca organize nodes into groups
- based on data request pattern mobility pattern)
- ECOR, Zone Co-operative, Cluster Cooperative
form clusters of nodes - based geographical proximity or adopting node
clustering algorithms for MANETs
38Relevant work
- Protocols that deviated from such approaches
- CacheData intermediate nodes cache the data to
serve future requests instead of fetching data
from their source - CachePath mobile nodes cache the data path and
use it to redirect future requests to the nearby
node which has the data instead of the faraway
origin node - Amalgamation of them the champion HybridCache
cooperative caching for MANETs
39NICoCa consists of
- A metric for estimating the importance of a
sensor node, which will imply short latency in
retrieval - A cooperative caching protocol which strives to
achieve uniform energy consumption - Datum discovery and cache replacement component
subprotocols - Performance evaluation of the protocol and
comparison with the state-of-the-art cooperative
caching for MANETs, with J-Sim
40Terminology and assumptions
- WMSN is abstracted as a graph G(V,E)
- edge e(u,v) exists iff u is in the transmission
range of v and vice versa (bidirectional links) - The network is assumed to be connected
- N1(v) the set of one hop neighbours of v
- N2(v) the set of two hop neighbours of v
- N12(v) combined set of N1(v) and N2(v)
- LNv is the induced subgraph of G associated
with vertices in N12(v) - dG(v,u) distance between v and u
41A measure of sensor importance
- suw swu number of shortest paths from u ? V to
w ? V (suu0) - suw(v) number of shortest paths from u to w
that some vertex v ? V lies on - Node importance index NI(v) of a vertex v is
42The NI index in sample graphs
43The NI index in sample graphs
- Nodes with large NI
- Articulation nodes (in bridges), e.g., 3, 4, 7,
16, 18 - With large fanout, e.g., 14, 8, U
44Centralized solution ???
- Create a broadcast tree to coordinate the
identification of NIs - lot of messages
- larger latency
- Hot-spots in communication (nodes with large NI)
- Localized Algorithms are preferable
- NIs in neighborhoods
45The NI index in a localized algorithm
2-hop neighbors of node 8
node 8 calculates the NI of its 2-hop neighbors
46The NI index in a localized algorithm
nodes 14 and 16 are more important than the
others from the viewpoint of node 8
Each node can identify its own important nodes
47Housekeeping information in NICoCa
- Ultimate source of multimedia data Data Center
- Each node is aware of its 2-hop neighborhood
- Uses NI to characterize some neighbors as
mediators - Can be either a mediator or an ordinary node
- Each sensor node stores
- the dataID, and the actual datum
- the data size, TTL interval
- for each cached item
- characterized either as O (i.e., own) or H (i.e.,
hosted) - the timestamps of the K most recent accesses
48The cache discovery protocol (1/2)
- A sensor node issues a request for a multimedia
item - Searches its local cache and if it is found
(local cache hit) then the K most recent access
timestamps are updated - Otherwise (local cache miss), the request is
broadcasted and received by the mediators - These check the 2-hop neighbors of the requesting
node whether they cache the datum (proximity hit) - If none of them responds (proximity cache miss),
then the request is directed to the Data Center
49The cache discovery protocol (2/2)
- When a mediator receives a request, searches its
cache - If it deduces that the request can be satisfied
by a neighboring node (remote cache hit),
forwards the request to the neighboring node with
the largest residual energy - If the request can not be satisfied by this
mediator node, then it does not forward it
recursively to its own mediators, since this will
be done by the routing protocol, e.g., AODV - If none of the nodes can help, then requested
datum is served by the Data Center (global hit )
50The cache replacement protocol
- Each sensor node first purges the data that it
has cached on behalf of some other node - Calculate the following function for each cached
datum i
- The candidate cache victim is the item which
incurs the largest cost - Inform the mediators about the candidate victim
- If it is cached by a mediator, the metadata are
updated - If not, it is forwarded and cached to the node
with the largest residual energy
51Evaluation setting (1/2)
- We compared NICOCA to
- Hybrid, state-of-the-art cooperative caching
protocol for MANETs - Implementation of protocols using J-Sim
simulation library
52Evaluation setting (2/2)
- Measured quantities
- number of hits (local, remote and global)
- residual energy level of the sensor nodes
- average latency for getting the requested data
- the number of packets dropped
- Present here only results for number of hits
- representative of latency, collisions and energy
consumption - A small number of global hits
- less network congestion, fewer collisions and
packet drops. - Large number of remote hits ? effectiveness of
cooperation - Large number of local hits ? effective
cooperation - the cost of global hits vanishes the benefits of
local hits
53Cache vs. hits (MB files uniform access) in a
sparse WMSN (d 4)
54Cache vs. hits (MB files uniform access) in a
dense WMSN (d 7)
55Cache vs. hits (MB files uniform access) in a
very dense WMSN (d 10)
56Observe MB files uniform access
- For all network topologies (sparse, dense and
very dense), NICoCa achieves more remote hits and
less global hits than HybridCache - This performance gap widens in favor of NICoCa as
we move from sparse to denser WMSNs - For very dense sensor deployments, NICoCa
achieves double the remote hits of HybridCache
and only half of its global hits - For sparse WMSNs HybridCache achieves slightly
more local hits than does NICoCa, but this gap
vanishes completely when moving to denser network - This small gain of HybridCache for sparse
topologies is not advantageous at all, since it
incurs global hits as many as twice the number of
its local hits
57Cache vs. hits (KB files Zipfian access) in a
sparse WMSN (d 4)
58Cache vs. hits (KB files Zipfian access) in a
dense WMSN (d 7)
59Cache vs. hits (KB files Zipfian access) in a
very dense WMSN (d 10)
60Observe KB files Zipfian access
- For all network topologies (sparse, dense and
very dense), NICoCa achieves more remote hits and
less global hits than HybridCache - For very dense WMSNs, the requests reaching Data
Center for NICoCa are less than half those of
HybridCache! - NICoCa's global hits do not vary significantly
with varying network topologies and varying local
sensor storage - Global hits of HybridCache are severely affected
by the topology and the cache size - For cache equal to 1 of the total data,
HybridCache's global hits increase at a pace of
50! - The results for Zipfian access on megabyte-sized
data more impressively in favor of NICoCa
61Summary
- Wireless Sensor Networks (WSNs)
- Cooperation among sensors
- Distributed protocols
- A brand new world or Distributed Algorithms
reloaded? - Exploit the unknown network topology!
- Impresice/incomplete queries!
- New storage devices (flash)
- Minimize energy consumption
- Minimize latency
62Thank you for your attention!
63Important references
- N. Dimokas, D. Katsaros, Y. Manolopoulos.
Cooperative caching in wireless multimedia sensor
networks. ACM Mobile Networks and Applications,
accepted, May 2008 - M. Kontaki, D. Katsaros, Y. Manolopoulos. The
d-hop k-data coverage query problem in wireless
sensor networks. Submitted, June 2008 - D. Katsaros, Y. Manolopoulos. Prediction in
wireless networks by Markov chains. IEEE Wireless
Communications magazine, (under second round
review), April 2008 - L. Yin and G. Cao. Supporting cooperative caching
in ad hoc networks. IEEE Transactions on Mobile
Computing, 5(1)77-89, 2006