Title: Information Processing in Sensor Networks
1Information Processing in Sensor Networks
Anand Meka
Program Committee D. Agrawal S. Suri A.K.Singh
(Chair)
2Outline
- Motivation and Roadmap
- Internet-like Architecture
- Overview of each layer
- Research and Future Directions
- Plume Detection and Tracking
- Clustering Spatio-Temporal Data Distributions
3Outline
- Motivation and Roadmap
- Internet-like Architecture
- Overview of each layer
- Research and Future Directions
- Plume Detection and Tracking
- Clustering Spatio-Temporal Data Distributions
4Where are they required?
- Micro-sensors, on-board processing, wireless
interfaces feasible at very small scale--can
monitor phenomena up close - Enables spatially and temporally dense
environmental monitoring
Contaminant Transport
Ecosystems, Biocomplexity
Marine Microorganisms
Seismic Structure Response
5Networked Sensor Node History
LWIM III UCLA, 1996 Geophone, RFM radio, PIC,
star network
AWAIRS I UCLA/RSC 1998 Geophone, DS/SS Radio,
strongARM, Multi-hop networks
Sensor Mote UCB, 2000 RFM radio, Amtel
Medusa, MK-2 UCLA NESL 2002
- Predecessors in
- DARPA Packet Radio program
- USC-ISI Distributed Sensor Network Project (DSN)
- Manufactured at Crossbow Technology
6Telos New OEP Mote
- Single board philosophy
- Robustness, Ease of use, Lower Cost
- Integrated Humidity Temperature sensor
- First platform to use 802.15.4
- CC2420 radio, 2.4 GHz, 250 kbps (12x mica2)
- 3x RX power consumption of CC1000, 1/3 turn on
time - Same TX power as CC1000
- Motorola HCS08 processor
- Lower power consumption, 1.8V operation,faster
wakeup time - 40 MHz CPU clock, 4K RAM
- Package
- Integrated onboard antenna 3dBi gain
- Removed 51-pin connector
- Everything USB Ethernet based
- 2 AA batteries
- Weatherproof packaging
- Support in upcoming TinyOS 1.1.3 Release
- Co-designed by UC Berkeley and Intel Research
- Available from Moteiv (moteiv.com)
7Sensor Node Energy Roadmap
10,000 1,000 100 10 1 .1
Rehosting to Low Power
Average Power (mW)
-System-On-Chip -Adv Power Management Algorithms
2000 2002 2004
Source CENS04
8Communication/Computation Technology Projection
Assume 10kbit/sec. Radio, 10 m range. Large cost
of communications relative to computation
continues
Source DARPA04
9Sensors
- Passive elements seismic, acoustic, infrared,
strain, salinity, humidity, temperature, etc - Arrays imagers (visible, IR), biochemical ( Ex
Electronic Nose _at_ Caltech) - Active sensors radar, sonar
- Added computation along with sensing
- Technology trend use of IC technology for
increased robustness, lower cost, smaller size - Network as a sensor ?
10Outline
- Motivation and Roadmap
- Internet-like Architecture
- Overview of each layer
- Research and Future Directions
- Plume Detection and Tracking
- Clustering Spatio-Temporal Data Distributions
11Sample Layered Architecture
User Queries, External Database
Resource constraints call for more tightly
integrated layers AnInternet-like architecture
for such application-specific systems
In-network Application processing, Data
aggregation, Query processing
Data dissemination, storage, caching
Adaptive topology, Geo-Routing
MAC, Time Synchronization, Localization
Communication, sensing, actuation
12Sample Layered Architecture
User Queries, External Database
Resource constraints call for more tightly
integrated layers AnInternet-like architecture
for such application-specific systems
In-network Application processing, Data
aggregation, Query processing
Data dissemination, storage, caching
Adaptive topology, Geo-Routing
MAC, Time Synchronization, Location
Communication, sensing, actuation
13Energy Efficiency in MAC Design
- Energy is of primary concern
- What causes energy waste?
- Collisions
- Control packet overhead
- Overhearing unnecessary traffic
- Long idle time
- bursty traffic in sensor-net apps
- Idle listening consumes 50100 of the power for
receiving (Stemm97, Kasten et al)
14Case Study S-MAC
- S-MAC (Ye02)
- Tradeoffs
- Major components in S-MAC
- Periodic listen and sleep
- Collision avoidance
- Overhearing avoidance (PAMAS01)
- Message passing
Latency Fairness
Energy
15Coordinated Sleeping
- Problem Idle listening consumes significant
energy - Solution Periodic listen and sleep
- Turn off radio when sleeping
- Reduce duty cycle to 10 (120ms on/1.2s off)
16Collision Avoidance
- S-MAC is based on contention
- Similar to IEEE 802.11
- Carrier sense
- Randomized backoff time
- RTS/CTS for hidden terminal problem
- RTS/CTS/DATA/ACK sequence
17Overhearing Avoidance
- Problem Receive packets destined to others
- Solution Sleep when neighbors talk
- Basic idea from PAMAS ( Raghavendra98)
- But we only use in-channel signaling
- Who should sleep?
- All immediate neighbors of sender and receiver
- How long to sleep?
- The duration field in each packet informs other
nodes the sleep interval
18Message Passing
- Problem Sensor net in-network processing
requires entire message - Solution Dont interleave different messages
- Long message is fragmented sent in burst
- RTS/CTS reserve medium for entire message
- Fragment-level error recovery ACK
- extend Tx time and re-transmit immediately
- Other nodes sleep for whole message time
Energy Msg-level latency
Fairness
19Sample Layered Architecture
User Queries, External Database
Resource constraints call for more tightly
integrated layers AnInternet-like architecture
for such application-specific systems
In-network Application processing, Data
aggregation, Query processing
Data dissemination, storage, caching
Adaptive topology, Geo-Routing
MAC, Time Synchronization, Location
Communication, sensing, actuation
20Time Synchronization (Elson01)
Sender
Receiver
Receiver
- Crucial in many other contexts
- tracking, beam forming, security, aggregation
etc. - Global time not always needed
- New ideas
- Local timescales
- Receiver-receiver sync
- Post-facto sync
- Multihop time translation
- Mote implementation
- 10 ?s single hop
- Error grows slowly over hops
- Fault-tolerant synchronization ?.
NIC
NIC
NIC
I saw it at t4
I saw it at t5
Propagation Time
Physical Media
21Sample Layered Architecture
User Queries, External Database
Resource constraints call for more tightly
integrated layers AnInternet-like
architecture for such application-specific systems
In-network Application processing, Data
aggregation, Query processing
Data dissemination, storage, caching
Adaptive topology, Geo-Routing
MAC, Time Synchronization, Localization
Communication, sensing, actuation
22Data storage and Querying
A. Store data centrally
B. Store data locally
C. Multi-resolution storage
23Storage challenges
Method Transmit all data out of network and
store in a centralized database. Advantage
Centralized, persistent storage and unconstrained
search. Excellent initial deployment and
debugging tool. Deployments are
currently primarily
data-gathering based. Disadvantage
Power-inefficient, high latency due to bandwidth
constraints.
A. Store data centrally
B. Store data locally
C. Multi-resolution storage
24Storage challenges
Method Store data locally, and construct
distributed index structures to reduce search
cost. DCS (Ratnasamy Hotnets 2000), DIMS (Li
Sensys2003), DIFS (Greenstein SPNA
2002) Advantage Low communication overhead,
efficient search. Disadvantage Short-term use
when local storage capacity is limited.
A. Store data centrally
B. Store data locally
C. Multi-resolution storage
25Storage challenges
Method Store data locally, and construct
distributed index structures to reduce search
cost. DCS (Ratnasamy Hotnets 2000), DIMS (Li
Sensys2003), DIFS (Greenstein SPNA
2002) Advantage Low communication overhead,
efficient search. Disadvantage Short-term use
when local storage capacity is limited.
A. Store data centrally
B. Store data locally
C. Multi-resolution storage
26Data-Centric Storage (DCS)
- Data-Centric data are named
- Event data are stored, by name, at some home
nodes - Queries also go to the home nodes instead of
the nodes detected events
27The Big Picture
- Based on
- geographic routing (GPSR- Karp00)
- P2P lookup algorithm (CAN- Ratnasamy01)
28Distributed Hash Table (DHT)
- void Put(key, value)
- Stores value to home node in the sensor
networks according to key - Value Get(key)
- Retrieve value from home node in the sensor
networks according to key
29Properties of DHT
- Distributed Hash Function
- Known to everybody
- Every home node takes care of roughly the same
amount of event types - Evenly distributed geographically
- Candidate Message Digest Algorithms
- Such as SHA-1, MD5
30DHT - Example
Elephant
(11, 28)
MD5
5a76e813d6a0a40548b91acc11557bd2
31DCS Example Revisit
(11, 28)
(11,28)Hash(elephant)
32DCS Example
Get(elephant)
(11, 28)
(11,28)Hash(elephant)
33DCS Example
elephant
fire
34GPSR- ( Event Query Routing )
- Nodes know identifications and positions of their
neighbors - Location service required
- Send(value, x, y)
- Modes
- Greedy forwarding
- Packets are greedily forwarded to neighbor
closest to destination coordinates - Perimeter forwarding
35GPSR Greedy Forwarding
36GPSR - Void
37GPSR Perimeter Forwarding
Right Hand Rule Each node to receive a packet
forwards the packet to the next link
counterclockwise about itself from the ingress
link
38Other Routing Algorithms
- Directed Diffusion (Estrin02)
- interests are subscribed and gradients are
established - sinks reinforce gradients of high quality
- Rumor Routing (Braginsky02)
- Events are stored and forwarded randomly for k
hops - Query walks randomly until it hits or (selective
retransmission/ flooding is employed) - Gossip Routing (Li03)
- Nodes send message to some of its neighbors
instead of flooding
39Storage challenges
Method Provide gracefully degrading storage and
query quality over time. Advantage Long-term
storage in storage-constrained networks,
efficient search. (DIMENSIONS -
Ganesan03) Disadvantage More communication
overhead than (B).
A. Store data centrally
B. Store data locally
C. Multi-resolution storage
40Progressive Local Data Storage (Ganesan03)
- Apply progressive coding strategy to local
storage - Store the data at multiple resolutions locally
and phase out data at different resolutions at
different rates
- How much scaling does this provide?
- How to determine the aging periods of different
resolutions?
41Outline
- Motivation and Roadmap
- Internet-like Architecture
- Overview of each layer
- Research and future directions
- Plume Detection and Tracking
- Clustering Spatio-Temporal Data Distributions
42Plume Detection and Tracking
Plume origin
Actual plume boundary
43Application-dependent architecture design
- Goals
- Simple query shape of plume
- Spatio-temporal query plume location at the
specified time - Multi-resolution storage
44Plume Detection and Tracking
Plume origin
Actual plume boundary
Approx plume boundary
45Shape detection
- Region Based
- Medial Axis Transforms (Smith82)
- Moment-based Approaches (Loncaric98)
- Contour based
- Rosenfeld-Johnston method ( RJ78)
- Fourier Transforms (Zahn81)
- Wavelet Descriptors (Chuang96)
46Shape Approximation
- Boundary Detection
- Perimeter traversal
- Rosenfeld-Johnston
- Wavelet Transforms
47Traversal techniques
- Rosenfeld-Johnston summarization
- Dominant (high curvature) points are retained
- Sliding Window of (2k 1) nodes is used to
decide if the kth point is Dominant - Wavelet based summarization
- Transforms are applied to each dimension
independently - Incremental construction to retain kth level
coefficients
48 49(No Transcript)
50(No Transcript)
51Spatio-temporal Indexing
- Range Query
- Report whether the region specified by
x1,x2 x y1,y2 was affected during the time
interval t1,t2 ? - Indexing Scheme
- DIST
52Related Work
- P2P-RTree (Yilifu01)
- embeds an R-Tree onto the network.
- Root node becomes bottleneck.
- DIFS (Greenstein01)
- Range queries on a single attribute.
- multiple-rooted hierarchy.
- DIMS (Li03)
- Range queries on multi-dimensions
- Based on a locality-preserving hash
53DIST Indexing Scheme
- Movement is observed at different spatial
resolutions. - Grid is broken down into cells at various levels
(resolutions). - Lowest level depends on the size of the object.
- Cell leader keeps the time interval in which the
plume was present in its cell. -
54Update Propagation
55Querying Algorithms
- SCA
- Querys spatial range is decomposed into cells at
the resolution of shape. - Query is directed to the Smallest Common Ancestor
of these cells. - Query drills down pruning using both spatial and
temporal extents. - Direct
- Query traverses the spanning tree constructed out
of the cells.
56SCA Querying
57Direct Querying
58Aging
- Length of time for which a shape approximation is
stored. - Storing coarser resolution improves storage at
the cost of quality. - Exponential Aging scheme.
- Implementation
- Wavelets (k-1) th level coefficients are
computed from kth , and the latter are discarded. - RJ Local perimeter traversal using the same or
a different smoothing factor.
59Cloud Model Update Cost
60Cloud Model Querying Cost
61Cloud Model Querying Cost
62Adaptive Querying
63Extensions and Future Work
- Shape Detection
- Irregular Topologies.
- Faulty readings (Chintalapudi04).
- Path planning
- Find the safest path between specified points A
and B. - Accurate modeling of the plume motion.
- Implementation with humidity sensors.
B
A
64Outline
- Motivation and Roadmap
- Internet-like Architecture
- Overview of every layer
- Research and Future Directions.
- Plume Detection and Tracking
- Clustering Spatio-Temporal Data Distributions
65Discovering Climate Indices using Clustering
- Climate Indices
- Time Series that summarize the behavior of
selected regions of Earths ocean and atmosphere - Linking climate anomalies with ecosystem
responses - Discovering new phenomenon (Outliers)
http//www.cdc.noaa.gov/USclimate/Correlation
66Problem
- Cluster spatio-temporal data using a distributed
algorithm - Spatial Approximation
- Exploit correlations and locality in space to get
compact representative summaries - Temporal Approximation
- Compress temporal redundancies and efficiently
track the cluster movement with time
67Approaches
- K-Means and Fuzzy-k-Means
- Depends on the initial k- centers and might get
stuck in local maxima - Hierarchical
- Start as point clusters in every round only the
most similar clusters are merged - Mixture of Gaussians
- fitting data to the model considering Gaussian
distribution models with barycentres
68Related Work
- CLARANS (Ng94)
- Medoid -based
- R-tree To scale for disk-resident points
- DBSCAN (Ester96)
- Density-based clustering for discovering
arbitrary clusters - CURE (Ng98)
- Sampling-based hierarchical clustering
- All the above assume vector spaces
- BUBBLE (Ghanti00)
- Use the scalable MDS technique, FastMap (Lin et
al) to cluster in arbitrary metric spaces
69Related Work
- STING (Wang99)
- Grid-based approach to combine both hierarchy
and density - WaveCluster (Zhang00)
- Wavelet transforms are applied on the feature
space to result in clusters at different scales - Distributed Clustering
- Convex hulls of similar regions are transferred
to the helper sites to integrate spatial data
(Lazarevic99) - Sample locally and use a distributed clustering
algorithm based on non-parametric kernel density
estimates taking communication cost into account
(Klusch03)
70Approach
- Clustering in metric spaces
- 1. data distributions
- 2. EMD which is a measure of similarity is a
metric - Wavelet-based histograms are used for
representing cumulative data distribution at each
sensor node - Merge two physically adjacent data-distributions,
if the two distributions are only ? distance
away using the EMD (Earth Movers distance)
metric.
71Distributed Clustering
- Start as point clusters (every node as root)
- Every node exchanges the sketch with its local
neighbors - Merger Process
- Local ?-condition
- Global d-condition
- If the conditions are satisfied, merge the two
cluster trees - Iterate the merger process, starting with the
clusters and new ? and d to build a hierarchy
72Extensions and Future Work
- Distributed Clustering in the presence of
- Faults
- Cluster maintenance with time
- Prediction models
- Distributed Outlier detection
- Kernel density estimates
- Implementation on motes
73New Directions
Security
Precision Agriculture
Global seismic Grids/facilities
Tropical biology
Theatre,Film,TV
Coral reef
Macro-Programming
Adaptive Sampling
High Integrity ENS
RFIDs
NIMS