Title: Growth Codes: Maximizing Sensor Network Data Persistence
1Growth CodesMaximizing Sensor Network Data
Persistence
- Abhinav Kamra, Vishal Misra, Dan Rubenstein
- Department of Computer Science, Columbia
University
Jon Feldman Google Labs
2Outline
- Problem Description
- Solution Approach Growth Codes
- Experiments and Simulations
- Conclusions and Ongoing work
3Background A generic sensor network
Sensor Nodes
Data follows multi-hop path to sink(s)
Sink(s)
x1
x9
Sensed Data
x10
x2
A single node failure can break the data flow
x12
x11
x3
x8
x5
x6
Generic Aim Collect data from all nodes at
sink(s)
x13
x7
x4
4Specific Context for our problem
- Sensor Networks in a Disaster setting
- E.g., Monitoring earthquakes, fires, floods
- Problems in this setting
- Congestion near sink(s)
- All nodes simultaneously forward data
- Overwhelm sink(s) capacity
- Network Collapsing nodes failing rapidly
- Pre-computed routes may fail
- Data from failed nodes can be lost
5Challenges
- Networking Challenges
- Disaster scenarios feedback often infeasible
- Frequent disruptions to routing tree if setup
- Difficult to predict node failures sink
locations unknown, surviving routes unknown - Difficult to synchronize clocks amongst nodes
- Coding Challenges
- Data source distributed (among all sensor nodes)
- Prior approaches (Turbo codes, LDPC codes) aim at
fast complete recovery - Sensor nodes have very limited memory, CPU,
bandwidth
6Objectives
- Fraction of data that eventually reaches the
sink(s)
Preserve data from failed sensor nodes
6 of 10 symbols reach sink. Persistence 60
Deliver data to sink(s) as fast as possible
Maximize Data Persistence
7Limitations of Previous Work
- Channel Coding based
- (e.g. Turbo Codes Anderson-ISIT94, LT Codes
Luby02) - Aim for complete recovery in minimum time
- Difficult to implement with distributed sources
- Routing-based
- (e.g. Directed Diffusion Govindan00, Cougar
Yao-SIGMOD02) - Conjecture Is fragile (disrupted easily) in
disaster scenarios
8Our Approach
- Two main ideas
- Randomized routing and replication
- Avoid actively maintaining routes
- Replicate data to increase data survival
- Distributed channel codes (Growth Codes)
- Expedite data delivery survivability
First (to our knowledge) distributed channel codes
9Outline
- Problem Description
- Our Solution Growth Codes
- Experiments and Simulations
- Conclusions and Ongoing work
10Network Assumptions
4
3
2
5
S
1
6
7
S
- N node sensor network
- Limited storage each node stores C data units
- Large storage at sink(s)
- All sensed data assumed independent (no source
coding)
11High Level View of the Protocol
4
1
2
3
Nodes send data at random times (Current
implementation exponentially distributed timers)
12High Level View of the Protocol (2)
1
2
?
0
Degree 2 codeword
Even if node 3 fails Node 3s data survives
K1
Sender picks a random symbol
XORs it with its own symbol
K2
K3
After time K1, nodes start sending degree 2
codewords
13High Level View of the Protocol (3)
- After time K1, nodes start sending degree 2
codewords - After time K2, nodes start sending degree 3
codewords -
. -
. -
. - After time Ki, nodes start sending degree i1
codewords
Times Ki can be out of sync at different nodes
No need to tightly synchronize clocks
14The Intuition behind Growth Codes
Codewords
When very few symbols decoded
Easy to decode low degree codewords
Set of symbols decoded at Sink
time line
15The Intuition behind Growth Codes(2)
Codewords
When significant number of symbols decoded
Low degree codewords often redundant
Set of symbols decoded at Sink
Higher degree codewords more likely to be useful
16Outline
- Problem Description
- Growth Codes
- Experiments and Simulations
- Conclusions and Ongoing work
17Simulations/ExperimentsCompare data persistence
of various approaches
- Simulations
- Centralized Setting compare GC with other
channel coding schemes - (e.g. Soliton, Robust Soliton) LT Codes
Luby02 - Distributed Simulation to assess the performance
gain of coding vs no coding - Experiment on motes
- compare time for complete recovery of GC vs
routing - resilience to node failures
18Comparison with various coding schemes(N 1500)
- Centralized Simulation
- (to compare with other channel coding schemes)
- Single source, single sink
- Source generates codewords according to coding
scheme (GC, Soliton, R-Soliton) - Zero failure rate
- No coding is fast in beginning
- Slowdown is explained via Coupon Collectors
problem - Soliton/ R-Soliton slow in the beginning (use lot
more higher degree codewords) - Growth Codes tries to decode the maximum at any
time
1
19Growth Codes vs No Coding(Varying N)
- Distributed Simulation
- (to assess the performance gain of coding)
- N sources, single sink
- Random graph topology (avg degree 10)
- Sink receives 1 codeword per time unit
- Complete recovery takes
- O(N logN) time without coding (Coupon Collectors
effect) - Linear time with Growth Codes
- Soliton/R-Soliton have no distributed
implementation. How to compare?
20Experiments with (micaz) motes
- (to measure data persistence with time)
- GC vs TinyOSs MultiHop routing protocol
- No routing setup at beginning (scenarios where
sensor nodes are deployed rapidly)
- MultiHop for persistence takes long time to
complete route setup - Comparison with GC simulator validates simulator
performance
21Motes experimentsResilience to node failures
- Nodes generate data every 300 seconds
- 3 nodes fail just after 3rd data generation
22Motes experimentsResilience to node failures
- 1st generation, GC faster, MH takes time to setup
routes - 2nd generation, routing already setup, MH very
fast - 3rd generation, MH needs to repair routes
23Conclusions
- Data persistence in sensor networks
- First distributed channel codes (GC)
- Protocol requires minimal configuration
- Is robust to node failures
- Simulations and experiments on micaz motes show
- GC achieves complete recovery faster
- GC recovers more partial data at any time
24Ongoing Work
- Adapt Growth Codes to scenarios where sensor data
is correlated - Take advantage of any available routing
information (e.g. before a disaster) - Estimate network size on the fly to use in Growth
Codes
25Thanks for your patience !
- For more information
- DNA Research Lab, Columbia University
- http//dna-wsl.cs.columbia.edu/