Title: System Management in Challenged Networks
1 - System Management in Challenged Networks
- CENS Seminar November 17th, 2006
- Martin Lukac
- Lewis Girod
- Deborah Estrin
- UCLA CENS - MIT CSAIL
2Outline
- Meso American Subduction Experiment (MASE) A
Challenged Network - Data Delivery
- System Management
- The Future
3Seismic Deployment Application Requirements
- Extensive 500 Km from Acapulco through Mexico
City to Tampico - Dense 1 sensor every 5-10 Km
- High bandwidth Data acquisition rate 3 - 24 bit
channels at 100Hz each - Online and Reliable Semi real-time (on the order
of days), reliable data delivery to UCLA for
analysis - Online system management
- Query state, change configuration, update
binaries - Can not interfere with data delivery
- Application driven topology application
determines sensor placement - Infrastructure does not (Cant rely on
pre-existing cell or power infrastructure)
MASE Given these requirements, we deployed solar
powered seismic stations equipped with 802.11b
418 - A 152 - B 69 - C 77 - D 107 - E 42 -
F 81 - G 202 - H 76 - I 106 - J 95 - K 53 -
L 157 - M
MASE 13 Node Cuernavaca Line
L
K
Data paths
A
- Network topology does not reflect the mostly
linear physical topology - Routing and other services can not use physical
topology
B
A sink Direct inet connection
F
G
D
C
E
H
M
I
J
N
5How challenged is the MASE network?
- Frequent unpredictable disconnections
- Rainy season sites flood (some 24x7), trees grow
- Wind misaligned antennas
- Equipment malfunction amps burn, voltage
regulators break - Poor and unstable links
- Connectivity secondary concern for site selection
- Stretched links highly susceptible to weather and
environment - Human effort is a critical resource
- Installation, maintenance, protection
6Networking support needed for both data
acquisition and system management
- Data delivery Bandwidth driven
- Bandwidth 20-40 of MB per day per station
- Latency get the data eventually, but reliably
- Many to one routing
- System Management Latency driven
- Bandwidth usually less than 10s of KBs
- Latency as fast as possible
- One to all routing and back
7Well-known limitations of existing techniques
- Data delivery and system management techniques
designed for wired or always-on-wireless do not
work well - Typical tools use TCP to create and maintain an
end to end session to deliver a stream of data
over multiple hops - These are online applications which expect
reliable links with low latencies - Patterns of poor links, disconnections, and
disruptions - Difficult to obtain and maintain end-to-end
connections - Intermittent end-to-end connections insufficient
to achieve necessary bandwidth and latency
8Our Contributions
- Real world application and deployment of Delay
Tolerant Networking (DTN) techniques for data
delivery - Disruption Tolerant Shell (DTS) a tool for
system management on challenged networks that
performs better than traditional tools
9Summary
- MASE A Challenged Network
- Poor and erratic links
- Frequent unpredictable disruptions
- Data Delivery
- System Management
- The Future
10Data Delivery using DTN Techniques
- Buffer data into hour long bundles (1-3 MB)
- Deliberate one hop bundle transfer
- Path to sink determined by best ETX
- Improvement over end-to-end
- Not affected by path disconnections
- Keeps retrying on single link instead of full
path - Continual progress being made towards sink
- More efficient use of bandwidth in face of
disconnections and bottlenecks
A
X
X
B
X
X
C
F
end-to-end
hop-by-hop
11Upcoming Features
- Currently piggyback data movement log with actual
data - No global time stamping of log events
- Want coarse grained global time (one second)
- Will be able to recreate movie of file movement
for entire network - Can help spot network problems and bottlenecks
- Upload data to SensorBase.org
- Makes it easy to visualize and browse data
collection status - RSS feed can provide access to anyone who wants
to monitor problems or generic status of network
12Data Acknowledgement
- Nodes keep their own bundles until ACKed by sink
- Many ways of doing ACKs
- First try for ACK implementation worked
- Push bundle ID into StateSync (disseminates
information to all the nodes in the network) - But usage model not quite right too many
entires, too much churn for StateSync (can
explain better later) - Second try
- Use file dissemination feature of DTS to
distribute ACK list once a day - Use DTS to remove list once we know all nodes
have file
13Summary
- MASE A Challenged Network
- Poor and erratic links
- Frequent unpredictable disruptions
- DTN Style Data Delivery
- Resilient to path disconnections
- Efficient use of bandwidth
- System Management
- The Future
14System Management
- Existing management tool remote shell (ssh)
- Modified management tool Disruption Tolerant
Shell - Asynchronous remote shell to all nodes in network
simultaneously - Provides node management capabilities when
end-to-end connections are unavailable or fail - Ensures that commands will succeed as long as
there is eventually a connection between a node
and any other node that already has the command
df h ls /opt/dts/file_mover wc
A
E
B
C
D
F
Commands
Responses
15Extra Fun Features of DTS
- Guaranteed in order execution from source node
- Reboot and crash safe
- Implicit feed back on nodes and links spot
bottlenecks, dead nodes - Execute a command on individual nodes
- Push a file to all nodes
- Distribute new script or component
16Upcoming Features
- Web interface
- Command line interface is nice for me
- Takes a bit of getting used to
- Web interface more intuitive for asynchronous
model - Constant feeds of frequently executed commands
- Disk space, file counts, q330/gurlap status, link
quality - SensorBase.org
- Accountability log load all commands and
responses and metadata for those - DTS analysis and implicit network feedback just
point and click
17Reliable State Synchronization
A
- StateSync reliable and efficient
publish-subscribe mechanism - Implements a broadcast dissemination protocol
- Published data is scoped
- DTS publishes commands and responses one hop
- Works well for applications that require
- Reliable delivery
- Have a few Kbytes of data to share
- Data has lifetime that is long compared to system
latency requirements - Suitable for DTN since it does not use end-to-end
connections
PUBLISH
Commands
Responses
SYNCHRONIZE
B
PUBLISH
Commands
Responses
SYNCHRONIZE
C
PUBLISH
Commands
Responses
18DTS latency results
- Compare latency of DTS to parallel ssh
- DTS is faster 90 of the time, comparable to the
rest - DTS reaches 100 of nodes
- ssh requires retries from the source node
- Latency can vary by day, but DTS always faster or
comparable to ssh
19What makes DTS better than ssh?
A
- StateSync data model tables of key value pairs
- DTS has a command table and response table
- Each node republishes a command and response
tables one hop - Logging mechanism
- Do not republish whole table
- Only send changes to tables small amount of
information - More efficient use of bandwidth in face of
disconnections - Retransmission protocol
- Keeps retrying on individual links
- Not affected by path disconnections
- No overhead of creating and maintaining
end-to-end connection
Cmd A-1
Resp A-1-A
Resp A-1-A
Cmd A-1
Resp A-1-B
Resp A-1-C
B
Resp A-1-A
Cmd A-1
Cmd A-1
Resp A-1-A
Resp A-1-B
Resp A-1-B
Resp A-1-C
Resp A-1-C
20Future of StateSync
- StateSync allows data to be published N hops
- When publish N hops, not end to end but expect
data path (the flow) to be maintained with
refresh beacons - If refreshes from source or node in flow stop,
statesync will not propagate information - Not idea for frequent disconnections
- DTS publishes data one hop
- Gets around problem by republishing another nodes
data as its own - Statesync only publishes one hop
- Tweaks
- Allow flows to be propagated even when no refresh
from source or node along data path - Tunable latency parameters
- Report metrics about itself
- DTS can then publish data N hops
- Lowers RAM usage, lowers number of packets
21Site Installation
- Mexico Xyoli Pérez-Campos, Mario Islas Herrera,
Oscar MartÃnez Susano, Jorge Soto, Aida Quezada
Reyes, Arturo Iglesias, Lizbeth Espejo, Luis
Antonio Placencia Gómez, Luis Edgar Rodriguez,
Fernando Greene
USA Paul Davis, Allen Husker, Igor Stubailo,
Richard Guy, Sam Irving, Martin Lukac, Alma
Quezada, Steve Skinner, Irving Flores
22Our Contributions
- Real world application and deployment of Delay
Tolerant Networking (DTN) techniques for data
delivery - Disruption Tolerant Shell (DTS) a tool for
system management on challenged networks that
performs better than traditional tools
23Summary
- MASE A Challenged Network
- Poor and erratic links
- Frequent unpredictable disruptions
- DTN Style Data Delivery
- Resilient to path disconnections
- Efficient use of bandwidth
- System Management
- DTS viable tool for system management for
challenged networks - The Future
24Whats Next?
- Have a tool that works
- Understand conceptually why it works better
- We have a high level analysis per link bandwidth
- Network is being pulled out in Feburary
25Work in Progress
- Need better network characterization
- Long-Distance 802.11b Links Performance
Measurements and Experience, K. Chebrolu, B.
Raman, S. Sen ITT Kanpur, Mobicom 2006 - Use their driver to collect per packet received
signal strength, silence value, MAC packet type
subtype, CRC check succeeded or not, MAC address
information, MAC sequence number information - Is our network different then theirs? Antennas,
chipsets are the same. Our network is not always
way up high and do not have good link quality
all the time. - Coordinated IP level dumps on entire network
- Cant stop data flow
- Synchronize dumps between nodes
- Coordinate with driver information
- How do the long links affect the transfers?
- Huge hidden terminal problem, does rts/cts seem
to help?
Vinayak analyzed received signal strength (RSS)
for a single source-destination pair in the UNAM
line. Max RSS -46dBm (83 of data) Min RSS
-81dBm (10 of data) Difference of 35dB Max/Min
for IIT-Kanpur's -70dBm / -90dBm Difference of
20dB Next do this on Cuernavaca line. Maybe it
will have higher variation than that of
UNAM. High variation might be from inter-link
interference since RTS-CTS is off See what
RTS-CTS does. If still high link variation, then
Mexico network is intrinsically different
from that in India. May be our network is in
between Boston's urban Roofnet and Kanpur's rural
network?
26New Applications
- DTS and DTN ideas/techniques can (must?) be
applied to two new CENS applications - GeoNet
- SHM (Structure Health Monitoring)
27GeoNet Rapidly Deployable Challenged Network
- Platform to support high data rate rapidly
deployed large-scale WSN - Deploy 100-1000 nodes after event at a
separation of 0.5-1Km - Software tools for rapid deployment
- Must make real time decision about sensor
location vs. network connectivity tradeoff - Need as much feedback from network as possible
- Power efficient platform such as LEAP needs
appropriate software architecture. - Network time synchronization when no GPS
available - Data deliver system management
- Take advantage of dual radios?
28SHM
- SHM framework to improve safety and reliability
of aerospace, civil and mechanical infrastructure
by detecting damage before it reaches a critical
state - Initially targeting tall buildings
- Still a challenged network
- Building structure (walls, ceilings), people,
other networks, stuff
29Thank you!
Demo!
Thanks to Igor and Derek for all the pictures and
diagrams!
Teotihuacan, 2006
30MASE Wireless Seismic Station
15 dBi YAGI or 24 dBi Parabolic 2.4GHz
antenna 70 watt solar panel, GPS mast and guy
wires Quanterra Q330 24-bit digitizer sensor
controller 2.4GHz amp car battery CDCC (CENS
Data Communication Controller) Guralp 3T
seismometer
31Following slides prepared by Roy Clayton
(CalTech) and Igor Stubailo (UCLA CENS)
Science!
32The Middle America Subduction Experiment (MASE).
Why Mexico? Slab detachment theory.
- A subduction zone is an area on Earth where two
tectonic plates meet and move towards one
another, with one sliding underneath the other
and moving down into the mantle, at a speed of
several inches per year. - Typically, an oceanic plate slides underneath a
continental plate, and this often creates a zone
with many volcanoes and earthquakes.
B
Ferrari, 2004, Geology
33Similarities of Mexico City and Los Angeles
locations
- LA and Mexico City are major centers of commerce
which sit upon compliant sedimentary basins. - Both are subject to damaging earthquakes and how
earthquakes excite resonant shaking
34Great potential of high station density
- Achieve 20 times better resolution than before.
- Provide visualization of the upper mantle and the
subduction process, coast to coast across Mexico. - The data collected is very valuable to scientists
in seismology, geodesy, geochemistry, geology,
computational geodynamics, geophysics, and others
35Russian Event (Kamchatka) April 20, 2006, M7.7
36First results detect flat slab with receiver
functions
Rob Clayton, Caltech, 2006
37Related Work
- P. Levis, N. Patel, D. E. Culler, and S.
Shenker.Trickle A self-regulating algorithm for
code propagation and maintenance in wireless
sensor networks. NSDI 2004 - Whitehouse, C. Sharp, E. Brewer, and D. Culler.
Hood a neighborhood abstraction for sensor
networks. MobiSYS '04 - A. Vahdat, D. Becker. Epidemic Routing for
Partially-Connected Ad Hoc Networks. Duke
Technical Report CS-2000-06 - K. Fall. A Delay-Tolerant Network Architecture
for Challenged Internets. SIGCOMM 2003 - DTNRG (http//www.dtnrg.org)
38Related Work
- Sensor network epidemic dissemination and state
synchronization - P. Levis, N. Patel, D. E. Culler, and S.
Shenker.Trickle A self-regulating algorithm for
code propagation and maintenance in wireless
sensor networks. NSDI 2004 - Whitehouse, C. Sharp, E. Brewer, and D. Culler.
Hood a neighborhood abstraction for sensor
networks. MobiSYS '04 - Lots of ideas from all the DTN work
- K. Fall. A Delay-Tolerant Network Architecture
for Challenged Internets. SIGCOMM 2003 - DTNRG (http//www.dtnrg.org)