Reliable Transport and Code Distribution in Wireless Sensor Networks - PowerPoint PPT Presentation

1 / 66
About This Presentation
Title:

Reliable Transport and Code Distribution in Wireless Sensor Networks

Description:

Potentially not as bad as it sounds! Cluster/group based communication ... Requirements and Properties of Code Distribution. The complete image must reach all nodes ... – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 67
Provided by: thanossta
Category:

less

Transcript and Presenter's Notes

Title: Reliable Transport and Code Distribution in Wireless Sensor Networks


1
Reliable Transport and Code Distribution in
Wireless Sensor Networks
  • Thanos Stathopoulos
  • CS 213 Winter 04

2
Reliability Introduction
  • Not an issue on wired networks
  • TCP does a good job
  • Link error rates are usually 10-15
  • No energy cost
  • However, WSNs have
  • Low power radios
  • Error rates of up 30 or more
  • Limited range
  • Energy constraints
  • Retransmissions reduce lifetime of network
  • Limited storage
  • Buffer size cannot be too large
  • Highly application-specific requirements
  • No single TCP-like solution

3
Approaches
  • Loss-tolerant algorithms
  • Leverage spatial and temporal redundancy
  • Good enough for some applications
  • But what about code updates?
  • Add retransmission mechanism
  • At the link layer (e.g. SMAC)
  • At the routing/transport layer
  • At the application layer
  • Hop-by-hop or end-to-end?

4
Relevant papers
  • PSFQ A Reliable Transport Protocol for Wireless
    Sensor Networks
  • RMST Reliable Data Transport in Sensor Networks
  • ESRT Event-to-Sink Reliable Transport in
    Wireless Sensor Networks

5
PSFQ Overview
  • Key ideas
  • Slow data distribution (pump slowly)
  • Quick error recovery (fetch quickly)
  • NACK-based
  • Data caching guarantees ordered delivery
  • Assumption no congestion, losses due only to
    poor link quality
  • Goals
  • Ensure data delivery with minimum support from
    transport infrastructure
  • Minimize signaling overhead for
    detection/recovery operations
  • Operate correctly in poor link quality
    environments
  • Provide loose delay bounds for data delivery to
    all intended receivers
  • Operations
  • Pump
  • Fetch
  • Report

6
End-to-end considered harmful ?
  • Probability of reception degrades exponentially
    over multiple hops
  • Not an issue in the Internet
  • Serious problem if error rates are considerable
  • ACKs/NACKs are also affected

7
Proposed solution Hop-by-Hop error recovery
  • Intermediate nodes now responsible for error
    detection and recovery
  • NACK-based loss detection probability is now
    constant
  • Not affected by network size (scalability)
  • Exponential decrease in end-to-end
  • Cost Keeping state on each node
  • Potentially not as bad as it sounds!
  • Cluster/group based communication
  • Intermediates are usually receivers as well

8
Pump operation
  • Node broadcasts a packet to its neighbors every
    Tmin
  • Data cache used for duplicate suppression
  • Receiver checks for gaps in sequence numbers
  • If all is fine, it decrements TTL and schedules a
    transmission
  • Tmin lt Ttransmit lt Tmax
  • By delaying transmission, quick fetch operations
    are possible
  • Reduce redundant transmissions (dont transmit if
    4 or more nodes have forwarded the packet
    already)
  • Tmax can provide a loose delay bound for the last
    hop
  • D(n)Tmax ( of fragments) ( of hops)

9
Fetch operation
  • Sequence number gap is detected
  • Node will send a NACK message upstream
  • Window specifies range of sequence numbers
    missing
  • NACK receivers will randomize their transmissions
    to reduce redundancy
  • It will NOT forward any packets downstream
  • NACK scope is 1 hop
  • NACKs are generated every Tr if there are still
    gaps
  • Tr lt Tmax
  • This is the pump/fetch ratio
  • NACKs can be cancelled if neighbors have sent
    similar NACKs

10
Proactive Fetch
  • Last segments of a file can get lost
  • Loss detection impossible no next segment
    exists!
  • Solution timeouts (again)
  • Node enters proactive fetch mode if last
    segment hasnt been received and no packet has
    been delivered after Tpro
  • Timing must be right
  • Too early wasted control messages
  • Too late increased delivery latency for the
    entire file
  • Tpro a (Smax - Smin) Tmax
  • A node will wait long enough until all upstream
    nodes have received all segments
  • If data cache isnt infinite
  • Tpro a k Tmax (Tpro is proportional to
    cache size)

11
Report Operation
  • Used as a feedback/monitoring mechanism
  • Only the last hop will respond immediately
    (create a new packet)
  • Other nodes will piggyback their state info when
    they receive the report reply
  • If there is no space left in the message, a new
    one will be created

12
Experimental results
  • Tmax 0.3s, Tr 0.1s
  • 100 30-byte packets sent
  • Exponential increase in delay happens at 11 loss
    rate or higher

13
PSFQ Conclusion
  • Slow data dissemination, fast data recovery
  • All transmissions are broadcast
  • NACK-based, hop-by-hop recovery
  • End-to-end behaves poorly in lossy environments
  • NACKs are superior to ACKs in terms of energy
    savings
  • No out-of-order delivery allowed
  • Uses data caching extensively
  • Several timers and duplicate suppression
    mechanisms
  • Implementing any of those on motes is challenging
    (non-preemptive FIFO scheduler)

14
RMST Overview
  • A transport layer protocol
  • Uses diffusion for routing
  • Selective NACK-based
  • Provides
  • Guaranteed delivery of all fragments
  • In-order delivery not guaranteed
  • Fragmentation/reassembly

15
Placement of reliability for data transport
  • RMST considers 3 layers
  • MAC
  • Transport
  • Application
  • Focus is on MAC and Transport

16
MAC Layer Choices
  • No ARQ
  • All transmissions are broadcast
  • No RTS/CTS or ACK
  • Reliability deferred to upper layers
  • Benefits no control overhead, no erroneous path
    selection
  • ARQ always
  • All transmissions are unicast
  • RTS/CTS and ACKs used
  • One-to-many communication done via multiple
    unicasts
  • Benefits packets traveling on established paths
    have high probability of delivery
  • Selective ARQ
  • Use broadcast for one-to-many and unicast for
    one-to-one
  • Data and control packets traveling on established
    paths are unicast
  • Route discovery uses broadcast

17
Transport Layer Choices
  • End-to-End Selective Request NACK
  • Loss detection happens only at sinks (endpoints)
  • Repair requests travel on reverse (multihop) path
    from sinks to sources
  • Hop-by-Hop Selective Request NACK
  • Each node along the path caches data
  • Loss detection happens at each node along the
    path
  • Repair requests sent to immediate neighbors
  • If data isnt found in the caches, NACKs are
    forwarded to next hop towards source

18
Application Layer Choices
  • End-to-End Positive ACK
  • Sink requests a large data entity
  • Source fragments data
  • Sink keeps sending interests until all fragments
    have been received
  • Used only as a baseline

19
RMST details
  • Implemented as a Diffusion Filter
  • Takes advantage of Diffusion mechanisms for
  • Routing
  • Path recovery and repair
  • Adds
  • Fragmentation/reassembly management
  • Guaranteed delivery
  • Receivers responsible for fragment retransmission
  • Receivers arent necessarily end points
  • Caching or non-caching mode determines
    classification of node

20
RMST Details (contd)
  • NACKs triggered by
  • Sequence number gaps
  • Watchdog timer inspects fragment map periodically
    for holes that have aged for too long
  • Transmission timeouts
  • Last fragment problem
  • NACKs propagate from sinks to sources
  • Unicast transmission
  • NACK is forwarded only if segment not found in
    local cache
  • Back-channel required to deliver NACKs to
    upstream neighbors

21
Evaluation
  • NS-2 simulation
  • 802.11 MAC
  • 21 nodes
  • single sink, single source
  • 6 hops
  • MAC ARQ set to 4 retries
  • Image size 5k
  • 50 100-byte fragments
  • Total cost of sending the entire file 87,818
    bytes
  • Includes diffusion control message overhead
  • All results normalized to this value

22
Results Baseline (no RMST)
  • ARQ and S-ARQ have high overhead when error rates
    are low
  • S-ARQ is better in terms of efficiency
  • Also helps with route selection
  • No ARQ results drop considerably as error rates
    increase
  • Exponential decay of end-to-end reliability
    mechanisms

23
Results RMST with H-b-H Recovery and Caching
  • Slight improvement for ARQ and S-ARQ results over
    baseline
  • No ARQ is better even in the 10 error rate case
  • But, many more exploratory packets were sent
    before the route was established

24
Results RMST with E-2-E Recovery
  • No ARQ doesnt work for the 10 error rate case
  • Numerous holes that required NACKs couldnt make
    it from source to sink without link-layer
    retransmissions
  • ARQ and S-ARQ results are statistically
    insignificant from H-b-H results
  • NACKs were very rare when any form of ARQ was
    used

25
Results Performance under High Error Rates
  • No ARQ doesnt work for the 30 error rate case
  • Diffusion control messages could not establish
    routes most of the time
  • In the 20 case, it took several minutes to
    establish routes

26
RMST Conclusion
  • ARQ helps with unicast control and data packets
  • In high error-rate environments, routes cannot be
    established without ARQ
  • Route discovery packets shouldnt use ARQ
  • Erroneous path selection can occur
  • RMST combines a NACK-based transport layer
    protocol with S-ARQ to achieve the best results

27
Congestion Control
  • Sensor networks are usually idle
  • Until an event occurs
  • High probability of channel overload
  • Information must reach users
  • Solution congestion control

28
ESRT Overview
  • Places interest on events, not individual pieces
    of data
  • Application-driven
  • Application defines what its desired event
    reporting rate should be
  • Includes a congestion-control element
  • Runs mainly on the sink
  • Main goal Adjust reporting rate of sources to
    achieve optimal reliability requirements

29
Problem Definition
  • Assumption
  • Detection of an event is related to number of
    packets received during a specific interval
  • Observed event reliability ri
  • of packets received in decision interval I
  • Desired event reliability R
  • of packets required for reliable event
    detection
  • Application-specific
  • Goal configure the reporting rate of nodes
  • Achieve required event detection
  • Minimize energy consumption

30
Reliability vs Reporting frequency
  • Initially, reliability increases linearly with
    reporting frequency
  • There is an optimal reporting frequency (fmax),
    after which congestion occurs
  • Fmax decreases when the of nodes increases

31
Characteristic Regions
  • n normalized reliability indicator
  • (NC,LR) No congestion, Low reliability
  • f lt fmax, n lt 1-e
  • (NC, HR) No congestion, High reliability
  • f lt fmax, n lt 1e
  • (C, HR) Congestion, High reliability
  • f gt fmax, n gt 1
  • (C, LR) Congestion, Low reliability
  • f lt fmax, n lt 1
  • OOR Optimal Operating Region
  • f lt fmax, 1-e lt n lt 1e

32
Characteristic Regions
33
ESRT Requirements
  • Sink is powerful enough to reach all source nodes
    (i.e. single-hop)
  • Nodes must listen to the sink broadcast at the
    end of each decision interval and update their
    reporting rates
  • A congestion-detection mechanism is required

34
Congestion Detection and Reliability Level
  • Both done at the sink
  • Congestion
  • Nodes monitor their buffer queues and inform the
    sink if overflow occurs
  • Reliability Level
  • Calculated by the sink at the end of each
    interval based on packets received

35
ESRT Protocol Operation
  • (NC, LR)
  • (NC, HR)
  • (C, HR)
  • (C, LR)

36
ESRT Conclusion
  • Reliability notion is application-based
  • No delivery guarantees for individual packets
  • Reliability and congestion control achieved by
    changing the reporting rate of nodes
  • Pushes all complexity to the sink
  • Single-hop operation only

37
Code Distribution Introduction
  • Nature of sensor networks
  • Expected to operate for long periods of time
  • Human intervention impractical or detrimental to
    sensing process
  • Nevertheless, code needs to be updated
  • Add new functionality
  • Incomplete knowledge of environment
  • Predicting right set of actions is not always
    feasible
  • Fix bugs
  • Maintenance

38
Approaches
  • Transfer the entire binary to the motes
  • Advantage
  • Maximum flexibility
  • Disadvantage
  • High energy cost due to large volume of data
  • Use a VM and transfer capsules
  • Advantage
  • Low energy cost
  • Disadvantages
  • Not as flexible as full binary update
  • VM required
  • Reliability is required regardless of approach

39
Papers
  • A Remote Code Update Mechanism for Wireless
    Sensor Networks
  • Trickle A Self-Regulating Algorithm for Code
    Propagation and Maintenance in Wireless Sensor
    Networks

40
MOAP Overview
  • Code distribution mechanism specifically targeted
    for Mica2 motes
  • Full binary updates
  • Multi-hop operation achieved through recursive
    single-hop broadcasts
  • Energy and memory efficient

41
Requirements and Properties of Code Distribution
  • The complete image must reach all nodes
  • Reliability mechanism required
  • If the image doesnt fit in a single packet, it
    must be placed in stable storage until transfer
    is complete
  • Network lifetime shouldnt be significantly
    reduced by the update operation
  • Memory and storage requirements should be moderate

42
Resource Prioritization
  • Energy Most important resource
  • Radio operations are expensive
  • TX 12 mA
  • RX 4 mA
  • Stable storage (EEPROM)
  • Everything must be stored and Write()s are
    expensive
  • Memory usage
  • Static RAM
  • Only 4K available on current generation of motes
  • Code update mechanism should leave ample space
    for the real application
  • Program memory
  • MOAP must transfer itself
  • Large image size means more packets transmitted!
  • Latency
  • Updates dont respond to real-time phenomena
  • Update rate is infrequent
  • Can be traded off for reduced energy usage

43
Design Choices
  • Dissemination protocol How is data propagated?
  • All at once (flooding)
  • Fast
  • Low energy efficiency
  • Neighborhood-by-neighborhood (ripple)
  • Energy efficient
  • Slow
  • Reliability mechanism
  • Repair scope local vs global
  • ACKs vs NACKs
  • Segment management
  • Indexing segments and gap detection Memory
    hierarchy vs sliding window

44
Ripple Dissemination
  • Transfer data neighborhood-by-neighborhood
  • Single-hop
  • Recursively extended to multi-hop
  • Very few sources at each neighborhood
  • Preferably, only one
  • Receivers attempt to become sources when they
    have the entire image
  • Publish-subscribe interface prevents nodes from
    becoming sources if another source is present
  • Leverage the broadcast medium
  • If data transmission is in progress, a source
    will always be one hop away!
  • Allows local repairs
  • Increased latency

45
Reliability Mechanism
  • Loss responsibility lies on receiver
  • Only one node to keep track of (sender)
  • NACK-based
  • In line with IP multicast and WSN reliability
    schemes
  • Local scope
  • No need to route NACKs
  • Energy and complexity savings
  • All nodes will eventually have the same image

46
Retransmission Policies
  • Broadcast RREQ, no suppression
  • Simple
  • High probability of successful reception
  • Highly inefficient
  • Zero latency
  • Broadcast RREQ, suppression based on randomized
    timers
  • Quite efficient
  • Complex
  • Latency and successful reception based on
    randomization interval

47
Retransmission Policies (contd)
  • Broadcast RREQ, fixed reply probability
  • Simple
  • Good probability of successful reception
  • Latency depends on probability of reply
  • Average efficiency
  • Broadcast RREQ, adaptive reply probability
  • More complex than the static case
  • Similar latency/reception behavior
  • Unicast RREQ, single reply
  • Smallest probability of successful reception
  • Highest efficiency
  • Simple
  • Complexity increases if source fails
  • Zero latency
  • High latency if source fails

48
Segment Management Discovering if a segment is
present
  • No indexing
  • Nothing kept in RAM
  • Need to read from EEPROM to find if segment i is
    missing
  • Full indexing
  • Entire segment (bit)map is kept in RAM
  • Look at entry i (in RAM) to find if segment is
    missing
  • Partial indexing
  • Map kept in RAM
  • Each entry represents k consecutive segments
  • Combination of RAM and EEPROM lookup needed to
    find if segment i is missing

49
Segment Management (contd)
  • Hierarchical full indexing
  • First-level map kept in ram
  • Each entry points to a second-level map stored in
    EEPROM
  • Combination of RAM and EEPROM lookup needed to
    find if segment i is missing
  • Sliding window
  • Bitmap of up to w segments kept in RAM
  • Starting point last segment received in order
  • RAM lookup
  • Limited out-of-order tolerance!

50
Retransmission Polices Comparison
51
Segment Management Comparison
52
Results Energy efficiency
  • Significant reduction in traffic when using
    Ripple
  • Up to 90 for dense networks
  • Full Indexing performs 5-15 better than Sliding
    Window
  • Reason better out-of-order tolerance
  • Differences diminish as network density grows

53
Results Latency
  • Flooding is 5 times faster than Ripple
  • Full indexing is 20-30 faster than Sliding
    window
  • Again, reason is out-of-order tolerance

54
Results Retransmission Policies
  • Order-of-magnitude reduction when using unicasts

55
Current Mote implementation
  • Using Ripple-sliding window with unicast
    retransmission policy
  • User builds code on the PC
  • Packetizer creates segments out of binary
  • Mote attached to PC becomes original source and
    sends PUBLISH message
  • Receivers 1 hop away will subscribe, if version
    number is greater than their own
  • When a receiver gets the full image, it will send
    a PUBLISH message
  • If it doesnt receive any subscriptions for some
    time, it will COMMIT the new code and invoke the
    bootloader
  • If a subscription is received, node becomes a
    source
  • Eventually, sources will also commit

56
Current Mote Implementation (contd)
  • Retransmissions have higher priority than data
    packets
  • Duplicate requests are suppressed
  • Nodes keep track of their sources activity with
    a keepalive timer
  • Solves NACK last packet problem
  • If the source dies, the keepalive expiration will
    trigger a broadcast repair request
  • Late joiner mechanism allows motes that have just
    recovered from failure to participate in code
    transfer
  • Requires all nodes to periodically advertise
    their version
  • Footprint
  • 700 bytes RAM
  • 4.5K bytes ROM

57
MOAP Conclusion
  • Full binary updates over multiple hops
  • Ripple dissemination reduces energy consumption
    significantly
  • Sliding window method and unicast retransmission
    policy also reduce energy consumption and
    complexity
  • Successful updates of images up to 30K in size
  • Next steps
  • Larger experiments
  • Better Late Joiner mechanism
  • Verification phase
  • Sending DIFFS instead of full image

58
Trickle Overview
  • State synchronization/code propagation mechanism
  • Suitable for VM environments, where transmitted
    code is small
  • Uses polite gossip dissemination
  • Periodic broadcasts of state summary
  • Nodes overhear transmissions and stay quiet
    unless they need to update
  • Goals
  • Propagation install new code
  • Maintenance detect propagation need

59
Basic Mechanism
  • A node will periodically transmit information
  • Only if less than a number of neighbors hasnt
    sent the same data
  • Cells (neighborhoods) can be in two states
  • All nodes up to date
  • Update needed
  • Node learns about new code
  • Node detects neighbor with old code
  • Since communication can be transmission or
    reception, ideally only one node per cell needs
    to transmit
  • Similar to MOAPs ideal single-source scenario

60
Maintenance
  • Time is split in periods
  • Nodes pick a random slot from 0..T
  • Transmission occurs if a node has heard less than
    K other identical transmissions
  • Otherwise, node stays quiet
  • K is small (1 or 2 usually)
  • If a node detects a neighbor that is out of date,
    it transmits the newer code
  • If a node detects it is out of date, it transmits
    its state
  • Update is triggered when other nodes receive this
    transmission
  • Nodes transmit at most once per period
  • In the presence of losses, scaling property is
    O(logn)

61
Maintenance and timesync
  • When nodes are synchronized, everything works
    fine
  • If nodes are out of sync, some might transmit
    before others had a chance to listen
  • Short-listen problem
  • O(sqrt(n)) scaling
  • Solution Enforce a listen-only periond
  • Pick a slot from T/2..T for transmission

62
Maintenance and timesync (contd)
63
Propagation
  • Large T
  • Low communication overhead (less probable to pick
    same slot)
  • High latency
  • Small T reversed
  • Solution dynamic scaling of T
  • Use two bounds TL and TH
  • When T expires, it doubles until it reaches TH
  • When newer state is overheard, T TL
  • When older state is overheard, immediately send
    updates
  • When new code is installed, T TL
  • Helps spread new code quickly

64
Propagation Summary
65
Trickle Conclusion
  • Efficient state synchronization protocol
  • Fast
  • Limits transmissions (localized flood)
  • Scales well with network size
  • Does not propagate code per se
  • Instead, it notifies the system that update is
    needed
  • In many cases, determining when to propagate can
    be more expensive than propagation
  • Contrast with MOAPs simple late-joiner algorithm

66
The End!
Write a Comment
User Comments (0)
About PowerShow.com