TCP Part II - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

TCP Part II

Description:

Ref: Chap 19-24; RFC 793, 1323, 2001, papers by Jacobson, Karn/Partridge. Overview ... Karn/Partridge: don't update RTT estimators during retransmission. ... – PowerPoint PPT presentation

Number of Views:112
Avg rating:3.0/5.0
Slides: 36
Provided by: ShivkumarK7
Category:
Tags: tcp | part | partridge

less

Transcript and Presenter's Notes

Title: TCP Part II


1
TCP (Part II)
  • Shivkumar Kalyanaraman
  • Rensselaer Polytechnic Institute
  • shivkuma_at_ecse.rpi.edu
  • http//www.ecse.rpi.edu/Homepages/shivkuma

2
Overview
  • TCP interactive data flow
  • TCP bulk data flow
  • TCP congestion control
  • TCP timers
  • TCP futures and performance
  • Ref Chap 19-24 RFC 793, 1323, 2001, papers by
    Jacobson, Karn/Partridge

3
Reliability models
  • Reliability fundamentally requires redundancy to
    recover from uncertain loss or other failure
    modes.
  • Two types of redundancy
  • Spatial redundancy independent backup copies
  • Forward error correction (FEC) codes
  • Problem requires huge overhead, since the FEC
    is also part of the packet(s) it cannot recover
    from erasure of all packets
  • Temporal redundancy retransmit if packets
    lost/error
  • Requires trading off response time for
    reliability
  • Design of status reports and retransmission
    optimization (see next slide) important

4
Temporal Redundancy model
5
Status report design
  • Cumulative acks
  • Robust to losses on the reverse channel
  • Can work with go-back-N retransmission
  • Cannot pinpoint blocks of data which are lost
  • The first lost packet can be pinpointed because
    the receiver would generate duplicate acks
  • Selective acks
  • For a byte-stream model like TCP, need to specify
    ranges of bytes received (requires large
    overhead)
  • SACK is a TCP option over-and-above the
    cumulative acks
  • Bitmaps are not efficient because a bit is needed
    for every byte
  • NAKs have same problems like SACKs and bitmaps,
    but also are not robust to reverse channel losses

6
Retransmission optimization
  • Default retransmission
  • Go-back-N I.e. retransmit the entire window.
  • Triggered by timeout or persistent loss in TCP
  • Not efficient if windows are large high speed
    n/ws
  • Selective retransmission
  • Retransmit one packet based upon duplicate acks
  • Recovers quickly from isolated loss, but not
    from burst loss
  • SACK allows pinpointing retransmissions to just
    cover ranges of lost packets
  • Such retransmitted packets must finally be
    confirmed by acks since SACK is only an option
    and not reliable

7
TCP Interactive Data Flow
  • Problems
  • Overhead 40 bytes header 1 byte data
  • To batch or not to batch response time important
  • Batching acks
  • Delay-ack timer piggyback ack on echo
  • 200 ms timer (fig 19.3)
  • Batching data
  • Nagles algo Dont send packet until next ack is
    received.
  • Developed because of congestion in WANs

8
TCP Bulk Data Flow
  • Sliding window
  • Send multiple packets while waiting for acks (fig
    20.1) upto a limit (W)
  • Receiver need not ack every packet
  • Acks are cumulative.
  • Ack Largest consecutive sequence number
    received 1
  • Two transfers of the data can have different
    dynamics (eg fig 20.1 vs fig 20.2)
  • Receiver window field
  • Reduced if TCP receiver short on buffers

9
TCP Bulk Data Flow (Contd)
  • End-to-end flow control
  • Window update acks receiver ready
  • Default buffer sizes 4096 to 16384 bytes.
  • Ideal window and receiver buffer
    bandwidth-delay product
  • TCP window terminology figs 20.4, 20.5, 20.6
  • Right edge, Left edge, usable window
  • closes gt left edge (snd_una) advances
  • opens gt right edge advances (receiver buffer
    freed gt receiver window increases)
  • shrinks gt right edge moves to left (rare)

10
The Congestion Problem
  • Problem demand outstrips available capacity
  • Q Will the congestion problem be solved when
  • a) Memory becomes cheap (infinite memory)?

No buffer
Too late
  • b) Links become cheap (high speed links)?

Replace with 1 Mb/s
All links 19.2 kb/s
S
S
S
S
File Transfer Time 7 hours
File Transfer time 5 mins
11
  • c) Processors become cheap (fast routers
  • switches)?

A
C
S
B
D
Scenario All links 1 Gb/s. A B send to C.
  • Ans None of the above solves congestion !
  • Congestion Demand gt Capacity
  • It is a dynamic problem gt Static solutions are
    not sufficient
  • TCP provides a dynamic solution

12
?i
?i
?
?
  • If information about ?i , ? and ? is known in a
    central location where control of ?i can be
    effected with zero time delays, the congestion
    problem is solved.
  • Problems
  • Incomplete information (eg loss indications)
  • Distributed solution required
  • Congestion and control/measurement locations
    different
  • Time-varying, heterogeneous time-delays

13
TCP Congestion Control
  • Window flow control avoid receiver overrun
  • Dynamic window congestion control avoid/control
    network overrun
  • Observation Not a good idea to start with a
    large window and dump packets into network
  • Treat network like a black box and start from a
    window of 1 segment (slow start)
  • Increase window size exponentially (exponential
    increase) over successive RTTs gt quickly grow
    to claim available capacity.
  • Technique Every ack increase cwnd (new window
    variable) by 1 segment.
  • Effective window Min(cwnd, Wrcvr)

14
Dynamics
2nd RTT
3rd RTT
4th RTT
1st RTT
  • Rate of acks rate of packets at the bottleneck
    Self-clocking property.

100 Mbps
10 Mbps
Router
Q
15
Congestion Detection
  • Packet loss as an indicator of congestion.
  • Set slow start threshold (ssthresh) to min(cwnd,
    Wrcvr)/2
  • Retransmit pkt, set cwnd to 1 (reenter slow start)

Receiver Window
Timeout
Congestion Window (cwnd)
IdleInterval
ssthresh
1
Time (units of RTTs)
16
Congestion avoidance
  • Increment cwnd by 1 per ack until ssthresh
  • Increment by 1/cwnd per ack afterwards
    (Congestion avoidance or linear increase)
  • Idea ssthresh estimates the bandwidth-delay
    product for the connection.
  • Initialization ssthresh Receiver window or
    default 65535 bytes. Larger values thru options.
  • If source is idle for a long time, cwnd is reset
    to one MSS.

17
  • Implications of using packet loss as congestion
    indicator
  • Late congestion detection if the buffer sizes
    larger
  • Higher speed links or large buffers gt larger
    windows gt higher probability of burst loss
  • Interactions with retransmission algorithm and
    timeouts
  • Implications of ack-clocking
  • More batching of acks gt bursty traffic (harder
    to manage)
  • Less batching leads to a large fraction of
    Internet traffic being just acks (huge overhead)
  • Additive Increase/Multiplicative Decrease
    Dynamics
  • TCP approximates these dynamics

18
Timeout and RTT Estimation
  • Timeout for robust detection of packet loss
  • Problem How long should timeout be ?
  • Too long gt underutilization too short gt
    wasteful retransmissions
  • Solution adaptive timeout based on RTT
  • RTT estimation
  • Early method exponential averaging
  • R ? ?R (1 - ?)M M measured RTT
  • RTO ?R ? delay variance factor
  • Suggested values ? 0.9, ? 2

19
RTT Estimation
  • Jacobson 1988 this method has problems w/
    large RTT fluctuations
  • New method Use mean deviation of RTT
  • A smoothed average RTT
  • D smoothed mean deviation
  • Err M - A M measured RTT
  • A ? A gErr g gain 0.125
  • D ? D h(Err - D) h gain 0.25
  • RTO A 4D
  • Integer arithmetic used throughout. Complex
    initialization process ...

20
Timer Backoff/Karns Algorithm
  • Timer backoff If timeout, RTO 2RTO
    exponential backoff
  • Retransmission ambiguity problem
  • During retransmission, it is unclear whether an
    ack refers to a packet or its retransmission.
    Problem for RTT estimation
  • Karn/Partridge dont update RTT estimators
    during retransmission.
  • Restart RTO only after an ack received for a
    segment that is not retransmitted

21
Fast Retransmit and Recovery
  • Goals
  • Timeout avoidance The 500 ms timer granularity
    can have an adverse performance impact especially
    for high speed n/ws
  • Selective retransmission Especially when packets
    are dropped due to error or light congestion
  • Fast Recovery Converge quickly to a state of
    congestion avoidance (linear increase) with
    half-current window -- the assumed ideal window
    size.
  • Observation Receivers are required to send an
    immediate duplicate acknowledgment when they
    receives out-of-order data segments.

22
Fast Retransmit and Recovery
0
500
Ack 500
Ack 500
Ack 500
FRR
Ack 500
Ack 500
  • 3 duplicate acks gt assume loss
  • More duplicate acks gt other packets have reached
    destination safely.
  • Wait for about 1/2RTT, and resume transmitting
    new segments for every subsequent duplicate ack
    received. Stop this process once the ack for the
    missing segment received

23
Fast Retransmit and Recovery
  • Fast Retransmit Received third duplicate ack
  • Set ssthresh to 1/2 of current cwnd
  • Retransmit the missing segment
  • Set cwnd to ssthresh3
  • Fast Recovery For each duplicate ack hence
  • Increment cwnd by 1 MSS
  • New packets are transmitted once cwnd grows large
    enough.
  • If old cwnd was a pipe of length 1RTT, the
    network gets a relief period of 1/2RTT

24
FRR (contd)
  • Upon receiving the next (non-duplicate) Ack
  • Set cwnd to ssthresh enter linear growth phase

New packets sent during this phase
CWND
CWND/2
TIME
25
FRR problems
  • Burst loss of 3 pkts gt Timeout window shutdown
    to cwnd/8 !

CWND
W
CWND/2
CWND/8
CWND/4
Time
1st Fast Retransmit
Timeout
2nd Fast Retransmit
26
TCP Performance Optimization
  • SACK selective acknowledgments specifies blocks
    of packets received at destination.
  • Random early drop (RED) scheme spreads the
    dropping of packets more uniformly and reduces
    average queue length and packet loss rate.
  • Scheduling mechanisms protect well-behaved flows
    from rogue flows.
  • Explicit Congestion Notification (ECN) routers
    use a explicit bit-indication for congestion
    instead of loss indications.

27
Congestion control summary
  • Sliding window limited by receiver window.
  • Dynamic windows slow start (exponential rise),
    congestion avoidance (linear rise),
    multiplicative decrease.
  • Adaptive timeout need mean RTT deviation
  • Timer back off and Karns algo during
    retransmission
  • Go-back-N or Selective retransmission
  • Cumulative and Selective acknowledgements
  • Timeout avoidance FRR
  • Drop policies, scheduling and ECN

28
TCP Persist Timer
  • Receiver flow control can set window to zero
  • Receiver later sends window update acks
  • But TCP does not transmit acks reliably gt update
    acks may be lost and source may be stuck at a
    zero window value
  • TCP uses persist timer to query the receiver
    periodically to find if the window has been
    increased.
  • Persist timer always bounded between 5s and 60s.
    It does exponential backoff like other timers too.

29
Silly Window Syndrome
  • A) The system operates at a small window (sends
    segments which are not MSS-sized) even if the
    receiver grants a large window.
  • B) Receiver advertises small windows.
  • Solution batching
  • Receiver must not advertise small windows
  • Sender waits until segment full before sending
    (extension of Nagles algo),
  • It can transmit everything if it is not waiting
    for any ACK (or if Nagles algo has been
    disabled)

30
TCP Keepalive timer
  • Optional timer.
  • Not part of TCP spec, but found in most
    implementations.
  • Not necessary, because connection defined by
    endpoints.
  • Connection can be upas long as
    source/destination up.
  • Typical use to detect idle clients or half-open
    connections and de-allocate server resources tied
    up to them. Eg telnet, ftp.

31
Gigabit Networks
  • Higher Bandwidth Networks
  • Propagation latency unchanged.
  • Increasing bandwidth from 1.5Mb/s to 45 Mb/s
    (factor of 29) decreases file transfer time of
    1MB by a factor of 25.
  • But, increasing from 1 Gb/s to 2 Gb/s gives an
    improvement of only 10 !
  • Transfer time propagation time transmission
    time queueing/processing.
  • Design networks to minimize delay (queueing,
    processing, reduce retransmission latency)

32
Window Scaling Option
  • Long Fat Pipe Networks (LFN) Satellite links
  • Need very large window sizes.
  • Normally, Max window 216 64 KBytes
  • Window scale Window W 2Scale

Kind 3
Length 3
Scale
  • Max window 216 2255
  • Option sent only in SYN and SYN
  • Ack segments.
  • RFC 1323

33
Timestamp option
  • For LFNs, need accurate and more frequent RTT
    estimates.
  • Timestamp option
  • Place a timestamp value in any segment.
  • Receiver echoes timestamp value in ack
  • If acks are delayed, the timestamp value returned
    corresponds to the earliest segment being acked.
  • Segments lost/retransmitted gt RTT overestimated

34
PAWS Protection against wrapped sequence numbers
  • Largest receiver window 230 1 GB
  • Lost segment may reappear before MSL, and the
    sequence numbers may have wrapped around
  • The receiver considers the timestamp as an
    extension of the sequence number gt discard
    out-of-sequence segment based on both seq and
    timestamp.
  • Reqt timestamp values need to be monotonically
    increasing, and need to increase by at least one
    per window

35
Summary
  • Interactive and bulk TCP flow
  • TCP congestion control
  • Informal exercises Perform some of the
    experiments described in chaps 19-21 to see
    various facets of TCP in action
Write a Comment
User Comments (0)
About PowerShow.com