Title: Module 15 TCP Flow Control and TCP Congestion Control
1Module 15TCP Flow Control and TCP Congestion
Control
2- Textbook sections
- BF Section 12.6 Flow Control
- LG Section 7.8.1 Open-loop Control
- LG Section 7.8.2 Closed-loop Control
- Topics
- TCP Flow Control
- Overview
- The small-packet problem
- TCP Congestion Control
- Open-loop Control
- Closed-loop Control
- Congestion Avoidance
31. TCP Flow Control - Overview
- TCP provides a means for the receiver to govern
the amount of data sent by the sender. - This is achieved by returning a window with
every ACK indicating a range of acceptable
sequence numbers beyond the last segment
successfully received. The window indicates an
allowed number of bytes that the sender may
transmit before receiving further permission.
41. TCP Flow Control - The small-packet problem
- The small-packet problem
- There is a special problem associated with small
packets. For example, when TCP is used for the
transmission of single-character messages
originating at a keyboard, the typical result is
that 41 byte packets (one byte of data, 40 bytes
of header) are transmitted for each byte of
useful data. - This 4000 overhead is annoying but tolerable on
lightly loaded networks. - On heavily loaded networks, however, the
congestion resulting from this overhead can
result in lost datagrams and retransmissions, as
well as excessive propagation time caused by
congestion in switching nodes and gateways. In
practice, throughput may drop so low that TCP
connection are aborted.
51. TCP Flow Control - The small-packet problem
- The solution to the small-packet problem
- To impose additional restrictions on the sender
- The solution is to inhibit the sending of new TCP
segments when new outgoing data arrives from the
user if any previously transmitted data on the
connection remains unacknowledged. This
inhibition is to be unconditional no timers,
tests for size of data received, or other
conditions are required.. Implementation
typically requires one or two lines inside a TCP
program.
61. TCP Flow Control - The small-packet problem
- Silly window syndrome/remedies
- Syndrome created by the sender
- Situation Application program on the sender side
is too slow - Problem Sender may create many small segments
- Remedy
- Nagles algorithm
- The sending TCP sends the first piece of data it
receives from the sending application program
even if it is only one byte - After sending the first segment, the sending TCP
accumulates data in the output buffer and waits
until either the receiving TCP sends an
acknowledgement or until enough data has
accumulated to fill a maximum-size segment. At
this time, the sending TCP can send the segment - The above step is repeated for the rest of the
transmission.
71. TCP Flow Control - The small-packet problem
- The advantages of Nagles algorithm are its
simplicity and the fact that it takes both the
speed of the application program that creates the
data and the speed of the network that transport
the data into consideration. - If the network is slower than the application
program, then segments are larger. - If the network is faster than the application
program, then segments are smaller. - If the network is much faster than the
application program, then segments are very
small. (lightly loaded networks)
81. TCP Flow Control - The small-packet problem
- Syndrome created by the receiver
- Situation Application program on the receiver
side is too slow (receive window becomes very
small) - Problem Sender may create many small segments
- Remedy
- Clarks solution
- Acknowledge as soon as the data arrives, but to
announce a window size of zero until either there
is enough space to accommodate a segment of
maximum size or until half of the buffer is
empty. - Delayed acknowledgement
- When a segment arrives, it is not acknowledged
immediately. The receiver waits until there is a
decent amount of space in its incoming buffer
before acknowledging the arrived segments - Advantage reduced traffic
- Disadvantage retransmit the unacknowledged
segments.
9LG Figure 8.19 TCP end-to-end flow control
Transmitter
Receiver
Send Window
Receive Window
SlastWa-1
RlastWR1
Rlast
...
...
...
Rnext
Rnew
Octets transmitted and ACKed
Slast
Srecent
SlastWs-1
Slast oldest unacknowledged octet Srecent
highest-numbered transmitted octet SlastWa-1
highest-numbered octet that can be
transmitted SlastWs-1 highest-numbered octet
that can be accepted from the application
Rlast highest-numbered octet not yet read by the
application Rnext next expected octet Rnew
highest numbered octet received correctly RlastWR
-1 highest-numbered octet that can be
accommodated in receive buffer
102. TCP Congestion Control
- Open-loop control
- Admission control
- Policing
- Traffic policing When the traffic violates the
agreed-upon contract, the network may choose to
discard or tag the nonconforming traffic. The
tagged traffic will be carried by the network but
given lower priority. If there is any congestion
down-stream, the tagged traffic is the first one
to be lost. - Leaky bucket algorithm
- Traffic shaping
- Leaky bucket traffic shaper
- Token bucket traffic shaper
- Closed-loop control
- TCP congestion control
11LG Figure 7.53 A leaky bucket
Water poured
irregularly
Note - The bucket depth is used to absorb the
irregularities in the flow. Deep bucket for
bursty flow. Shallow bucket for smooth flow. -
Once the bucket is full, any additional water
entering it spills over the sides and is lost
Leaky bucket
Water drains at
a constant rate
Leaky bucket algorithm A key part of bandwidth
allocation is the mechanism used to specify the
needed bandwidth and limit users to their
allocations. A counter associated with each user
transmitting on a connection is incremented
whenever the user sends a packet and is
decremented periodically. If the counter exceeds
a threshold upon being incremented, the network
discards the packet. The user specifies the rate
at which the counter is decremented (this
determines the average bandwidth) and the value
of the threshold ( a measure of burstiness).
12LG Figure 7.54 Leaky bucket algorithm used for
policing
Arrival of a packet at time ta
X X - (ta - LCT)
Yes
X lt 0?
No
X 0
Yes
Nonconforming
X gt L?
packet
No
X X I
X value of the leaky bucket counter
LCT ta
X auxiliary variable
conforming packet
LCT last conformance time
132. TCP Congestion Control Open-loop Control
Assumptions 1.Packets are assumed to be of fixed
length (I) 2. The leaky bucket will drain at a
continuous rate of 1 unit per packet time
142. TCP Congestion Control Open-loop Control
LG Figure 7.55 Behavior of leaky bucket
Nonconforming
Packet
arrival
Time
LI
Bucket
content
I
Time
152. TCP Congestion Control Open-loop Control
Given I 4, L 6, and the arrival times of
packets, values in the following table were
generated using the leaky bucket algorithm.
162. TCP Congestion Control Open-loop Control
LG Figure 7.58 Possible traffic patterns at the
average rate of 10 kbps
10 Kbps
(a)
Time
0
1
2
3
50 Kbps
(b)
Time
0
1
2
3
100 Kbps
(c)
Time
0
1
2
3
172. TCP Congestion Control Open-loop Control
LG Figure 7.59 A leaky bucket traffic shaper
Shaped
Incoming
Size N
traffic
traffic
Server
Packet
182. TCP Congestion Control Open-loop Control
LG Figure 7.60 Token bucket traffic shaper
Tokens arrive
periodically
Size K
Token
Shaped
Incoming
Size N
traffic
traffic
Server
Packet
192. TCP Congestion Control - Closed-loop Control
- TCP congestion control
- TCP would start a connection with the sender
injecting multiple segments into the network, up
to the window size advertised by the receiver. - While this is OK when the two hosts are on the
same LAN, if there are routers and slower links
between the sender and the receiver, problem can
arise. - Some intermediate router must queue the packets,
and it is possible for that router to run out of
space.
202. TCP Congestion Control - Closed-loop Control
- Definitions
- Receiver window (rwnd)
- The most recently advertised receive window
- Receive-side limit
- Congestion window (cwnd)
- A TCP state variable that limits the amount of
data a TCP can send. At any given time, a TCP
must not send data with a sequence number higher
than the sum of the highest acknowledged sequence
number and the minimum of receive window and
congestion window - actual sender window size minimum (
receive-advertised sender window size, congestion
window size) - The congestion window is a sender-side limit on
the amount of data the sender can transmit into
the network before receiving an ACK - Initial window (IW)
- The initial window is the size of the senders
congestion window after the three-way handshake
is completed - Slow start threshold (Congestion threshold)
- The value of the congestion window where slow
start phase stops and congestion avoidance phase
starts. The initial value is 65,535 bytes - Segment
- Any TCP/IP data or acknowledgement packet (or
both)
212. TCP Congestion Control - Closed-loop Control
LG Figure 7.63 Dynamics of TCP congestion window
Congestion occurs
Congestion
20
avoidance
16
15
Congestion
window
Threshold
10
Slow
start
5
0
Round-trip times
222. TCP Congestion Control - Closed-loop Control
- Slow start phase
- Algorithm
- The Initialization of the congestion window
- for a given connection sets congestion window to
one segment and slow start threshold to 65,535
bytes - The increment of the congestion window
- The sender starts by transmitting one segment and
waiting for its ACK. When that ACK is received,
the congestion window is incremented from one to
two, and two segments can be sent. When each of
those two segments is acknowledged, the
congestion window is increased to four. This
provides an exponential growth. - Limitation of the segments sent
- The TCP output routine never sends more than the
minimum of congestion and the advertised receive
window. - Transition from slow start phase to congestion
avoidance phase - Slow start phase stops when congestion window
reaches the slow start threshold. At this point,
a congestion avoidance phase takes over.
232. TCP Congestion Control - Closed-loop Control
- Slow start phase
- Flow control
- The advertised receiver window is flow control
imposed by the receiver. - The advertised receiver window is related to the
amount of available buffer space at the receiver
for a connection - The congestion window is flow control imposed by
the sender - The congestion window is based on the sender's
assessment of perceived network congestion
242. TCP Congestion Control - Closed-loop Control
- Congestion avoidance phase
- This phase assume that the pipe is running close
to full utilization - Algorithm
- Increase the congestion window by one segment for
each round-trip time. - The congestion window stops increasing when TCP
detects that the network is congested - When congestion is detected, the slow start
threshold is first set to one-half of the current
window size (the minimum of the congestion window
and the advertised window, but at least two
segments). Next the congestion window is set to
one maximum-sized segment. - Restart phase
- With updated slow start threshold and congestion
window, the system restarts, using the slow stat
algorithm.
252. TCP Congestion Control - Closed-loop Control
- Congestion detection
- Time-out
- TCP assumes that congestion occurs in a network
when an acknowledgement does not arrive before
the time-out expires because of segment loss - Duplicate ACK
- A TCP receiver SHOULD send an immediate duplicate
ACK when an out-of-order segment arrives. The
purpose of this ACK is to inform the sender that
a segment was received out-of-order and which
sequence number is expected. - From the senders perspective, duplicate ACKs can
be caused by a number of network problems - They can be caused by dropped segments. In this
case, all segments after the dropped segment will
trigger duplicate ACKS. - They can be caused by the re-ordering of data
segments by the network (not a rare event along
some network paths) - They can be caused by replication of ACK or data
segment by the network.
262. TCP Congestion Control - Closed-loop Control
- Discussion
- Weakness of the slow start congestion
avoidance algorithm - Periodic packet losses
- During the congestion avoidance phase, the
window increase linearly by approximately 1 for
each round trip. Eventually new packet losses
result in another timeout and this process
repeats. Oscillation of such high magnitude in
sending window size leads to drastic round trip
time and queue length fluctuation. The periodic
packet losses at each peak of the fluctuation
have sever impact on the network performance.
Yet, window oscillation is the very measure used
in the multiplicative decrease /additive increase
algorithm to probe the network conditions.
Sending Window Size
Time
272. TCP Congestion Control - Closed-loop Control
- Duplicate ACK
- Every time a data packet arrives at the receiving
side, the receiver responds with an
acknowledgement, even if this sequence number has
already been acknowledged. - Thus, when a packet arrives out of order, TCP
cannot yet acknowledge the data the packet
contains because earlier data has not yet
arrived. TCP resend the same acknowledgement it
sent the last time. - This second transmission of the same
acknowledgement is called a duplicate ACK. - When the sending side sees a duplicate ACK, it
knows that the other side must have received a
packet out of order, which suggests that an
earlier packet might have been lost. - Since it is also possible that the earlier packet
has only been delayed rather than lost, the
sender waits until it sees some number of
duplicate ACKs and then retransmission the
missing packet. - In practice, TCP waits until it has seen three
duplicate ACKs before retransmitting the packet.
282. TCP Congestion Control - Closed-loop Control
- Fast Retransmit
- Reason
- Rely only on TCP timeouts to detect congestion
could lead to long period of time during which
the connection went dead while waiting for a
timer to expire. - Fast retransmit is a heuristic that sometimes
triggers the retransmission of a dropped segment
sooner than the regular timeout mechanism. It
does not replace regular timeouts it just
enhances that facility. - Fast retransmit is based on the use of duplicate
ACK.
292. TCP Congestion Control - Closed-loop Control
- Fast Retransmit
- Description
- Since we do not know whether a duplicate ACK is
caused by a lost segment or just a reordering of
segments, we wait for a small number of duplicate
ACKs to be received. - It is assumed that if there is just a reordering
of the segments, there will be only one or two
duplicate ACKS before the reordered segments is
processed, which will then generate a new ACK.
If three or more duplicate ACKs are received in a
row, it is a strong indication that a segment has
been lost. - We then perform a retransmission of what appears
to be the missing segment, without waiting for a
retransmission timer to expire.
302. TCP Congestion Control - Closed-loop Control
- Fast Recovery
- Description
- After fast retransmit, congestion avoidance, but
not slow start is performed. This the the fast
recovery algorithm. - The reason for not performing slow start after
fast retransmit is that the receipt of the
duplicate ACKs tells us more than just a packet
has been lost. Since the receiver can only
generate the duplicate ACK when another segment
is received, that segment has left the network,
and is in the receivers buffer. That is, there
is still data flowing between the two ends, and
we do not want to reduce the flow abruptly by
going into slow start. - Fast retransmit and fast recovery are usually
implemented together.
312. TCP Congestion Control - Closed-loop Control
- Fast retransmit/fast recovery algorithm
- When the third duplicate ACK is received, set
ssthresh to one-half of the minimum of the
current congestion window (cwnd) and the
receivers advertised window. Retransmit the
missing segment. Set cwnd to ssthresh plus 3
times the segment size. - Each time another duplicate ACK arrives,
increment cwnd by the segment size and transmit a
packet (if allowed by the new value of cwnd) - When the next ACK arrives that acknowledges new
data, set cwnd to ssthresh (the value set in step
1). This should be the ACK of the retransmission
from step 1. one round-trip time after the
retransmission. Additionally, this ACK should
acknowledge all the intermediate segments sent
between the lost packet and the receipt of the
third duplicate ACK. This step is congestion
avoidance, since were slowing down to one-half
the rate we were at when the packet was lost.
32Sender
Receiver
Packet 1
Packet 2
ACK 1
Packet 3
ACK 2
Packet 4
ACK 2
Packet 5
Packet 6
ACK 2
ACK 2
Retransmit
packet 3
ACK 6
33(No Transcript)
342. TCP Congestion Control - Closed-loop Control
- Discussion
- Weakness of TCP congestion control using slow
start and congestion avoidance only - TCP needs to create losses in order to find the
available bandwidth of the connection - TCP repeatedly increase the load it imposes on
the network in an effort to find the point at
which congestion occurs, and then it backs off
from this points. - Alternative approaches
- Congestion avoidance
- Add small amount functionality into the routers
to assist the end node in the anticipation of
congestion.
353. Congestion Avoidance - DECbit
- DECbit
- Concept
- K.K. Ramakrishnan and Raj Jains research at
Digital Equipment Corporation titled A Binary
feedback Scheme for Congestion Avoidance in
Computer Networks. - More evenly split the responsibility for
congestion control between the routers and the
end nodes. - Each router monitors the load it is experiencing
and explicitly notifies the end notes when
congestion is about to occur. - This notification is implemented by setting a
binary congestion bit in the packets that flow
through the router hence the name DECbit - The destination host then copies this binary
congestion bit into the ACK it sends back to the
source - Finally, the source adjusts its sending rate so
as to avoid congestion.
363. Congestion Avoidance - DECbit
- DECbit
- Algorithm
- Router Policy
- Congestion detection The router sets the
congestion avoidance bit in the packet when the
average queue length at the router at the time
the packet arrives is greater or equal to one. - Average queue length The average queue length
is determined based on the number of packets in
the network router that are queued and in service
averaged over an interval T. The interval T is
the last (busy idle) cycle time plus the busy
period of the current cycle. The router is busy
when it is transmitting and idle when it is not
373. Congestion Avoidance - DECbit
Note The figure below shows the queue length at
a router as a function of time. Essentially, the
router calculate the area under the curve and
divides this value by the time interval to
compute the average queue length.
Queue length
Current time
Time
Previous cycle
Current cycle
Averaging interval
383. Congestion Avoidance - DECbit
- Algorithm
- User Policy
- The user updates the window size after receiving
acknowledgements for a number of packets
transmitted. This number is the sum of the
previous window size (Wp) and the current window
size (Wc) at which the transport connection is
operating. The bits returned in the
acknowledgment are stored by the user. - In practice, the source maintains a congestion
window, just as in TCP, and watches to see what
fraction of the last windows worth of packets
resulted in the binary congestion bit being set.
the bits corresponding to the last Wc packets for
which acknowledgments are returned are examined. - 3. If less than 50 of the packets had the bit
set, then the source increases its congestion
window by one packet. - If 50 or more of the last windows worth of
packets had the congestion bit set, then the
source decreases its congestion window to 0.875
time the previous values.
393. Congestion Avoidance - DECbit
- Discussion
- Weakness of DECbit congestion avoidance
- DECbit requires modification in the packet header
- DECbit requires modification of router software
403. Congestion Avoidance - DECbit
Information flow in the DECbit congestion
avoidance mechanism
congestion avoidance bit
congestion avoidance bit
Data Packet
Destination Node
Router
Source Node
Congested Router
...
Acknowledgement Packet
congestion avoidance bit
413. Congestion Avoidance - Random Early Detection
(RED)
- Concept
- Each router is programmed to monitor its own
queue length. - When the router detects that congestion is
imminent, it will notify the source to adjust its
congestion window. - Implicit notification
- The router implicitly notified the source of
imminent congestion by dropping one of its
packets. - The source is , therefore, effectively notified
by the subsequent timeout or duplicate ACK. - RED is designed to be used in conjunction with
TCP, which detects congestion by means of
timeouts or duplicated ACKs. - The router drops a few packets before it has
exhausted its buffer space completely, so as to
cause the source to slow down, with the hope that
this will mean it does not have to drop lot of
packets later on
423. Congestion Avoidance - Random Early Detection
(RED)
- Algorithm
- if CurrentAvgLen ? Minthreshold
- then queue the packet
- if MinThreshold lt CurrentAvgLen lt MaxThreshold
- then calculate probability P and drop the
arriving packet with probability P - if MaxThreshold ? CurrentAvtgLen
- then drop the arriving packet
- Notes on the algorithm
- If the average queue length is smaller than the
lower threshold, no action is taken. - If the average queue length is larger than the
upper threshold, then the packet is always
dropped. - If the average queue length is between the two
thresholds, then the newly arriving packet is
dropped with some probability P.
433. Congestion Avoidance - Random Early Detection
(RED)
- Calculation of average queue length
- CurrentAvgLen (1 - Weight) x PreviousAvgLen
weight x SampleLen - where 0 lt weight lt 1, and
- SampleLen is the length of the queue when a
sample measurement is made. In most software
implementation, the queue length is measured
every a new packet arrives at the router. In
hardware, it might be calculated at some fixed
sampling interval.
443. Congestion Avoidance - Random Early Detection
(RED)
- Calculation of Probability (P)
- TempP MaxP x (CurrentAvgLen - MinThreshold) /
(MaxThreshold - MinThreshold) - P TempP / (1 - count x TempP)
TempP
1.0
MaxP
CurrentAvgLen
MinThreshold
MaxThreshold