Title: TCP:%20Overview%20%20RFCs:%20793,%201122,%201323,%202018,%202581
1TCP Overview RFCs 793, 1122, 1323, 2018, 2581
- point-to-point one sender, one receiver
- connection-oriented
- exchange control msgs first to initialize sender
receiver state - full duplex data delivery
- bi-directional data flow over the same connection
- reliable, in-order byte steam delivery
- no message boundaries
- sender receiver must buffer data
- flow controlled
- Prevent sender from flooding receiver
- Congestion controlled
- Reduce potential jam in the network
application reads data
application writes data
Socket Interface
TCP send buffer
TCP receive buff
2What defines a TCP connection
- TCP uses 4 values to define a connection (a
communication association) - local-host-addr, local-port, remote-host-addr,
remote-port - each of the two ends keeps state for on-going
communication - sequence for data sent, received, ack'ed,
retransmission timer, flow congestion window
UDP
TCP
IP
Ethernet
3Issues To Consider
- packets may be lost,duplicated,re-ordered
- packets can be delayed arbitrarily long inside
the network - the delay between two communicating ends is
unknown beforehand and may vary over time - port numbers can be reused later
- a later connection must not mistake packets from
an earlier connection as its own
4TCP segment format
IP header
source port
dest port
sequence number
acknowledgement number
head len
not used
rcvr window size
F
S
R
P
A
U
checksum
ptr to urgent data
Options (variable length)
application data (variable length)
5TCP Connection Establishment
- initialize TCP control variables
- Initial seq. used in each direction
- Buffer size (rcvWindow)
- Three way handshake
- 1 client host sends TCP SYN segment to server
- specifies initial seq
- Does not carry data
- 2 server receives SYN, replies with SYN_ACK and
SYN control segment - 3 client end sends SYN_ACK
- May carry data
listen( )
server
client
connect( )
6TCP Connection Close
B
A
- Either end can initiate the close of its end of
the connection at any time - 1 one end (A) sends TCP FIN control segment to
the other - 2 the other end (B) receives FIN, replies with
FIN_ACK when its ready to close too, send FIN - 3 A receives FIN, replies with FIN-ACK.
- 4 B receives FIN_ACK, close connection
- what problem does A have?
server
client
close( )
close( )
?
7the well-known two-army problem
Blue army
Red army
Red army
- Q how can the 2 red armies agree on an attack
time? - Fact the last one who send a message does not
whether the msg is delivered - Basic rule one cannot send an ACK to acknowledge
an ACK
8TCP Connection Close
B
A
server
client
- 1 one end (A) sends TCP FIN control segment to
the other - 2 the other end (B) receives FIN, replies with
FIN_ACK when its ready to close too, send FIN - 3 A receives FIN, replies with ACK.
- 4 B receives FIN_ACK, close connection
- A Enters timed wait, waits for 2 min before
deleting the connection state - Abort a connection send reset to the other
end, enter closed state immediately - All data assumed lost
close( )
close( )
9TCP Connection Management (cont)
wait 2 min
TCP server lifecycle
TCP client lifecycle
10TCP state-transition diagram
CLOSED
Active open
/SYN
Passive open
Close
Close
LISTEN
Send/
SYN
SYN/SYN ACK
SYN/SYN ACK
SYN_RCVD
SYN_SENT
SYN ACK/ACK
ACK
ESTABLISHED
Close
/FIN
FIN/ACK
Close
/FIN
CLOSE_WAIT
FIN_WAIT_1
FIN/ACK
ACK
Close
/FIN
ACK FIN/ACK
FIN_WAIT_2
LAST_ACK
CLOSING
Timeout after two
ACK
ACK
segment lifetimes
FIN/ACK
TIME_WAIT
CLOSED
11How to Set TCP Retransmission Timer
- TCP sets rxt timer based on measured RTT
- SRTT EstimatedRTT
- SRTT (1-?) x SRTT ? x SampleRTT
- Setting retransmission timer
- SRTT plus safety margin
- Timer SRTT 4 X rttvar
12After obtain a new RTT sample
- difference SampleRTT - SRTT
- SRTT (1-?) x SRTT ? x SampleRTT
- SRTT ? x difference
- rttvar (1-?) x rttvar ? x difference )
- rttvar ? (difference - rttvar)
- Retransmission Timer (RTO) SRTT 4 x rttvar
- Typically ? 1/8, ? 1/4
13An Example
Assuming SRTT 500 msec, rttvar
120, RTT(3)600ms, ? RTT - SRTT 100ms SRTT
500 0.125 100 512.5 rttvar 120 0.25
(100 - 120) 115 RTO SRTT 4 rttvar 512.5
460 972.5 ms
RTT(4)650ms, ? RTT - SRTT 137ms SRTT 512
0.125 137 529 rttvar rttvar 0.25 (137 -
115) 120
sender
4
3
receiver
14Example RTT estimation
15How to measure RTT in cases of retransmissions?
- Options
- take the delay between first transmission and
final ACK? - take the delay between last retransmission of
segment(n) and ACK(n)? - Dont measure?
RTT?
timeout
16Karns algorithm
- in case of retransmission
- do not take the RTT sample (do not update SRTT or
rttvar) - double the retransmission timer value (RTO) after
each timeout - Take RTT measure again upon next transmission
(without retrans.)
17One more question
- What initial SRTT, rttvar values to start with?
- Currently by some engineered guessing
- what if the guessed value too small?
- Unnecessary retransmissions
- what if the guessed value too large?
- In case of first or first few packets being lost,
wait longer than necessary before retransmission - current practice
- initial SRTT value 3 sec, rttvar 3 sec
- when get first RTT, SRTT?RTT, rttvarSRTT/2
18TCPs seq. s and ACK s
- Seq.
- The number of first byte in segments data
- ACK
- seq of next byte expected from other side
- cumulative ACK
Host B
Host A
Host A sends 10byte data
Seq42, ACK79, data
host B ACKs receipt of 10B data from A, and sends
5byte data
Seq79, ACK52, data
host ACKs receipt of 5B
Seq52, ACK84
A simple example
19How to guarantee seq. uniqueness
- sequences will eventually wrap around
- TCP assumes Maximum Segment Lifetime (MSL) of 120
sec. - make sure that for the same
src-addr, src-port, dest-addr, dest-port
tuple, the same sequence number does not get
reused within 2xMSL - assure that no two different data segments can
bear the same sequence number, as long as datas
life time lt 120 sec.
20TCP reliable data transfer
- simplified sender, assuming
- one way data transfer
- not flow/congestion control
00 SendBase Initial_SeqNumber 01 NextSeqnum
Initial_SeqNumber 02 03 loop (forever) 04
switch(event) 05 event data received from
application above 06 create TCP segment
with seq. number NextSeqNum 07 start
timer for segment SextSeqNum 08 pass
segment to IP 09 NextSeqNum
NextSeqNum length(data) 10 event timer
timeout for segment with seq. number y 11
retransmit segment with sequence number y 12
compute new timeout interval for segment y
13 restart timer 14 event ACK
received, with ACK field value of y 15
if (y gt SendBase) / cumulative ACK of all data
up to y/ 16 SendBase y 17 If (any outstanding
not-yet-ack'ed segments) 18
Start timer 19 else / a duplicate
ACK for already ACKed segment / 20
increment count of duplicate ACKs received for y
21 if (count of dup. ACKS received
for y 3) 22 resend segment with
sequence number y 23 reset dup.
count 24 25 / end of loop forever /
event data received from application
create, send segment
event timeout for segment with seq y
wait for event
wait for event
retransmit segment
event ACK received, with ACK y
ACK processing
21Fast Retransmit
- Time-out period often relatively long
- long delay before resending lost packet
- Detect lost segments via duplicate ACKs.
- Sender often sends many segments back-to-back
- If segment is lost, there will likely be many
duplicate ACKs.
- If sender receives 3 ACKs for the same data, it
supposes that segment after ACKed data was lost - fast retransmit resend segment before timer
expires
22TCP retransmission scenarios
Host A
Host B
Seq92, 8 bytes data
Seq100, 20 bytes data
ACK100
ACK120
Seq92, 8 bytes data
Sendbase 100
SendBase 120
ACK120
Seq92 timeout
SendBase 100
SendBase 120
premature timeout
23TCP retransmission scenarios (more)
Host A
Host B
Host A
Host B
Seq92, 8 bytes data
X
ACK100
timeout
Seq100, 20 bytes data
X
loss
ACK592
ACK120
SendBase 120
time
time
Cumulative ACK scenario
Fast RXT scenario
24TCP Receiver when to send ACK?
Event TCP Receiver action
delayed ACK wait up to 500ms, If nothing
arrived, send ACK
in-order segment arrival, no gaps, everything
earlier already ACKed
in-order segment arrival, no gaps, one delayed
ACK pending
immediately send one cumulative ACK
out-of-order arrival higher-than-expect seq. ,
gap detected
send duplicate ACK, indicating seq. of next
expected byte
arrival of segment that partially or completely
fills a gap
immediate ACK if segment starts at the lower end
of the gap
25TCP Flow Control
- receiver informs sender of (dynamically
changing) amount of free buffer space - RcvWindow field in TCP header
- sender keeps the amount of transmitted, unACKed
data no more than most recently received RcvWindow
Prevent sender from overrunning receivers buffer
by transmitting too much too fast
- Special case When RcvWindow 0
- sender can send a 1-byte segment
- receiver can respond with current size
- receiver buffer eventually freed ? windown size
increased
26Design ChoiceCounting bytes or counting packets?
- pros of counting bytes flexibility
- need a byte counter somewhere anyway
- can repackage data for retransmission
- e.g. first sent segment-1 with 200 bytes
- 300 more bytes are passed down from application
- Segment-1 times out, send new segment with 500
byte data
27Counting Bytes con's
- sequence number runs out faster
- needs a larger sequence field
- easily fall into traps of transmitting small
packets - network overhead goes up with the number of
packets transmitted - silly window syndrome receiver ACKed a single
byte, causing sender to send single byte segment
forever
28Design ChoicesUnderstand the consequence of the
design
- TCP sequence number 32 bits?4 Gbytes
- wrap-around time
- 50 Kbps 20 hours
- Ethernet (10 Mbps) about an hour
- FDDI (100 Mbps) 6 minutes
- at 1Gbps about 30 seconds
- TCP window size 16-bits?64Kbytes max
- assume RTT 100 msec
- can keep a channel of 5 Mbps fully utilized
- OC3(155 Mbps) x 100 msec 1.9 MB, need a window
size at least 21 bits - 1 Gbps x 100 msec
29Always Keeps the Big Picture in Mind
Web server
Web browser
HTTP
HTTP
Socket interface
Socket interface
TCP
TCP
Unreliable network data packet delivery