Title: INFO 330 Computer Networking Technology I
1INFO 330Computer Networking Technology I
- Chapter 3
- The Transport Layer
- Glenn Booker
2Transport Layer
- The Transport Layer is between the application
and network layers - Well focus initially on the transport-network
connection, and see how UDP handles it - Then well address techniques for verifying data
transmission, and reassembly of packets, and how
TCP applies them - Finally well cover congestion control techniques
3Transport Layer
- The Transport Layer handles logical communication
between processes - Its the last layer not used between processes
for routing, so its the last thing a client
process and the first thing a server process sees
of a packet - By logical communication, we recognize that the
means used to get between processes, and the
distance covered, are irrelevant
4Transport vs Network
- Notice we didnt say hosts in the previous
slidethats because - The network layer provides logical communication
between hosts - Mail analogy
- Lets assume cousins (processes) want to send
letters to each other between their houses
(hosts) - They use their parents (transport layer) to mail
the letters, and sort the mail when it arrives
5Transport vs Network
- The letters travel through the postal system
(network layer) to get from house to house - The transport layer doesnt participate in the
network layer activities (e.g. most parents dont
work in the mail distribution centers) - The transport layer protocols are localized in
the hosts - Routing isnt affected by anything the transport
layer added to the messages
6Transport vs Network
- Following the analogy, different people might
have to pick up and sort the mail theyre like
using different transport layer protocols - And the transport layer protocols (parents) are
often at the mercy of what services the network
layer (postal system) provides - Some services can be provided at the transport
layer, even if the network layer doesnt (e.g.
reliable data transfer or encryption)
7Two Choices
- Here we choose between TCP and UDP
- In the transport layer, a packet is a segment
- In the network layer, a packet is a datagram
- The network layer is home to the Internet
Protocol (IP) - IP provides logical communication between hosts
- IP makes a best effort to get segments where
they belong no guarantees of delivery, or
delivery sequence, or delivery integrity
8IP
- Each host has an IP address
- Common purpose of UDP and TCP is extend delivery
of IP data to the hosts processes - This is called transport-layer multiplexing and
demultiplexing - Both UDP and TCP also provide error checking
- Thats it for UDP data delivery and error
checking!
9TCP
- TCP also provides reliable data transfer (not
just data delivery) - Uses flow control, sequence numbers,
acknowledgements, and timers to ensure data is
delivered correctly and in order - TCP also provides congestion control
- TCP applications share the available bandwidth
(they watched Sesame Street!) - UDP takes whatever it can get (greedy little
protocol)
10Multiplexing Demultiplexing
- At the destination host, the transport layer gets
segments from the network layer - Needs to deliver these segments to the correct
process on that host - Do so via sockets, which connect processes to
the network - Each socket has a unique identifier, whose
format varies for UDP and TCP
11Multiplexing Demultiplexing
- Demultiplexing is getting the transport layer
segment into the correct socket - Hence Multiplexing is taking data from various
sockets, applying header info, breaking it into
segments, and delivering it to the network layer - Multiplexing and demultiplexing are used in any
kind of network not just in the Internet
protocols
12Multiplexing Demultiplexing
13Mail Analogy
- Multiplexing is when a parent collects letters
from the cousins, and puts them into the mail - Demultiplexing is getting the mail, and handing
the correct mail to each cousin - Here we need unique socket identifiers, and some
place in the header for the socket identifier
information
14Segment Header
- Hence the segment header starts with the source
and destination port numbers - Each port number is a 16-bit (2 byte) value (0
to 65,535) - Well known port numbers are from 0 to 1023 (210)
- After the port numbers are other headers,
specific to TCP or UDP, then the message
15UDP Multiplexing
- UDP assigns a port number from 1024 to 65,535 to
each socket, unless the developer specifies
otherwise - UDP identifies a socket only by destination IP
address and destination port number - The port numbers for source and destination are
switched (inverted) when a reply is sent - So a segment from port 19157 to port 46428
generates a reply from 46428 to 19157
16TCP Multiplexing
- TCP is messier, of course
- TCP identifies a socket by four values
- Source IP address, source port number,
destination IP address, and destination port
number - Hence if UDP gets two segments with the same
destination IP and port number, theyll both go
to the same process - TCP tells the segments apart via source IP/port
17TCP Multiplexing
- So if you have two HTTP sessions going to the
same web server and page, how can TCP tell them
apart? - Even though the destination IP and port (80) are
the same, and the two sessions (processes) have
the same source IP address, they have different
source port numbers
18Web Servers TCP
- Each new client connection often uses a new
process and socket to send HTTP requests and get
responses - But a thread (lightweight process) can be used,
so a process can have multiple sockets for each
thread
19UDP
- The most minimal transport layer has to do
multiplexing and demultiplexing - UDP does this and a little error checking and,
well, um, thats about it! - UDP was defined in RFC 768
- An app that uses UDP almost talks directly to IP
- Adds only two small data fields to the header,
after the requisite source/destination addresses - Theres no handshaking UDP is connectionless
20UDP for DNS
- DNS uses UDP
- A DNS query is packaged into a segment, and is
passed to the network layer - The DNS app waits for a response if it doesnt
get one soon enough (times out), it tries another
server or reports no reply - Hence the app must allow for the unreliability of
UDP, by planning what to do if nothing comes back
21UDP Advantages
- Still UDP is good when
- You want the app to have detailed control over
what is sent across the network UDP changes it
little - No connection establishment delay
- No connection state data in the end hosts hence
a server can support more UDP clients than TCP - Small packet header overhead per segment
- TCP uses 20 bytes of header data, UDP only 8 bytes
22UDP Apps
- Other than DNS, UDP is also used for
- Network management (SNMP)
- Routing (RIP)
- Multimedia and telephony (various protocols)
- Remote file server (NFS)
- The lack of congestion control in UDP can be a
problem when lost of large UDP messages are being
sent can crowd out TCP apps
23UDP Header
- The UDP header has four two-byte fields in two
lines - Source port number Destination port number
- Length Checksum
- Length is the total length of the segment,
including headers, in bytes - The checksum is used by the receiving app to see
if errors occurred
24Checksum
- Noise in the transmission lines can lose bits of
data or rearrange them in transit - Checksums are a common method to detect errors
(RFC 1071) - To create a checksum
- Find the sum of the binary digits of the message
- The checksum is the 1s (ones) complement of the
sum - If message is uncorrupted, sum of message plus
checksum is all ones 1111111111111
251s Complement?
- The 1s complement is a mirror image of a binary
number change all the zeros to ones, and ones
to zeros - So the 1s complement of 00101110101 is
11010001010 - UDP does error checking because not all lower
layer protocols do error checking - This provides end-to-end error checking, since
its more efficient than every step along the way
26UDP
- Thats it for UDP!
- The port addresses, the message length, and a
checksum to see if it got there intact - Now see what happens when we want reliable data
transfer
27Reliable Data Transfer
- Distinguish between the service model, and how
its really implemented - Service model From the app perspective, it just
wants a reliable transport layer to connect
sending and receiving processes - Service implementation In reality, the transport
layer has to use an unreliable network layer
(IP), so transport has to make up for the
unreliability below it
28Reliable Data Transfer
- The sending process will give the transport layer
a message rdt_send (rdt reliable data transfer) - The transport protocol will convert to udt_send
(udt unreliable data transfer Fig 3.8 has
typo) and give to the network layer - At the receiving end, the protocol gets rdt_rcv
from the network layer, - The protocol will convert to deliver_data and
give it to the receiving application process
29Reliable Data Transfer
30Reliable Data Transfer
- Here well refer to the data as packets, rather
than distinguish segments, etc. - Also consider that well pretend we only have to
send data one direction (unidirectional data
transfer) - Bidirectional data transfer is what really
occurs, but the sending and receiving sides get
switched - Time to build a reliable data transfer protocol,
one piece at a time
31Reliable Data Transfer v1.0
- For the simplest case, called rdt1.0, assume the
network is completely reliable - Finite state machines (FSMs) for the sender and
receiver each have one state waiting for a call - The sending side (rdt_send) makes a packet
(make_pkt) and sends it (udt_send) - The receiving side (rdt_rcv) extracts data from
the packet (extract), and delivers it to the
receiving app (deliver_data)
32Reliable Data Transfer v1.0
- Here a packet is the only unit of data
- No feedback to sender is needed to confirm
receipt of data, and no control over transmission
rate is needed
33Reliable Data Transfer v2.0
- Now allow bit errors in transmission
- But all packets are received, in the correct
order - Need acknowledgements to know when a packet was
correct (OK, 10-4) versus when it wasnt (please
repeat) called positive and negative
acknowledgements, respectively - These types of messages are typical for any
Automatic Repeat reQuest (ARQ) protocol
34Reliable Data Transfer v2.0
- So allowing for bit errors requires three
capabilities - Error detection to know if a bit error occurred
- Receiver feedback, both positive (ACK) and
negative (NAK) acknowledgements - Retransmission of incorrect packets
35Reliable Data Transfer v2.0
36Reliable Data Transfer v2.0
- Sending FSM (cont.)
- The left state waits for a packet from the
sending app, makes a packet with a checksum
(make_pkt) - Then the left state sends the packet (udt_send)
- It moves to the other state (waiting for ACK/NAK)
- If it gets a NAK response (errors detected), then
it resends the packet (udt_send) until it gets it
right - If it gets an ACK response (no errors), then it
goes back to the other state to wait for the next
packet from the app
37Reliable Data Transfer v2.0
- Notice this model does nothing until it gets the
NAK/ACK, so its a stop-and-wait protocol - Receiving FSM
- The receiving side uses the checksum to see if
the packet was corrupted - If it was ( corrupt) send a NAK response
- If it wasnt ( notcorrupt), extract and deliver
the data, and send an ACK response - But what if the NAK/ACK is corrupted?
38Reliable Data Transfer v2.0
- Three possible ways to handle NAK/ACK errors
- Add another type of response to have the NAK/ACK
repeated but what if that response got
corrupted? Leads to long string of messages - Add checksum data to the NAK/ACK, and data to
recover from the error - Resend the packet if the NAK/ACK is garbled but
introduces possible duplicate packets
39Reliable Data Transfer v2.1
- TCP and most reliable protocols add a sequence
number to the data from the sender - Since we cant lose packets yet, a one-bit
number is adequate to tell if this is a new
packet or a repeat of the previous one - This gives our new model rdt version 2.1
40Reliable Data Transfer v2.1
sender
41Reliable Data Transfer v2.1
- Now the number of states are doubled, since we
have to handle sequence of 0 or 1 - So in make_pkt(1, data, checksum) the 1 is the
sequence number - Sequence number alternates 010101 if everything
works if a packet is corrupted, the same
sequence number is expected two or more times - Start at Wait for call 0 state when get
packet, send it to network with sequence 0 - Then wait for ACK or NAK with sequence 0
42Reliable Data Transfer v2.1
- If the packet was corrupt, or got a NAK, resend
that packet (upper right loop) - Otherwise wait for call with sequence 1 from app
- When call 1 is received, make and send the packet
with sequence 1 (desired outcome) - Then wait for a NAK/ACK with sequence 1
- If corrupt or got a NAK, resend (lower left loop)
- Otherwise go to waiting for a sequence 0 call
from the app - Repeat cycle
43Reliable Data Transfer v2.1
receiver
44Reliable Data Transfer v2.1
- The receiver side also doubles in of states
- When waiting for seq 0 state
- If the packet has sequence 0 and isnt corrupt,
extract and deliver the data, and send an ACK
go to wait for seq 1 state - If the packet was corrupt, reply with a NAK
- If the packet has sequence 1 and was not corrupt
(its out of order) send an ACK and keep waiting
for a seq 0 packet - Mirror the above for starting from wait for seq
1 state
45Reliable Data Transfer v2.2
- Could achieve the same effect without a NAK (for
corrupt packet) if we only ACK the last
correctly received packet - Two ACKs for the same packet (duplicate ACKs)
means the packet after the second ACK wasnt
received correctly - The NAK-free protocol is called rdt2.2
46Reliable Data Transfer v2.2
47Reliable Data Transfer v2.2
- Again, the send and receive FSMs are symmetric
for sequence 0 and 1 - Sender must now check the sequence number of the
packet being ACKd (see isACK message) - The receiver must include the sequence number in
the make_pkt message - FSM on page 211 also has oncethru variable to
help avoid duplicate ACKs
48Reliable Data Transfer v3.0
- Now account for the possibility of lost packets
- Need to detect packet loss, and decide what to do
about it - The latter is easy with the tools we have (ACK,
checksum, sequence , and retransmission), but
need a new detection mechanism - Many possible loss detection approaches focus
on making the sender responsible for it
49Reliable Data Transfer v3.0
- Sender thinks a packet lost when packet doesnt
get to receiver, or the ACK gets lost - Cant wait for worst case transmission time, so
pick a reasonable time before error recovery is
started - Could result in duplicate packets if it was still
on the way but rdt2.2 can handle that - For the sender, retransmission is ultimate
solution whether packet or ACK was lost
50Reliable Data Transfer v3.0
- Knowing when to retransmit needs a countdown
timer - Count time from sending a packet to still not
getting an ACK - If time is exceeded, retransmit that packet
- Works the same if packet is lost or ACK is lost
- Since packet sequence numbers alternate
0-1-0-1-etc., is called an alternate-bit protocol
51Reliable Data Transfer v3.0
sender
52Reliable Data Transfer v3.0
- How does the receiver FSM differ from rdt2.2?
- Notice that, even allowing for lost packets, we
still assume only once packet is sent completely
and correctly at a time - But rdt3.0 still stops to wait for timeout of
each packet fix with pipelining
53Pipelined RDT
- Suppose we implemented rdt3.0 between NYC and LA
- Distance of 3000 miles gives RTT of about 30 ms
- If transmission rate is 1 Gbps, and packets are 1
kB (8 kb) - Transmission time is therefore only 8kb / 1E9 b/s
8 microseconds - Even if ACK messages are very small (transmission
time about zero), the time for one packet to be
sent and ACK is 30.008 ms
54Pipelined RDT
- Hence were transmitting 0.008 ms out of the
30.008 ms RTT, which equals 0.027 utilization - How a protocol is implemented drastically affects
its usefulness! - It makes sense to send multiple packets and keep
track of the ACKs for each - Methods to do so are Go-Back-N (GBN) and
selective repeat
55Go-Back-N
- In this protocol, sender can send up to N packets
without getting an ACK - N is also called a window size, and the protocol
is a.k.a. a sliding-window protocol - Let base be the number of the first packet in a
window - The window size, N, is already defined
- Then all packets from 0 to base-1 have already
been sent
Why a limit at all? Need for flow and
congestion control later.
56Go-Back-N
- The window currently focuses on packets number
base to baseN, these packets can be sent before
their ACK is received - Packet sequence numbers need to have a maximum
value if k bits are in the sequence number,
the range of sequence numbers is 0 to 2k-1 - The sequence numbers are used in a circle, so
after 2k-1 you use 0 again, then 1, etc.
57Go-Back-N
- In contrast, rdt3.0 only had sequence numbers 0
and 1 - TCP has a 32-bit sequence number range for the
bytes in a byte stream - Now the FSMs for Go-Back-N (GBN) follow
- Sender must respond to
- Call from above (i.e. the app)
- Receipt of an ACK from any of the packets
outstanding - Timeout which causes all un-ACKed packets resent
58Go-Back-N Sender
59Go-Back-N
- The GBN receiver does
- If a packet is correct and in order, send an ACK
- Sender moves window up with each correct and in
order packet ACKed this minimizes resending
later - In all other cases, throw away the packet, and
resend ACK for the most recent correct packet - Hence we throw away correct but out-of-order
packets this makes receiver buffering easier
60Go-Back-N
- GBN can be implemented in event-based
programming events here are - App invokes rdt_send
- Receiver protocol receives rdt_rcv
- Timer interrupts
- In contrast, consider the selective repeat (SR)
approach for pipelining
61Selective Repeat
- Large window size and bandwidth delay can make a
lot of packets in the pipeline under GBN, which
can cause a lot of retransmission when a packet
is lost - Selective repeat only retransmits packets
believed to be in error so retransmission is
on a more individual basis - To do this, buffer out-of-order packets until
the missing packets are filled in
62Selective Repeat
- SR still uses a window of size N packets
- SR sender responds to
- Data from the app above it finds next sequence
number available, and sends as soon as possible - Timeout is kept for each packet
- ACK received from the receiver then sender marks
off that packet, and moves the window forward
can transmit packets inside the new window
63Selective Repeat
- The SR receiver responds to
- Packet within the current window then send an
ACK deliver packets at the bottom of the window,
but buffer higher number packets (out of order) - Packets that were previously ACKed are ACKed
again - Otherwise ignore the packet
- Notice the sender and receiver windows are
generally not be the same!!
64Selective Repeat
- Its possible that the sequence number range and
window size could be too close, producing
confusing signals - To prevent this, need window size lt half of the
sequence number range
65Packet Reordering
- Our last assumption was that packets arrive in
order, if at all - What is they arrive out of order?
- Out of order packets could have sequence numbers
outside of either window (snd or rcv) - Handle by not allowing packets older than some
max time - TCP typically uses 3 minutes
66Reliable Data Transfer Mechanisms
- Checksum, to detect bit errors in a packet
- Timer, to know when a packet or its ACK was lost
- Sequence number, to detect lost or duplicate
packets - Acknowledgement, to know packet got to receiver
correctly - Negative acknowledgement, to tell packet was
corrupted but received - Window, to pipeline many packets at once before
an ACK was received for any of them
67TCP Intro
- Now see how all this applies to TCP
- RFC 793
- Invented circa 1974 by Vint Cerf and Robert Kahn
- TCP starts with a handshake protocol, which
defines many connection variables - Connection only at hosts, not in between
- Routers are oblivious to whether TCP is used!
- TCP is a full duplex service data can flow both
directions at once
68TCP Intro
- TCP is point-to-point between a single sender
and a single receiver - In contrast with multipoint technologies
- TCP is client/server based
- Client needs to establish a socket to the
servers hostname and port - Recall default port numbers are app-specific
- Special segments are sent by client, server, and
client to make the three-way handshake
69TCP Intro
- Once connection exists, processes can send data
back and forth - Sending process sends data through socket to the
TCP send buffer - TCP sends data from the send buffer when it feels
like it - Max Segment Size (MSS) is based on the max frame
size, or Max Transmission Unit (MTU) - Want 1 TCP segment to eventually fit in the MTU
70TCP Intro
- Typical MTU values are 512 1460 bytes
- MSS is the max app data that can fit in a
segment, not the total segment size (which
includes headers) - TCP adds headers to the data, creating TCP
segments - Segments are passed to the network layer to
become IP datagrams, and so on into the network
71TCP Intro
- At the server side, the segment is placed in the
receive buffer - So a TCP connection consists of two buffers (send
and receive), some variables, and two socket
connections (send and receive) on the
corresponding processes
72TCP Segment Structure
- A TCP segment consists of header fields and a
data field - The data field size is limited by the MSS
- Typical header size is 20 bytes
- The header is 32 bits wide (4 bytes), so it has
five lines at a minimum
73TCP Segment Structure
- The header lines are
- Source and destination port numbers (16 bit ea.)
- Sequence number (32 bit)
- ACK number (32 bit)
- A bunch of little stuff (header length, URG, ACK,
PSH, RST, SYN, and FIN bits), then the receive
window (16 bit) - Internet checksum, urgent data pointer (16 bit
ea.) - And possibly several options
74TCP Segment Structure
- Weve seen the port numbers (16 bits each),
sequence and ACK numbers (32 bits each) - The bunch of little stuff includes
- Header length (4 bits)
- A flag field includes six one-bit fields ACK,
RST, SYN, FIN, PSH, and URG - The URG bit marks urgent data later on that line
- The receive window is used for flow control
75TCP Segment Structure
- The checksum is used for bit error detection, as
with UDP - The urgent data pointer tells where the urgent
data is located - The options include negotiating the MSS, scaling
the window size, or time stamping
76TCP Sequence Numbers
- The sequence numbers are important for TCPs
reliability - TCP views data as unstructured but ordered stream
of bytes - Hence sequence numbers for a segment is the
byte-stream number of the first byte in the
segment - Yes, each byte is numbered!
77TCP Sequence Numbers
- So if the MSS is 1000 bytes, the first segment
will be number 0, and cover bytes 0 to 999 - The second segment is number 1000, and covers
bytes 1000-1999 - Third is number 2000, and covers 2000-2999, etc.
- Typically start sequences at random numbers on
both sides, to avoid accidental overlap with
previously used numbers
78TCP Acknowledgement No.
- TCP acknowledgement numbers are weird
- The number used is the next byte number expected
from the sender - So if host B sends to A (!) bytes 0-535 of data,
host A expects byte 536 to be the start of the
next segment, so 536 is the Ack number - This is a cumulative acknowledgement, since it
only goes up to the first missing byte in the
byte-stream
79TCP Out-of-Order Segments
- What does it do when segments arrive out of
order? - Thats up to the TCP implementer
- TCP can either discard out of order segments, or
keep the strays in buffer and wait for the pieces
to get filled in - The former is easier to implement, the latter is
more efficient and commonly used
80Telnet Example
- Telnet (RFC 854) is an old app for remote login
via TCP - Telnet interactively echoes whatever was typed to
show it got to the other side - Host A is the client, starts a session with Host
B, the server - Suppose client starts with sequence number 42,
and server with 79
81Telnet Example
- User types a single letter, c
- Notice how the seq and Ack numbers mirror or
piggy back each other
82Timeout Calculation
- TCP needs a timeout interval, as discussed in the
rdt example, but how long? - Longer than RTT, but how much? A week?
- Measure sample RTT for segments here and there
(not every one) - This SampleRTT value will fluctuate, with an
average value called EstimatedRTT which is
updated with each measurement (a moving average)
83Timeout Calculation
- Naturally, EstimatedRTT is a smoother curve than
each SampleRTT - EstimatedRTT 0.875EstimatedRTT
0.125SampleRTT - The variability of RTT is measured by DevRTT,
which is the moving average magnitude difference
between SampleRTT and EstimatedRTTDevRTT
0.75DevRTT 0.25
SampleRTT-EstimatedRTT
84Timeout Calculation
- We want the timeout interval larger than
EstimatedRTT, but not huge use - TimeoutInterval EstimatedRTT 4DevRTT
- This is analogous to control charts, where the
expected value of a measurement is no more than
the (mean 3the standard deviation) about ¼ of
the time - DevRTT isnt a standard deviation, but the idea
is similar
85Timeout Calculation
- Notice this means that the timeout interval is
constantly being calculated, and to do so
requires frequent measurement of SampleRTT to
find current values for - Estimated RTT, DevRTT, and TimeoutInterval
86Reliable Data Transfer
- IP is not a reliable datagram service
- It doesnt guarantee delivery, or in order, or
intact delivery - In theory we saw that separate timers for each
segment would be nice in reality TCP uses one
retransmission timer for several segments (RFC
2988) - For the next example, assume Host A is sending a
big file to Host B
87Simplified TCP
- Here the sender responds to three events
- Receive data from application
- Then it makes segments of the data, each with a
sequence number, and passes them to the IP layer - Starts timer
- Timer times out
- Then it re-sends the segment that timed out
- ACK was received
- Compares the received ACK value with SendBase,
the last byte number successfully received - Restart timer if any un-ACK segments left
88Simplified TCP
- Even this version of TCP can successfully handle
lost ACKs by ignoring duplicate segments (Fig
3.34, p. 242) - If a segment times out, later segments dont get
re-sent (Fig 3.35, p. 243) - A lost ACK can still be deduced to not be a lost
segment (Fig 3.36, p. 244)
89Doubling Timeout
- After a timeout event, many TCP implementations
double the timeout interval - This helps with congestion control, since timeout
is often due to congestion, and retransmitting
often makes it worse!
90Fast Retransmit
- Waiting for the timeout can be too slow
- Might know to retransmit sooner if get duplicate
ACKs - An ACK for a given byte number means a gap was
noted in the segment sequence (since there are
no negative NAKs) - Getting three duplicate ACKs typically forces a
fast retransmit of the segment after that value
91Go-Back-N vs. Selective Repeat?
- TCP partly looks like Go-Back-N (GBN)
- Tracks last sequence number transmitted but not
ACKed (SendBase) and sequence number of next byte
to send (NextSeqNum) - TCP partly looks like Selective Repeat (SR)
- Often buffers out-of-order segments to limit the
range of segments retransmitted - TCP can use selective acknowledgment (RFC 2018)
to specify which segments are out of order
92Flow Control
- TCP connection hosts maintain a receive buffer,
for bytes received correctly and in order - Apps might not read from the buffer for a while,
so it can overflow - Flow control focuses on preventing overflow of
the receive buffer - So it also depends on how fast the receiving app
is reading the data!
93Flow Control
- Hence the sender in TCP maintains a receive
window (RcvWindow) variable how much room is
left in the receive buffer - The receive buffer has size RcvBuffer
- The last byte number read by the receiving app is
LastByteRead - The last byte put in the receive buffer is
LastByteRcvd - RcvWindow RcvBuffer (LastByteRcvd
LastByteRead)
94Flow Control
- So the amount of room in RcvWindow varies with
time, and is returned to the sender in the
receive window field of every segment (see slide
73) - The sender also keeps track of LastByteSent and
LastByteAcked the difference between them is the
amount of data between sender and receiver - Keep that difference less than the RcvWindow to
make sure the receive buffer isnt overflowed - LastByteSent LastByteAcked lt RcvWindow
95Flow Control
- If the RcvWindow goes to zero, the sender cant
send more data to the receiver ever! - To prevent this, TCP makes the sender transmit
one byte messages when RcvWindow is zero, so that
the receiver can communicate when the buffer is
no longer full
96UDP Flow Control
- There aint none (sic!)
- UDP adds newly arrived segments to a buffer in
front of the receiving socket - If the buffer gets full, segments are dropped
- Bye-bye data!
97TCP Connection Management
- Now look at the TCP handshake in detail
- Important since many security threats exploit it
- Recall the client process wants to establish a
connection with a server process - Step 1 client sends segment with code SYN1 and
an initial sequence number (client_isn) to the
server - Choosing a random client_isn is key for security
98TCP Connection Management
- Step 2 Server allocates variables needed for
the connection, and sends a connection-granted
segment, SYNACK, to the client - This SYNACK segment has SYN1, the ack field is
set to client_isn1, and the server chooses its
initial sequence number (server_isn) - Step 3 Client gets SYNACK segment, and
allocates its buffers and variables - Client sends segment with ack value server_isn1,
and SYN0
99TCP Connection Management
- The SYN bit stays 0 while the connection is open
- Why is a three-way handshake used?
- Why isnt two-way enough?
- Now look at closing the connection
- Either client or server can close the connection
100TCP Connection Management
- One host, lets say the client, sends a segment
with the FIN bit set to 1 - The server acknowledges this with a return
segment, then sends a separate shutdown segment
(also with FIN1) - Client acknowledges the shutdown from the server,
and resources in both hosts are deallocated
101TCP State Cycle
- Another way to view the history of a TCP
connection is through its state changes - The connection starts Closed
- After the handshake is completed its Established
- Then the processes communicate
- Sending or receiving a FIN1 starts the closing
process, until both sides get back to Closed - Whoever sent a FIN waits some period (30-120 s)
after ACKing the other hosts FIN before closing
their connection
102Stray Segments
- Receiving a segment with SYN trying to open an
unknown or closed port results in - Server sends a reset message RST1, meaning go
away, that port isnt open - Similarly, a UDP packet with unknown socket
results in sending a special ICMP datagram (see
next chapter) - Nmap is a good tool for scanning for TCP or UDP
ports left open on a server
103Congestion Control
- Now address congestion control issues
- Congestion is a traffic jam in the middle of the
network somewhere - Most common cause is too many sources sending
data too fast into the network - Lets look at increasingly complex scenarios
- Suppose two hosts (A and B) send data across a
single router to hosts C and D - Host A sends only to host D B sends only to C
104Congestion Control
- Assume
- Hosts A and B send data at a steady rate lin
bytes/sec each, with no retransmission - The router has capacity of C bytes/sec, and has
infinite buffer space for excess packets - The throughput for data received at C and D is
lout bytes/sec - Each link will get up to half of the router
capacity, and can throughput data no faster
105Congestion Control
- The delay between hosts (A-D or B-C) becomes
infinite, as the send rate exceeds router
capacity, and the buffer stays full
106Congestion Control
- Hence as a network reaches its capacity, the
queuing delays become very large! - Now for scenario 2
- Allow retransmission of data, and make the router
buffer finite in size, and router rate is R
bytes/sec - Now send rate for hosts A and B is the offered
load, lin bytes/sec (equal to a steady rate
plus the retransmission rate)
107Congestion Control
- Throughput depends on how packet retransmission
is handled - Case a. below assumes no packet loss ever, and
hence no retransmission
108Congestion Control
- Case b. assumes only retransmission when a packet
is really lost - At max throughput, each link gets 0.333R
bytes/sec of new data, and 0.166R bytes/sec of
retransmissions - Case c. assumes that retransmissions may be done
for delayed packets, hence there can be duplicate
packets delivered to the receiver - Data shown assumes each packet gets delivered
twice, hence max rate is R/4 instead of the ideal
R/2
109Congestion Control
- Key lessons from cases b and c are
- A congested network forces retransmissions for
packets lost due to buffer overflow, which adds
to the congestion - A congested network can waste its bandwidth by
sending duplicate packets which werent lost in
the first place - Now lets make it a really messy problem
110Congestion Control
- Scenario 3 has four hosts again, but now four
routers (R1 to R4) - Hosts A and B only connect to C and D,
respectively, but across opposite routes, e.g. - Host A ? R1 ? R2 ? Host C
- Host C ? R3 ? R4 ? Host A
- Host B ? R2 ? R3 ? Host D
- Host D ? R4 ? R1 ? Host B
111Congestion Control
112Congestion Control
- Assume all hosts transmit at rate lin bytes/sec,
and all routers have capacity C bytes/sec - With little traffic being transmitted, the data
rate received at each host, lout bytes/sec, is
about the same as lin - With lots of traffic, consider one router, R2
113Congestion Control
- Router R2 could have traffic from R1 at rate C
bytes/sec - But it also gets traffic from B toward D also
competing for buffer space - Hence B-D traffic could overwhelm R2, and cut off
traffic on the A-C path, forcing everyones
output to zero
114Congestion Control
- So the work done by a router like R1 to forward a
packet is wasted when a later router (R2) drops
that packet - So the lesson is dropping a packet wastes the
transmission capacity of every upstream link that
packet saw - So what are our approaches for dealing with
congestion?
115Congestion Control Approaches
- Either the network provides explicit support for
congestion control, or it doesnt - End-to-end congestion control is when the network
doesnt provide explicit support - Presence of congestion is inferred from packet
loss, delays, etc. - Since TCP uses IP, this is our only option right
now
116Congestion Control Approaches
- Network-assisted congestion control is when
network components (e.g. routers) provide
congestion feedback explicitly - IBM SNA, DECnet, and ATM use this, and proposals
for improving TCP/IP have been made - Network equipment may provide various levels of
feedback - Send a choke packet to tell sender theyre full
- Flag existing packets to indicate congestion
- Tell what transmission rate the router can
support at the moment
117ATM ABR Congestion Control
- ATM Available Bit-Rate (ABR) is one method of
network-assisted congestion control - It uses a combination of virtual circuits (VC)
and resource management (RM) cells (packets) to
convey congestion information along the VC - Data cells (packets) contain a congestion bit to
prompt sending a RM cell back to the sender - Other bits convey whether the congestion is mild
(dont increase traffic) or severe (back off) or
tell the max rate supported along the circuit
118TCP Congestion Control
- As noted, TCP uses end-to-end congestion control,
since IP provides no congestion feedback to the
end systems - In TCP, each sender limits its send rate based
on its perceived amount of congestion - Each side of a TCP connection has a send buffer,
receive buffer, and several variables - Each side also has a congestion window variable,
CongWin
119TCP Congestion Control
- The max send rate for a sender is the minimum of
CongWin and the RcvWindow - LastByteSent LastByteAcked lt min(CongWin,
RcvWindow) - Assume for the moment that the RcvWindow is
large, so we can focus on CongWin - If loss and transmission delay are small,
CongWin bytes of data can be sent every RTT,
for a send rate of CongWin/RTT
120TCP Congestion Control
- Now address how to detect congestion
- Call a loss event when a timeout occurs or
three duplicate ACKs are received - Congestion causes loss events in the network
- If theres no congestion, lots of happy ACKs tell
TCP to increase CongWin quickly, and hence
transmission rate - Conversely, slow ACK receipt slows CongWin
increase
121TCP Congestion Control
- TCP is self-clocking, since it measures its own
feedback (ACK receipt) to determine changes in
CongWin - Now look at how TCP defines its congestion
control algorithm in three major parts - Additive-increase, multiplicative-decrease
- Slow start
- Reaction to timeout events
122Additive-increase, Multiplicative-decrease
- When a loss event occurs, CongWin is halved
unless it approaches 1.0 MSS, a process called
multiplicative-decrease - When theres no perceived congestion, TCP
increases CongWin slowly, adding 1 MSS each RTT
this is additive-increase - Collectively they are the AIMD algorithm
Recall MSS maximum segment size
123AIMD Algorithm
- Over a long TCP connection, when theres little
congestion, AIMD will result in slow rises in
CongWin, followed by a cut in half when a loss
event occurs, followed by another slow rise,
etc., producing a grumpy sawtooth wave
124Slow Start
- The initial send rate is typically 1 MSS/RTT,
which is really slow - To avoid a really long ramp up to a fast rate, an
exponential increase in CongWin is used until the
first loss event occurs - CongWin doubles every RTT during slow start
- Then the AIMD algorithm takes over
125Reaction to Timeout
- Timeouts are not handled the same as triple
duplicate ACKs - Triple duplicate ACKs are followed by halve
CongWin, then use AIMD approach - But true timeout events are handled differently
- The TCP sender returns to slow start, and if no
problems occur, ramps up to half of the CongWin
value before the timeout occurred - A variable Threshold stores the 0.5CongWin value
when a loss event occurs
126Reaction to Timeout
- Once CongWin gets back to the Threshold value, it
is allowed to increase linearly per AIMD - So after a triple duplicate ACK, CongWin recovers
faster (called a fast recovery, oddly enough)
than after a timeout - Why do this? Because the triple duplicate ACK
proves that several other packets got there
successfully, even if one was lost - A timeout is a more severe congestion indicator,
hence the slower recovery of CongWin
127TCP Tahoe Reno
- TCP Tahoe follows the timeout recovery pattern
after any loss event - Go back to CongWin 1 MSS, ramp up exponentially
until reach Threshold, then follow AIMD - TCP Reno introduced the fast recovery from
triple duplicate ACK - After loss event, cut CongWin in half, and resume
linear increase until next loss event repeat
128TCP Tahoe Reno
129TCP Throughput
- If the sawtooth pattern continues, with a loss
event occurring at the same congestion window
size consistently, then the average throughput
(rate) is - Average throughput 0.75W/RTTwhere W is the
CongWin when loss event occurs
130TCP Future
- TCP will keep changing to meet the needs of the
Internet - Obviously, many critical Internet apps depend on
TCP, so there are always changes being proposed - See RFC Index for current ideas
- For example, many want to support very high data
rates (e.g. 10 Gbps)
131TCP Future
- In order to support that rate, the congestion
window would have to be 83,333 segments - And not lose any of them!
- If we have the loss rate (L) and MSS, we can
derive - Average throughput 1.22MSS/(RTTsqrt(L))
- For 10 Gbps throughput, we need L about 2x10-10,
or lose one segment in five billion!
132Fairness
- If a router has multiple connections competing
for bandwidth, is it fair in sharing? - If two TCP connections of equal MSS and RTT are
sharing a router, and both are primarily in AIMD
mode, the throughput for each connection will
tend to balance fairly, with cyclical changes in
throughput due to changes in CongWin after packet
drops
133Fairness
- More realistically, unequal connections are less
fair - Lower RTT gets more bandwidth (CongWin increases
faster) - UDP traffic can force out the more polite TCP
traffic - Multiple TCP connections from a single host
(e.g. from downloading many parts of a Web page
at once) get more bandwidth
134Are We Done Yet?
- So weve covered transport layer protocols from
the terribly simple UDP to a seemingly exhaustive
study of TCP - Key features along the way include
multiplexing/demultiplexing, error detection,
acknowledgements, timers, retransmissions,
sequence numbers, connection management, flow
control, end-to-end congestion control - So much for the edge of the Internet next is
the network layer, to start looking at the core