Title: Section 11: Transport Layer Protocols and Interfaces
1Section 11Transport Layer Protocols and
Interfaces
- In this section
- The Transmission Control Protocol (TCP)
- Service model
- Message format
- Connection and Disconnection
- State machine
- Flow control
- Timer management
- The User Datagram Protocol (UDP)
- Sockets
2TCP - Transmission Control Protocol
- Provides reliable end-to-end byte stream over
unreliable network. - Designed to dynamically adapt to properties of
network, and for robust handling of failures. - Initial definition in standard RFC 793.
- TCP entity accepts user data from local
processes, splits into pieces no larger than 64K
octets (usually about 1500 octets in practice),
and sends each piece as a separate IP datagram. - When IP datagrams containing TCP information
arrive, TCP entity receives them, and
reconstructs the original byte stream. - TCP times out, retransmits IP datagrams as
needed, and reorders on reception.
3The TCP Service Model
- Sender and receiver use sockets as end points.
- Sockets have an address consisting of IP address
for the machine (such as 127.1.1.1), plus a 16
bit number local to the host called a port. - Connection is established between sockets.
- Port numbers 0 through 1023 are well known, and
are reserved for standard services which are
expected to be found at one of these ports. - Examples FTP daemons at ports 20 and 21, HTTP
server at port 80 - All TCP connections are full duplex, point to
point.
4The TCP Service Model - 2
- Connection is a stream of octets, not messages.
- Four messages of 512 octets may be sent by TCP as
any combination of pieces totaling 2048 octets - 1 2048 octets
- 2 1024 octets
- 4 512 octets
- 8 256 octets
- ...
5The TCP Service Model - 3
- When application sends data to TCP, it may not be
sent immediately - TCP may decide to collect more data before
transmission. - TCP push flag is used to request no transmission
delay. - TCP urgent flag immediately sends all pending
data for connection. - Receiver immediately sends an interrupt to its
user. - On reception, data is buffered until retrieved by
a request from the socket.
6The TCP Protocol
- 32 bit sequence numbers used.
- Separate fields for acknowledgments and window
mechanism. - TCP entities exchange data in the form of
segments - Fixed 20 octet header, plus optional part,
followed by zero or more data octets. - TCP software determines segment size.
- Restrictions on maximum segment size
- IP payload maximum 65,535 octets, including IPs
20 octet header. - MTU at data link layer.
- TCP uses sliding window protocol
- Sender starts timer after segment sent.
- Receiver sends back segment (which may or may not
contain data) with acknowledgment number being
the next sequence number it expects to receive. - Sender retransmits on timeout.
7TCP must handle...
- Fragmentation only part of a transmission
arrives. - Out-of-order segments.
- Duplicates segments delayed long enough in
transit that sender retransmits. - Duplicates fragmentation retransmission may
be fragmented differently. - Network congestion.
8The TCP Segment Header
Source Port
Destination Port
Sequence Number
Acknowledgment Number
Window size
FIN
S Y N
Header Length
A C K
P S H
R S T
U R G
Checksum
Urgent Pointer
Options (0 or more 32 bit words)
Data (optional)
32 bits
Header length 4 bits ACK / PSH / RST / SYN /
FIN / URG 1 bit Other fields 16 or 32 bits
unless noted Unused 6 bits
9TCP Header Fields - 1
- Source and destination ports 16 bit address of
local port. - Sequence and acknowledgment numbers
- Every octet is numbered in a TCP stream.
- Acknowledgment number is next octet number
expected. - 32 bits each.
- Header length
- Needed because options field can vary in length.
- Number of 32 bits words in header.
- URG set to 1 if urgent pointer in use
- Pointer indicates offset from current sequence
number at which urgent data is found. - Facility provided in lieu of interrupts.
- PSH Set to 1 to indicate pushed data.
- Receiver is requested to immediately send data to
user, instead of storing in buffer.
10TCP Header Fields - 2
- ACK Set to 1 to indicate acknowledgment number
is valid - If 0, no acknowledgment in this segment.
- SYN used to establish connections
(synchronize). - SYN 1, ACK 0 in connection request.
- SYN 1, ACK 1 in connection acceptance.
- SYN 0, ACK 1 to acknowledge connection
acceptance - FIN set to 1 to indicate end of user data
(finished). - Used to close connection in sending direction.
- May continue to receive data.
- RST Set to 1 to indicate reset.
- Host has become confused due to crash or for
other reason. - Also used to reject a connection, or refuse an
invalid segment.
11TCP Error Checking
TCP Pseudo-header
Source IP address
Destination IP address
TCP segment length
00000000
Protocol 6
32 bits
- Checksum field provides error detection
information for TCP segment header, plus the
pseudo-header shown above. - Checksum computation
- Set checksum field to all zeros.
- Pad user data with extra 0 octet, if needed, so
that user data has an even number of bytes. - Add all 16 bit words in 1s complement, and take
1s complement of the sum. - When receiver performs this computation,
including checksum field, result should be 0.
12TCP Connection and Disconnection
- Initial sequence number on connection
- Based on clock that ticks every 4 ?sec.
- Maximum packet lifetime MPL seconds (often 30s
to 60s). - When host crashes, it must delay 2MPL seconds to
let outstanding packets be cleared. - Disconnection
- Segment with FIN bit set indicates end of user
data. - Connection is then shut down in that direction.
- Data can continue to be sent in the other
direction. - If no acknowledgment for a segment with FIN is
received in 2MPL, connection is released.
13TCP connection Three-way Handshake
Host 1
Host 2
ack
seq
SYN Propose sequence number x
SYN ACK Propose sequence number Y, and
confirm X
ACK Confirm sequence number Y
Normal, successful connection
14TCP simultaneous disconnection
Host 1
Host 2
FIN Close connection for sending at Host 1.
FIN ACK Close connection for sending at host
2, and confirm FIN reception
ACK Confirm FIN from Host 2
15TCP Half-close
Host 1
Host 2
FIN Close connection for sending at Host 1.
ACK Confirm FIN reception
(additional data transfer)
FIN Close connection for sending at host 2
ACK Confirm FIN reception
16Sequencing of Control Segments
- A segment with the SYN bit on consumes one
sequence number, although there is no data. - A segment with the FIN bit on consumes one
sequence number, although there is no data. - A segment with the ACK bit on, the SYN and FIN
bits off, and that does not carry data, does not
consume a sequence number.
17TCP Connection Management States
18TCP Connection Establishment
CONNECT / SYN
Closed
LISTEN / -
CLOSE / -
Listen
SYN / SYN ACK
CLOSE / -
SYN SENT
SYN RCVD
SYN / ACK
ESTABLISHED
ACK / -
SYN ACK / ACK
19TCP Disconnection
true
FIN / --
ESTABLISHED
auto- close
false
CLOSE / FIN
-- / FIN ACK
-- / ACK
CLOSING
CLOSE WAIT
FIN WAIT 1
FIN / ACK
ACK / -
CLOSE / FIN
ACK / -
FIN ACK / ACK
TIMED WAIT
LAST ACK
FIN WAIT 2
FIN / ACK
timeout / -
CLOSED
ACK / -
20TCP Flow Control
- Sequence numbers correspond to octets within user
data, instead of TCP segments. - Window size field tells how many octets can be
sent starting at the last byte acknowledged. - Principle for flow control window once window
space has been granted for specific octet
numbers, the window space should not be revoked
in a subsequent acknowledgement. - Window size of 0 is legal octets up to
acknowledgment number 1 have been received,
but sender is requested not to send more data. - Permission to restart indicated by sending a
segment with the same acknowledgment number, but
with a non-zero window size. - TCP also uses a separate congestion window the
sender may send only the number of octets
determined from the minimum value of the two
windows.
21TCP transmission policy - 1
Sender
Receiver
Application writes 2K of data
0 4K
2K
Application writes 3K of data
2K
Application reads 2K of data
Sender blocked
2K
1K
22TCP transmission policy - 2
- Window size of zero normally means that sender is
blocked. - Exceptions
- Sending of urgent data.
- May send 1 octet segment requesting receiver to
send window size. - Avoids deadlock when acknowledgment lost.
- Senders are not required to transmit data
immediately on reception from application. - Attempt to avoid sending segments with 20 octets
of TCP header (and 20 more octets of IP header),
and only 1 octet of user data. - Acknowledgments are often delayed up to 500 ms,
in the hope of collecting some user data to send
along with it.
23TCP transmission policy - 3
- Nagles algorithm for sparse user data (i.e. slow
sender) - Send first octet in a segment.
- Buffer all octets from application at sender,
until first acknowledgment is received or buffer
contains maximum segment size. - Then, send entire buffer in one segment.
- Further data is buffered as above.
- Clarks algorithm for receiver slow to read data
- Receiver should not send a window update until
either - There is enough space in the buffer to handle the
minimum segment size advertised in connection
phase. - The buffer is half empty.
- Avoids sending acknowledgments with 1 octet
window size.
24TCP Timer Management
- How long should the retransmission timer be?
- More difficult than in data link layer, since the
round trip travel time variance is greater
T
Probability
10
20
30
20
40
60
Round trip time (msec)
Data link
Transport
- Too short unnecessary retransmissions that will
clog network. - Too long performance reduction due to long
delay before retransmission.
25Dynamic Timer Setting
- Algorithm due to Jacobson (1988).
- Keep variable RTT that is best current estimate
of Round Trip Time to destination. - Measure time M required for acknowledgment to
return for a segment. - Update RTT by
- RTT a RTT ( 1 a ) M
- where a determines how much weight to give old
and new values. - Typical value for a is 7 / 8
- Use mean deviation as an approximation to the
standard deviation of RTT - D b D ( 1 b ) RTT Â M
- Set timeout to be
- timeout RTT 4 ? D
26Additional TCP Timers
- Karns algorithm (for noisy, wireless media)
- Do not use retransmitted segments to update value
for RTT . - Instead, double timeout value until segments get
through. - TCP persistence timer
- Prevents deadlock when current window size is 0,
and acknowledgment increasing window size is
lost. - When persistence timer expires, ask receiver for
window size. - TCP keep alive timer
- Timer expires after long interval with no
messages. - On expiry, send a message to receiver asking are
you still there? - Connection terminated if no response.
- TCP close timer ensures all packets die on
connection termination.
27UDP - User Data Protocol
Source Port
Destination Port
UDP length
UDP checksum
32 bits
- provides ability to send raw IP datagrams
without establishing a connection - described in RFC 768
- often used for client-server applications with
one request, one reply - UDP length number of bytes in datagram,
including 8 byte header - Check sum field provides error detection
- calculated in same manner as TCP, including
pseudo-header - 0 if not used
28Sockets Network user interface
- Functions
- new Socket Create a new communication end
point - Bind Associate a socket with a specified port
number. - Listen Announce willingness to accept
connections - Queue size for incoming connection requests can
be specified. - Accept Block the caller until an incoming
connection request arrives - On receipt of a request, return.
- Connect Actively attempt to establish a
connection - Send Send some data over the connection
- Receive Receive some data from the connection
- Close Release the connection
29Step 1 Create a socket
File descriptor or object reference
Our program
3
socket
Set of local ports
?
23
80
1237
Host
123.90.47.122
Network (IP) address
To network
- Creates attachment point from application to
network. - May be viewed by program as a file descriptor (as
shown above) or an object reference.
30Step 2 Bind the socket to a port
Our program
File descriptor
3
socket
Set of local ports
23
80
1237
Host
123.90.47.122
To network
- Chooses a port (which must be currently unbound).
- Required for servers client often bind
implicitly to next available port while
connecting. - Port numbers from 0 to 1024 cannot be bound by
application programs operating system or root
privileges are required.
31Step 3 (server) Listen for connections
Our program
File descriptor
3
socket
Set of local ports
23
80
1237
Host
Buffers for incoming connection requests
123.90.47.122
To network
32Step 4 Incoming connection request
client program
server program
3
3
socket
socket
client host
1237
23
80
1237
SYN
123.90.47.122
191.36.42.1
- Client performs a connect operation
- Client needs to obtain network address of server.
- Call to Domain Name System (DNS) server used to
obtain IP addresses from a domain name.
33Step 5 Accept a connection request
our program
3
4
Socket for new connection
Socket for listening
host
80
123.90.47.122
To network
- A new socket is created for handling the data
transfer. - The original socket continues listening for
connection requests.
34Step 6 Create new process or thread
Server process/thread
Handler process/thread
3
4
host
80
123.90.47.122
To network
- Why?
- Server can handle multiple, simultaneous
connection requests - A separate handler process can be created for
each request
35Step 7 Data transfer phase
client program
handler
server
4
3
3
socket
client host
1237
23
80
1237
123.90.47.122
191.36.42.1
- Either side can send or receive data
36Step 8 Close sockets
client program
handler
server
4
3
3
client host
1237
23
80
1237
123.90.47.122
191.36.42.1
- Sockets are independently closed at each side of
the connection. - Listening socket must also be closed
independently if server shuts down. - After waiting for twice the maximum packet
lifetime, the port is released and is available
for use again.
37Socket attributes
- Local address (port network address) if bound
- Remote address (port network address) if
connected - Stream vs. datagram
- Usual value stream
- Protocol TCP, UDP, other
- Blocking vs. non-blocking
- Usual value blocking
- A blocking socket will stop and wait for the
following operations to complete - Accept wait until connection request arrives
- Receive wait until data is available
38The Application View of a Socket
- In C, a program views a socket as a file
descriptor. - On Unix/Linux, socket( ) is a direct call to the
operating system. - Calling socket( ) returns a file descriptor.
- On Windows, you can use the basic interface
directly via the winsock library. - In C and Java, sockets are implemented within a
class library, and then the programs view of a
socket is an object reference.
39Java Sockets - 1
- Primary classes
- Socket used for TCP outgoing connection or data
transfer streams - ServerSocket used to listen for incoming TCP
connections. - DatagramSocket used for UDP datagrams (if
allowed by security manager) - Instances of Socket and ServerSocket are always
blocking. - The Java class library uses constructor
parameters to combine some of the steps. - Creating a new Socket can also include the
connect step. - Creating a new ServerSocket can also include the
bind, and listen steps.
40Java Sockets - 2
- Java stream sockets have separate input and
output streams. - Typical usage
- Get input and/or output streams from socket
- Wrap stream in one of the stream classes.
- Use stream write or read methods to send or
receive. - Close streams and socket.
- All socket operations may throw IOExceptions (or
more specific exceptions).
41Java Sockets Example client
- // Implicit local port bind done by socket
constructor - socket new Socket( ?127.0.0.1?, 80 ) // remote
IP port - // Get and wrap input / output streams
- inStream new ObjectInputStream(aSocket.getInputS
tream()) - outStream new ObjectOutputStream(aSocket.getOutp
utStream()) - // Data transfer
- outStream.writeObject( object1 )
- object2 inStream.readObject( ) // Blocks until
data arrives - // Close streams and socket
- inStream.close()
- outStream.close() // Sends TCP 'FIN'
- socket.close() // Unbind socket