Title: Protocol Underlying
1 Protocol Underlying HTTP
Chapter 5
Web Protocols and Practice
2Topics
PROTOCOLS UNDERLYING HTTP
Protocol Definition Domain Name System Application-Layer Protocols Internet Protocol Transmission Control Protocol
Web Protocols and Practice
3Protocol Definition
PROTOCOLS UNDERLYING HTTP
A protocol defines both the syntax and semantics of the message exchanged between senders and receivers. The protocol suite for the Internet consists of four main layers Link layer Handles the hardware details of interfacing with the physical communication medium, such as Ethernet, Asynchronous Transfer Mode (ATM), or Synchronous Optical Network (SONET).
Web Protocols and Practice
4Protocol Definition
PROTOCOLS UNDERLYING HTTP
Network layer Handles the delivery of individual packets of data through the network. A network-layer protocol is implemented in routers and the end hosts. Transport layer Coordinates the communication between hosts on behalf of the application layer. In practice, a transport layer protocol is typically implemented in the operating system of the end host. Application layer Handles the details of specific applications. In practice, an application-layer protocol is typically implemented as part of the application software, such as a Web browser or Web server.
Web Protocols and Practice
5Protocol Definition
PROTOCOLS UNDERLYING HTTP
Figure 5.1 illustrates layering of protocols.
Web Protocols and Practice
6Protocol Definition
PROTOCOLS UNDERLYING HTTP
NNTP
HTTP
Application layer
DNS
Telnet
FTP
SMTP
UDP
TCP
Transport layer
IP
Network layer
ATM
SONET
Ethernet
Link layer
Figure 5.1. Layering of protocols
Web Protocols and Practice
7Protocol Definition
PROTOCOLS UNDERLYING HTTP
The three main protocols involved in the transfer of HTTP messages are Internet Protocol (IP) Is a network-layer protocol that coordinates the delivery of individual packets from one host to another, based on the IP address of the destination host. Transmission Control Protocol (TCP) Is a transport-layer protocol that coordinates the transmission of IP packets in order to provide the abstraction of a reliable, bidirectional connection between two communicating applications. Some applications use User Datagram Protocol (UDP).
Web Protocols and Practice
8Protocol Definition
PROTOCOLS UNDERLYING HTTP
Domain Name System (DNS) Is an application-layer protocol that controls the translation of hostnames into IP addresses, and vice versa.
Web Protocols and Practice
9Domain Name System
PROTOCOLS UNDERLYING HTTP
Domain Name System Definition DNS Resolver DNS Architecture DNS Protocol DNS Queries and the Web
Web Protocols and Practice
10Domain Name System Definition
PROTOCOLS UNDERLYING HTTP
The Domain Name System (DNS) coordinates the translation of hostnames to IP addresses and IP addresses into hostnames. Machines on the Internet have hostnames because Remembering a hostname is much easier than remembering an IP address. The IP address associated with a hostname may change over time.
Web Protocols and Practice
11DNS Resolver
PROTOCOLS UNDERLYING HTTP
A software library that is linked with the Internet applications is named resolver. A DNS resolver performs two main functions Gethostbyname() The function converts a hostname to an IP address. Gethostbyaddr() The function converts an IP address to a hostname. The resolver interacts with one or more DNS servers to perform these functions on behalf of the application.
Web Protocols and Practice
12DNS Architecture
PROTOCOLS UNDERLYING HTTP
In the early days, a single master file listed the IP addresses associated with each hostname. Now DNS is a distributed database that consists of a hierarchical set of name servers, each responsible for a portion of the domain names and address space. The DNS architecture reflects the hierarchy of hostnames and IP addresses. (Figure5.10)
Web Protocols and Practice
13DNS Architecture
PROTOCOLS UNDERLYING HTTP
Unnamed root
com
edu
org
ac
uk
zw
arpa
Top-level domains
Generic domains
Country domains
bar
In-addr
ac
Second-level domains
cam
west
east
12
www
ftp
users
34
57
56
www.west.bar.com
ftp.east.bar.com
user.cam.ac.uk
Figure 5.10. DNS hierarchy
Web Protocols and Practice
14DNS Architecture
PROTOCOLS UNDERLYING HTTP
The top level includes the three-character generic or organizational domains and two-character country domains. The top level domains are handled by a collection of root servers. The hierarchy of domain names does not correspond to the hierarchical structure of IP addresses. Efficient mapping of IP addressing to hostnames requires a separate hierarchy based on IP addresses.
Web Protocols and Practice
15DNS Protocol
PROTOCOLS UNDERLYING HTTP
The DNS protocol governs communication between a DNS client and a DNS server. A DNS client sends a query for information (e.g. ,the IP address associated with a particular hostname) to a DNS server, and the DNS server returns a response with the requested information (e.g., the IP address).
Web Protocols and Practice
16DNS Protocol
PROTOCOLS UNDERLYING HTTP
DNS queries can be recursive or iterative. A recursive query requests that the receiving DNS server resolve the entire query itself. An iterative query requests that the receiving DNS server respond directly to the DNS client with the IP address of the next DNS server in the DNS hierarchy. Root servers handle only iterative queries. (Figure 5.11)
Web Protocols and Practice
17DNS Protocol
PROTOCOLS UNDERLYING HTTP
Client host
3
Root server
Web browser
DNS cache
4
1
10
DNS query
5
2
Top-level domain server
Local DNS server
6
DNS resolver
9
7
DNS response
Second-level domain server
8
Local area network
Figure 5.11. DNS resolver and local DNS server
Web Protocols and Practice
18DNS Protocol
PROTOCOLS UNDERLYING HTTP
Figure 5.11 shows that for a recursive query The resolver is invoked by a system call from the application (step 1). Then the resolver sends a DNS query to the local DNS server (step 2). Then the resolver waits for the reply (step 9). The resolver provides the IP address to the application (step 10).
Web Protocols and Practice
19DNS Protocol
PROTOCOLS UNDERLYING HTTP
Figure 5.11 shows that for an iterative query The resolver is invoked by a system call from the application (step 1). Then the resolver sends a DNS query to the local DNS server (step 2). The local DNS server sends a query to the root DNS server (step 3). The local DNS server learns the names and IP addresses of the DNS servers for the zone at the next level (step 4).
Web Protocols and Practice
20DNS Protocol
PROTOCOLS UNDERLYING HTTP
Then the local DNS server can send a query to the next DNS server in the chain (steps 5,6,7,8) Ultimately, the local DNS server responds to the resolver (step 9). The resolver provides the IP address to the application (step 10).
Web Protocols and Practice
21DNS Protocol
PROTOCOLS UNDERLYING HTTP
DNS servers employ caching To reduce the latency in responding to queries To reduce the amount of DNS traffic in the Internet DNS primarily uses UDP for sending queries and responses, although TCP may also be used.
Web Protocols and Practice
22DNS Queries and the Web
PROTOCOLS UNDERLYING HTTP
A Web client performs a gethostbyname() query before establishing a transport connection to the Web server. In some cases, the client may not need to perform a DNS lookup Request directed to a proxy Request satisfied by the client cache Using the result of the previous query Although the Web client needs to learn the IP address of the Web server, the Web server knows the IP address of the client when receiving a request because the clients IP address is included the header of each IP packet.
Web Protocols and Practice
23DNS Queries and the Web
PROTOCOLS UNDERLYING HTTP
The mapping of the Web clients IP address to a hostname is controlled by the DNS servers at the Web client institution. Mapping the clients IP address into a hostname often incurs significant delay. In addition, the DNS queries consume resources at the Web server.
Web Protocols and Practice
24Application-Layer Protocols
PROTOCOLS UNDERLYING HTTP
Application-Layer Protocols Definition Telnet Protocol File Transfer Protocol Simple Mail Transfer Protocol Network News Transfer Protocol Properties of Application-Layer Protocols
Web Protocols and Practice
25Application-Layer Protocols Definition
PROTOCOLS UNDERLYING HTTP
Applications execute on end hosts and communicate via application-level protocols. An application-layer protocol defines both the syntax and the semantics of the messages exchanged between the end systems. Four key internet applications are Telnet File transfer E-mail Network news
Web Protocols and Practice
26Telnet Protocol
PROTOCOLS UNDERLYING HTTP
Telnet permits a user to connect to an account on a remote machine. A client program running on the users machine communicates using the Telnet protocol with a server program running on the remote machine. The Telnet client program performs two important functions Interacting with the user terminal on the local host Exchanging messages with the Telnet server
Web Protocols and Practice
27File Transfer Protocol
PROTOCOLS UNDERLYING HTTP
FTP allows a user to copy files to and from a remote machine. The client program sends commands to the server program to coordinate the copying of files between the two machines on behalf of the user. FTP uses separate TCP connections for control and data.
Web Protocols and Practice
28Simple Mail Transfer Protocol
PROTOCOLS UNDERLYING HTTP
SMTP supports the transfer of e-mail. SMTP is used to send an e-mail message from a local mail server to a remote mail server. SMTP is used to send an e-mail message from the users mail agent to the local mail server. The separation of functionality between the user agent and the mail server is valuable The mail agent provides rich features for a single user. The mail server provides reliable service for multiple users. FTP and SMTP are text oriented and command based.
Web Protocols and Practice
29Simple Mail Transfer Protocol
PROTOCOLS UNDERLYING HTTP
The communication between the two servers starts with a greeting message from the remote mail server. Then the local mail server issues commands to transfer the e-mail message. A typical exchange involves separate commands to Identify the local mail server Identify the sender of the e-mail message Identify each recipient of the e-mail message Send the actual e-mail message
Web Protocols and Practice
30Simple Mail Transfer Protocol
PROTOCOLS UNDERLYING HTTP
In contrast to FTP, SMTP uses a single TCP connection for both The command reply exchanges The transfer of the e-mail message In addition to transferring the message between mail servers, delivering an e-mail message requires two additional steps involving the mail agent The transfer of the message to the local mail server The reception of the message from the remote mail server
Web Protocols and Practice
31Network News Transfer Protocol
PROTOCOLS UNDERLYING HTTP
NNTP supports the transfer of articles associated with electronic news groups. A user agent uses NNTP to communicate with a local news server, which uses NNTP to communicate with a central repository of news article. The key idea is to store the messages in a central database instead of having a separate copy in each subscribers mailbox.
Web Protocols and Practice
32Network News Transfer Protocol
PROTOCOLS UNDERLYING HTTP
The database consists of a collection of newsgroups, each associated with an ordered list of messages. An article includes header lines such as E-mail address of the person who posted the article Subject matter of the article Date/time when the article was generated Number of lines of text in the article Unique message identifier for the article List of newsgroups receiving with the article
Web Protocols and Practice
33Network News Transfer Protocol
PROTOCOLS UNDERLYING HTTP
NNTP coordinates the transfer of messages between the local news server and the central repository. NNTP may also be used between the user agent and the local news server.
Web Protocols and Practice
34Properties of Application-Layer Protocols
PROTOCOLS UNDERLYING HTTP
Telnet, FTP, SMTP, and NNTP have important similarities and differences, as follows Command/reply Telnet clients and servers send commands in binary format. FTP, SMTP, NNTP commands are text-based and are sent by the client. The commands have a well-defined, fixed format, and the server responds with a three-digit reply code and an optional text message. Data types Telnet, FTP, SMTP, and NNTP transmit textual data in the standard U.S. 7-bit ASCII format. FTP also supports the transfer of data in binary form.
Web Protocols and Practice
35Properties of Application-Layer Protocols
PROTOCOLS UNDERLYING HTTP
Transport All four protocols rely on a reliable transport protocol, typically TCP. Telnet, SMTP, and NNTP use a single TCP connection for transmitting commands/replies and data. FTP uses separate connections for control and data. Directionality FTP and NNTP can transfer data in both directions- copying data from the client and retrieving files from the server. SMTP is used to transmit e-mail messages from the client to the server.
Web Protocols and Practice
36Properties of Application-Layer Protocols
PROTOCOLS UNDERLYING HTTP
Statefulness Under all four protocols, the server retains information about the session with the client.
Web Protocols and Practice
37Internet Protocol
PROTOCOLS UNDERLYING HTTP
The Internet protocol (IP) is the network-level protocol underlying the Internet, a collection of interconnected networks spanning the globe. IP provides a framework for sending individual packets. In traveling from the sending host to the receiving host, a packet traverses a collection of routers that communicate via IP. (Figure 5.2)
Web Protocols and Practice
38Internet Protocol
PROTOCOLS UNDERLYING HTTP
Web client
Web server
HTTP message
HTTP
HTTP
TCP segment
TCP
TCP
Router
Router
IP packet
IP packet
IP packet
IP
IP
IP
IP
Ethernet interface
Ethernet interface
Ethernet interface
Ethernet interface
SONET interface
SONET interface
Ethernet
Ethernet
SONET link
Figure 5.2. Protocols involved in transferring
HTTP messages
Web Protocols and Practice
39Internet Protocol
PROTOCOLS UNDERLYING HTTP
The routers in the Internet treat each packet independently and do not need to retain state across successive packets. A sequence of IP packets traveling from one host to another may not traverse the same path through the network. Packets may be lost, corrupted, or delivered out of order. The model of the Internet is referred to as packet switching.
Web Protocols and Practice
40Internet Protocol
PROTOCOLS UNDERLYING HTTP
Internet hosts are identified by numerical addresses (IP addresses). An IP address can be divided into a network part and a host part. Once the packet reaches the destination network, the host portion of the address is used to direct the packet to the appropriate destination machine. IP addresses are allocated in five classes. (As discussed before in socket programming)
Web Protocols and Practice
41Internet Protocol
PROTOCOLS UNDERLYING HTTP
Each IP packet has a header. The fields of the IP header are set by operating system on the sending machine and are important for successful communication between the sender and receiver Version number (4 bits) Header length (4 bits) Type of service (8 bits) Total length (16 bits) Identification (16 bits) IP flags (3 bits)
Web Protocols and Practice
42Internet Protocol
PROTOCOLS UNDERLYING HTTP
Fragment offset (13 bits) Time-to-live (8 bits) Protocol (8 bits) Header checksum (16 bits) Source IP address (32 bits) Destination IP address (32 bits) IP options (variable length) (Figure 5.4)
Web Protocols and Practice
43Internet Protocol
PROTOCOLS UNDERLYING HTTP
0
4
8
16
20
32
total length Type of service hdr len Ver
fragment offset flags identification
20 bytes
Header checksum protocol time to live
IP header
Source IP address
destination IP address
Options (0 or more)
data
Figure 5.4. Format of an IP packet
Web Protocols and Practice
44Transmission Control Protocol
PROTOCOLS UNDERLYING HTTP
Fallowing topics will be discussed Transmission Control Protocol Definition Opening and Closing a TCP Connection Sliding-Window Flow Control Retransmission of Lost Packets
Web Protocols and Practice
45Transmission Control Protocol Definition
PROTOCOLS UNDERLYING HTTP
The Transmission Control Protocol (TCP) coordinates the transmission of data between a pair of applications. Applications communicate by reading from and writing to a socket that presents data as an ordered, reliable stream of bytes. The TCP sender divides data into segments and transmits each segment in an IP packet along with a TCP header.
Web Protocols and Practice
46Transmission Control Protocol Definition
PROTOCOLS UNDERLYING HTTP
The TCP header includes information necessary to coordinate the ordered, reliable delivery of segments. The sending and receiving applications should be allowed to assume that they communicate over a channel that provides an ordered, reliable byte stream. IP does not provide this service. Instead, this abstraction is provided by TCP.
Web Protocols and Practice
47Opening and Closing a TCP Definition
PROTOCOLS UNDERLYING HTTP
The SYN, ACK, FIN, and RST flags in the TCP header are used in opening and closing a TCP connection. Establishing a TCP connection between two applications, A and B, involves a three-way handshake. SYN from A to B SYN-ACK from B to A ACK from A to B (Figure 5.5)
Web Protocols and Practice
48Opening and Closing a TCP Connection
PROTOCOLS UNDERLYING HTTP
Termination a TCP connection between two applications, A and B, involves a four-way handshake. FIN from B to A ACK from A to B FIN from A to B ACK from B to A (Figure 5.5)
Web Protocols and Practice
49Opening and Closing a TCP Connection
PROTOCOLS UNDERLYING HTTP
B
ACK
ACK
FIN
ACK
DATA
SYN
SYN-ACK
ACK
DATA
DATA
FIN
ACK
A
Figure 5.5. Timeline of a TCP connection
Web Protocols and Practice
50Sliding-Window Flow Control
PROTOCOLS UNDERLYING HTTP
The TCP sender limits the transmission of data to avoid overflowing the buffer space at the receiver for two reasons The sender should not transmit more data than the receiver can store in its buffers. The sender should not transmit data more quickly than the network can handle. Each TCP sender limits the number of unacknowledged bytes in the network, using sliding-window flow control.
Web Protocols and Practice
51Sliding-Window Flow Control
PROTOCOLS UNDERLYING HTTP
To avoid overflow of the buffer at the receiver, packets from B to A include the receiver window in the TCP header.
Web Protocols and Practice
52Retransmission of Lost Packets
PROTOCOLS UNDERLYING HTTP
The retransmission of lost packets plays a crucial role in how TCP provides reliable delivery of a stream of bytes. The sender infers that a packet has been lost in two ways A retransmission timeout (RTO) Duplicate acknowledgement
Web Protocols and Practice
53Retransmission of Lost Packets
PROTOCOLS UNDERLYING HTTP
Selecting an appropriate value for the RTO is a delicate process Setting RTO too low results in a false alarm, and the sender unnecessarily transmits a packet that was not actually lost. Setting RTO too high postpones the detection of a lost packet, resulting in unnecessary delay in retransmitting the packet. The right value for RTO depends on The distance between the sender and receiver The network congestion
Web Protocols and Practice
54Retransmission of Lost Packets
PROTOCOLS UNDERLYING HTTP
The time between transmission of a packet and receipt of the acknowledgement is called Round Trip Time (RTT). The RTO is set to the average RTT plus an additive factor. In some cases, the sender can infer that a packet has been lost without waiting for the retransmission timer to expire.
Web Protocols and Practice