Title: DHTTP: An Efficient and CacheFriendly Transfer Protocol for the Web
1DHTTP An Efficient and Cache-Friendly
TransferProtocol for the Web
- By Michael Rabinovich and Hua Wang
- Presented by Jerry Usery
2Overview
- Introduction
- DHTTP protocol
- Implications of DHTTP
- Server design
- Performance analysis
- Future work
- Summary
3Introduction
- Two issues addressed
- 1) Violation of end-to-end principle by
interception caches (Web proxies) - Impersonation of origin servers
- Web interactions may be disrupted
- 2) Performance implications of client-initiated
TCP (Transmission Control Protocol) as transport
protocol - HTTP generally conceived as file transfer
protocol - TCP connection overhead, persistence pipelining
4IntroductionInterception Cache
5IntroductionPerformance Implications
- TCP connection overhead, persistence
pipelining penalties - Persistent connections
- Degradation of throughput and increasing
connections - Pipelined transmissions
- Server must maintain connections, send responses
in order - Head-of-line delays can occur with slow responses
6IntroductionMain Ideas of DHTTP
- Two main ideas of DHTTP (Dual-Transport HTTP
protocol) - 1) Split Web traffic between UDP (User Datagram
Protocol) and TCP - Client typically sends requests via UDP
- Server sends response via UDP or TCP
- Response size
- Network conditions
- 2) Server establishes TCP connection to client
7IntroductionUDP vs TCP
- UDP use for short responses
- Reduced number of open server connections
- Reduced number of TCP connection setups
- DHTTP benefits
- Improves client latency
- Fewer Web interactions wait for connections
- Increases server capacity
- Servers manage fewer open connections
- Remaining TCP connections reserved for larger
objects - Improved utilization of TCP connections
- Ordering constraints of pipelining doesnt exist
8IntroductionDHTTP Server-established TCP
connections
- TCP connection client/server roles reversed
- Some firewall implications (existing
countermeasures suffice) - Benefits
- Retains end-to-end Internet principle
- True interception cache IP address use
- Allows arbitrary deployment of interception
caches - Server-initiated TCP reduces message round trips
(even with initial UDP request) - Message round trips decreases with UDP
- Bottleneck process removed (server-accepted TCP
connections)
9DHTTP Protocol
- Web clients and servers listen on two ports
- One UDP, one TCP
- Servers use well-known ports (or URL-specified)
- Clients use ephemeral ports (short-lived, per
download) - Two channels exist between client and server
- UDP used for requests below 1460 bytes
- Most HTTP requests met
- Server opens TCP connection or reuses open one
for larger messages - If TCP request sent by client, server may use it
10DHTTP ProtocolMessage Exchange
- (a) HTTP (b) DHTTP over UDP (c) DHTTP over TCP
11DHTTP Protocol Message Format Description
- Response may arrive to client out of order
- Client assigns request ID, then matches it
- Client advises server of listening port numbers
- Channels source port number included in IP
header already - UDP request contains clients TCP port number
- TCP request contains clients UDP port number
- Flag field used for duplicate (resend) requests
12DHTTP ProtocolMessage Format
13DHTTP Protocol
- Reliability
- DHTTP stipulates that a client may resend UDP
requests - Lessens overhead
- Fighting denial-of-service attacks beyond
papers scope - Nonidempotent (e-commerce, etc) requests
- Delegated to TCP channel
- Congestion control
- DHTTP responds to any resent client requests via
TCP - Aids packet loss issues
- In congested Internet experiments, only 6 of
responses sent via UDP
14DHTTP ProtocolChannel Selection/Algorithm
- Response size
- Network condition
- Server maintains fresh requests and resent
requests counters - Loss threshold parameter L (currently 1)
- Size threshold parameter S (1460 bytes,
default) - Algorithm
- 1) Choose TCP for all large responses, i.e.,
whose size exceeds S, as well as for all resent
requests. - 2) If the ratio of resent request counter to
fresh request counter exceeds L, enter a
high-loss mode, else enter a low-loss mode. - 3) In the low-loss mode, choose UDP for all small
responses, i.e., those below the size threshold
S. - 4) In the high-loss mode, choose TCP for the 1-L
fraction of small responses and UDP for the
remaining L small responses.
15DHTTP Implications
- DHTTP and Interception Caches
- DHTTP interception caches intercept UDP requests
TCP requests pass through - Client is aware it speaks with cache
- Retains end-to-end principle (even with caching)
- DHTTP uses UDP channel for short requests
- Reduces TCP setup costs, connections, response
time - DHTTP allows client or server to unilaterally
close TCP connection - Ensures no in-transit data exists
16DHTTP ImplicationsServer Design Description
- Master process
- Accepts incoming requests
- Three threads
- ReadRequest thread
- Reads incoming UDP request, copies into global
buffer - PipeRequest thread
- Pipes global buffer requests to worker processes
- Global buffer moves requests ASAP so UDP port
buffer will not fill up, requests dont get
dropped - Maintenance thread checks worker process status
every second - If too few idle worker processes, it forks new
ones - If too many, it kills some
17DHTTP ImplicationsServer Design Description
- Worker processes
- Execute individual requests, respond to clients
- Reads pipe requests, generates response, chooses
UDP or TCP, sends response - If TCP connection chosen and one exists to
client, it reuses it - One TCP connection per client, possibly many
clients - Includes Timeout thread
- Closes TCP connections idle gt timeout period
18DHTTP ImplicationsServer Design
- Modified Apache 1.3.6 Web server
19Performance Analysis
- Three-pronged performance study
- Trace-driven simulation
- Measured number/utilization of TCP connections
experienced by server (HTTP and DHTTP) - Benchmarked Apache HTTP/DHTTP servers with
clients on same LAN - Compared peak performance and scalability
- Tested both servers in WAN environment with
congested Internet connection
20Performance AnalysisSimulation
- Used access log from ATT low-end hosting
services - Three month duration
- Contained over 100 million accesses
- Average response size of 13 K
- Two threshold values used
- 4 K, optimistic value for many small Web
responses - 1460 bytes, conservative value, one Ethernet MTU
(maximum transfer unit)
21Performance AnalysisSimulation
- Number of TCP connections at a server with three
connections per client
22Performance AnalysisSimulation
- Connection utilization with three connections per
client
23Performance AnalysisSimulation
- Number of TCP connections at a server with one
connection per client
24Performance AnalysisSimulation
- Connection utilization with one connection per
client
25Performance AnalysisPrototyped Testing Results
- Apache performance (bottleneck at server)
- (a) Throughput with three connections per client
- (b) Latency with three connections per client
26Performance AnalysisPrototyped Testing Results
- Apache performance (bottleneck at server)
- (c) Throughput with one connection per client
- (d) Latency with one connection per client
27Performance AnalysisPrototyped Testing Results
- DHTTP server performance (bottleneck at server)
- (a) Throughput with three connections per client
- (b) Latency with three connections per client
28Performance AnalysisPrototyped Testing Results
- DHTTP server performance (bottleneck at server)
- (c) Throughput with one connection per client
- (d) Latency with one connection per client
29Performance AnalysisPerformance Comparison
- Comparison of Apache and DHTTP servers
- (a) Throughput (b) Latency
30Performance AnalysisPerformance Comparison
- Apache and DHTTP server performance under network
congestion - (a) Throughput (b) Latency
31Performance AnalysisPerformance Comparison
- Effectiveness of congestion detection in DHTTP
server
32Future Work
- Dividing response among several UDP packets
- Likely allows higher size thresholds
- Building native support for nonidempotent
requests - Investigation of finer algorithms that track
network conditions at the client and/or subnet - Dynamic policies for size threshold selection
- High-loss vs low-loss environments
33Summary
- Introduction
- DHTTP protocol
- Implications of DHTTP
- Server design
- Performance analysis
- Future work
- Summary