Title: Introduction to Content-aware Switch
1Introduction to Content-aware Switch
2Content-aware Switch (CS)
www.yahoo.com
Internet
Image Server
APP. DATA
TCP
IP
Application Server
Switch
GET /cgi-bin/form HTTP/1.1 Host www.yahoo.com
HTML Server
- Front-end of a web cluster
- Route packets based on layer 5/7 (content)
information
3Why use CS
- Servers can be specialized for certain types of
request - Content segregation
- Exploit locality
- Affinity-based routing
- Increase the performance because of the improved
hit rate - Partial replication of server file set
- Partition the servers file set over different
nodes
4Content-aware Switch Architecture
- Two way architecture
- Server returns the
- response to the switch
- One way architecture
- Server returns the
- response to the client
server
switch
client
Valeria01
5Layer 7 Two-way Architecture
6Layer-7 Two-way Mechanisms
- TCP gateway
- An application level proxy running on the web
switch mediates the communication between the
client and the server - TCP splicing
- reduce the overhead in TCP gateway. Packet
forwarding occurs at network level between the
network interface driver and the TCP/IP stack, is
carried out directly by OS
user
kernel
user
kernel
7TCP Splicing
client
server
content switch
SYN(CSEQ)
step1
step2
SYN(DSEQ)
ACK(CSEQ1)
DATA(CSEQ1)
step3
ACK(DSEQ1)
step7
DATA(SSEQ1)
DATA(DSEQ1)
ACK(CSEQlenR1)
ACK(CSEQLenR1)
step8
ACK(DSEQ
lenD1)
ACK(SSEQlenD1)
lenR size of http request.
.
lenD size of return document
8TCP Splicing w/ Pre-forked Connections
switch
client
server
SYN(PSEQ)
step1
SYN(SSEQ)
step2
ACK(PSEQ1)
ACK(SSEQ1)
step3
SYN(CSEQ)
step4
SYN(DSEQ)
step5
ACK(CSEQ1)
DATA(CSEQ1)
step6
ACK(DSEQ1)
DATA(PSEQ1)
step7
ACK(SSEQ1)
DATA(SSEQ1)
DATA(DSEQ1)
step8
ACK(PSEQlenR1)
ACK(CSEQLenR1)
ACK(DSEQ
lenD1)
ACK(SSEQlenD1)
step9
lenR size of http request.
Ref Yang99
.
lenD size of return document
9Pre-Allocate Server Scheme
Pre-allocated server
client
content switch
- Use a guess routing decision based on
IP/Port/History - Advantage
- Faster than TCP splicing.
- Reduce session processing overhead no need to
convert server sequence
Ref Edward
10Degenerated to TCP Splicing If Guess Wrong
Pre-allocated server
client
content switch
SYN(CSEQ)
SYN(CSEQ)
step1
SYN(SSEQ)
SYN(SSEQ)
step2
ACK(CSEQ1)
ACK(CSEQ1)
step3
DATA(CSEQ1)
FIN(CSEQ1)
ACK(SSEQ1)
Right server
step4
DATA(SSEQ1)
DATA(RSEQ1)
ACK(CSEQLenR1)
ACK(CSEQlenR1)
step5
lenD1)
ACK(DSEQ
ACK(SSEQlenD1)
Sequence conversion needed
11Case Study
- Linux-based content aware switch Yang99
- IBM Layer 5 Pradhan00
12Functional Overview of Content-aware Distributor
Ref Yang99
13Results
- Overhead of the switch
- 89usec reduced ? pre-forked connections
- CS vs. Layer 4 switch
- Affinity-based routing vs. WRR
- Content-segregation vs. WRR
- CGI 27
- Static 36
14IBM Switch Architecture
- Switch core
- Port controller
- Identify packets (layer 5) and send them to CPU
- Processing all other packets
- CPU PowerPC 603e
- Parse http request
- URL based routing
Ref Pradhan99
15Flow Diagram on Layer 5 System
- Client ports vs. server ports
- Classifier Identify packets
16Results
- CS vs. Layer 4 switch
- Entire set of files are replicated
- Some servers share files by NFS
- Partitioned file set
17Layer-7 one-way architecture
18Layer-7 one-way mechanisms
- TCP handoff
- The switch hands off the TCP connection endpoint
to the server - TCP connection hop
- Software-based proprietary solution
- encapsulating the IP packet in an RPX packet and
sending it to the server.
19TCP Handoff
client
content switch
server
SYN(CSEQ)
step1
step2
SYN(DSEQ)
ACK(CSEQ1)
DATA(CSEQ1)
step3
ACK(DSEQ1)
Migrate(Data, CSEQ, DSEQ)
step4
DATA(DSEQ1)
step5
ACK(CSEQlenR1)
step6
ACK(DSEQ
lenD1)
ACK(DSEQlenD1)
- Migrate the created TCP connection from the
switch to the back-end sever - Create a TCP connection at the back-end without
going through the TCP three-way handshake - Retrieve the state of an established connection
and destroy the connection without going through
the normal message handshake required to close a
TCP connection - Once the connection is handed off to the back-end
server, the switch must forward packets from the
client to the appropriate back-end server
Pai98
20Case Study
- Scalable content-aware request distribution in
cluster-based network servers Aron00
21TCP Handoff
(1) a client connects to the front-end (2) the
dispatcher at the front-end accepts the
connection and hands it off to a back-end server
using the handoff protocol (3) the back-end takes
over the established connection received by the
handoff protocol (4) the server at the back-end
accepts the created connection (5) the server at
the back-end sends replies directly to the client
22Scalability of a single Front-end
23Scalable Cluster Design
- Switch
- Dispatcher component
- Implement the request distribution decide which
server should handle request - 0.8usec
- Distributor component
- Distribute the client requests to the server
(handoff or splicing) - 300usec for handoff, gt750usec for splicing
24Cluster Operation
- (1) The layer 4 switch receives a SYN packet,
choose the least loaded distributor - (2) the distributor accepts the TCP connection
and parses the client request - (3) the distributor contacts the dispatcher for
the assignment of the request to a server - (4) the distributor hands off the connection
using TCP handoff protocol to the server - (5) the server takes over the connection using
its handoff protocol - (6) the server application at the server node
accepts the connection - (7) The server sends the response directly to the
client - (8) (not shown) the switch forward TCP
acknowledgments to the corresponding server
25Results
- The proposed cluster architecture scales far
better than the one with a single front-end node.
26Our Current Research on CS
- IXP 1200
- StrongARM _at_ 233MHz
- Microengine(6)
- IXP 2400
- Xscale _at_ 700MHz
- Microengines(8)
27Our Design
28Using TCP Splicing
29Results
30References
- Pradhan00 G.Apostolopoulos, et. al, Design,
Implementation and Performance of a Content-Based
Switch, proceedings of IEEE INFOCOM-2000 - Pai98 V.S. Pai, et. al, Locality-Aware Request
Distribution in Cluster-based Network Servers. In
Proceedings of the 8th Conference on
Architectural Support for Programming Languages
and Operating Systems, San Jose, CA, Oct.1998 - Aron00 Mohit Aron et. al, Scalable
Content-aware Request Distribution in
Cluster-based Network Servers, Proc. of the 2000
Annual Usenix Technical Conference, June 2000 - Edward C. Edward Chow Chow, Introduction to
content switch - Valeria01 Valeria Cardellini, et. al, The state
of the Art in Locally Distributed Web-server
Systems, IBM research report - Yang99 Chu-Sing Yang, et. Al, Efficient support
for content-based rouging in web server clusters,
Proc. Of USITS 99