Title: Implementing a Loadbalanced Web Server System
1Implementing a Load-balanced Web Server System
2Architecture of A Cluster-based Web System
Courtesy IBM Research Report, The state of the
art in the locally distributed Web Server systems.
3Architecture of Our Web Server Cluster
File Service Database Service
http requests
Web Server 1
Load Distributor (grid1.cs.ucr.edu)
File Service Video-On-Demand
http requests
Web Server 2
4Our Web Server Cluster
- The whole web server only provides one visible
web address to the outside world. - Each Web Server is able to provide two kinds of
web services. - The load distributor distributes the incoming
requests among the servers according to either
content-aware or content-unaware load balancing
strategies.
5Tasks to do to set up the system
- Building up the web services on the servers
- File Services
- Video-on-demand Services
- Database Services
- Implementing the load distributor on the
frond-end node - Content aware request distribution
- Content unaware request distribution
6 Building Up the Web Services
- File service
- Built on top of Apache server.
- File set is generated by SPECWEB99.
- Video-On-Demand
- Real MPEG2 movies are stored in a specific
directory on the Apache server. - Client video streaming software (VideoLan) is
installed and automatically launched by the
Apache server. - Database service
- Built on top of Apache and MySQL.
7Video On Demand Service
- VideoLAN project (Open Source Media Streaming
Solution) - Targeting multimedia streaming of MPEG-1, MPEG-2,
MPEG-4 and DivX files, DVDs, digital satellite
channels, digital terrestial television channels
and live videos on a high-bandwidth IPv4 or IPv6
network in unicast or multicast. - Client-server Architecture
- Server streams MPEG-1, MPEG-2 and MPEG-4 / DivX
files, DVDs and live videos on the network in
unicast or multicast. - Client receives, decodes and displays MPEG
stream.
8VideoLan System
9Building Up Video-On-Demand Service in Our Web
Server
- VideoLan client-server software is installed
- Server can stream movies to the client in
realtime through UDP/RTP or HTTP/TCP - For video-on-demand service using HTTP/TCP, only
the client is needed. The client software (vlc)
is automatically launched once the Apache server
detects that it is a video file.
10Load Balancing Schemes
- Content Unaware Scheme
- Choose a server before receiving the URL request
- Round Robin
- Content Aware Schemes
- Choose a server to dispatch a request after
receiving and looking at the URL request - Balance load according to different URL request
- For database service Database Server
- For video-on-demand service Multimedia Server
- For file service Round Robin
11 Implementing the Load Distributor
- Install the TCPSP
- The tcp splicing is a technique to splice two
connections inside the kernel, so that data
relaying between the two connections can be run
at near router speeds. - Write the Distributor program in C language
- Two load balancing strategies are implemented
- The installed kernel module TCPSP is invoked to
perform TCP splicing - Run the distributor program in the application
level
12Flow Chart of the Load Distributor(content aware)
Distributor
Child Process
13Flow Chart of the Load Distributor(content
unaware)
Distributor
Child Process
Listen for incoming connections on port 8888
Accept the connection
Choose a server according to the load balancing
scheme
Create a child process to do further processing
14Comparison with Gage
- Gage A QoS Aware Web Server System
- Performance Guarantees for Cluster-Based
Internet Services, Chang Li, State University
of New York at Stony Brook. - The load distributor is implemented as a kernel
module. It is faster but can only implement
content-unaware load balancing. - Gage doesnt provide a variety of web services.
15Planned Performance Measurement
- Let all servers provide file service, use
SPECWEB99 to test the performance of the
cluster-based file server. - Compare the time taken to service a Database
request through the load distributor with that
without the load distributor. - Compare the time taken to service a Database
request through the load distributor with that
without the load distributor. -
16SPECWEB99
17Lets go to the lab to see DEMO!