Title: Server Scalability
1- Session 11
- Server Scalability Planning for Growth
2Content Management Issues.
- Naming Policy.
- File and directory naming needs design!
- Names is difficult to 'fix' at a later stage.
- Poor design will cause maintenance grief.
- Content Update Policy.
- Timing who decides.
- Frequency planned?
- Impact on scalability.
3Content Update Policy
- Without control, a large web system will quickly
spawn - Inconsistencies
- Errors
- Inaccessible data etc.
- Do not to up-date on demand!
- Stick to a regular up-date schedule.
- Consider a content management tool. e.g.
Vignette's StoryServer, see notes.
43 Stage Update
5Scalability Config Issues.
- Scope for initial Webserver Tuning
- Optimise the number of child processes (or
threads on Windows). - Review buffer sizes.
- Directives with performance implications
- Use of KeepAlive .
- HostNameLookups on/off/double
6Scalability Config Issues (2)
- Directives (cont)
- UseCanonicalName on/off/dns
- FollowSymLinks - this is an Option, but can
cause Apache to waste time checking through the
file structure. - Logging of all kinds slows Apache down.
- .htaccess files add overheads.
- Large configuration files also slow Apache, so
thinning here is a good idea.
7Scalability Config Issues (3)
- CGI programs influence the performance of the
website - Consider FastCGI or mod_perl to speed matters.
- Writing efficient code is always important.
- Other Tricks
- Force popular files to be memory resident.
- Force secure transfers to have more bandwidth.
8Proxy Servers Performance Issues.
- An Apache Proxy can
- Cache for speed
- Filter for security or decency.
- Apaches proxy functionality is encapsulated in
mod_proxy - In order to use mod_proxy, use the ProxyRequests
directive
to enable - ProxyRequests on or to
disable ProxyRequests off
9Proxy Customisation.
- To block particular sites from your clients, e.g.
ProxyBlock www.badsite.com
baddomain.co.uk badword
- This will block the specific URL www.badsite.com,
or the domain, or any URL with names that contain
badword.
10Hiding Servers with a Proxy
- Suppose there are two extra servers, parallel to
the dis server. - Add the ProxyPass directive to the main
www.dis.port.ac.uk server configuration file - so that users.dis.port.ac.uk and
secure.dis.port.ac.uk appears as directories on
the main server, e.g.
ProxyPass /users/ http//users.dis.port.ac.uk/ Pro
xyPass /secure/ http//secure.dis.port.ac.uk/
11Fault Tolerance Clustering.
- Tuning done but insufficient?
- Two basic routes to boosting performance
- Replace the server hardware with more powerful
boxes. - Add more servers and distribute the load of
client requests amongst them.
12Benefits of multiple servers
- Server machines can be cheaper and easily
replaceable. - Individual servers can fall over without the
website becoming unavailable. - Increase capacity by adding another server
synchronising the data. - No need to alter or reconfigure any of the
existing servers.
13Clustering (1).
- Cannot just add an extra servers, would need
different IP addresses why? - Set of servers needs to be established as a
cluster so that - For external clients it should appear as one big
fast server with one domain name. - Clients should not be aware that the load is
being shared by a cluster of servers.
14Clustering (2).
- Two basic ways of approaching clustering
- DNS load sharing.
- Web server clustering.
- Several ways for each.
15DNS Load Sharing.
- Mechanism for synchronising the content on the
multiple servers! - Most common approach is Round-Robin DNS
distribution. - It works by specifying multiple IP addresses for
the same host name, e.g. (using a BIND syntax)
www.dis.port.ac.uk. 60 IN A 148.197.203.1 www.dis.
port.ac.uk. 60 IN A 148.197.203.2 www.dis.port.ac.
uk. 60 IN A 148.197.203.3
16Round Robin DNS Sharing.
- Each DNS request for www.dis.port.ac.uk returns
the next IP in sequence. - What about name caches?
- Set a short time-to-live (TTL) the 60
- A lower TTL would
- Improve webserver load sharing,
- But increase the load on DNS server.
- Attraction of the round-robin DNS is its
simplicity.
17Round Robin DNS Sharing.
- Not true load balancing, only load sharing.
- The round-robin takes no account of
- Which servers are loaded,
- Which are free,
- Or even which are actually up and running!
- Round-robin DNS makes keeping state for a user
more difficult eCommerce?
18Hardware Load Balancing.
- LocalDirector and DistributedDirector were
products from Cisco (http//www.cisco.com). - These will rewrite IP headers to redirect a
connection to a local server. - See notes for details.
19Clustering with Apache.
- Last topic on unit!
- Apache provides way to cluster servers using the
features of mod_rewrite and mod_proxy together. - This avoids the DNS caching problems and the cost
of hardware solutions. - Need a machine as a proxy server, handling
requests to several back-end servers on which the
website is actually loaded.
20Clustering with Apache (2).
- E.g. the proxy takes the name www.dis.port.ac.uk
and the backend servers might be www1 to www6. - Wainwright (1999) sets out a method of setting up
Apache using two parts - Use mod_rewrite to randomly select a back-end
server for the client request. - Use mod_proxys ProxyPassReverse directive to
disguise the URL of the back-end server.
21Clustering with Apache (3).
- See notes for details not a practical
proposition for the unit. - Awareness of what the proxy server achieves is
the important point.
22Summary.
- Content Management Issues.
- Managed Updates for the webserver.
- Configuration issues for scalability.
- Proxy Servers filter and cache!
- DNS (round robin) clustering.
- Hardware clustering.
- Proxy based clustering.
23Revision.
- Get a suitable printout of your config file
small font or two columns? - This years exam is two hours. Plan how you
intend to use this time! - Use your config printout to provide config
examples!