New Technologies in the JANET Web Cache Service Martin Hamilton

About This Presentation

Title:

New Technologies in the JANET Web Cache Service Martin Hamilton

Description:

Awarded by competitive tender to Loughborough University Computing Services and ... sms_client - http://www.styx.demon.co.uk/ WAP emulator - http://www.gelon.net ... – PowerPoint PPT presentation

Number of Views:67

Avg rating:3.0/5.0

Slides: 27

Provided by: martinh6

Category:

more less

Transcript and Presenter's Notes

Title: New Technologies in the JANET Web Cache Service Martin Hamilton

1
New Technologies in the JANET Web Cache
ServiceMartin HamiltonGeorge
Neisserhttp//wwwcache.ja.net/support_at_wwwcache.
ja.net
2
What is the JANET Web Cache Service ?

National caching service for the UK education and
research community.
Funded by JISC.
Awarded by competitive tender to Loughborough
University Computing Services and Manchester
Computing.
Largest "site" on JANET.
155 Megabits/second aggregate traffic.
70-80 million transactions/day.
700-800 Gigabytes transferred/day.
Being used by some 170 institutions.

3
What is Web Caching ?

Caches keep copies of popular Internet content.
First site to fetch a URL causes it to be cached.
Subsequent visits get the cached copy.
Exceptions for things like secure (SSL) content,
cookies, and dynamic content (CGI).
Web caching seen as essential by most ISPs and
large Internet sites.
Caches can also be used for content filtering -
e.g. legal requirement for FE sites.

4
Service configuration

Cache machines (34 of these) are typically
Pentium II or III processor, 512 Megabytes
memory, dual 100 Megabit/second Ethernet, and 6
or 12 Ultra2 SCSI disks for cached objects.
Small number (currently 3) of load balancers to
distribute requests between caches.
Caches and load balancers all running Linux and
the Squid Web Cache server.
Some 1.5TB of pooled cache disk.

5
New technologies covered today...

Automation of service monitoring and
availability.
Automating operations, so that a small number of
people can run a huge service.
"Glue" needed to link monitoring and management
tools with email/paging/WAP.
Incident and change logging/reporting.
Management of machines at remote sites.
Identify useful info for other service operators.

6
Problems encountered

A server goes down, e.g. crashes or locks up.
The service (e.g. Squid cache) goes down, but the
server is still up.
The machine or service is slow/overloaded.
Time taken for machines to recover after a crash
- Unix fsck process.
Knowing who changed what, and when.
Capturing long terms stats for profiling.

7
Problem Machine goes down

Spotting the problem - can get away with using
ping for this. Many other tools available to
automate this basic testing.
Fixing may require local action (e.g. push the
reset button), but most Unix systems support
serial console access. Linux also has serial
access to the LILO boot loader.
Serial console useful for remotely managed kit,
and also remote (off-site) access to local kit in
an emergency.

8
Solution Linux Virtual Server
9
Linux Virtual Server explained

Layer 4 switch in software. High service
availability through redundancy.
Load balances traffic across multiple "real
servers" using a virtual IP address per server
weightings.
Real server death only affects current users -
traffic routes around dead servers.
Now fully deployed on the JANET caches.
Useful for other services too, e.g. Websites.
Note that e-mail and DNS have automatic fallback
already.

10
Problem Service goes down

e.g. Squid dies when disks fill up.
Older Squids used to lose track of disk
consumption and fill disks up after a time.
Can spot if Squid is running OK by SNMP.
LVS monitor uses SNMP for service upness and
performance check.
What constitutes your service? Can you measure
its availability automatically?

11
Problem Overloading

Performance metrics available via SNMP already,
plus addons like df and top.
Can also try to use the service, e.g. fetch via
proxy HTTP and measure performance.
Fetch a test URL via each cache at intervals.
Consider what you want to do with the info, e.g.
tune LVS weightings, make case to management for
more funding -)

12
Solution SNMP network monitoring
13
Solution SNMP service monitoring
14
Problem Filestore check (fsck)

Bugbear of traditional Unix systems.
After a crash, 6 x 9GB disks can take over half
an hour to check -(
Possible solution - trialling Linux journalling
filesystem ReiserFS, which is also a lot faster
than the conventional ext2 filesystem.
Generally useful for server and workstation
applications. Can be a work-around for other
problems, e.g. recovery of remote systems much
less painful after a crash.

15
Tracking changes - manually

Web form - who, what, when?
Search/browse interface for analysis/reporting.
Only requires Unix, HTTP server, Perl.
Nightly summary mailshot for management.
Also being used by EMMAN and several groups at
L'boro.
Easier to use than paper record and more readily
available. Structure allows for sensible queries.

16
Solution Change logging system
17
Tracking changes - automatically

Mail from service monitoring script.
Urgent warnings (e.g. machine down) gatewayed to
cellphone using sms_client modem.
LVS monitor logs incidents with timestamp,
machine name, and type of problem.
Mobile phone (SMS) message size very limited.
Must be careful not to send too many messages,
and to provide positive feedback - i.e. that the
service/machine recovered.

18
Long term stats

Daily log file analysis overnight (Calamaris,
squidtimes, squidclients our own code).
Log file summaries - possible to usefully
summarise 1GB down to 5MB!
Dynamic monitoring of Ethernet traffic levels and
Squid performance metrics via SNMP and
MRTG/rrdtool. Stats can hang around forever.
30GB disk 200! Figure out what to monitor and
keep historical stats. You won't regret it.

19
WAP - Tomorrow Today -)

Phones buggy - easy to crash, which can require a
trip to the service centre.
Different vendors support different features,
e.g. Nokia doesn't do tables.
Screens far too small for detailed info.
Space on "cards" very limited on some phones,
e.g. Nokia is 1397 characters.
But... very easy to create content for!

20
WAP example - LVS stats

1.1//EN" "http//www.wapforum.org/DTD/wml_1.1.xml"
Wed May 31 192502 2000
babylonnchor
kair/
wilburhor
... more cards ...

21
WAP in practice - 1
22
WAP in practice - 2
23
WAP redux

Phones use Wireless Markup Language (WML) instead
of HTML. WML is very simple by comparison.
One line tweak to Web server config required for
serving WML documents.
Easy to create WML automatically from monitoring
scripts.
Watch out for bugs and incompatibilities! Use
Internet emulators to save on phone bill.

24
Current Future developments

Two way WAP control for common jobs, e.g. restart
Squid, take a faulty disk out of service, reboot
a machine.
Failover of load balancers, so cluster survives
death of primary load balancer.
Mirror service integration, so that caches
automatically find mirrored resources - e.g. from
the UK Mirror Service.
"Cluster digests", to give sites an accurate
impression of JANET cache hit rates.

25
Closing thoughts...

Much of this technology is truly new - didn't
exist in 1997 when we started the JANET Web Cache
Service.
Perl and cron used extensively to glue other
tools togther.
Most of the software used existed already, so it
wasn't necessary to develop it from scratch.
Don't be afraid to lead from the front - JANET
cache team members have been very active in Web
caching development internationally.

26
Useful links

LVS - http//www.linuxvirtualserver.org/
MRTG - http//www.mrtg.org/
Perl - http//www.perl.org/
ReiserFS - http//devlinux.com/namesys/
L'boro change logging system - http//lanlord.lbor
o.ac.uk/martin/change/
sms_client - http//www.styx.demon.co.uk/
WAP emulator - http//www.gelon.net/

Write a Comment

User Comments (0)

About PowerShow.com

New Technologies in the JANET Web Cache Service Martin Hamilton - PowerPoint PPT Presentation

New Technologies in the JANET Web Cache Service Martin Hamilton

Awarded by competitive tender to Loughborough University Computing Services and ... sms_client - http://www.styx.demon.co.uk/ WAP emulator - http://www.gelon.net ... – PowerPoint PPT presentation