Title: Simulation Evaluation of a Heterogeneous Web Proxy Caching Hierarchy
1Simulation Evaluationof a HeterogeneousWeb
Proxy Caching Hierarchy
- Mudashiru Busari Carey
Williamson - University of Saskatchewan University of
Calgary - MASCOTS 2001
2Introduction
- The Web is both a blessing and a curse
- Blessing
- Internet available to the masses
- Seamless exchange of information
- Curse
- Internet available to the masses
- Stress on networks, protocols, servers, users
- Motivation improve the performance and
scalability of the Web (e.g., caching)
3Example of a Web Proxy Cache
Web server
Web server
Web server
Proxy server
4Our Previous Work
- Evaluation of Canadas national Web caching
infrastructure for CANARIEs CAnet II backbone - Workload characterization and evaluation of
CAnet II Web caching hierarchy
(IEEE Network, May/June 2000) - Developed Web proxy caching simulator for
trace-driven simulation evaluation of Web proxy
caching architectures - Developed synthetic Web proxy workload generator
called ProWGen
Busari/Williamson INFOCOMM 2001
5CAnet II Web Caching Hierarchy (Dec 1998)
(selected measurement points for our traffic
analyses 6-9 months of data
from each)
USask
CANARIE (Ottawa)
To NLANR
6Caching Hierarchy Overview
Top-Level/International (20-50 GB)
Cache Hit Ratios
Proxy
5-10
(empirically observed)
Proxy
National (10-20 GB)
Proxy
15-20
Regional/Univ. (5-10 GB)
Proxy
Proxy
Proxy
30-40
...
...
C
C
C
C
C
C
C
7Some Observationson Multi-Level Caching...
- Caching hierarchy not very effective
- Reason workload characteristics change as you
move up the caching hierarchy (due to
filtering effects, etc) - Idea 1 Try different cache replacement policies
at different levels of hierarchy - Idea 2 Limit replication of cache content in
overall hierarchy through partitioning (size,
type, sharing,)
8Research QuestionsMulti-Level Caches
- In a multi-level caching hierarchy, can overall
caching performance be improved by using
different cache replacement policies at different
levels of the hierarchy? - In a multi-level caching hierarchy, can overall
performance be improved by keeping disjoint
document sets at each level of the hierarchy?
9Experimental Methodology
- Trace-driven simulation
- Multi-factor experimental design
- Cache size
- 1 MB to 32 GB
- Cache Replacement Policy
- Least-Recently-Used (currently active docs)
- Least-Frequently-Used (popular docs)
- Greedy-Dual-Size (favours smaller docs)
- Workload Characteristics
- Degree of overlap amongst child caches
10Simulation Model
Web Servers
Web Clients
11Web Proxy Workload Used
- Synthetically generated workload using ProWGen
proxy workload generator Busari/Williamson
INFOCOMM 2001 - Parameterized based on empirical data
- Zipf-like document popularity profile
- Lots of one-timer documents
- Heavy-tailed file size distribution
- Note static content only
12Workload Characteristics
13Zipf-like Referencing Behaviour
Empirical Trace Slope 0.81
Synthetic Trace Slope 0.83
14Performance Metrics
- Document Hit Ratio
- Percent of requested docs found in cache (HR)
- Byte Hit Ratio
- Percent of requested bytes found in cache (BHR)
Notes - application-level simulation (files),
not network-level (pkts) - all three caches
always identical in size
15Experiment 1 Different Policies at Different
Levels of the Hierarchy (Complete Overlap)
(a) Hit Ratio
(b) Byte Hit Ratio
16(No Transcript)
17Experiment 2Sensitivity to Workload Overlap
- The greater the degree of workload overlap
amongst the child proxies, the greater the
role for the parent cache - In the no overlap scenario, the parent cache
has negligible hit ratios, particularly when
child caches are large
18Experiment 3Size-based Partitioning
- Partition files across the two levels of the
hierarchy based on size (e.g., keep small files
at the lower level and large files at the upper
level) (or vice versa) - Three size thresholds for small...
- 5,000 bytes
- 10,000 bytes
- 100,000 bytes
19Small files at the lower level Large files at
the upper level
Parent
Size threshold 5,000 bytes
20Large files at the lower level Small files at
the upper level
Size threshold 5,000 bytes
21Summary Multi-Level Caches
- Different Policies at different levels
- LRU/LFU-Aging at the lower level GD-Size at the
upper level provided improvement in performance - GD-Size GD-Size provided better performance in
hit ratio, but with some penalty in byte hit ratio
- Size-threshold approach
- small files at the lower level large files at
the upper level provided improvement in
performance - reversing this policy offered no perf advantage
22Conclusions
- ProWGen is a valuable tool for the evaluation of
Web proxy caching architectures, using synthetic
workloads - Existing multi-level caching hierarchies are not
always that effective - Heterogeneous caching architectures may better
exploit workload characteristics and improve Web
caching performance
23Future Work
- Extend and improve ProWGen
- Use of packet-level simulations to understand
protocol/network-level effects - Port ProWGen to network emulation testbed at the
U of Calgary
24For More Information...
- M. Busari, Simulation Evaluation of Web Caching
Hierarchies, M.Sc. Thesis, Dept of Computer
Science, U. Saskatchewan, June 2000 - ProWGen tool
- http//www.cs.usask.ca/faculty/carey/software/
- Email carey_at_cpsc.ucalgary.ca
- http//www.cpsc.ucalgary.ca/carey/