Title: Evaluation of Cooperative Web Caching with Web Polygraph
1Evaluation of Cooperative Web Caching with Web
Polygraph
- Ping Du and Jaspal Subhlok
- Department of Computer Science
- University of Houston
- presented at WCW 02, Aug 14
2Web Cache Hierarchy
3Motivation and Goals
- No practical methods to evaluate cache
hierarchies under specific workload and network
conditions - Important for designing a caching solution
- Criteria for evaluation system
- Model reality well
- Applicable to different protocols structures
- Experiments should be repeatable
- Use both hit rate and user response time as
metrics - Solution based on Web Polygraph
4Cache Evaluation with Web Polygraph
Polysrv
Polysrv
Polysrv
- Synthetic HTTP clients and servers on real
machines on a LAN - Workload parameterized by size, distribution,
popularity, load and many others
Proxy Cache
Polyclt
Polyclt
Polyclt
5Hierarchy Evaluation with Web Polygraph
Polysrv
Polysrv
Polysrv
Proxy Cache
Proxy Cache
Proxy Cache
Polyclt
Polyclt
Polyclt
6Evaluation Framework
- Web Polygraph
- Reports throughput, response time, hit ratio etc.
from clients viewpoint (but unaware of
hierarchy) - Dummynet
- Used to simulate networks of different
capabilities by controlling bandwidth, latency
and packet loss. - Squid cache and Squeezer log analysis tool
- Captures cache cooperation info
- Modified to monitor specific polygraph phases
- Squeezer and Polygraph info has to be reconciled
7Experimental Setup
- Experiments performed on different cache
hierarchies of two, three four Squid caches. - Hardware configuration of all Squid machines is
the same (800MHz, 256MB, 4 30GB disks) - Polygraph machines and caches on same 100Mbps
switched ethernet network - Balanced workload
- Cache fill-up phase not measured
8List of Experiments
- Performance with different cache hierarchies
- Influence of network latency
- Influence of cache size
- Influence of the document sharing pattern
- One big cache compared to multiple caches
- Virtually unlimited experiment space with many
parameters (e.g., request rate, public interest,
cache, memory size etc.)
9List of Cache Hierarchies
Cache
Client
2OY
3OY
2SY
Sibling-sibling
Parent-child
3SY
1ON-2OY
1ON-2SY
Same memory, disk per cache, fixed total request
rate, no network delay
2SY-1OY
2OY-1OY
1OY-2SY
10Simulation Results - Different Hierarchies
- Improved hit ratio overcomes overheads of peering
- Parents appear less important than siblings
11List of Experiments
- Performance with different cache hierarchies
- Influence of network latency
- 2 and 3 Squid caches independent or as siblings
- Network delay of 0 msecs, 40 msecs, or 80 msecs
between caches - Influence of cache size
- Influence of the document sharing pattern
- One big cache compared to multiple caches
12Impact of Network Latency
- Hit ratio unaffected by latency
- Hit and Miss response times increase with latency
- Some increase in response time going from 0 to 40
to 80 msec - Cache cooperation is helpful even with modest
network delay
13Conclusion
- Web Polygraph based framework to evaluate
cooperative caching - Flexible
- Works on a real network
- Workload characteristics are easy to specify.
- Repeatable experiments
- Hit ratio and user response time based metrics
- Captures actual cooperation overheads
14Future Work
- Make the toolset easily usable by the community
currently a recipe type help available - Evaluation of large hierarchies may need a
combination of experimental and analytical
methods - More results from the performance of different
kinds of hierarchies in different scenarios
15Influence of Cache Size
- Two Squid caches, running isolated or as
siblings. - Various total disk cache size
- Same total memory cache size
- Same constant request rate
- No network latency between caches
16Simulation Results - Cache Size
- Cooperative caches
- Higher hit miss response time
- Miss response time is stable.
- Increase in hit response time
- Fraction of memory to disk cache size
- Performance with increase of cache size
- Improve quickly
- Stabilizes gradually
- Benefits of cooperation increase.
17Influence of the Document Sharing Pattern
- Two Squid caches, running isolated or as
siblings. - Various document sharing pattern
- Global URL space
- Public interest the percentage of all documents
shared by Polygraph clients. - Same total disk cache size
- Same total memory cache size
- Same constant request rate
- No network latency between caches
18Simulation Results - Document Sharing Pattern
- Performance improves with public interest.
- Influence is mainly on remote hit.
19Working Set Size
20Performance of one big cache compared to multiple
caches
- One, two, three and four Squid caches
- Isolated or all siblings cache hierarchies
- Best effort workload
- Constant rate vs. best effort workload
- Used to get the best throughput
- Same total disk cache size
- Same total memory cache size
- No network latency between caches
21Simulation Results - one big cache compared to
multiple caches
- One cache the worst throughput
- Two separate cache large improvement
- More separate cache declined performance
- All siblings hierarchy
- Improvement is more stable
- Levels off quickly
- Overheads of peering outweigh improved hit ratio
eventually
22Methodology - Phase Schedule
- Web Polygraph provides a scheme to customized
desired workload pattern by phase schedule.
- Caches are in a stable state after Fill phase.
- Simulate daily Web traffic pattern in a short
period.