On the Scale and Performance of Cooperative Web Proxy Caching presentation

About This Presentation

Transcript and Presenter's Notes

Title: On the Scale and Performance of Cooperative Web Proxy Caching

1
On the Scale and Performance of Cooperative Web
Proxy Caching

University of Washington
Alec Wolman, Geoff Voelker, Nitin Sharma, Neal
Cardwell, Anna Karlin, Henry Levy

2
Caching for a Better Web

Performance is a major concern in the Web
Proxy caching is the most commonly used method to
improve Web performance
Duplicate requests to the same document served
from cache
Hits reduce latency, network utilization, server
load
Misses increase latency (extra hops)

3
Cache Effectiveness

Previous work has shown that hit rate increases
with population size Duska et al. 97, Breslau
et al. 98
A single proxy cache has practical limits
Load, network topology, organizational
constraints
One technique to scale the client population is
to have proxy caches cooperate

4
Cooperative Web Proxy Caching

Sharing and/or coordination of cache state among
multiple Web proxy cache nodes
Effectiveness of proxy cooperation depends on

Inter-proxy communication distance
Size of client population served

Proxy utilization and load balance

5
Cooperative Web Caching

How much benefit does cooperative caching provide
in the Web environment?

6
Outline

Introduction related work
Trace methodology
Cooperative caching for small medium scale
populations
Scaling to larger client populations
Latency
Conclusions

7
Previous Research

Cooperative proxy caching is a popular research
topic
e.g. Chankhunthod et al. 96, Zhang et al. 97,
Fan et al. 98, Krishnan et al. 98, Menaud et al.
98, Tewari et al. 98, Touch 98, Karger et al. 99
...
Focus is on highly scalable algorithms
Some seek to scale to the entire Web

8
Challenges

No real understanding of document sharing across
diverse organizations
Little analytic or empirical evaluation of these
algorithms using realistic workloads for
large-scale client populations
Problem
Evaluating cooperative proxy caching requires
multiple simultaneous traces of Web proxies,
across a diverse set of organizations

9
Our Contribution

We use multi-organization traces to evaluate
cooperative proxy caching at small and medium
scales
We use analytic modelling to evaluate cooperative
caching at scales beyond those available in our
traces

10
A Multi-Organization Trace

University of Washington (UW) is a large and
diverse client population
Approximately 50K people
UW client population contains 200 independent
campus organizations
Museums of Art and Natural History
Schools of Medicine, Dentistry, Nursing
Departments of Computer Science, History, and
Music
A trace of UW is effectively a simultaneous trace
of 200 diverse client organizations

11
Cooperation Across Organizations

By considering each UW organization as an
independent company with its own clients and
its own proxy, we can empirically evaluate
cooperative caching across diverse client
populations
How much Web document reuse is there between
these organizations?
Place a proxy cache in front of each
organization. What is the benefit of cooperative
caching among these 200 proxies?

12
UW Trace Characteristics

Trace collected at UW network border (May 1999)
Filtered requests from UW clients, responses
from external Web servers
Most of requests come directly from clients (0.5
come from proxies)

13
Question

What is the benefit of cooperative caching among
the 200 UW organizational proxies?

14
Ideal Hit Rates for UW proxies

Ideal hit rate - infinite storage, ignore
cacheability, expirations
Average ideal localhit rate 43

15
Ideal Hit Rates for UW proxies

Ideal hit rate - infinite storage, ignore
cacheability, expirations
Average ideal localhit rate 43
Explore benefits of perfect cooperation rather
than a particular algorithm
Average ideal hit rate increases from 43 to 69
with cooperative caching

16
Cacheable Hit Rates forUW proxies

Cacheable hit rate - same as ideal, but doesnt
ignore cacheability
Cacheable hit rates are much lower than ideal
(average is 20)
Average cacheable hit rate increases from 20 to
41 with (perfect) cooperative caching

17
Scaling Cooperative Caching

Organizations of this size can benefit
significantly from cooperative caching
We dont need cooperative caching to handle the
entire UW population size
A single proxy (or small cluster) can handle this
entire population!
No technical reason to use cooperative caching
for this environment
In the real world, decisions of proxy placement
are often political or geographical
How effective is cooperative caching at scales
where a single cache will not work?

18
Hit Rate vs. Client Population

Curves similar to other studies
e.g., Duska97, Breslau98
Small organizations
Significant increase in hit rate as client
population increases
The reason why cooperative caching is effective
for UW
Large organizations
Marginal increase in hit rate as client
population increases

19
Extrapolation to Larger Client Populations

Use least squares fit to create a linear
extrapolation of hit rates
Hit rate increases logarithmically with client
population, e.g., to increase hit rate by 10
Need 8 UWs (ideal)
Need 11 UWs (cacheable)
Low ceiling, though
100 at 11.3M clients (UW ideal)
61 at 2.1M clients (UW cacheable)
A city-wide cooperative cache would get all the
benefit

20
Question

What is the benefit of cooperative caching among
large organizations?

21
UW Microsoft Cooperation

What if we ran a wire across Lake Washington, to
connect UW Microsoft?
We collected a Microsoft proxy trace during same
time period as the UW trace
Combined population is 80K clients
Increases the UW population by a factor of 3.6
Increases the MS population by a factor of 1.4

22
UW Microsoft Traces
23
UW MS Cooperative Caching

Is this worth it?

24
What about Latency?

From the clients perspective, latency matters
far more than hitrate
How does latency change with population?
Median latencies improve only a few 100 ms with
ideal caching compared to no caching.
On average, a web page consists of 4.5 HTTP
objects

25
Conclusions

A negative result without significant workload
changes, designing highly-scalable cooperative
proxy-cache schemes is unnecessary
Largest benefit is achieved with small
populations (up to 2K-5K clients)
Limited benefit of cooperation when we combined
the UW Microsoft populations
Document cacheability is a severe limitation with
current workloads
Analytic model results
Confirm that most of benefit is obtained once you
reach populations the size of a large city
Future workloads large-scale cooperative
caching could become more relevant with different
rate-of-change characteristics

26
UW Cooperative Caching Results
27
Extrapolating UW MS Hit Rates
28
UW Latency
29
Complete Hit Rate Graph
30
Document Cacheability
31
Blank Slide

blankness here...

Write a Comment

User Comments (0)

About PowerShow.com

On the Scale and Performance of Cooperative Web Proxy Caching PowerPoint PPT Presentation