On the Scale and Performance of Cooperative Web Proxy Caching PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: On the Scale and Performance of Cooperative Web Proxy Caching


1
On the Scale and Performance of Cooperative Web
Proxy Caching
  • University of Washington
  • Alec Wolman, Geoff Voelker, Nitin Sharma, Neal
    Cardwell, Anna Karlin, Henry Levy

2
Caching for a Better Web
  • Performance is a major concern in the Web
  • Proxy caching is the most commonly used method to
    improve Web performance
  • Duplicate requests to the same document served
    from cache
  • Hits reduce latency, network utilization, server
    load
  • Misses increase latency (extra hops)

3
Cache Effectiveness
  • Previous work has shown that hit rate increases
    with population size Duska et al. 97, Breslau
    et al. 98
  • A single proxy cache has practical limits
  • Load, network topology, organizational
    constraints
  • One technique to scale the client population is
    to have proxy caches cooperate

4
Cooperative Web Proxy Caching
  • Sharing and/or coordination of cache state among
    multiple Web proxy cache nodes
  • Effectiveness of proxy cooperation depends on
  • Inter-proxy communication distance
  • Size of client population served
  • Proxy utilization and load balance

5
Cooperative Web Caching
  • How much benefit does cooperative caching provide
    in the Web environment?

6
Outline
  • Introduction related work
  • Trace methodology
  • Cooperative caching for small medium scale
    populations
  • Scaling to larger client populations
  • Latency
  • Conclusions

7
Previous Research
  • Cooperative proxy caching is a popular research
    topic
  • e.g. Chankhunthod et al. 96, Zhang et al. 97,
    Fan et al. 98, Krishnan et al. 98, Menaud et al.
    98, Tewari et al. 98, Touch 98, Karger et al. 99
    ...
  • Focus is on highly scalable algorithms
  • Some seek to scale to the entire Web

8
Challenges
  • No real understanding of document sharing across
    diverse organizations
  • Little analytic or empirical evaluation of these
    algorithms using realistic workloads for
    large-scale client populations
  • Problem
  • Evaluating cooperative proxy caching requires
    multiple simultaneous traces of Web proxies,
    across a diverse set of organizations

9
Our Contribution
  • We use multi-organization traces to evaluate
    cooperative proxy caching at small and medium
    scales
  • We use analytic modelling to evaluate cooperative
    caching at scales beyond those available in our
    traces

10
A Multi-Organization Trace
  • University of Washington (UW) is a large and
    diverse client population
  • Approximately 50K people
  • UW client population contains 200 independent
    campus organizations
  • Museums of Art and Natural History
  • Schools of Medicine, Dentistry, Nursing
  • Departments of Computer Science, History, and
    Music
  • A trace of UW is effectively a simultaneous trace
    of 200 diverse client organizations

11
Cooperation Across Organizations
  • By considering each UW organization as an
    independent company with its own clients and
    its own proxy, we can empirically evaluate
    cooperative caching across diverse client
    populations
  • How much Web document reuse is there between
    these organizations?
  • Place a proxy cache in front of each
    organization. What is the benefit of cooperative
    caching among these 200 proxies?

12
UW Trace Characteristics
  • Trace collected at UW network border (May 1999)
  • Filtered requests from UW clients, responses
    from external Web servers
  • Most of requests come directly from clients (0.5
    come from proxies)

13
Question
  • What is the benefit of cooperative caching among
    the 200 UW organizational proxies?

14
Ideal Hit Rates for UW proxies
  • Ideal hit rate - infinite storage, ignore
    cacheability, expirations
  • Average ideal localhit rate 43

15
Ideal Hit Rates for UW proxies
  • Ideal hit rate - infinite storage, ignore
    cacheability, expirations
  • Average ideal localhit rate 43
  • Explore benefits of perfect cooperation rather
    than a particular algorithm
  • Average ideal hit rate increases from 43 to 69
    with cooperative caching

16
Cacheable Hit Rates forUW proxies
  • Cacheable hit rate - same as ideal, but doesnt
    ignore cacheability
  • Cacheable hit rates are much lower than ideal
    (average is 20)
  • Average cacheable hit rate increases from 20 to
    41 with (perfect) cooperative caching

17
Scaling Cooperative Caching
  • Organizations of this size can benefit
    significantly from cooperative caching
  • We dont need cooperative caching to handle the
    entire UW population size
  • A single proxy (or small cluster) can handle this
    entire population!
  • No technical reason to use cooperative caching
    for this environment
  • In the real world, decisions of proxy placement
    are often political or geographical
  • How effective is cooperative caching at scales
    where a single cache will not work?

18
Hit Rate vs. Client Population
  • Curves similar to other studies
  • e.g., Duska97, Breslau98
  • Small organizations
  • Significant increase in hit rate as client
    population increases
  • The reason why cooperative caching is effective
    for UW
  • Large organizations
  • Marginal increase in hit rate as client
    population increases

19
Extrapolation to Larger Client Populations
  • Use least squares fit to create a linear
    extrapolation of hit rates
  • Hit rate increases logarithmically with client
    population, e.g., to increase hit rate by 10
  • Need 8 UWs (ideal)
  • Need 11 UWs (cacheable)
  • Low ceiling, though
  • 100 at 11.3M clients (UW ideal)
  • 61 at 2.1M clients (UW cacheable)
  • A city-wide cooperative cache would get all the
    benefit

20
Question
  • What is the benefit of cooperative caching among
    large organizations?

21
UW Microsoft Cooperation
  • What if we ran a wire across Lake Washington, to
    connect UW Microsoft?
  • We collected a Microsoft proxy trace during same
    time period as the UW trace
  • Combined population is 80K clients
  • Increases the UW population by a factor of 3.6
  • Increases the MS population by a factor of 1.4

22
UW Microsoft Traces
23
UW MS Cooperative Caching
  • Is this worth it?

24
What about Latency?
  • From the clients perspective, latency matters
    far more than hitrate
  • How does latency change with population?
  • Median latencies improve only a few 100 ms with
    ideal caching compared to no caching.
  • On average, a web page consists of 4.5 HTTP
    objects

25
Conclusions
  • A negative result without significant workload
    changes, designing highly-scalable cooperative
    proxy-cache schemes is unnecessary
  • Largest benefit is achieved with small
    populations (up to 2K-5K clients)
  • Limited benefit of cooperation when we combined
    the UW Microsoft populations
  • Document cacheability is a severe limitation with
    current workloads
  • Analytic model results
  • Confirm that most of benefit is obtained once you
    reach populations the size of a large city
  • Future workloads large-scale cooperative
    caching could become more relevant with different
    rate-of-change characteristics

26
UW Cooperative Caching Results
27
Extrapolating UW MS Hit Rates
28
UW Latency
29
Complete Hit Rate Graph
30
Document Cacheability
31
Blank Slide
  • blankness here...
Write a Comment
User Comments (0)
About PowerShow.com