Web%20Cache%20Replacement%20Policies:%20Properties,%20Limitations%20and%20Implications - PowerPoint PPT Presentation

About This Presentation
Title:

Web%20Cache%20Replacement%20Policies:%20Properties,%20Limitations%20and%20Implications

Description:

Experimental Results: Percentage of First-Timers ... Caching policies cannot satisfy first-timers, the most important factor for poor ... – PowerPoint PPT presentation

Number of Views:98
Avg rating:3.0/5.0
Slides: 21
Provided by: Fern201
Category:

less

Transcript and Presenter's Notes

Title: Web%20Cache%20Replacement%20Policies:%20Properties,%20Limitations%20and%20Implications


1
Web Cache Replacement Policies Properties,
Limitations and Implications
  • Fabrício Benevenuto, Fernando Duarte, Virgílio
    Almeida, Jussara Almeida

Computer Science Department Federal University of
Minas Gerais Brazil
2
Summary
  • Introduction to Web caching
  • Motivations and goals
  • Evaluation methodology
  • Performance metrics
  • Workload description
  • Caching system simulator
  • Experimental results
  • Conclusions and future work

3
Web Caching
  • Dramatic growth of the WWW in terms of content,
    users, servers and complexity
  • Web caching is a common strategy used to
  • reduce the traffic over Internet
  • increase server scalability
  • diminish the latency in the network
  • Use of caching by the deployment of Web Proxies

4
Web Caching
  • Web proxies can be seen as intermediaries of the
    traffic between the HTTP clients and servers
  • Nowadays the Web has a hierarchical topology

5
Web Caching
  • Cache replacement is one of the issues that a
    proxy should be able to manage
  • As the cache has finite size, when it is full,
    how does a proxy choose a page to remove from its
    cache?
  • A lot of research has been done to address this
    question and several cache replacement policies
    can be found in the literature
  • Key questions
  • Is the design of new cache replacement policies
    needed?
  • What are the properties that new policies should
    take advantage of to improve a caching system?

6
Goals
  • Investigate how much a new caching policy could
    improve cache system performance
  • Explore the main causes of periods of poor and
    high performance in caching systems

7
Evaluation Methodology
  • Evaluation of different metrics over time
  • Hit Ratio
  • Percentage of first-timers
  • Maximum improvement
  • Entropy
  • Time intervals of 1, 10 and 100 minutes
  • Use of real workloads

8
Performance Metric Hit Ratio
  • Hit ratio is the percentage of requests satisfied
    by the cache
  • It is most general metric used to evaluate the
    effectiveness of a caching policy
  • Measuring hit ratio over time to detect periods
    of variations of performance

9
Performance Metric Percentage of First-Timers
  • First-timer is the first request for an object of
    the trace.
  • Caching policies cannot satisfy first-timers
  • the first-timer has never been requested in the
    past

10
Performance Metric Maximum Improvement
  • The maximum improvement MI is defined as
  • Maximum improvement over LRU
  • We evaluate the maximum hit ratio a new caching
    policy can improve over the simple LRU policy

11
Performance Metric Entropy
  • Taking n distinct objects with probability pi of
    occurrence, the entropy H(X) of a request stream
    is calculated as
  • Entropy measures the concentration of popularity
    of a request stream
  • The higher the value of the entropy, the lower
    the concentration of popularity
  • Caching policies should keep objects with high
    probability of being referenced in the near
    future

12
Performance Metric Entropy
  • Entropy depends on the number of distinct objects
  • Use of the normalized entropy HN
  • Investigate the influence of popularity on
    caching performance

13
Experiment Setup
  • Real traces from proxy caches located at two
    points of the Web topology
  • Closer to clients
  • Federal University of Minas Gerais (UFMG)
  • Closer to servers
  • National Laboratory for Applied Network
    Research (NLANR)
  • Cache Size 10 of the number of distinct objects
  • Replacement caching policy Simple LRU

14
Workload Description
Name University 1 University 2 NLANR 1 NLANR 2
start date 01-10-2004 01-12-2004 01-18-2005 01-20-2005
days 2 10 2 11
requests 1,004,747 3,459,549 1,207,075 3,427,391
distinct objects 299,367 623,164 891,906 2,350,215
normalized entropy 0.8532 0.8268 0.9482 0.9329
  • Traces used
  • Cache warming University 1, NLANR 1
  • Performance evaluation University 2, NLANR 2
  • Higher concentration of popularity on university
    traces (lower entropy)
  • Larger fraction of different objects in the NLANR
    traces, what diminish significantly the caching
    performance

15
Experimental Results Hit Ratio
proxy closer to clients
proxy closer to servers
  • Higher hit ratio for University trace
  • Strong variation along the time
  • What are the factors that causes the variations
    on hit ratio?

16
Experimental Results Percentage of First-Timers
proxy closer to clients
proxy closer to servers
  • Smaller of first-timers at the proxy closer to
    clients
  • Correlation coefficient between hit ratio and the
    percentage of first-timers
  • -0.857 for the NLANR and -0.962 for the
    university
  • Caching policies cannot satisfy first-timers, the
    most important factor for poor and good
    performance in the analyzed traces

17
Experimental Results Entropy
proxy closer to clients
proxy closer to servers
  • Proxy closer to clients lower entropy ? higher
    concentration of popularity
  • LRU policy does not take advantage of all
    locality of reference
  • Correlation coefficient between hit ratio and
    entropy
  • -0.787 for the NLANR and -0.453 for the
    university
  • If we had a caching policy able to filter all the
    locality (entropy 1), how much could hit ratio
    be improved?

18
Experimental Results Maximum Improvement
proxy closer to clients
proxy closer to servers
  • The hit ratio cannot be significantly improved
    for the trace closer to clients
  • High number of first-timers diminishing the hit
    ratio
  • Improving caching performance
  • Reorganization of the hierarchy of caches (cache
    placement)
  • Caching system able to deal with the first-timers

19
Conclusions and Future Work
  • Summary of main findings
  • Strong variation of hit ratio along the time
  • High number of first-timers (higher close to
    servers)
  • Main cause of low hit ratio
  • LRU policy is not able to filter the entire
    locality of a stream
  • Small correlation with hit ratio
  • The maximum improvement we could obtain over LRU
  • less than 5 percent closer to clients
  • In average 25 percent closer to servers
  • Results suggest reorganization of cache topology
    and a caching system able to deal with the higher
    number of first-timers
  • Future work
  • Cache placement find the optimal cache
    organization in order to improve the overall
    system performance
  • Auto-adaptive cache system able to minimize
    periods of poor performance

20
Questions?
  • Fabricio Benevenuto, Fernando Duarte,
  • Virgilio Almeida, Jussara Almeida
  • fabricio, fernando, virgilio, jussara_at_dcc.ufmg.b
    r
Write a Comment
User Comments (0)
About PowerShow.com