Evaluating Content Management Techniques for Web Proxy Caches PowerPoint PPT Presentation

presentation player overlay
1 / 20
About This Presentation
Transcript and Presenter's Notes

Title: Evaluating Content Management Techniques for Web Proxy Caches


1
Evaluating Content Management Techniques for Web
Proxy Caches
Martin Arlitt, Ludmila Cherkasova, John Diley,
Rich Friedrich and Tai Jin (Hewlett-Packard
Laboratories) (in 2nd Workshop on Internet Server
Performance, in conjunction with ACM SIGMETRICS
99)
Cho Joon-ho(CA Lab, CS department, KAIST) 2001 .
11. 6
2
Agenda
  • Problems
  • Quick Tour (Summary)
  • Critique
  • Design Design Rationale
  • Data Collection and Reduction
  • Key Workload Characteristics
  • Experimental Design
  • Simulation Results
  • Virtual Cache

3
Problems
  • Current Web Proxy caches utilize simple
    replacement policies
  • Relatively low hit rates
  • Additional delays
  • So what?
  • Developing a quantitative understanding of Web
    traffic
  • How effective are current proxy cache replacement
    policies for real workloads?
  • Focus on two performance metrics
  • Hit rate
  • Byte hit rate
  • Designing new replacement policies
  • Utilize frequency for higher performance
  • Are neither susceptible to cache pollution nor
    require parameterization

4
Agenda
  • Problems
  • Quick Tour (Summary)
  • Critique
  • Design Design Rationale
  • Data Collection and Reduction
  • Key Workload Characteristics
  • Experimental Design
  • Simulation Results
  • Virtual Cache

5
Quick Tour (Summary) 1/3
  • The problems of existing studies
  • Short-term traces of busy proxies or long-term
    traces of relatively inactive proxies
  • Long-term traces in busy environments are needed
  • Trace driven simulation
  • Collect total 117,652,652 requests during five
    month
  • Use smaller and more compact log
  • The points to be considered
  • Object size
  • Recency of Reference
  • Frequency of Reference
  • Turnover

6
Quick Tour (Summary) 2/3
  • Existing replacement policy
  • LRU (Least-Recently-Used)
  • Size replaces the largest object
  • GD-Size (GreedyDual-Size)
  • Replaces the object with the lowest utility
  • LFU - replaces the least frequently used object
  • New replacement policy
  • GDSF (GreedyDual-Size with Frequency)
  • GD-Size a frequency factor
  • LFU-DA (Least Frequently Used with Dynamic Aging)
  • LFU-Aging a dynamic mechanism(Running age L)
  • Virtual Caches
  • Logically partitions the cache into N virtual
    caches

KiCi/SiL
KiFiCi/SiL
KiCiFiL
7
Quick Tour (Summary) 3/3
Analysis of Virtual Cache Performance VC0 using
GDSF-Hits, VC1 using LFU-DA
8
Agenda
  • Problems
  • Quick Tour (Summary)
  • Critique
  • Design Design Rationale
  • Data Collection and Reduction
  • Key Workload Characteristics
  • Experimental Design
  • Simulation Results
  • Virtual Cache

9
Critique
  • Pros
  • Quantitative understanding of Web traffic
  • Long term trace-driven simulation in busy proxy
    servers
  • Providing two new replacement algorithms that run
    efficiently
  • Providing a new cache management method, Virtual
    Cache
  • Cons
  • Not fresh
  • No consideration of dynamic data
  • No consideration of processing overhead for these
    more complex algorithms
  • Performance improvements are insignificant

10
Agenda
  • Problems
  • Quick Tour (Summary)
  • Critique
  • Design Design Rationale
  • Data Collection and Reduction
  • Key Workload Characteristics
  • Experimental Design
  • Simulation Results
  • Virtual Cache

11
Data Collection and Reduction
  • Data collection
  • Long term trace-driven simulation
  • Total 117,652,652 requests were handled during
    five month period
  • Data include
  • Client IP address, request time, response status,
    the time required for the proxy to complete its
    response
  • Data reduction
  • Smaller, more compact log
  • Due to storage constraint
  • To ensure that analyses and simulations could be
    completed in a reasonable amount of time
  • Reduction by
  • Storing data in more efficient manner
  • Removing information of little value

12
Key Workload Characteristics
  • Cacheable Objects
  • Most client requests be for cacheable objects
    (96)
  • Object Set Size
  • total 389GB
  • Object Sizes
  • Variable medium 4KB, maximum 148MB video
    clip
  • Recency of reference
  • 1/3 of all re-references occurred within one hour
  • Frequency of reference
  • Web referencing patterns are non-uniform
  • Turnover
  • Objects that were once popular are no longer
    requested

13
Experimental Design 1/2
  • Least-Recently-Used(LRU)
  • Replaces the object requested least recently
  • Considers only a single work load characteristic
  • Size
  • Replaces the largest object
  • Tries to minimize the miss ratio (target to byte
    hit rate)
  • Cache pollution
  • GreedyDual-Size(GD-Size)
  • GD-Size(1) for Hit Rate
  • GD-Size(Packets) for Byte Hit Rate

Ci the cost associated with bringing object i
into the cache Si the object size L a running
age factor
KiCi/SiL
14
Experimental Design 2/2
  • LFU
  • Replaces the least frequently used object
  • LFU-Aging LFU Aging ? avoids cache pollution
  • Parameterization problem still remains
  • Greedy Dual-Size with Frequency(GDSF)
  • GD-Size doesnt take into account frequency
  • Least Frequently Used with Dynamic Aging(LFU-DA)
  • LFU-Aging requires parameterization to perform
    well
  • LFD-DA uses inflation factor as well as the
    frequency count

KiFiCi/SiL
Fi a frequency count
KiCiFiL
L a running age factor
15
Agenda
  • Problems
  • Quick Tour (Summary)
  • Critique
  • Design Design Rationale
  • Data Collection and Reduction
  • Key Workload Characteristics
  • Experimental Design
  • Simulation Results
  • Virtual Cache

16
Simulation Results 1/2
17
Simulation Results 2/2
18
Agenda
  • Problems
  • Quick Tour (Summary)
  • Critique
  • Design Design Rationale
  • Data Collection and Reduction
  • Key Workload Characteristics
  • Experimental Design
  • Simulation Results
  • Virtual Cache

19
Virtual Cache 1/2
  • An approach that can focus on both of hit rate
    and byte hit rate simultaneously
  • Mechanism
  • Logically partitions the cache into N virtual
    caches
  • Each virtual cache(VC)is managed with its own
    replacement policy
  • Steps
  • Initially all objects are in VC0
  • Replacements from VCi are moved to VCi1
  • Replacements from VCi1 are evicted form the
    cache
  • When reaccessed, objects are reinserted in VC0

20
Virtual Cache 2/2
Figure 3. Analysis of Virtual Cache Performance
VC0 using GDSF-Hits, VC1 using LFU-DA
Figure 4. Analysis of Virtual Cache Performance
VC0 using LFU-DA, VC1 using GDSF-Hits
Write a Comment
User Comments (0)
About PowerShow.com