Title: Evaluating Content Management Techniques for Web Proxy Caches
1Evaluating Content Management Techniques for Web
Proxy Caches
Martin Arlitt, Ludmila Cherkasova, John Diley,
Rich Friedrich and Tai Jin (Hewlett-Packard
Laboratories) (in 2nd Workshop on Internet Server
Performance, in conjunction with ACM SIGMETRICS
99)
Cho Joon-ho(CA Lab, CS department, KAIST) 2001 .
11. 6
2Agenda
- Problems
- Quick Tour (Summary)
- Critique
- Design Design Rationale
- Data Collection and Reduction
- Key Workload Characteristics
- Experimental Design
- Simulation Results
- Virtual Cache
3Problems
- Current Web Proxy caches utilize simple
replacement policies - Relatively low hit rates
- Additional delays
- So what?
- Developing a quantitative understanding of Web
traffic - How effective are current proxy cache replacement
policies for real workloads? - Focus on two performance metrics
- Hit rate
- Byte hit rate
- Designing new replacement policies
- Utilize frequency for higher performance
- Are neither susceptible to cache pollution nor
require parameterization
4Agenda
- Problems
- Quick Tour (Summary)
- Critique
- Design Design Rationale
- Data Collection and Reduction
- Key Workload Characteristics
- Experimental Design
- Simulation Results
- Virtual Cache
5Quick Tour (Summary) 1/3
- The problems of existing studies
- Short-term traces of busy proxies or long-term
traces of relatively inactive proxies - Long-term traces in busy environments are needed
- Trace driven simulation
- Collect total 117,652,652 requests during five
month - Use smaller and more compact log
- The points to be considered
- Object size
- Recency of Reference
- Frequency of Reference
- Turnover
6Quick Tour (Summary) 2/3
- Existing replacement policy
- LRU (Least-Recently-Used)
- Size replaces the largest object
- GD-Size (GreedyDual-Size)
- Replaces the object with the lowest utility
- LFU - replaces the least frequently used object
- New replacement policy
- GDSF (GreedyDual-Size with Frequency)
- GD-Size a frequency factor
- LFU-DA (Least Frequently Used with Dynamic Aging)
- LFU-Aging a dynamic mechanism(Running age L)
- Virtual Caches
- Logically partitions the cache into N virtual
caches
KiCi/SiL
KiFiCi/SiL
KiCiFiL
7Quick Tour (Summary) 3/3
Analysis of Virtual Cache Performance VC0 using
GDSF-Hits, VC1 using LFU-DA
8Agenda
- Problems
- Quick Tour (Summary)
- Critique
- Design Design Rationale
- Data Collection and Reduction
- Key Workload Characteristics
- Experimental Design
- Simulation Results
- Virtual Cache
9Critique
- Pros
- Quantitative understanding of Web traffic
- Long term trace-driven simulation in busy proxy
servers - Providing two new replacement algorithms that run
efficiently - Providing a new cache management method, Virtual
Cache - Cons
- Not fresh
- No consideration of dynamic data
- No consideration of processing overhead for these
more complex algorithms - Performance improvements are insignificant
10Agenda
- Problems
- Quick Tour (Summary)
- Critique
- Design Design Rationale
- Data Collection and Reduction
- Key Workload Characteristics
- Experimental Design
- Simulation Results
- Virtual Cache
11Data Collection and Reduction
- Data collection
- Long term trace-driven simulation
- Total 117,652,652 requests were handled during
five month period - Data include
- Client IP address, request time, response status,
the time required for the proxy to complete its
response - Data reduction
- Smaller, more compact log
- Due to storage constraint
- To ensure that analyses and simulations could be
completed in a reasonable amount of time - Reduction by
- Storing data in more efficient manner
- Removing information of little value
12Key Workload Characteristics
- Cacheable Objects
- Most client requests be for cacheable objects
(96) - Object Set Size
- total 389GB
- Object Sizes
- Variable medium 4KB, maximum 148MB video
clip - Recency of reference
- 1/3 of all re-references occurred within one hour
- Frequency of reference
- Web referencing patterns are non-uniform
- Turnover
- Objects that were once popular are no longer
requested
13Experimental Design 1/2
- Least-Recently-Used(LRU)
- Replaces the object requested least recently
- Considers only a single work load characteristic
- Size
- Replaces the largest object
- Tries to minimize the miss ratio (target to byte
hit rate) - Cache pollution
- GreedyDual-Size(GD-Size)
- GD-Size(1) for Hit Rate
- GD-Size(Packets) for Byte Hit Rate
Ci the cost associated with bringing object i
into the cache Si the object size L a running
age factor
KiCi/SiL
14Experimental Design 2/2
- LFU
- Replaces the least frequently used object
- LFU-Aging LFU Aging ? avoids cache pollution
- Parameterization problem still remains
- Greedy Dual-Size with Frequency(GDSF)
- GD-Size doesnt take into account frequency
- Least Frequently Used with Dynamic Aging(LFU-DA)
- LFU-Aging requires parameterization to perform
well - LFD-DA uses inflation factor as well as the
frequency count
KiFiCi/SiL
Fi a frequency count
KiCiFiL
L a running age factor
15Agenda
- Problems
- Quick Tour (Summary)
- Critique
- Design Design Rationale
- Data Collection and Reduction
- Key Workload Characteristics
- Experimental Design
- Simulation Results
- Virtual Cache
16Simulation Results 1/2
17Simulation Results 2/2
18Agenda
- Problems
- Quick Tour (Summary)
- Critique
- Design Design Rationale
- Data Collection and Reduction
- Key Workload Characteristics
- Experimental Design
- Simulation Results
- Virtual Cache
19Virtual Cache 1/2
- An approach that can focus on both of hit rate
and byte hit rate simultaneously - Mechanism
- Logically partitions the cache into N virtual
caches - Each virtual cache(VC)is managed with its own
replacement policy - Steps
- Initially all objects are in VC0
- Replacements from VCi are moved to VCi1
- Replacements from VCi1 are evicted form the
cache - When reaccessed, objects are reinserted in VC0
20Virtual Cache 2/2
Figure 3. Analysis of Virtual Cache Performance
VC0 using GDSF-Hits, VC1 using LFU-DA
Figure 4. Analysis of Virtual Cache Performance
VC0 using LFU-DA, VC1 using GDSF-Hits