Web Cache Replacements

About This Presentation

Title:

Web Cache Replacements

Description:

GreedyDual-Size (GD-Size) associates a cost with each object and evicts object ... Hierarchical GreedyDual (Hierarchical GD) does object placement and replacement ... – PowerPoint PPT presentation

Number of Views:134

Avg rating:3.0/5.0

Slides: 61

Provided by: swa7157

Category:

more less

Transcript and Presenter's Notes

Title: Web Cache Replacements

1
Web Cache Replacements

???
?????
????
ykchang_at_mail.ncku.edu.tw

2
Introduction

Which page to be removed from its cache?
Finding a replacement algorithm that can yield
high hit rate.
Differences from traditional caching
nonhomogeneity of the object sizes
same frequency and different size, favor smaller
objects if consider only hit rate,
Byte hit rate

3
Introduction

Other consideration
transfer time cost
Expiration time
Frequency
Measurement metrics?
admission control?
When or how often to perform the replacement
operations?
How many documents to remove?

4
Measurement Metrics

Hit Rate (HR)
requests satisfied by cache
(shows fraction of requests not sent to server)
Volume measures
Weighted hit rate (WHR) Byte Hit Ratio
client-requested bytes returned by proxy (shows
fraction of bytes not sent by server)
Fraction of packets not sent
Reduction in distance traveled (e.g., hop count)
Latency Time

5
Three Categories

Traditional replacement policies and its direct
extensions
LRU, LFU,
Key-based replacement policies
Cost-based replacement policies

6
Traditional replacement

Least Recently Used (LRU) evicts the object which
was requested the least recently
prune off as many of the least recently used
objects as is necessary to have sufficient space
for the newly accessed object.
This may involve zero, one, or many replacements.

7
Traditional replacement

Least Frequently used (LFU) evicts the object
which is accessed least frequently.
Pitkow/Recker evicts objects in LRU order, except
if all objects are accessed within the same day,
in which case the largest one is removed.

8
Key-based Replacement

The idea in key-based policies is to sort objects
based upon a primary key, break ties based on a
secondary key, break remaining ties based on a
tertiary key, and so on.

9
Key-based Replacement

LRUMIN
This policy is biased in favor of smaller sized
objects so as to minimize the number of objects
replaced.
Let the size of the incoming object be S. Suppose
that this object will not fit in the cache.
If there are any objects in the cache which have
size at least S, we remove the least recently
used such object from the cache.
If there are no objects with size at least S,
then we start removing objects in LRU order of
size at least S/2, then objects of size at least
S/4, and so on until enough free cache space has
been created.

10
Key-based Replacement

SIZE policy
In this policy, the objects are removed in order
of size, with the largest object removed first.
Ties based on size are somewhat rare, but when
they occur they are broken by considering the
time since last access. Specifically, objects
with higher time since last access are removed
first.

11
Key-based Replacement

LRU-Threshold is the same as LRU, but objects
larger than a certain threshold size are never
cached.
Hyper-G is a refinement of LFU, break ties using
the recency of last use and size.
Lowest Latency First minimizes average latency
by evicting the document with the lowest download
latency first.

12
Cost-based Replacement

Employ a potential cost function derived from
different factors such as
time since last access,
entry time of the object in the cache,
transfer time cost,
object expiration time and so on.
GreedyDual-Size (GD-Size) associates a cost with
each object and evicts object with the lowest
cost/size.
Hybrid associates a utility function with each
object and evicts the one has the least utility
to reduce the total latency.

13
Cost-based Replacement

Lowest Relative Value evicts the object with the
lowest utility value.
Least Normalized Cost Replacement (LCN-R) employs
a rational function of the access frequency, the
transfer time cost and the size.
Bolot/Hoschka employs a weighted rational
function of transfer time cost, the size, and the
time last access.

14
Cost-based Replacement

Size-Adjusted LRU (SLRU) orders the object by
ratio of cost to size and choose objects with the
best cost-to-size ratio.
Server-assisted scheme models the value of
caching an object in terms of its fetching cost,
size, next request time, and cache prices during
the time period between requests. It evicts the
object of the least value.
Hierarchical GreedyDual (Hierarchical GD) does
object placement and replacement cooperatively in
a hierarchy.

15
GreedyDual

GreedyDual is originally proposed by Young and
Tarjan, concerned with the case when pages in a
cache have the same size but incur different
costs to fetch from a secondary storage
A value H is initiated with each cached page p
when a page is brought into cache.
H is set to be the cost of bringing p into the
cache
the cost is always nonnegative.
(1) Page with the lowest H value (minH) is
replaced and (2) then all pages reduce their H
values by minH

16
GreedyDual

If a page is accessed, its H value is restored to
the cost of bringing it into the cache
Thus the H values of recently accessed pages
retain a larger portion of the original cost than
the pages that have not been accessed for a long
time
By reducing the H values as time goes on and
restoring them upon access, GreedyDual integrates
the locality and cost concerns in a seamless
fashion

17
GreedyDual-Size

Setting H to cost/size upon accesses to a
document, where cost is the cost of bringing the
document and size is the size of the document in
bytes
call this extended version as GreedyDual-Size
The definition of cost depends on the goal of the
replacement algorithm cost is set to
1 if the goal is to maximize hit ratio
the downloading latency if the goal is to
minimize average latency
network cost if the goal is to minimize the total
cost

18
GreedyDual-Size

Implementation
Need to decrement all the pages in cache by
Min(q) every time a page q is replaced, which may
be very inefficient
Improved algorithm is in the next page
Maintaining a priority queue based on H
Handling a hit requires O(log k) time and
Handling an eviction requires O(log k) time since
in both cases the queue needs update

19
GreedyDual-Size

Algorithm GreedyDual (document p)
/ Initialize L ? 0 /
If p is already in memory,
H(p) ? L cost(p)/size(p)
If p is not in memory,
while there is not enough room in memory for
p,
Let L ? min H(q) for all q in
cache
Evict q such that H(q) L
Put p into memory set H(p)?Lcost(p)/size(p)

20
Hybrid Algorithm (HYB)

Motivated by Bolot and Hoschka's algorithm.
HYB is a hybrid of several factors, considering
not only download time but also number of
references to a document and document size. HYB
selects for replacement the document i with the
lowest value of the following expression

21
HYB

Utility function is defined as follows
Cs is the estimated time to connect to the server
bs is the estimated bandwidth to the server
Zp is the size of the document
np is the of times document has been referenced
Wb and Wn are constants that set the relative
importance of the variables bsand np, respectively

22
Latency Estimation Algo. (LAT) REF

Motivated by estimating the time required to
download a document, and then replace the
document with the smallest download time.
Apply some function to combine (e.g., smooth)
these time samples to form an estimate of how
long it will take to download the document
keeping a per-document estimate is probably not
practical.
Alternative keep statistics of past downloads on
a per-server basis, rather than a per-document
basis. (less storage)
For each server j, the proxy maintains an
Clatj estimated latency (time) to open
connection to server
Cbwj estimated bandwidth of the connection (in
bytes/second),

23
Latency Estimation Algo. (LAT) REF

When a new document is received from server, the
connection establishment latency (sclat) and
bandwidth for that document (scbw) are measured ,
the estimates are updated as follows
clatj (1-ALPHA) clatj ALPHA sclat
cbwj (1-ALPHA) cbwj ALPHA scbw
ALPHA is a smoothing constant, set to 1/8 as it
is in the TCP smoothed estimation of RTT
Let ser(i) denote the server on which document i
resides, and si denote the document size. Cache
replacement algorithm LAT selects for replacement
the document i with the smallest download time
estimate, denoted di
di clatser(i) si/cbwser(i)

Replacement Algorithm
24
Latency Estimation Algo. (LAT)

One detail remains
a proxy runs at the application layer of a
network protocol stack, and therefore would not
be able to obtain the connection latency samples
sclat.
Therefore the following heuristic is used to
estimate connection latency. A constant CONN is
chosen (e.g., 2Kbytes). Every document that the
proxy receives whose size is less than CONN is
used as an estimate of connection latency sclat.
Every document whose size exceeds CONN is used as
a bandwidth sample as follows
scbw download time of document current value
of clatj.

25
Lowest Relative Value (LRV)

time from the last access t for its large
influence on the probability of a new access
the probability of a new access conditioned to
the time from the last access can be expressed as
(1 - D(t))
of previous accesses i this parameter allows
the proxy to select a relatively small number of
documents with a much higher probability of being
accessed again
document size s This seems to be the most
effective parameter that make a selection among
documents with only one access

26
Distribution of interaccess times, D(t)
27
Prob. Density function of interaccess times, d(t)
28
Lowest Relative Value (LRV)

We compute the probability that a document is
accessed again, Pr(i, t, s), as follows
Pr(i, t, s) P1(s)(1 - D(t)) if i 1
Pr(i, t, s) Pi (1 D(t)) otherwise
Pi conditional probability that a document is
reference i1 times given that it has been
accessed i times
P1(s) Percentage of size s with at least 2
accesses
D(t) density distribution of times between
consecutive requests to the same document,
derived as D(t) 0.035log(t1) 0.45(1 - e
)

29
Lowest Relative Value (LRV)
30
Lowest Relative Value (LRV)
31
Performance from Pei Cao

Use hit ratio, byte hit ratio, reduced latency
and reduced hops
reduced latency the sum of downloading latency
for the pages that hit in cache as a percentage
of the sum of all downloading latencies
reduced hops the sum of the network costs for
the pages that hit in cache as a percentage of
the sum of the network costs of all Web pages
model network cost of each document as hops
Web server has hop value 1 or 32 we assign 1/8
of servers with hop value 32 and 7/8 with hop
value 1
The hop value can be thought of either as the
number of network hops traveled by a document or
as the monetary cost associated with the document

32
Performance from Pei Cao

GD-Size(1) sets cost of each document to be 1,
thus trying to maximize hit ratio
GD-Size(packets) sets the cost for each document
to 2size/536, i.e. estimated number of network
packets sent and received if a miss to the
document happens
1 packet for the request, 1 packet for the reply
and size/536 for extra data packets assuming a
536-byte TCP segment size.
It tries to maximize both hit ratio and byte hit
ratio
Finally GD-Size(hops) sets the cost for each
document to the hop value of the document trying
to minimize network costs

33
Performance from Pei Cao

See Caos paper page 4

34
Weighted Hit Rate

Results on best primary key are inconclusive
Most references are from small files, but most
bytes are from large files
Why Size?
Most accesses are for smaller documents
A few large documents take the space of many
small documents
Concentration of large inter-reference times

35
Exp. 3 Partitioning Cache by Media

Idea
Do clients that listen to music degrade the
performance of clients using text and graphics?
Could a partitioned cache with one portion
dedicated to audio, and the other to non-audio
documents increase the WHR experienced by either
audio or non-audio documents?
Simulate
cache size 10 of max needed
two partitions audio and non-audio

36
Exp. 4 Partitioning Cache by Media

In Experiment 4,
a one-level cache with SIZE as the primary key
random as the secondary key
three partition sizes dedicate 1/4, 1/2, or 3/4
of the cache to audio
the rest is dedicated to non-audio documents.

37
Exp. 4 Partitioning Cache by Media
38
Exp. 4 Partitioning Cache by Media
39
Problems to solve

Certain sorting keys have intuitive appeal.
The first is document type. A sorting key that
puts text documents at the front of the removal
queue would insure low latency for text in Web
pages, at the expense of latency for other
document type.
The second sorting key is refetch latency. To a
user of international documents, the most obvious
caching criteria is one that caches documents to
minimize overall latency.
A European user of North American documents would
preferentially cache those documents over ones
from other European servers to avoid using
heavily utilized transatlantic network links.
Therefore a means of estimating the latency for
refetching documents in a cache could be used as
a primary sorting key.

40
Problems to solve

caching dynamic documents. Cache is only useless
for dynamic documents if the document content
completely changes otherwise a portion but not
all of the cached copy remains valid.
allow caches to request the differences between
the cached version and the latest version of a
document.

41
Problems to solve

For example, in response to a conditional GET a
server could send the diff" of the current
version and the version matching the
Last-Modified date sent by the client or a
specific tag could allow a server to fill-in a
previously cached static query response form."
Another approach to changing semi-static pages
(i.e., pages that are HTML but replaced often) is
to allow Web servers to preemptively update
inconsistent document copies, at least for the
most popular.

42
Randomized Strategies

These strategies use randomized decisions to find
an object for replacement.

43
Randomized Strategies
Randomized Strategies

1. RAND
This strategy removes a random object.
2. HARMONIC Hosseini-Khayat 1997
RAND uses equal probability for each object,
HARMONIC removes from cache one item at random
with a probability inversely proportional to its
specific cost cost ci/si .

44
Randomized Strategies
Randomized Strategies

3. LRU-C and LRU-S Starobinski and Tse 2001.
LRU-C is a randomized version of LRU.
Let Cmaxc1,cN be the maximum of the access
costs of all N objects of a request sequence.
Let ci ci/cmax be the normalized cost for
object i. When an object i is requested, it is
moved to the head of the cache with probability
ci otherwise, nothing is done.

45
Randomized Strategies
Randomized Strategies

LRU-S uses the size instead of the cost. Let
smins1,sN be the size of the smallest objects
among the N documents, and di smin/si be the
normalized density of object i.
LRU-S acts as LRU with probability di otherwise
the cache state is left unmodified.
Furthermore, Starobinski and Tse 2001 proposed
an algorithm which deals with both varying-size
and varying-cost objects.
The following quantities were defined
Upon a request for object i, this algorithm
performs the same operation as LRU with
probability and with will leave the
cache state unmodified.

46
Randomized Strategies
Randomized Strategies

4. Randomized replacement with general value
functions Psounis and Prabhakar 2001.
This strategy draws N objects randomly from the
cache and evicts the least useful object in the
sample. The usefulness of a document can be
determined by any utility function. After
replacing the least useful object, the next M(M lt
N) least useful objects are retained in memory.
At the next replacement, N - M new samples are
drawn from the cache and the least useful of
these N-M and M previously retained is evicted.
The M least useful of the remaining are stored in
memory and so on.

47
Randomized Strategies Summary
Randomized Strategies

1. Randomization presents a different approach
to cache replacement.
2. Randomized strategies try to reduce the
complexity of the replacement process without
sacrificing the quality too much.

48
Admission control

If we store the response in cache or not?
First time not save

49
Admission control

heuristic to make this decision the most
frequently accessed objects recently will most
likely be accessed again. The words frequently
and recently imply that access frequency of
objects and a decay function applied on frequency
are needed.
an extra space called URL memory cache is
introduced to store URLs and the associated
access frequency of the requested objects.

50
Admission control

If the requested object is cacheable, the process
of storing the object in disk cache is delayed
until the same object is accessed again. (Or we
can say that cacheable objects are not stored in
disk cache unless they have been accessed before.
)
Since the access stream is infinite, the size of
URL cache must be limited. A replacement policy
is also needed in URL cache.

51
Admission control operations

Cache hits
The operations are similar to the original
algorithm.
In addition to unused non-cacheable objects and
hot objects in memory cache, the cacheable
objects without disk copies are also the
candidates for replacement in memory cache.
Consider the case that a copy of the requested
object exists in memory cache but not in disk
cache.
The reference count associated with the requested
object in memory cache is incremented by one and
the data is then stored in disk cache.
If the evicted objects from memory cache are
cacheable, its URL along with its reference count
is then stored in URL cache.

52
Admission control operations

Cache misses for cacheable objects
If the requested object is cacheable, the caching
algorithm checks
(1) if its URL is not stored in URL cache.
Replacement operations are performed for
allocating enough space for holding the requested
object.
The URL of the replaced object is now stored in
URL cache along with its reference count.
The replacement operations in URL cache must be
performed.
The evicted URLs from URL cache are released.
The requested object itself is not stored in disk
cache at this moment. Thus, no replacement in
disk cache is needed.

53
Admission control operations

Cache misses for cacheable objects
(2) if the URL of the requested object is stored
in URL cache,
its associated record in URL cache is removed,
the requested object is stored in disk cache, and
the reference count is set to one.
Similarly, the replacement operations in disk
cache must be performed. The URLs of the evicted
objects from disk cache are stored in URL cache
and again the replacement operations in URL cache
are performed.

54
Admission control operations

Cache misses for no-cacheable objects
For a cache miss, if the object is non-cacheable,
the operations are similar to original algorithm.
If the evicted object from memory cache is
cacheable and it does not exist in disk cache,
its URL along with the reference count is stored
in URL cache.
Notice that the proposed approach may lose some
possible hits on the disk cache when objects are
accessed the second time. However, it removes all
the disk activity that disk cache stores the
objects that will not be accessed again before
evicted.

55
Admission control

Efficient Management of URL Cache
A separate hash table similar to that in
memory/disk cache is used in URL cache to support
efficient search for the URL of requested object.
The MD5 of URL is employed as the search key.
We employ a replacement policy that is based on
the URL access frequency.
The least frequently accessed entry in URL cache
is first selected for replacement.
A priority queue with access frequency as the key
is a suitable implementation for such replacement
policy.

56
Admission control

Efficient Management of URL Cache
Each entry of the URL cache records the MD5 of
URL, access frequency, and a few pointers for
facilitating priority queue and hash table data
structures.
The required memory space for each entry in URL
cache is constant.
The size of hash table and priority queue itself
is small and does not depend on the number of
entries hashed, thus can be ignored.
Based the size of the UC trace we studied in this
paper, keeping all the URLs of the requests from
one-day period in URL cache is reasonable. This
accounts for 400k URLs. Therefore, assuming 80
bytes is needed for each entry in URL cache, 32
MB of the memory space is needed for the URL
cache.

57
hit ratio h(S)
0.7
CHU
0.65
heff(S)
HR
h(S)
0.6
0.55
1
2
4
8
32
16
58
Removal frequency

On-demand Run policy when the size of the
requested document exceeds the free room in a
cache. (take time to do the removal)
Periodically Run policy every T time units, for
some T.
If removal is time consuming
Both on-demand and periodically Run policy at
the end of each day and on-demand (Pitkow/Recker
13).

59
On-demand

Two arguments suggest that overhead of simply
using on-demand replacement will not be
significant.
First, the class of removal policies maintains a
sorted list. If the list is kept sorted as the
proxy operates, then the removal policy merely
removes the head of the list for removal, which
should be a fast and constant time operation.
Second, a proxy server keeps read-only documents.
Thus there is no overhead for writing-back" a
document, as there is in a virtual memory system
upon removal of a page that was modified since
being loaded.

60
How many to remove

Removal process is stopped when the free cache
area equals or exceeds the requested document
size.
Replace documents until a certain threshold
(Pitkow and Recker's comfort level) is reached.

Write a Comment

User Comments (0)