Web Prefetching: Costs, Benefits and Performance - PowerPoint PPT Presentation

1 / 24

About This Presentation

Title:

Web Prefetching: Costs, Benefits and Performance

Description:

Department of Electrical and Computer Engineering, UNM. 1 ... Dept. of Electrical & Computer Engineering. The University of New Mexico. Aug 15, 2002 ... – PowerPoint PPT presentation

Number of Views:37

Avg rating:3.0/5.0

Slides: 25

Provided by: yingyi

Category:

more less

Transcript and Presenter's Notes

Title: Web Prefetching: Costs, Benefits and Performance

1
Web Prefetching Costs, Benefits and Performance

Yingyin Jiang, Min-You Wu, Wei Shu
Dept. of Electrical Computer Engineering
The University of New Mexico
Aug 15, 2002
WCW 2002, Boulder, Colorado

2
Talk Focus

A solution space of web prefetching
Several object-selection criteria
New simple selection algorithm
Select a good set of objects
Maximize benefit/cost
Can be tuned to achieve different goals
New performance metric
Balance of benefits (hit rate improvement)
against costs(network bandwidth increase)

3
Outline

Motivation
Our approach
Performance Evaluation
Discussions and Conclusions

4
Motivation for Prefetching

Limited hit ratio by passive caching
Typical -- 20 to 40
Limited by newly introduced, dynamically
generated data and rapid changes of objects in
the web
Prefetching can further improve hit ratio (reduce
client latency) but sacrifice network bandwidth
Predict future accesses to objects
Fetch objects before users request them

5
Key Parameters for Prefetching Algorithms

Object popularity
Zipf popularity distribution
pi C / i a
The probability of a request for the ith most
popular document is inversely proportional to i
Object lifetime
Indication of object modification
Key factor for the design of prefetching
algorithm

6
Solution Space for Web Prefetching

Six models
Two extreme cases
Passive caches(non-prefetching)
Prefetching all objects
The other four algorithms use different
object-selecting criteria and fetch objects with
values that exceed the threshold
Popularity
Lifetime
Good Fetch
APL

7
Two Simple Schemes

Popularity
Keep the most popular objects in the system
Update these objects immediately whenever they
are modified
Threshold objects popularity
Lifetime
Keep objects with longest lifetimes
Mostly consider the network resource demands
Threshold the expected lifetime of object

8
Good Fetch

Proposed by Venkataramani in Venkataramani01
Attempt to balance the benefit against the cost
of keeping an object
Threshold probability that a prefetched object
is accessed before it changes
,
li object is expected lifetime
a avg. request arrival rate
pi object is popularity
Prefetch object i if

9
APL

Attempt to balance benefit against cost
Threshold the expected number of requests for
the object i that arrive during its lifetime
,
li object is expected lifetime
a avg. request arrival rate
pi object is popularity
Prefetch object i if

10
Enhanced APL

Enhanced APL
Prefetch object i if
Motivation -- adapt to network status
When the network has abundant bandwidth, a larger
value of n can be used to fetch more popular
objects to improve hit ratio
When the network has congestions, a smaller value
of n can be used or prefetching can even be
disabled to save the bandwidth

11
Performance Evaluation for Prefetching

Evaluation metrics
Benefit -- hit ratio
Cost bandwidth
Benefit/cost H/B
Algorithms to be evaluated
Popularity
Lifetime
Good Fetch
APL

12
New Evaluation Metric H/B

Measure benefit/cost
Passive caching serves as a baseline for
comparison
Enhanced
Emphasize benefit -- hit ratio improvement
When system has plenty of spare bandwidth, a
small fraction of hit ratio improvement can still
be justified

13
Evaluation Methodology

Analytical simulation evaluation
Give a proof of concept for performance of
different algorithms
Experimental settings
Poisson model of user request arrival
Workload of one million objects Douglis97,
Breslau99, Nishikawa98
Zipf popularity distribution, with parameter
0.986 Breslau99
Object lifetimes distribution obtained from
Douglis97
Fixed object size 10K Bytes Bray96,
Williams96, Abdulla98, Arlitt99
No correlation between lifetimes, sizes,
popularities Crovella98, Breslau99

14
Distribution of Object Lifetimes
Douglis97

We vary the mean lifetime of objects across
several orders of magnitude
Shifting factor 0, mean 3.8 months
Shifting factor -2, mean 1.2 days
Shifting factor -4, mean 16. 7 minutes
The shift factor denotes the horizontal
displacement along the lifetime axis (on log
scale) of the Cumulative Distribution Function

15
Results -- Hit Ratio
shift factor -4
shift factor 0
Hit Ratio
Hit Ratio
Log10( of prefetched objects)
Log10( of prefetched objects)

Popularity -- the highest hit ratio
APL (n 1) works very close to GoodFetch
GoodFetch and APL (n 1) work closer to Lifetime
at longer mean lifetime, and closer to Popularity
at shorter mean lifetime
Lifetime the lowest hit ratio

16
Results -- Bandwidth
shift factor -4
BW(kbps)
BW(kbps)
Log10( of prefetched objects)
Log10( of prefetched objects)

Popularity consumes the most network bandwidth
compared to others
GoodFetch and APL obtain significant improvement
in hit ratio at an expense of moderate bandwidth
increase
e.g. when prefetching 0.1 objects, 15 increase
on hit ratio (39.54 over 24.3)
Total bandwidth lt 2demand bandwidth (113.50 kbps
over 60.57 kbps)
Lifetime consumes a smallest amount of bandwidth

17
Results -- H/B
shift factor 0
shift factor -4
H/B
H/B
Log10( of prefetched objects)
Log10( of prefetched objects)

Popularity drop quickly, not comparable with
others
GoodFetch and APL -- attain high H/B values and
show their effectiveness on maximizing
benefit/cost
Lifetime -- slowly decrease all the way from the
beginning

18
APL Family Hit Ratio
Hit Ratio
Log10( of prefetched objects)

a ( pi )n li , n 0.5, 1, 2, 5
n 1, APL -gt Good Fetch, maximize benefit/cost
n gt 1, APL -gt Popularity, increase hit ratio
n lt 1, APL -gt Lifetime, reduce bandwidth
consumption

19
APL Family Bandwidth
BW(kbps)
Log10( of prefetched objects)

E.g., n 5, APLs hit ratio is very close to
Popularity, and bandwidth cost is favorably
smaller.

20
APL Family H/B Ratio
H/B
Log10( of prefetched objects)

From H/B point of view, APL can bias on popular
or long-lived objects without sacrificing too
much benefit/cost
n 1, achieve the best benefit/cost
n gt 1, get more increase on hit ratio with fair
BW consumption
n lt 1, reduce bandwidth and still with reasonable
hit ratio

21
Enhanced Hk/B

Recall -- why do we extend H/B to Hk/B?
Emphasize hit ratio improvement when evaluating
benefit/cost
When evaluating with Hk/B, a small fraction of
hit ratio improvement can still be justified even
at the cost of disproportionate bandwidth
increase

22
Hk/B Evaluation
Hk/B
Log10( of prefetched objects)

With higher k, it allows prefetching of more
objects -gt encourage more hit ratio improvement
with Hk/B
For Hk/B gt 1, how many objects can be prefetched?
K 1 --700 k 2 -- 2,000 k 3 -- 7,000 k
4 -- 30,000 k 5 -- 200,000

23
Discussions
100
Hit Ratio
Bandwidth

We obtain a solution space for prefetching where
different strategies lie along axes of hit ratio
and bandwidth with different performance

24
Conclusions

We propose a new prefetching algorithm APnL
that can be made adaptive to different network
status by varying n
Prefetching must consider both object popularity
and lifetime in order to significantly improve
hit ratios at modest costs

Write a Comment

User Comments (0)