Whole Page Performance - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Whole Page Performance

Description:

Whole Page Performance? ... Not clear how individual object performance maps onto whole page performance ... at all pages in trace doesn't tell the whole story. ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 39
Provided by: Mimi4
Category:

less

Transcript and Presenter's Notes

Title: Whole Page Performance


1
Whole Page Performance
  • Leeann Bent and Geoffrey M. Voelker
  • University of California, San Diego

2
Whole Page Performance?
  • Extensive previous work on how specific
    techniques affect individual object download.
  • Caching, Prefetching, CDNs, DNS caching.
  • However, user downloads pages of objects.
  • Not clear how individual object performance maps
    onto whole page performance
  • Goal Study whole page performance
  • Extent to which different optimizations are used
  • Effect on downloading whole pages of objects

3
Related Work
  • Krishnamurthy and Wills99 look at
  • Parallel (HTTP1.0), persistent and pipelined
    connections.
  • In addition to caching, range requests, and
    content placed on different servers.
  • Top-level pages of popular sites.
  • Focus on pages where all optimizations used.
  • Our Study
  • Follow on, with a different perspective.
  • Use real user workloads.
  • All pages, not just top level pages on popular
    servers
  • Not all pages use optimizations
  • Base page embedded objects.
  • Connection optimizations CDNs DNS.

4
Overview
  • Introduction
  • Methodology
  • Results
  • Conclusion

5
Methodology Overview
  • Use Medusa to
  • Record everyday browsing from six users over four
    days.
  • Replay traces toggling performance options
  • Parallel Connections
  • Using CDNs
  • Complete DNS caching
  • Persistent Connections
  • Compute download costs for whole pages

6
The Medusa Proxy
  • Internet

7
Page Download Time
  • Page download time
  • Time required to download base page and all
    embedded objects.
  • Reflects user-perceived web performance
  • Calculated using object download time.
  • Determine object download time from just after
    DNS lookup to connection close or full object
    return (persistent).
  • Incorporate original recorded DNS times where
    appropriate.

8
Example
Individual Object Times
Page Download Times
854 ms
Serial
Parallel (2 conns)
259 ms
580ms
580 ms
9
Traces
  • Six users April 27 - 30 (Sat. - Tues.).
  • Originally 22,228 objects and 1,455 pages.
  • Remove error pages.
  • Replay data gathered May 6-7 (Mon - Tues) June
    22-27 (Sat. Thurs.).
  • Minimize warming effects by taking median of 5
    consecutive page downloads.

10
Optimization Combinations
  • Parallel Connections (1)
  • Medusa tracks number of concurrent connections
    used during trace.
  • Used to replay parallel download.
  • CDN Usage (2)
  • When no CDN usage, remove CDN references.
  • Replace with references to origin servers.
  • When CDN usage enabled, traces left intact.
  • DNS Caching (3)
  • Simulate ideal DNS caching by excluding DNS time.
  • Normal DNS add original DNS lookup times from
    trace.
  • Persistent Connections (4)
  • Use whichever protocol (1.0/1.1) recorded in
    original trace.

11
Overview
  • Introduction
  • Methodology
  • Results
  • Conclusion

12
Whole Page Optimizations
  • Parallel gives large improvement.
  • CDN improvement small.
  • 2.5
  • DNS improvement consistent.
  • 7.4
  • 6.7
  • Persistent connections not as helpful as
    expected
  • 1.5

13
Overall Trace Conclusions
  • Parallelism has the greatest effect.
  • Parallelism used aggressively on all pages.
  • All other options provide incremental benefits.
  • Does not mean other optimizations dont work.
  • Some overheads may be relatively small.
  • Average over all pages.
  • Not all pages implement all optimizations.
  • We dont simulate more aggressive use of options
    than found in original trace.
  • A closer look

14
Ideal DNS Caching
  • Average DNS costs
  • Per object 7.1 ms
  • Per page 529 ms
  • DNS improvement moderate across the board.
  • 5 14 improvement across all pages.
  • Provides moderate benefit to all pages.
  • Not all objects require full DNS lookups
  • Already effective DNS caching in traces

15
Objects Per Page
  • We would expect some other optimizations to have
    a greater effect (e.g. persistent connections).
  • Looking at all pages in trace doesnt tell the
    whole story.
  • Less opportunity for connection optimizations on
    small pages.
  • Page with one object counts as much as a page
    with 152 objects.
  • Optimizations more effective on a page with 152
    objects.
  • Separate out effects of optimizations in pages
    with different numbers of objects
  • Median number of objects per page is 5.
  • Average number of objects per page is 15.

16
Page Breakdown
  • 1-5 objects
  • 1 21
  • 2-5 63
  • 6 objects improvements.
  • 6-15 157
  • 16 183
  • Persistent
  • 1.95
  • 18.5

17
Page Breakdown Conclusions
  • Performance optimizations dependent on number of
    objects per page.
  • Optimizations more effective when more objects
    per page.
  • Especially connection optimizations.
  • Single object pages see moderate improvement.
  • Can usually only benefit from DNS caching and
    CDNs.
  • Persistent benefit only if on same server as
    previous page.
  • And 26 of pages had one object

18
Persistent Connections
  • Still dont see a whole lot of improvement for
    persistent connections.
  • Expected to see more benefit for 16 objects.
  • Not all pages use persistent connections.
  • 20 of pages in our trace use them (229 pages).
  • 2211 objects or 16.1.
  • 9.65 objects per page.
  • Look at only pages that contain persistent
    connections.

19
Persistent Connections
  • Persistent connections useful if
  • Many objects downloaded over persistent
    connections in the original trace.
  • Objects downloaded from few servers.
  • For pages lt 6 objects
  • 2 out of 3 downloaded with persistent
    connections.
  • Average page size 3.
  • On average, 1.32 persistent objects per server.
  • For pages gt 16 objects
  • Average 18 objects with persistent connections.
  • On average, 3.92 persistent objects per server.

20
Mostly Persistent Pages
  • Know what it takes to see persistent
    optimization improvement
  • Look at large pages where persistent connections
    used extensively (gt50 of objects).
  • Pages that can benefit, do
  • 6 objects improve 33-50.

21
CDN
  • Previous study showed CDNs highly effective for
    individual objects. Koletsou01
  • What is effect on whole page performance?
  • Few pages with explicit Akamai-hosted objects.
  • 48 pages or 5.2 of pages.
  • 216 objects or 1.6 of total downloaded objects.
  • Average of 4.5 CDN objects per page.
  • Looked at CDN only page improvements
  • CDNs improve CDN containing pages 6 - 30.

22
Conclusions
  • Parallel connections have greatest impact.
  • Universally applicable and easy to implement.
  • Other options give incremental performance across
    all pages.
  • Some optimizations provide consistent, but
    moderate, improvement across all pages.
  • Some optimizations are not implemented on all
    pages.
  • Provide benefit when used extensively.

23
Conclusions
  • Can we draw correlation between object and
    real-world whole page performance?
  • Depends.
  • Not all optimizations widely used.
  • When optimizations are used to full advantage,
    they are effective.

24
Medusa Available
  • http//ramp.ucsd.edu/lbent/Medusa/index.html

25
The End
26
Medusa Proxy Functionality
  • Trace and Replay
  • Record requests and replay.
  • Parallel connections.
  • Persistent connections.
  • Transformation
  • CDN/no CDN replay.
  • Performance Measurement
  • Request latency.
  • DNS overhead.
  • Optimization options
  • Use parallel connections.
  • Use persistent connections.
  • HTTP 1.0 and HTTP 1.1.
  • Always attempt, never attempt, mirror trace
    attempt.

27
Page Delimitation
  • Determining pages
  • Necessary for
  • Calculating total page costs.
  • Limiting optimizations to within one page.
  • Parallel Connections.
  • Can analyze page and draw object dependencies.
  • High overhead
  • May impact user
  • Use inter-object times in the original trace
    data.
  • Use 2 second inter-object times.

28
Akamaized URLs
  • Akamai accounts for 85-98 of CDN hosted objects
    ref.
  • Will not account for sites completely hosted on
    Akamai hosts.
  • Filter
  • http//a1964.g.akamai.net/f/1964/2730/1h/app.whenu
    .com/image.gif
  • http//app.whenu.com/image.gif

29
Interleaved Requests
  • Requests may get interleaved when recorded in
    parallel mode and replayed in serial mode
  • E.G.
  • Connection 0 requests www.cnn.com,
    www.cnn.com/style.css.
  • Connection 1 requests ar.atwola.com.
  • Requests may be ordered in trace as
  • www.cnn.com, ar.atwola.com, www.cnn.com/style.css.
  • Negates benefit of parallel connections.

30
Page Characterization Objects per Page
31
Object Types
  • Identified object type by clues in URL
  • 80 of URLs images (.gif, .jpg).
  • 5.6 html file (.htm, .html).
  • 3.8 cgi, perl or javascript (?,.pl, .class).
  • 3.3 javascript (.js).
  • 3.6 unidentified (no suffix, pdf, txt, etc).

32
Persistent Connection/Brower
  • Persistent connections appear correlated with
    browser
  • IE - 12 pgs, 15.8 objs.
  • Netscape - 19.5 pgs,10.0 objs.
  • Omniweb - 66.0 pgs, 72.4 objs.
  • Mozilla 5.0/Gecko - 95.8 pgs, 91.3 objs.

33
Persistent Connection Pages
  • Still not as improved as expected
  • Better than for only large pages
  • Serial 7.28 vs. 1.98
  • Parallel 24.03 vs.18.5
  • Medians dont show improvements in all cases.

34
Mostly Persistent Pages
35
Persistent Connections per Page
36
Same as previous 16
37
Ad-Servers
  • Identified by identifying hosts that were named
    with the phrases ads and adserver.
  • YES http//rmads.msn.com/images_47144_date_0429_5
    0.jpg.
  • NO http//graphics4.nytimes.com/ads/scottrade_sov
    .gif.

38
Ad-Servers and DNS
  • Number of pages with ad-servers.
  • 9.5 of pages, 1.53 of total objects.
  • Average of 2.4 ads per page.
  • Objects not hosted on content server.
  • DNS lookup may be large part of lookup cost.
  • DNS caching doesnt give great improvement
  • DNS caching improves parallel case 10.9.
  • Compared with 12.2 over all pages.
  • DNS caching improves parallel, persistent case
    8.
  • Compared with 6.3 over all pages.
  • DNS caching improves parallel, persistent w/ CDN
    4.7.
  • Compared to 6.3.
Write a Comment
User Comments (0)
About PowerShow.com