An Analysis of Internet Content Delivery Systems PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: An Analysis of Internet Content Delivery Systems


1
An Analysis of Internet Content Delivery Systems
  • Stefan Saroiu, Krishna P. Gommadi, Richard J.
    Dunn, Steven D. Gribble, and Henry M. Levy
  • Proceedings of the 5th Symposium on Operating
    Systems Design and Implementation
  • December 2002

2
Outline
  • Goals of Paper
  • Overview of Content Delivery Systems
  • Experimental Methodology
  • Results
  • Caching
  • Conclusions

3
Goals
  • Quantify the increasing importance of novel
    content delivery systems
  • Characterize the behavior of these systems from
    the perspectives of clients, objects, and servers
  • Derive implications for caching in these systems

4
Content Delivery Systems
  • HTTP Web Traffic
  • Content Delivery Networks
  • Akamai
  • Peer-to-peer file sharing networks
  • Gnutella
  • Kazaa

5
HTTP Traffic
  • Clients request objects from web servers using
    HTTP
  • Most web objects are small, 5-10KB.
  • Web object requests follow a Zipf-like
    distribution
  • Caching
  • Cache hit rate increases logarithmically with
    client population
  • Impossible for dynamic content

6
Zipf Distribution Compared with Sun Log Data
7
Content Delivery Networks (CDNs)
  • Dedicated collections of servers that are
    geographically distributed
  • Provide static content, e.g. images, streaming
    video
  • Allows user to access replica of content that is
    close
  • Replica location done via DNS interposition or
    URL rewriting at origin servers
  • Redirection adds overhead
  • Reduces average download response time

8
Peer-to-Peer Systems
  • Peers form a distributed system to exchange
    content
  • Batch-style downloads
  • Most peers have low-availability and limited
    network capacity
  • Files transferred via direct connection between
    peers

9
Experiment Methodology
  • Use passive network monitoring to collect trace
    of TCP traffic between University of Washington
    (UW) to rest of Internet
  • Collected 9 days of data, over 20 TB

10
Some Interesting Observations
  • UW is an HTTP content provider
  • Exported 16.65 TB. Imported 3.44 TB
  • Bandwidth consumption (inout)
  • .2 Akamai
  • 6.04 Gnutella
  • 14.3 WWW
  • 36.9 Kazaa
  • Rest is other TCP protocols mail, streaming
    video/audio, etc.

11
Some More Interesting Observations
  • Compared to 1999 study
  • HTML traffic has decreased 43
  • GIF/JPG traffic has decreased 59
  • AVI/MPG traffic increased nearly 400
  • MP3 traffic increased nearly 300

12
Objects
  • Median P2P object size is 4MB.
  • Median Web object is 2KB
  • 5 of Kazaa objects are over 100MB
  • Top 1 of Kazaa objects account for 50 of bytes
    transferred
  • For Web, top 1 account for 16 of bytes
    transferred

13
Clients
  • For both Web and Kazaa, small number of clients
    account for large portion of traffic
  • In Web, top 200 clients (0.5 of the population)
    account for 13 of the traffic
  • In Kazaa, top 200 clients (4 of the population)
    account for 50 of the traffic

14
Servers
  • Would expect server load for Kazaa to be much
    more distributed than for WWW
  • This is not the case
  • Top 500 external Web servers provide 22 of the
    bytes
  • Top 500 external Kazaa servers provide 10 of the
    bytes

15
Scalability
  • With respect to bandwidth cost adding another
    450 Kazaa clients would be equivalent to doubling
    the web client population (from 40,000 to 80,000)

16
CDN Caching
  • Do CDNs provide any performance benefits over
    local proxy cache?
  • If Akamai traffic were directed to proxy cache
    instead
  • 88 ideal object hit rate (all objects cacheable)
  • 50 practical hit rate
  • Conclusion Widely deployed proxy caches reduce
    need for separate CDNs

17
P2P Caching
  • Inbound cache byte hit rate 35
  • Outbound cache byte hit rate 85
  • Hit rate increases with client population
  • 1,000 clients 40 hit rate
  • 500,000 clients 85 hit rate
  • Conclusion Reverse P2P cache saves the most
    bandwidth

18
Conclusions
  • P2P traffic accounts for majority of HTTP bytes
    transferred
  • P2P objects are significantly larger than Web
    objects
  • Small number of large objects account for a large
    percentage of P2P traffic
  • Small number of clients and servers responsible
    for majority of P2P traffic
  • P2P traffic creates significant bandwidth load
Write a Comment
User Comments (0)
About PowerShow.com