Building a Large and Efficient Hybrid Peer-to-Peer Internet Caching System PowerPoint PPT Presentation

presentation player overlay
1 / 20
About This Presentation
Transcript and Presenter's Notes

Title: Building a Large and Efficient Hybrid Peer-to-Peer Internet Caching System


1
Building a Large and Efficient Hybrid
Peer-to-Peer Internet Caching System
  • Li Xiao
  • Xiaodong Zhang
  • Artur Andrzejak
  • Songqing Chen

2
Overview
  • Web contents become more diverse
  • Multiple levels of caching
  • Document duplication among different levels
  • Underutilized network between clients
  • Share cache contents
  • P2P web caching management scheme
  • Improve scalability

3
Typical web caching
  • Browser cache / Proxy cache
  • Enlarge cache size
  • Documents exist in browser cache but have been
    replaced in the proxy
  • Decrease of proxy hit ratio
  • Each document is replicated in local cache

4
Limitations due to replication
  • Overhead
  • Invalidation
  • Broadcast
  • Waste storage space
  • Close the speed gap between local and remote
    accesses
  • Transfer among clients

5
Browser aware caching
  • Share cache contents
  • Reduce document replication
  • Utilize web contents / network bandwidth
  • Hybrid P2P system
  • Issues
  • Data integrity
  • Privacy

6
Trends in web caching
  • Dynamic pages gt non-cachable
  • Increase of cache size
  • Can not keep up
  • Hit ratios are reduced

7
Trends in web caching (contd)
8
Duplication in web caching
  • Types of data
  • Requested by a single client
  • Requested by a multiple clients
  • Intrasharing / Intersharing ratio

9
Architecture
  • Proxy server keep track of client cache contents
  • Miss -gt try to find in another clients cache
  • Forward data / copy in proxy cache

10
Architecture (contd)
  • Local hit
  • use it )
  • Proxy hit
  • (Access counter from client)
  • Store in client if counter above threshold
  • Proxys browser index file
  • Access counter (global and specific for client)
  • Move it to proxy / cache in another client
  • Global miss
  • Fetch document / store only in client

11
Data structures
  • Browser index file in proxy
  • Id of client / URL
  • Client initiated invalidation
  • Browser data
  • remote accesses
  • Proxy data
  • remote accesses
  • Duplicate document?

12
Evaluation Cache size
  • Proxy memory for each client 0.04 - 0.08 Mb
  • Proxy disc space for each client 7 11 MB
  • (

13
Evaluation Proxy cache size
14
Evaluation Browser cache size
15
Evaluation Replacement threshold
16
Evaluation Scaling clients
17
Evaluation Latency
  • 21 reduction compared to basic scheme
  • 56 reduction compared to typical scheme

18
Reliability and Privacy
  • Data integrity
  • Clients can modify files
  • Digital watermark
  • Privacy
  • Do not reveal client requests / hide identities
  • Proxy acts as anonymizer
  • Receive / forward requests (where is p2p?)

19
Implementation
  • Client daemon
  • Mozilla
  • Browser aware proxy server
  • Squid
  • For average size of 8 Kb, 11ms overhead
  • Fetch from web server -gt 50ms

20
Conclusion
  • Hit ratio decrease, significant duplication
  • Browser aware caching, share cache contents
  • Data integrity and privacy issues
Write a Comment
User Comments (0)
About PowerShow.com