Cooperative Caching in the Internet - PowerPoint PPT Presentation

About This Presentation
Title:

Cooperative Caching in the Internet

Description:

Each proxy maintains a local Bloom filter to represent its own cached documents. ... When a proxy fails, the mapping server notices it eventually and mards entries ... – PowerPoint PPT presentation

Number of Views:69
Avg rating:3.0/5.0
Slides: 20
Provided by: lassCs
Category:

less

Transcript and Presenter's Notes

Title: Cooperative Caching in the Internet


1
Cooperative Caching in the Internet
Presented by Mohammad Salimullah Raunak Feb. 15,
2000
2
Outline
  • Motivation
  • General Idea
  • Papers
  • Harvest and Squid (http//catarina.usc.edu/danzig/
    cache.ps)(http//www.ircache.net/7ewessels/Paper
    s/icp-squid.ps)
  • Summary Cache (http//www.cs.wisc.edu/cao/papers
    /summarycache.ps)
  • Cache Digests (http//www.ircache.net/7ewessels/
    Papers/wcw98.dvi)
  • Crisp (http//www.research.att.com/misha/crisp/d
    istrProxy/hotos.ps)
  • Summary

3
Motivation
  • Proxy Caching
  • Saves Bandwidth
  • Reduces Latency
  • Problems with a Single Proxy
  • Bottleneck
  • Scalability
  • Single Point of Failure

4
General Idea
5
Harvest
  • Hierarchical, Tree-like
  • Query-response
  • UDP based
  • Twice as fast as CERN object cache
  • Ten times faster than Netscape's Netsite.
  • Evolved into
  • Squid
  • NetCache

6
ICP
  • Internet Cache Protocol is the cache to cache
    communication Protocol developed in Harvest
  • A lot of commercial and research cache systems
    have implemented ICP
  • ICP is based on UCB to expedite query/reply
    exchange messages
  • "ICP loss" provides rudimentary load balancing
    mechanism
  • Suffers from the security problems of UDP less
    effective when cache servers are not close to
    each other

7
Summary Cache
  • Individual querying with ICP increases overhead
  • Inter-proxy traffic increases by a factor or 70
    to 90
  • In Summary Cache
  • A cache directory is maintained
  • Each proxy keeps a compact summary of the
    directory
  • Probe all the summaries and do a targetted query
  • The summaries are not 100 correct
  • Issues
  • Representation of the summary
  • Frequency of the Summary Update

8
Summary Cache Issues
  • Representation of the summary
  • Objective Less Memory
  • Choices
  • Exact directory with MD5 signature
  • Server Name
  • Get something less memory and minimal false hits
  • One Solution
  • Use Bloom Filters

9
Bloom Filters
  • A hash based probabilistic scheme that answers
    membership queries
  • Zero probability for false negatives
  • low probability for false positives
  • A vector v of m bits (initially set to 0)
  • k independent hash functions, h1, h2, ... hk
  • For each a in A , the bits at positions h1(a),
    h2(a), ... hk(a) in v are set to 1.

10
Bloom Filters (Contd.)
  • For a query for object b in A , check the bits at
    positions h1(b), h2(b), ... hk(b) in v are set
    to 1
  • If any of them is 0, b is not in A
  • Otherwise, assume b is in the set with certain
    probability of hitting a false positive
  • Each proxy maintains a local Bloom filter to
    represent its own cached documents.
  • For each location in the bit vector, it also
    keeps a counter to keep track of how many times
    the bit was set.

11
Bloom Filters (Contd.)
  • A proxy builds a Bloom filter from the list of
    URLs of its cached documents
  • It sends the bit array plus the specification fo
    the hash function to other proxies
  • Update can happen by sending the whole bit array
    or the changes.
  • A balanced tradeoff between the memory
    requirement and the false positive ratio

12
Summary Cache Issues
  • When to update?
  • Impact of delayed update
  • Update the summaries only when the percentage of
    new documents has reached a threshold
  • Use false misses as an indicator
  • A delay threshold of 1 to 10 for updating
    summaries results in a tolerable degradation of
    the cache hit ratios
  • Use either broadcast or periodic exchange for
    updating summaries

13
Cache Digests
  • Rousskov and Wessels at NLANR
  • Very very similar to Summary Cache
  • Digests allow proxies to make information about
    their cache contents available to peers in a
    compact form
  • A proxy uses digest to identify the likely
    neighbours to have the missed object
  • Digests are also based on Bloom Filters
  • While ICP generates a steady stream of small
    packets, with Bloom filter based approaches,
    transfers occur in high volume bursts

14
Summary vs. Digests
  • Same design objective
  • Same technique (Bloom filters)
  • Same update time policy
  • Differrent data dissemination policy
  • Summary cache uses push for updates
  • Cache Digests uses pull based approach
  • Cache Digests proposes to "piggyback" the update
    messages in HTTP replies.

15
CRISP
  • Caching and Replication for Internet Service
    Performance
  • A central directory of cached objects
  • Proxies share their caches using this central
    mapping service
  • Proxies notify the mapping service any time they
    add or remove and object from the cache.
  • The updates and probing are done using unicast
    messages.

16
CRISP (cont.)
17
CRISP (cont.)
  • Obvious Drawbacks
  • Centralized structure
  • Scalability problem
  • Single point of failur
  • CRISP argues
  • Requests to the map is small in size many can be
    served
  • The response time of a well-configured mapping
    server is well below the ability of a human
    percenption
  • When needed, more mapping servers an be added and
    the URLs can be statically partitioned across the
    servers.
  • Configure mapping server way below the saturation
    point

18
CRISP (cont.)
  • Handling Failures
  • When the mapping server fails, the cooperation
    becomes unavailable
  • Proxies serve their clients in the usual way
  • When a proxy fails, the mapping server notices it
    eventually and mards entries from that cache as
    unavailable
  • When a proxy rejoins, it registers itself and its
    cache objects are restored or re-enabled in the
    directory
  • When a mapping server comes alive from failure,
    the central map is recreated by joining the caches

19
Summary
  • Cooperative caching improves performance
  • Low overhead yet highly accurate information
    sharing mechanism is the prime objective
  • Different approaches have been tried
Write a Comment
User Comments (0)
About PowerShow.com