Title: informed content delivery across
1informed content delivery across adaptive overlay
networks (sigcomm 2002)
authors John Byers, Jeffrey Considine, Michael
Mitzenmacher, Stanislav Rost
talk Ramaprabhu Janakiraman (rama_at_cse.wustl.edu)
2(No Transcript)
3content delivery across overlay networks
advantages
- no reliance on central server scalable and
fault-tolerant - opportunistic downloads from multiple sources
- rich services CDN's, multiparty games,
streaming
challenges
- asynchrony
- heterogeneity
- transience
- scalability
4stateful solutions vs. digital fountain solutions
limitations of stateful solutions
- needs per-connection state at endpoints
- parallel downloads need to be carefully
orchestrated - partial content is correlated, fewer useful peers
the digital fountain approach
- use error-correcting codes to produce encoded
data - send random encoded data instead of source data
- any commensurate subset suffices to recover
original data
benefits of a digital fountain approach
- continuous encoding
- time-invariance stateless generation of new
encoded data - tolerance any sampling pattern by receivers
- additivity no orchestration for parallel
downloads
5(No Transcript)
6reconciliation and informed delivery
need for reconciliation
- encoded symbols chosen from large, unordered
universe - senders with partial content may only send
extant symbols - goal avoid transmitting redundant symbols
conditions for suitable applications
- use of rich overlay with multiple connections
per peer - senders have partial and correlated content
- working sets of symbols from large unordered
universe
approaches proposed
- coarse-grained reconciliation
- speculative transfers
- fine-grained reconciliation
7estimating the worth of a peer
B
A
wants to download from
SB
SA
definitions
- containment of SB in SA fraction of B's symbols
that overlap with A - resemblance of SA and SB fraction of total
symbols that overlap
in this example, containment of SB in SA is
1/2 their resemblance is 1/3
want to download from peers with low containment
and resemblance...
8 approximate set similarity
random sampling
- peer A sends a sample KA of size k to peer B
- peer B estimates the resemblance as SBKA/k
- needs to search for KA elements from SB
min-wise sketches
- fractional of matches is unbiased estimator of
resemblance
9(No Transcript)
10exact reconciliation
approach 1
- send a list of symbols
- communication complexity is O(SA log u)
approach 2
- use hashing into a universe of 0, h)
- communication complexity is O(SA log h)
instead - chance of false positives
approach 3
- throw more math at it
- if discrepancy d is symbols unique to SA or SB
then communication complexity O(d log u) or O(d
log h) - computational complexity O(d3) or O(d) with more
messages
in practice, exact determination not necessary!
11approximate reconciliation a Bloom filter
approach
what are Bloom filters?
0
m-1
...
k
2
3
1
x
properties
- possibility of false positivies
- m ( bits) and k ( hashes) can be optimally
chosen - in optimal case, false positive rate (1/2)k
approach
- A inserts its elements into Bloom filter and
sends it to B - B checks each of its elements against filter
O(SB) time needed to find out the difference!
12(No Transcript)
13(No Transcript)
14performance of approximate reconciliation
- d is discrepancy (elements unique to either SA
or SB) - approach helps when containment is large
15(No Transcript)
16experimental results
scenarios
- peer-to-peer reconciliation
- peer augmented downloads
- parallel downloads from peers
collaboration methods
- uninformed send random symbols
- speculative use min-wise summary to recode
content - reconciled use Bloom filter or A.R.T to filter
out useless symbols
definitions
- slack unique symbols in system (fraction of
minimum needed) - overhead actual symbols needed (fraction of
minimum needed)
17overhead of p2p reconciliation
18overhead of peer-augmented downloads
19overhead of p2p collaboration
- why do reconciled downloads have so much
overhead? - how can we fix this?
20collaboration with periodic updates
- periodically reconcile with sending peers
- generates extra bandwidth overhead
- optimal total bandwidth reconcile after every 5
10 of download - marked improvement in performance, in line with
other cases
21conclusions
- advocate a digital fountain approach for
flexible content sharing - main drawback is need to reconcile correlated
working sets - use reconcilement techniques here for informed
collaboration
thank you...questions?