Insights from modeling P2P content distribution - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Insights from modeling P2P content distribution

Description:

Sending a file, or video stream to a large number of peers, with the help ... M Faloutsos, 'Bitos: Enhancing bit-torrent for supporting streaming applications' ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 32
Provided by: zooCs
Category:

less

Transcript and Presenter's Notes

Title: Insights from modeling P2P content distribution


1
Insights from modeling P2P content distribution
  • Dah Ming Chiu, Chinese University of Hong Kong
  • Summer, 2007

2
What is P2P content distribution?
  • BitTorrent, CoolStreaming, PPlive
  • Sending a file, or video stream to a large number
    of peers, with the help of peers
  • Producing the most Internet traffic today
  • What IP multicast tried to support
  • Application layer multicast
  • Leverage on peers uplink capacity
  • Leverage on peers storage
  • Modeling these systems insights

3
Some basic questions
  • Scalability what happens when peers increases
  • Capacity makespan for downloading a file, or
    rate for streaming
  • Price of incentive keeping high-contribution
    peers around longer improves system performance
  • The last piece problem and the time it takes to
    get each piece in file sharing
  • The effect of piece selection strategy on
    continuity and delay in streaming

4
Three kinds of models
  • Capacity models
  • Bandwidth constraint X can help Y if X has
    enough uplink capacity
  • What is the maximum rate each peer gets served?
  • Population models
  • Content availability constraint X can help Y if
    X has the content Y needs
  • What is the population of each type of peers,
    hence max rate?
  • Peers content model
  • Assume homogeneous peers
  • What is in a peers buffer for playback?

5
The uplink sharing model
  • Mundingers thesis
  • Assume
  • only uplinks can be bottlenecks
  • heterogeneous peers (with different uplink
    bandwidths)
  • What is the minimum makespan?

6
Uplink sharing model related work
  • Previous work
  • The broadcast problem (unidirectional telephone
    model)
  • The broadcast problem (bidirectional telephone
    model)
  • Simultaneous send/receive model
  • Uplink sharing model heterogeneous peers
  • Solve a finite number of mixed integer
    programming (MILP) problems (for small number of
    file parts M)
  • Fluid limit solution (for large M)

J Mundinger et al, Analysis of Peer-to-peer File
Dissemination amongst Users of Different Upload
Capacities, IFIP Performance, Nov 2005.
7
Uplink sharing model main result
  • Maximum throughput
  • R minC0, (C0?Ci)/n
  • R is clearly the upper bound
  • The bound can be achieved if content is fluid

8
Achieving the upper bound
  • Use multiple spanning trees
  • 1 1-hop tree (server sends to all peers)
  • n 2-hop trees (server sends to 1 peer, which
    forwards content to rest of the peers)
  • Use the 2-hop trees as much as possible, till
    either
  • Server used up its bandwidth
  • Peers used up their uplink bandwidth
  • In case 2, server uses rest of bandwidth for
    1-hop tree

9
Some subsequent work
  • Use the maximum capacity to show that network
    coding cannot further help throughput
  • DM Chiu et al, Can Network Coding Help in
    P2P Networks, invited talk at 2nd Network Coding
    workshop, April 2006.
  • For two (or more) classes of users, show the
    trade-off of fairness and performance
  • B Fan, DM Chiu, JCS Lui, The delicate
    trade-off in BitTorrent-like file sharing
    protocol design, IEEE ICNP, Oct 2006.
  • For streaming, show how to allocate the total
    uplink bandwidth to two classes of peers
  • R Kumar, Y Liu, K Ross, Stochastic Fluid
    Theory for P2P Streaming Systems, IEEE Infocom,
    May 2007.

10
A generalized queueing system
  • Peers are both customers as well as servers
  • Peers arrive with rate lambda, but service rate
    given by the maximum rate R formula
  • For homogeneous peers, it is like a M/M/inf queue
    infinitely scalable performance depends on
    each peers contribution
  • For heterogeneous peers, the situation is more
    interesting

11
Stochastic uplink sharing model
  • Two or more classes of peers
  • Consider different ways to share uplink capacity
  • Allocation that minimize avg downloading time
  • Max-min allocation
  • Fair allocation
  • fair more contribution more allocated

12
Mean value analysis
  • Ui ? Di
  • ui ? Ui
  • di ? Di
  • ?di ?ui
  • Ni pi?/di
  • ?uiNi ? ? ?piui/di 1
  • T ? Ni/? ? pi/di
  • These equations gives the feasible set of ui and
    di

13
The performance vs fairness tradeoff
  • Performance avg downloading time
  • Fairness variance of share ratio
  • Share ratio ui/di
  • better downloading time ? more uneven share ratio
  • Pareto trade-off curve

14
What does BitTorrent do?
  • Tit-for-tat equal share ratio (roughly)
  • Optimistic unchoking random
  • max-min sharing (roughly)
  • BT peer selection is 4/5 Tit-for-tat and 1/5
    random
  • Possible to use different ratio for different
    trade-off

15
Three kinds of models - continued
  • Capacity models
  • Population models
  • Peers content model

16
Coupon replication model
  • A collection of coupons
  • Each peer arrives with one coupon initially
  • Each peer collects additional coupons till
    completion
  • A Markov Chain
  • Can solve it for special cases
  • Get coupon from peer in same layer (layered)
  • Random (flat)

L Massoulie and M Vojnovic Coupon Replication
Systems, ACM Sigmetrics 2005, best paper
17
Some results from coupon collection
  • For scaled up system
  • Get population size (of each layer)
  • Get sojourn time in each layer and total time
    (via Littles Law)
  • For layered, T K O(1)
  • For random, 1
  • T ? K O(sqrt(K))
  • Note, in this model, there is no uplink bandwidth
    constraint

18
Extension to the Coupon paper
  • MH Lin et al, Stochastic Analysis of File
    Swarming Systems, to appear IFIP Performance
    2007
  • Obtained a tighter bound T K log(K) O(1)
  • Also tighter results for time in each layer,
    Ti, the last piece problem
  • When peers have uplink bandwidth constraint, the
    sojourn time bound is increased by 1/(1- e-1)

19
Three kinds of models - continued
  • Capacity models
  • Population models
  • Peers content model

20
Simple peer model
  • M homogeneous peers
  • Each has a playback buffer
  • In each time slot, the server uploads one peer
    with probability 1/M
  • Without P2P network
  • continuity p(n)
  • 1/M

server
playback
1/M
1/M
M peers
1/M
YP Zhou et al, A simple model for analyzing p2p
streaming protocols, to appear ICNP 2007
21
Sliding window
  • Each peers buffer is a sliding window
  • In each time slot, each peer downloads from a
    random neighbor
  • q(i) the probability Bufi gets filled

p(1)1/M p(n)?
timet
sliding window
t1
p(1)1/M p(i1) p(i) q(i)
  • q(i) w(i)h(i)s(i)
  • w(i) peer wants to fill Bufi
  • h(i) the selected peer has the content for
    Bufi
  • s(i) Bufi determined by chunk selection
    strategy

22
Chunk selection strategies
  • Greedy
  • try to fill the empty buffer closest to playback
  • Rarest First
  • try to fill the empty buffer for the newest
    chunk
  • since p(i) is an increasing function, this means
    Rarest First

23
Chunk selection strategy - cont
  • Greedy
  • p(i1) p(i) p(i) (1- p(i)) (1- p(1) - p(n)
    p(i1))
  • Rarest first
  • p(i1) p(i) p(i) (1- p(i))2
  • Also studied
  • continuous forms for these difference equations
  • simulation

24
Which strategy is better?
  • What do you mean by better?
  • Playback continuity p(n) as large as possible
  • Start-up latency ? p(i) as small as possible
  • Given buffer size (n) and relatively large peer
    population (M)
  • Rarest first is better in continuity!
  • Greedy is better in start-up latency

25
Numerical result
  • M1000
  • N40
  • In simulation,
  • neighbors60
  • Uploads at most 2 in each time slot

26
Mixed strategy
  • Partition the buffer into 1,m and m1,n
  • Use RF for 1,m first
  • If no chunks available for download by RF, use
    Greedy for m1,n
  • Difference equations become
  • p(1) 1/M
  • p(i1) p(i) p(i) (1- p(i))2
    for i 1,,m-1
  • p(i1) p(i) p(i) (1- p(i))(1- p(m)- p(n)
    p(i1)) for i m, n-1

27
Comparison
  • For different buffer sizes
  • Mixed achieves better continuity than both RF
    and Greedy
  • Mixed has better start-up latency than RF

28
Closer look with simulation
  • Simulate 2000 time slots
  • Continuity is the average of all peers
  • Continuity for Mixed is more consistent, as well
    highest

29
Adapting m to unknown population
  • Adjust m so that p(m) achieves a target
    probability (e.g. 0.3)
  • In simulation study, 100 new peers arrive every
    100 slots
  • m adapts to a larger value as population increases

30
Related works
  • Coolstreaming
  • A prototype and experiments, demonstrating
    RF-like strategy works well for large-scale P2P
    streaming
  • X Zhang et al, Coolsteaming/donet A
    data-driven overlay network for efficient live
    media streaming, IEEE Infocom 2005
  • BiTos
  • Proposed something similar to our Mixed
    strategy, with a simulation study
  • A Vlavianos, M Iliofotou, M Faloutsos, Bitos
    Enhancing bit-torrent for supporting streaming
    applications IEEE Infocom 2006

31
The major contributors
Yipeng Zhou 1st year M.Phil student from USTC
Minghong Lin 1st year M.Phil student from USTC
Bin Fan also from USTC finished M.Phil won a
generous fellowship to study for
PhD at CMU
Write a Comment
User Comments (0)
About PowerShow.com