Title: Insights from modeling P2P content distribution
1Insights from modeling P2P content distribution
- Dah Ming Chiu, Chinese University of Hong Kong
- Summer, 2007
2What is P2P content distribution?
- BitTorrent, CoolStreaming, PPlive
- Sending a file, or video stream to a large number
of peers, with the help of peers - Producing the most Internet traffic today
- What IP multicast tried to support
- Application layer multicast
- Leverage on peers uplink capacity
- Leverage on peers storage
- Modeling these systems insights
3Some basic questions
- Scalability what happens when peers increases
- Capacity makespan for downloading a file, or
rate for streaming - Price of incentive keeping high-contribution
peers around longer improves system performance - The last piece problem and the time it takes to
get each piece in file sharing - The effect of piece selection strategy on
continuity and delay in streaming
4Three kinds of models
- Capacity models
- Bandwidth constraint X can help Y if X has
enough uplink capacity - What is the maximum rate each peer gets served?
- Population models
- Content availability constraint X can help Y if
X has the content Y needs - What is the population of each type of peers,
hence max rate? - Peers content model
- Assume homogeneous peers
- What is in a peers buffer for playback?
5The uplink sharing model
- Mundingers thesis
- Assume
- only uplinks can be bottlenecks
- heterogeneous peers (with different uplink
bandwidths) - What is the minimum makespan?
6Uplink sharing model related work
- Previous work
- The broadcast problem (unidirectional telephone
model) - The broadcast problem (bidirectional telephone
model) - Simultaneous send/receive model
- Uplink sharing model heterogeneous peers
- Solve a finite number of mixed integer
programming (MILP) problems (for small number of
file parts M) - Fluid limit solution (for large M)
J Mundinger et al, Analysis of Peer-to-peer File
Dissemination amongst Users of Different Upload
Capacities, IFIP Performance, Nov 2005.
7Uplink sharing model main result
- Maximum throughput
- R minC0, (C0?Ci)/n
- R is clearly the upper bound
- The bound can be achieved if content is fluid
8Achieving the upper bound
- Use multiple spanning trees
- 1 1-hop tree (server sends to all peers)
- n 2-hop trees (server sends to 1 peer, which
forwards content to rest of the peers) - Use the 2-hop trees as much as possible, till
either - Server used up its bandwidth
- Peers used up their uplink bandwidth
- In case 2, server uses rest of bandwidth for
1-hop tree
9Some subsequent work
- Use the maximum capacity to show that network
coding cannot further help throughput - DM Chiu et al, Can Network Coding Help in
P2P Networks, invited talk at 2nd Network Coding
workshop, April 2006. - For two (or more) classes of users, show the
trade-off of fairness and performance - B Fan, DM Chiu, JCS Lui, The delicate
trade-off in BitTorrent-like file sharing
protocol design, IEEE ICNP, Oct 2006. - For streaming, show how to allocate the total
uplink bandwidth to two classes of peers - R Kumar, Y Liu, K Ross, Stochastic Fluid
Theory for P2P Streaming Systems, IEEE Infocom,
May 2007.
10A generalized queueing system
- Peers are both customers as well as servers
- Peers arrive with rate lambda, but service rate
given by the maximum rate R formula - For homogeneous peers, it is like a M/M/inf queue
infinitely scalable performance depends on
each peers contribution - For heterogeneous peers, the situation is more
interesting
11Stochastic uplink sharing model
- Two or more classes of peers
- Consider different ways to share uplink capacity
- Allocation that minimize avg downloading time
- Max-min allocation
- Fair allocation
- fair more contribution more allocated
12Mean value analysis
- Ui ? Di
- ui ? Ui
- di ? Di
- ?di ?ui
- Ni pi?/di
- ?uiNi ? ? ?piui/di 1
- T ? Ni/? ? pi/di
- These equations gives the feasible set of ui and
di
13The performance vs fairness tradeoff
- Performance avg downloading time
- Fairness variance of share ratio
- Share ratio ui/di
- better downloading time ? more uneven share ratio
- Pareto trade-off curve
14What does BitTorrent do?
- Tit-for-tat equal share ratio (roughly)
- Optimistic unchoking random
- max-min sharing (roughly)
- BT peer selection is 4/5 Tit-for-tat and 1/5
random - Possible to use different ratio for different
trade-off
15Three kinds of models - continued
- Capacity models
- Population models
- Peers content model
16Coupon replication model
- A collection of coupons
- Each peer arrives with one coupon initially
- Each peer collects additional coupons till
completion - A Markov Chain
- Can solve it for special cases
- Get coupon from peer in same layer (layered)
- Random (flat)
L Massoulie and M Vojnovic Coupon Replication
Systems, ACM Sigmetrics 2005, best paper
17Some results from coupon collection
- For scaled up system
- Get population size (of each layer)
- Get sojourn time in each layer and total time
(via Littles Law) - For layered, T K O(1)
- For random, 1
- T ? K O(sqrt(K))
- Note, in this model, there is no uplink bandwidth
constraint
18Extension to the Coupon paper
- MH Lin et al, Stochastic Analysis of File
Swarming Systems, to appear IFIP Performance
2007 - Obtained a tighter bound T K log(K) O(1)
- Also tighter results for time in each layer,
Ti, the last piece problem - When peers have uplink bandwidth constraint, the
sojourn time bound is increased by 1/(1- e-1)
19Three kinds of models - continued
- Capacity models
- Population models
- Peers content model
20Simple peer model
- M homogeneous peers
- Each has a playback buffer
- In each time slot, the server uploads one peer
with probability 1/M - Without P2P network
- continuity p(n)
- 1/M
server
playback
1/M
1/M
M peers
1/M
YP Zhou et al, A simple model for analyzing p2p
streaming protocols, to appear ICNP 2007
21Sliding window
- Each peers buffer is a sliding window
- In each time slot, each peer downloads from a
random neighbor - q(i) the probability Bufi gets filled
p(1)1/M p(n)?
timet
sliding window
t1
p(1)1/M p(i1) p(i) q(i)
- q(i) w(i)h(i)s(i)
- w(i) peer wants to fill Bufi
- h(i) the selected peer has the content for
Bufi - s(i) Bufi determined by chunk selection
strategy
22Chunk selection strategies
- Greedy
- try to fill the empty buffer closest to playback
- Rarest First
- try to fill the empty buffer for the newest
chunk - since p(i) is an increasing function, this means
Rarest First
23Chunk selection strategy - cont
- Greedy
- p(i1) p(i) p(i) (1- p(i)) (1- p(1) - p(n)
p(i1)) - Rarest first
- p(i1) p(i) p(i) (1- p(i))2
- Also studied
- continuous forms for these difference equations
- simulation
24Which strategy is better?
- What do you mean by better?
- Playback continuity p(n) as large as possible
- Start-up latency ? p(i) as small as possible
- Given buffer size (n) and relatively large peer
population (M) - Rarest first is better in continuity!
- Greedy is better in start-up latency
25Numerical result
- M1000
- N40
- In simulation,
- neighbors60
- Uploads at most 2 in each time slot
26Mixed strategy
- Partition the buffer into 1,m and m1,n
- Use RF for 1,m first
- If no chunks available for download by RF, use
Greedy for m1,n - Difference equations become
- p(1) 1/M
- p(i1) p(i) p(i) (1- p(i))2
for i 1,,m-1 - p(i1) p(i) p(i) (1- p(i))(1- p(m)- p(n)
p(i1)) for i m, n-1
27Comparison
- For different buffer sizes
- Mixed achieves better continuity than both RF
and Greedy - Mixed has better start-up latency than RF
28Closer look with simulation
- Simulate 2000 time slots
- Continuity is the average of all peers
- Continuity for Mixed is more consistent, as well
highest
29Adapting m to unknown population
- Adjust m so that p(m) achieves a target
probability (e.g. 0.3) - In simulation study, 100 new peers arrive every
100 slots - m adapts to a larger value as population increases
30Related works
- Coolstreaming
- A prototype and experiments, demonstrating
RF-like strategy works well for large-scale P2P
streaming - X Zhang et al, Coolsteaming/donet A
data-driven overlay network for efficient live
media streaming, IEEE Infocom 2005 - BiTos
- Proposed something similar to our Mixed
strategy, with a simulation study - A Vlavianos, M Iliofotou, M Faloutsos, Bitos
Enhancing bit-torrent for supporting streaming
applications IEEE Infocom 2006
31The major contributors
Yipeng Zhou 1st year M.Phil student from USTC
Minghong Lin 1st year M.Phil student from USTC
Bin Fan also from USTC finished M.Phil won a
generous fellowship to study for
PhD at CMU