IEG5270 Advanced Topics in P2P Networking Modeling and analysis of p2p content distribution algorith - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

IEG5270 Advanced Topics in P2P Networking Modeling and analysis of p2p content distribution algorith

Description:

BitTorrent. A working, data-driven, p2p file-downing algorithm. Bootstrapping metainfo, tracker... delicate tradeoffs in BitTorrent-like file sharing ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 42
Provided by: courseIe
Category:

less

Transcript and Presenter's Notes

Title: IEG5270 Advanced Topics in P2P Networking Modeling and analysis of p2p content distribution algorith


1
IEG5270 Advanced Topics in P2P
NetworkingModeling and analysis of p2p content
distribution algorithmsPart 1
  • Dah Ming Chiu
  • Chinese University of Hong Kong

2
Recap
  • The secrets of P2P content distribution
  • Use multiple trees!
  • Tree-based (push)
  • Data-driven (pull)
  • Utilize peers resources to satisfy goal
  • File downing (as fast as possible)
  • Streaming (as continuous as possible at given
    playback rate)
  • BitTorrent
  • A working, data-driven, p2p file-downing
    algorithm
  • Bootstrapping metainfo, tracker
  • Peer selection algorithm
  • Chunk selection algorithm

3
Models
  • Why do we need models?
  • To understand the limit of p2p system
  • To understand the key factors, for designing
    better algorithms
  • To help operate a p2p system
  • The complexity of modeling a p2p system
  • It is a large scale distributed system
  • A large number of peers, and each with its own
    state
  • The complexity of modeling the network conditions
  • Links, routers, ISPs, bottlenecks
  • The complexity of modeling the peer dynamics
  • How do they arrive and leave the system
  • Accuracy versus simplicity/usefulness

4
Two types of models
  • Throughput models
  • Try to understand the capacity (maximum
    throughput) of BT-like systems
  • A large number of models in this category
  • Different levels of abstraction
  • Different peer dynamics
  • Different variations of the BT algorithm
  • Exploring different metrics
  • P2P streaming models
  • YP Zhou et al is the first (?)
  • Model the playback buffer to study playback
    performance

5
The uplink sharing model
  • Assume away the network complexity
  • Only the uplink can be the bottleneck in peer
    communications
  • Practically all models make this assumption
  • Its like the Poisson arrival assumption in
    studying a queue
  • The term is coined in Mundingers thesis

6
Mundiners thesis
  • Summarized in
  • Optimal Scheduling of P2P File Dissemination
  • To appear in Journal of Scheduling
  • Study the problem using a discrete model
  • Closer to real situation
  • Linkage to previous work on scheduling
  • The results can be easily appreciated using fluid
    assumption
  • Most other models make fluid assumption there
    are many chunks, each very small
  • Key assumptions
  • Uplink sharing model
  • Static peer population

7
Related work
  • The broadcast problem
  • Makespan the number of rounds for all N nodes
    to receive M message from a sender
  • Unidirectional communication 2M log2(N)
  • Bidirectional communication M log2(N)
  • In general, if peer is uplink capacity is Ci,
    (server is C0), what is the minimum makespan?
  • In this case, peers serve each other
    asynchronously

8
Convert to a synchronous problem
  • First show minimum makespan can be achieved using
    a synchronous system
  • Upload to one peer at a time
  • Upload chunks in discrete time slots

9
Prove equal capacity case by construction
The M2, N3 case
The first phase of the Mgt2, Ngt3 case
In the subsequent phase, peers start
back-filling Construct a strategy to finish all
pieces See paper for details
10
Discussion
  • The meaning of the formula
  • For sufficiently large M, the minimum makespan
    changes very slowly with increasing peer
    population

11
The unequal capacity case
  • What is the minimum makespan for peers with
    different uplink capacities?
  • It is a MILP (Mixed integer linear programming
    problem)
  • See paper for details
  • The following lemma is used to discretize the
    problem

But there is a simple closed-form solution,
asymptotically when M is very large!
12
Consider simple examples(small N and M)
  • C1C2CN, but CS can be different
  • Example 1 N2, M1, two cases
  • Both peers download from server
  • Peer 1 download from server, peer 2 download from
    peer 1

13
Example 2
  • N2, M2, four cases
  • Everything downloaded from server
  • One peer downloads from server, 2nd peer download
    from 1st
  • One peer downloads from server, 2nd peer downs
    partly from 1st and partly from server
  • Each peer downloads exactly one chunk from
    server, and the other chunk from each other

14
Example 2 analysis
  • N2, M2, four cases
  • Everything downloaded from server
  • One peer downloads from server, 2nd peer download
    from 1st
  • One peer downloads from server, 2nd peer downs
    partly from 1st and partly from server
  • Each peer downloads exactly one chunk from
    server, and the other chunk from each other

15
Minimum makespan for NM2
16
The M? case fluid assumption
  • Instead of makespan, think in terms of throughput
  • Given the uplink bandwidth from the server and
    peers, how do we allocate it so content can reach
    each peer at the (same) maximum rate?
  • Mundinger proved something more general

For us, it is more intuitive, and more
immediately useful to look at the single server
case
17
The simpler/more useful result
  • Given the uplink sharing model (right figure) and
    very large M, the maximum throughput is

This is a special case of Theorem 4.
DM Chiu et al, Can network coding help in P2P
networks, invited paper at 2nd NETCOD workshop,
2006
18
Proof step 1 R is upper bound
  • C0 is uplink bandwidth from server
  • Server must be able to send content out at least
    once it is a clear upper bound
  • C0 sumj(Cj) is the total uplink bandwidth from
    server and all peers
  • Each peer must receive servers content from some
    where.
  • The total demand is therefore nR, which must
    equal to the total supply

19
Proof step 2 Realizing R by construction
  • One 1-hop tree (from sender)
  • N 2-hop trees (from each peer)
  • Assign rates optimally, satisfying the uplink
    constraints
  • Two cases
  • C0 gt C/(n-1) where C sum Ci
  • C0 lt C/(n-1)

20
Maximum rate achieved
  • Case 1 C0 gt C/(n-1) where C sum Ci
  • Assign rate Ci/(n-1) to the ith 2-hop spanning
    tree
  • The ith spanning tree can deliver Ci/(n-1) to
    each other peer
  • Assign rate C0 C/(n-1) to the 1-hop spanning
    tree
  • The server can deliver (C0-C/(n-1))/n to each
    peer
  • Each peer receives (C0 C)/n
  • Case 2 C0 lt C/(n-1)
  • Assign rate C0Ci/C to the ith 2-hop spanning
    tree, which can then deliver to all other peers
    at that rate
  • Each peer receives sum(C0Ci/C) C0

21
Summary
  • Simple static model
  • Uplink sharing model
  • Large M fluid assumption
  • Static population (1 server n peers)
  • Peers always able to help each other
  • One simple extension more seeders
  • m seeders and n downloaders
  • Reason some peers stay around after becoming
    seeders
  • Try work this out yourself

22
Dynamic peers model
  • Qiu and Srikant, Modeling and performance
    analysis of BitTorrent-like peer-to-peer
    networks, Sigcomm 2004 (Google Scholar 200)

In Mundingers model
a constant n a constant 1 0 Constant
Ci Unbounded 0
this is assumed to be 1
23
A fluid model of population
  • System equations
  • In steady state
  • where

?gt0 assumed
24
Downloading time
  • Apply Littles Law

Average number of peers who will complete
downloads
Average rate downloads are completed
Average downloading time
where
25
Observations
  • T is not dependent on arrival rate ?!
  • When sharing efficiency ? increases, T decreases
  • When seeders departure rate ? increases, T
    increases
  • Initially, when downloading rate c increases, T
    decreases, but c is large enough (uplinks become
    bottleneck), c no longer affects T
  • Same observation about peer uplink rate ?
  • Normally c gt ? for a given peer but if ? lt ?,
    then there will be abundant uplink capacity (due
    to helpers), and downlink will be the bottleneck

26
Effect of sharing efficiency ?
  • In the capacity equation (Mundinger), ?1 is
    assumed
  • The analysis so far assumes 0lt?lt1
  • When ?0, (no sharing by downloaders), two cases
  • If rate of seeds leaving (?) is less than the
    rate a seed can help a peer download a file (?),
    then the system is limited by the downloading
    capacity T 1/c
  • If otherwise
  • This equation means y will decrease to 0, and the
    system dies.
  • Note, in this model, there is no server, only
    seeders.

27
Stability
  • It is not sufficient to derive a steady state
    population by setting dx/dt dy/dt 0
  • Must also show that the system will converge to
    the steady state
  • See detailed discussion of local stability in
    paper

28
Simulation results
29
Comparison to a queuing system
  • A regular queue models a client-server system,
    when the arrival rate exceeds the service rate,
    the system blows up (queue goes to infinity)
  • The analysis of a p2p system is similar to that
    for a queue, except the customers are also
    servers
  • The arrival of a customer also brings some
    additional service capacity (?gt0 case)
  • This makes T stay finite the p2p queue is
    infinitely scalable!
  • A queue may different service models FCFS, LCFS,
    priority, PS
  • What is the effect of different service models in
    a p2p system?

30
Tradeoff in rate allocationin uplink sharing
model
  • B Fan, DM Chiu and J Lui, The delicate tradeoffs
    in BitTorrent-like file sharing protocol design,
    ICNP 2006
  • Similar to Qiu Srikants model
  • But multiple classes of peers
  • Arrival rate
  • Uplink capacity
  • Downlink capacity
  • fat and thin peers if 2 classes
  • No seeders

31
Feasible rate allocation space
  • Any u and d satisfying

32
Performance metric 1average downloading time
  • In steady state
  • Number of type i peers
  • Total upload capacity
  • C??
  • ciui/di ?
  • ci is the share ratio of peer i
  • Download time of type i peers

33
Performance metric 2 fairness
  • Fairness index
  • It can be applied to any quantity x. In this
    case, let xi ci
  • Therefore, fairness index

34
Tradeoff
  • Each rate allocation (u,d) yields a different T
    and F
  • The space of different strategies and achievable
    (T,F)
  • Consider three specific allocations
  • Optimal downloading time
  • Optimal fairness
  • in terms of share ratio
  • Max-min rate allocation
  • Equalize the downloading rate of all peers,
    unless it is limited by the downlink capacity

35
Optimizing downloading time
  • Problem definition
  • Note
  • assuming uploading rate uplink capacity
  • Smaller type number means higher uplink
    capacity
  • Applying standard Lagrange multiplier methods and
    KKT conditions
  • Type 1 peer gets less downloading rate
    proportionally speaking
  • Type 1 peer may even get less downloading rate
    than thin peers

36
Optimizing fairness
  • This means equalizing the share ratios
  • Therefore
  • By same Lagrange multiplier methods

37
Max-min rate allocation
  • Problem definition
  • Can use the water filling algorithm to find the
    solution.
  • If none of the downlinks are bottlenecks

38
Compare the results
  • Since we have the closed form results for these
    cases
  • The concept of Pareto optimal, or non-inferior
    allocations

39
Apply it to distributed algorithms
  • Rate allocation analysis so far assumes knowledge
    only available in centralized algorithms
  • Examples of distributed algorithms found in BT
  • Tit-for-tat
  • This yields the most fair rate allocation, in
    equilibrium.
  • Peers form clusters
  • Random
  • This gives a solution similar to max-min
  • Each peer gets average uploading rate in steady
    state
  • Possible to adopt a weighted average of the two
  • Select ns neighbors using tit-for-tat, and na
    neighbors randomly

40
Different mix of TFT and random
For small na, the mixed strategy is not able to
find best matched neighbors, (total number of
peers 100)
41
Recap
  • So far, we considered three models
  • Mundingers model
  • Gives throughput capacity for small N,M, or very
    large M
  • But static peer population
  • Qiu-Srikant model
  • Gives average delay (hence also capacity) for
    dynamic peer population, and various other
    parameters
  • But sharing efficiency modeled by one parameter ?
  • Single class of peers
  • Fan-Chiu-Lui model
  • Multiple classes of peers, and the tradeoff of
    throughput and fairness
  • Assumes perfect sharing efficiency
  • How to more accurately model sharing efficiency?
Write a Comment
User Comments (0)
About PowerShow.com