IEG5270 Advanced Topics in P2P Networking Modeling and analysis of p2p content distribution algorith - PowerPoint PPT Presentation

1 / 41

About This Presentation

Title:

IEG5270 Advanced Topics in P2P Networking Modeling and analysis of p2p content distribution algorith

Description:

BitTorrent. A working, data-driven, p2p file-downing algorithm. Bootstrapping metainfo, tracker... delicate tradeoffs in BitTorrent-like file sharing ... – PowerPoint PPT presentation

Number of Views:37

Avg rating:3.0/5.0

Slides: 42

Provided by: courseIe

Category:

more less

Transcript and Presenter's Notes

Title: IEG5270 Advanced Topics in P2P Networking Modeling and analysis of p2p content distribution algorith

1
IEG5270 Advanced Topics in P2P
NetworkingModeling and analysis of p2p content
distribution algorithmsPart 1

Dah Ming Chiu
Chinese University of Hong Kong

2
Recap

The secrets of P2P content distribution
Use multiple trees!
Tree-based (push)
Data-driven (pull)
Utilize peers resources to satisfy goal
File downing (as fast as possible)
Streaming (as continuous as possible at given
playback rate)
BitTorrent
A working, data-driven, p2p file-downing
algorithm
Bootstrapping metainfo, tracker
Peer selection algorithm
Chunk selection algorithm

3
Models

Why do we need models?
To understand the limit of p2p system
To understand the key factors, for designing
better algorithms
To help operate a p2p system
The complexity of modeling a p2p system
It is a large scale distributed system
A large number of peers, and each with its own
state
The complexity of modeling the network conditions
Links, routers, ISPs, bottlenecks
The complexity of modeling the peer dynamics
How do they arrive and leave the system
Accuracy versus simplicity/usefulness

4
Two types of models

Throughput models
Try to understand the capacity (maximum
throughput) of BT-like systems
A large number of models in this category
Different levels of abstraction
Different peer dynamics
Different variations of the BT algorithm
Exploring different metrics
P2P streaming models
YP Zhou et al is the first (?)
Model the playback buffer to study playback
performance

5
The uplink sharing model

Assume away the network complexity
Only the uplink can be the bottleneck in peer
communications
Practically all models make this assumption
Its like the Poisson arrival assumption in
studying a queue
The term is coined in Mundingers thesis

6
Mundiners thesis

Summarized in
Optimal Scheduling of P2P File Dissemination
To appear in Journal of Scheduling
Study the problem using a discrete model
Closer to real situation
Linkage to previous work on scheduling
The results can be easily appreciated using fluid
assumption
Most other models make fluid assumption there
are many chunks, each very small
Key assumptions
Uplink sharing model
Static peer population

7
Related work

The broadcast problem
Makespan the number of rounds for all N nodes
to receive M message from a sender
Unidirectional communication 2M log2(N)
Bidirectional communication M log2(N)
In general, if peer is uplink capacity is Ci,
(server is C0), what is the minimum makespan?
In this case, peers serve each other
asynchronously

8
Convert to a synchronous problem

First show minimum makespan can be achieved using
a synchronous system
Upload to one peer at a time
Upload chunks in discrete time slots

9
Prove equal capacity case by construction
The M2, N3 case
The first phase of the Mgt2, Ngt3 case
In the subsequent phase, peers start
back-filling Construct a strategy to finish all
pieces See paper for details
10
Discussion

The meaning of the formula
For sufficiently large M, the minimum makespan
changes very slowly with increasing peer
population

11
The unequal capacity case

What is the minimum makespan for peers with
different uplink capacities?
It is a MILP (Mixed integer linear programming
problem)
See paper for details
The following lemma is used to discretize the
problem

But there is a simple closed-form solution,
asymptotically when M is very large!
12
Consider simple examples(small N and M)

C1C2CN, but CS can be different
Example 1 N2, M1, two cases
Both peers download from server
Peer 1 download from server, peer 2 download from
peer 1

13
Example 2

N2, M2, four cases
Everything downloaded from server
One peer downloads from server, 2nd peer download
from 1st
One peer downloads from server, 2nd peer downs
partly from 1st and partly from server
Each peer downloads exactly one chunk from
server, and the other chunk from each other

14
Example 2 analysis

N2, M2, four cases
Everything downloaded from server
One peer downloads from server, 2nd peer download
from 1st
One peer downloads from server, 2nd peer downs
partly from 1st and partly from server
Each peer downloads exactly one chunk from
server, and the other chunk from each other

15
Minimum makespan for NM2
16
The M? case fluid assumption

Instead of makespan, think in terms of throughput
Given the uplink bandwidth from the server and
peers, how do we allocate it so content can reach
each peer at the (same) maximum rate?
Mundinger proved something more general

For us, it is more intuitive, and more
immediately useful to look at the single server
case
17
The simpler/more useful result

Given the uplink sharing model (right figure) and
very large M, the maximum throughput is

This is a special case of Theorem 4.
DM Chiu et al, Can network coding help in P2P
networks, invited paper at 2nd NETCOD workshop,
2006
18
Proof step 1 R is upper bound

C0 is uplink bandwidth from server
Server must be able to send content out at least
once it is a clear upper bound
C0 sumj(Cj) is the total uplink bandwidth from
server and all peers
Each peer must receive servers content from some
where.
The total demand is therefore nR, which must
equal to the total supply

19
Proof step 2 Realizing R by construction

One 1-hop tree (from sender)
N 2-hop trees (from each peer)
Assign rates optimally, satisfying the uplink
constraints
Two cases
C0 gt C/(n-1) where C sum Ci
C0 lt C/(n-1)

20
Maximum rate achieved

Case 1 C0 gt C/(n-1) where C sum Ci
Assign rate Ci/(n-1) to the ith 2-hop spanning
tree
The ith spanning tree can deliver Ci/(n-1) to
each other peer
Assign rate C0 C/(n-1) to the 1-hop spanning
tree
The server can deliver (C0-C/(n-1))/n to each
peer
Each peer receives (C0 C)/n
Case 2 C0 lt C/(n-1)
Assign rate C0Ci/C to the ith 2-hop spanning
tree, which can then deliver to all other peers
at that rate
Each peer receives sum(C0Ci/C) C0

21
Summary

Simple static model
Uplink sharing model
Large M fluid assumption
Static population (1 server n peers)
Peers always able to help each other
One simple extension more seeders
m seeders and n downloaders
Reason some peers stay around after becoming
seeders
Try work this out yourself

22
Dynamic peers model

Qiu and Srikant, Modeling and performance
analysis of BitTorrent-like peer-to-peer
networks, Sigcomm 2004 (Google Scholar 200)

In Mundingers model
a constant n a constant 1 0 Constant
Ci Unbounded 0
this is assumed to be 1
23
A fluid model of population

System equations
In steady state
where

?gt0 assumed
24
Downloading time

Apply Littles Law

Average number of peers who will complete
downloads
Average rate downloads are completed
Average downloading time
where
25
Observations

T is not dependent on arrival rate ?!
When sharing efficiency ? increases, T decreases
When seeders departure rate ? increases, T
increases
Initially, when downloading rate c increases, T
decreases, but c is large enough (uplinks become
bottleneck), c no longer affects T
Same observation about peer uplink rate ?
Normally c gt ? for a given peer but if ? lt ?,
then there will be abundant uplink capacity (due
to helpers), and downlink will be the bottleneck

26
Effect of sharing efficiency ?

In the capacity equation (Mundinger), ?1 is
assumed
The analysis so far assumes 0lt?lt1
When ?0, (no sharing by downloaders), two cases
If rate of seeds leaving (?) is less than the
rate a seed can help a peer download a file (?),
then the system is limited by the downloading
capacity T 1/c
If otherwise
This equation means y will decrease to 0, and the
system dies.
Note, in this model, there is no server, only
seeders.

27
Stability

It is not sufficient to derive a steady state
population by setting dx/dt dy/dt 0
Must also show that the system will converge to
the steady state
See detailed discussion of local stability in
paper

28
Simulation results
29
Comparison to a queuing system

A regular queue models a client-server system,
when the arrival rate exceeds the service rate,
the system blows up (queue goes to infinity)
The analysis of a p2p system is similar to that
for a queue, except the customers are also
servers
The arrival of a customer also brings some
additional service capacity (?gt0 case)
This makes T stay finite the p2p queue is
infinitely scalable!
A queue may different service models FCFS, LCFS,
priority, PS
What is the effect of different service models in
a p2p system?

30
Tradeoff in rate allocationin uplink sharing
model

B Fan, DM Chiu and J Lui, The delicate tradeoffs
in BitTorrent-like file sharing protocol design,
ICNP 2006
Similar to Qiu Srikants model
But multiple classes of peers
Arrival rate
Uplink capacity
Downlink capacity
fat and thin peers if 2 classes
No seeders

31
Feasible rate allocation space

Any u and d satisfying

32
Performance metric 1average downloading time

In steady state
Number of type i peers
Total upload capacity
C??
ciui/di ?
ci is the share ratio of peer i
Download time of type i peers

33
Performance metric 2 fairness

Fairness index
It can be applied to any quantity x. In this
case, let xi ci
Therefore, fairness index

34
Tradeoff

Each rate allocation (u,d) yields a different T
and F
The space of different strategies and achievable
(T,F)
Consider three specific allocations
Optimal downloading time
Optimal fairness
in terms of share ratio
Max-min rate allocation
Equalize the downloading rate of all peers,
unless it is limited by the downlink capacity

35
Optimizing downloading time

Problem definition
Note
assuming uploading rate uplink capacity
Smaller type number means higher uplink
capacity
Applying standard Lagrange multiplier methods and
KKT conditions
Type 1 peer gets less downloading rate
proportionally speaking
Type 1 peer may even get less downloading rate
than thin peers

36
Optimizing fairness

This means equalizing the share ratios
Therefore
By same Lagrange multiplier methods

37
Max-min rate allocation

Problem definition
Can use the water filling algorithm to find the
solution.
If none of the downlinks are bottlenecks

38
Compare the results

Since we have the closed form results for these
cases
The concept of Pareto optimal, or non-inferior
allocations

39
Apply it to distributed algorithms

Rate allocation analysis so far assumes knowledge
only available in centralized algorithms
Examples of distributed algorithms found in BT
Tit-for-tat
This yields the most fair rate allocation, in
equilibrium.
Peers form clusters
Random
This gives a solution similar to max-min
Each peer gets average uploading rate in steady
state
Possible to adopt a weighted average of the two
Select ns neighbors using tit-for-tat, and na
neighbors randomly

40
Different mix of TFT and random
For small na, the mixed strategy is not able to
find best matched neighbors, (total number of
peers 100)
41
Recap

So far, we considered three models
Mundingers model
Gives throughput capacity for small N,M, or very
large M
But static peer population
Qiu-Srikant model
Gives average delay (hence also capacity) for
dynamic peer population, and various other
parameters
But sharing efficiency modeled by one parameter ?
Single class of peers
Fan-Chiu-Lui model
Multiple classes of peers, and the tradeoff of
throughput and fairness
Assumes perfect sharing efficiency
How to more accurately model sharing efficiency?