Title: Stochastic Analysis of File Swarming Systems
1Stochastic Analysis of File Swarming Systems
John C.S. Lui
- The Chinese University of Hong Kong
Collaborators D.M. Chiu, M.H. Lin, B. Fan
2Background
- Traditional Client/Server Sharing
- Performance deteriorates rapidly as the number of
clients increases - IP Multicast
- Application Multicast (e.g., CDN, ESM)
- reliability, unused resources at leaf nodes
- P2P (e.g., Naspter, Gnutella)
- Free riders only download without contributing
to the network. - BitTorrent P2P systems
- Good scalability
- Built-in incentive mechanism to contribute
3BT Components
- On a public domain site, obtain torrent file, for
example - http//bt.btchina.net
- http//bt.ydy.com/
Web Server
The Lord of Ring.torrent
Harry Potter.torrent
Transformer.torrent
4BT Components
- The .torrent file
- Static metainfo file to contain necessary
information - File name
- of chunks, size
- checksum
- IP address of the Tracker,etc
- A BitTorrent tracker
- Non-content-sharing node
- Track peers
- File
- Chunk size (256KB), has individual hash code in
the torrent file - Types of peers
- Leechers
- Seeders
5BT publishing a file
Web Server
Moe
Tracker
Downloader Larry
Seeder John
Downloader Curly
6Simple example
1,2,3,4,5,6,7,8,9,10
Seeder John
1,2,3
1,2,3,5
1,2,3
1,2,3,4
1,2,3,4,5
Downloader Larry
Downloader Moe
7BT internal Chunk Selection mechanisms
- Strict Priority
- First Priority
- Rarest First
- General rules
- Random First Piece
- Special case, at the beginning
- Endgame Mode
- Special case
8BT internal mechanism
- Built-in incentive mechanism (where all the magic
happens) - Choking Algorithm
- Optimistic Unchoking
9BT internal mechanism
- Choking is a temporal refusal to upload
- Each peer unchokes a fixed number of peers
- Reasons for choking
- Avoid free riders
- Network congestion
- Contribute to useful peers
Andy Yao
Choked
Choked
John C.S Lui
10BT internal mechanism (optimistic unchoking)
- A BitTorrent peer has a single optimistic
unchoke which uploads regardless of the current
download rate from it. This peer rotates every
30s - Reasons
- To discover currently unused connections are
better than the ones being used - To provide minimal service to new peers
11Example optimistic unchoking
Andy Yao
100kb/s
40kb/s
70kb/s
70kb/s.
10kb/s
110kb/s
Downloader Moe
70kb/s
10kb/s
20kb/s
30kb/s
5kb/s
15kb/s
Downloader Melinda
Downloader Larry
Downloader Curly
Downloader John Lui
12P2P content distribution
- BitTorrent
- Sending a file to a large number of peers, with
the help of peers - Producing the most Internet traffic today (over
50 of traffic, creates contention but ....) - What IP multicast tried to support
- Modeling these systems insights
13Why Study BitTorrent-like System?
- BitTorrent is very efficient.
- Which features make it perform so well?
- Motivating questions
- What is the effect of bandwidth constraints?
- Is the Rarest First policy really necessary?
- Must nodes perform seeding after file
downloading? - How serious is the Last Piece Problem?
- Is source coding useful?
- Does the incentive mechanism affect the
performance much?
Our aim is to develop mathematical models of
file swarming systems, allowing us to
investigate these issues via analytical means.
14Model for the File Swarming System
- A file has K non-overlapping chunks.
- Peers arrive according to a Poisson process. Each
peer is initialized with one random chunk. - Peers leave the system immediately when finish
downloading. - The system is slotted downlink bandwidth is one
chunk per time slot for all peers. (download
constraint) - In each time slot, each peer contacts m neighbors
uniformly from the system to see whether they are
useful. If some neighbors are useful, it randomly
chooses one and requests a random useful chunk. - If a peer receives several requests, it will
satisfy all / random one request(s).
(without/with upload constraint)
15Model for the File Swarming System
Without upload constraint
Example m2
With upload constraint
peer C
HELLO
peer A
Bitmap
HELLO
Request C5
C5
Bitmap
peer D
HELLO
Request C1
C1
peer B
Bitmap
HELLO
Bitmap
peer E
The case m 1 no upload constraint was
studied by L.Massoulie et.al in Coupon
replication systems.
16Model 1 Download Constraint Only
- Classify peers into K-1 types. Peers holding i
chunks are named type i peers. Denote the
number of type i peers, -
- We are interested in the average sojourn time Ti
for type i peers. - The average downloading time
- For a type i peer, the probability that a type j
peer is useful - For a type i peer, the probability that a
randomly picked peer is useful
17Model 1 Download Constraint Only
- Given the system state
, is a Multi-dimensional
infinite state-space Markov Process - It is hard to solve this Markov Chain directly
- Transform the Markov Chain to a Density
Dependent jump Markov Process - Focusing on its steady state and asymptotic
behavior - We derive tight bounds.
18Model 1 Download Constraint Only
The average downloading time .
The case m1 has been studied in 1, in which
the authors gave a looser bound
1 L.Massoulie, M.VojnoviC, Coupon replication
systems, SIGMETRICS, 2005.
19Lower bound v.s. Upper bound (K200)
m1
m2
Last Piece Problem It takes a peer a longer time
to download the last few chunks of the file,
since it gets increasingly more difficult to find
other peers that can help.
20Bounds v.s. Simulation (K200)
m1
m2
The simulation shows the accuracy of our model.
How to relief the last piece problem?
21System with Source Coding
Source
K4
Q6
peer C
peer A
peer D
C1
peer B
peer E
22System with Source Coding
The source encodes the original K chunks into Q
chunks, Any peer could
reconstruct the original file after he receives
any K distinct chunks.
23Source Coding vs. No Coding(K200)
m1, no coding
m1, source coding ( )
Source coding eliminates the Last Piece Problem
!!!
24Download constraint only
K500 m1
K200 m1
25Download Constraint
K500 m2
K200 m2
26Model 2 Download Upload Constraints
m1
peer C
peer A
HELLO
Request C5
Bitmap
peer D
HELLO
Request C1
C1
peer B
Bitmap
peer E
27Model 2 Download Upload Constraints
m1
- Stage One Requesting
- The same as Model 1.
- Stage Two Downloading
- The distribution of the number of requests one
peer would receive (depending on its type). - Only one request will be satisfied.
- Still a density dependent jump Markov process
- The transition rates are more complicated.
28Model 2 Download Upload Constraints
m1
1.58
29Bounds v.s. Simulation (K200, without source
coding)
m1 satisfying one request
Ti is NOT close 1 any more, i.e. downloading time
is far from being optimal.
30Model 3 An Incentive Mechanism
Assuming peers are matched randomly at the
beginning of each time slot. Each pair will
perform chunk transfer iff both of them are
useful to each other.
peer C
peer A
Request C5
C5
C2
Request C2
peer D
peer B
peer E
Request C1
31Model 3 An Incentive Mechanism
32Bounds v.s. Simulation (K200, without source
coding)
First Piece Problem
It is not easy to download the first few chunks
when a peer enters the system, but one can solve
this in various of ways.
33Incentive Mechanism
K500 m1
K200 m1
34Conclusion
- Many peers, steady state, certain mechanism to
ensure file - availability (e.g. some seeders), then
- The nature of swarming makes P2P systems very
efficient. - Rarest First policy is not necessary for
performance. If peers are cooperative, random
policy is good enough, though it may be helpful
to enhance file availability. - Peers are not necessary to perform seeding after
file downloading. - Simple strategies (everything is random) can make
the downloading time near optimal. - Source coding is useful, to relief the last piece
problem. - With certain incentive mechanism, the downloading
time can still approach optimal.
Our mathematical models provide a basis for
designing new BT-like protocol.
35Research Questions
- What about fairness?
- How to extend file swarming to multimedia
streaming? For Joost? - What about wide area network exchange?
- What happen if there is network congestion?
What is the impact? - Network Coding?
- Security?
36Q A