Title: Towards Efficient Distribution of Highvolume Content
1Towards Efficient Distribution of High-volume
Content
- Mukund Seshadri
- (mukunds_at_cs.berkeley.edu)
- Ph.D. Student, U.C.Berkeley
- Thesis Advisor Prof. Randy Katz
2Introduction
- Increasingly high volumes of content carried on
the Internet - Major P2P networks, in 2004, had 10 million users
online simultaneously, sharing over 10,000,000 GB
worth of data. 2 - Typical usage
- Early 90s Web pages
- Y2K Music 3MB files
- Now Movies, short videos, live broadcasts,
software updates. - Issue
- Higher volumes and larger client sets can stress
the bandwidth capacity of traditional
distribution methods.
MLBtv screen-shot watch baseball online
2 CacheLogic Press Release http//www.cachelogi
c.com/home/pages/news/pr040715.php
3Introduction
- Current approaches to distributing content
- 1/few Server(s) -gt Many Clients
- e.g. YouTube
- Distributed Set of Caches
- e.g. Akamai
- Multicast Trees (Overlay or IP)
- e.g. End-System Multicast1
- P2P Networks
- e.g. Gnutella, BitTorrent
- Emphasis so far
- Uncoordinated distribution of files to
individuals - Our focus
- Coordinated distribution of large files to large
well-understood communities.
1 Chu et al. A Case for End-System Multicast
SIGMETRICS00. 2 Jannotti et
al.Overcast OSDI00
4Our Motivating Scenario
1 Server
File 100 MB
1000 clients low churn rate. e.g. UCB
students home PCS e.g. Registered WinXP users
(SP2 was 260MB !)
Environment server-client and client-client
bandwidth is the bottleneck limitation
- Target Problem minimize time by which all
clients have received the file. - -- i.e., Completion Time --
5Our Goals
- Obtain the best possible algorithm for the
target problem. - Establish performance of current known
approaches. - Consider several different environments
- Cooperative .vs. Non-cooperative clients
- Homogeneous clients/bandwidths .vs.
Heterogeneous. - Different types of content/application Bulk
downloads .vs. Streaming Video
6Talk Roadmap
- Introduction
- Background
- Research Outline
- Research Details - Analysis of the Cooperative
Homogeneous Scenario - Research Details - Algorithms for Heterogeneous
Scenarios - Research Snapshot Other Scenarios
- Future Work
7Background
- Types of Distribution Methods
- 1/few Server(s) -gt Many Clients
- e.g. YouTube
- Issue Need to provision source bandwidth
proportional to number of clients - Clients upload capacity completely unutilized
- Distributed Set of Caches
- e.g. Akamai
- Issue distributing content to the Caches/Edges
- Multicast Tree-based methods
- P2P distribution
8Multicast trees Background
- d-ary tree multicast 1,2
- Operation File (parts) sent by each internal
tree node to its d children, which propagate it
similarly. - Target client reception rate, in-order delivery,
low delay - Inefficiency leaf node bandwidths unutilized
- Network-layer .vs. Overlay implementation
- e.g. ESM 1, Overcast 2
1 Chu et al. A Case for End-System Multicast
SIGMETRICS00. 2 Jannotti et
al.Overcast OSDI00
9P2P methods
- Napster, Gnutella, FastTrack
- Distributed, scalable search
- Download files from a few end-hosts.
- Large-fileSingle-source - download not optimized
- Splitstream and Parallel trees 3,4
- Pro Utilizes leaf nodes upload capacities
- Con Useful upload capacity growth is sub-optimal
- Target load-balance, fairness
- BitTorrent bittorrent.com
- Useful for distributing large files to many
people with low server bandwidths - Target per-client download time, incentivizing
cooperation
3 Karger et al. Consistent Hashing and Random
Trees STOC97. 4 Castro et al.Splitstream
SOSP03
10BitTorrent .com - Background
- Tracker enables client rendezvous
- Clients in random overlay graph
- Utilizes clients upload capacity
- Tit-for-tat prioritize transmissions to
neighbors by incoming bandwidth from them
- Targets individual performance, incentives,
dynamicity but performance/optimality in each
dimension not clear. - Completion Time has not been adequately
researched
11Background Summary and Issues
- Summary of Methods -
- d-ary tree multicast 1,2
- Target client reception rate, in-order delivery
- Parallel trees 3,4
- Target load-balance, fairness
- BitTorrent bittorrent.com
- Target per-client download time, incentivizing
cooperation
- Different methods gt different goals none
targeted to reduce completion time. - Completion time of these methods not yet
adequately understood. - What is the best, or optimal algorithm?
1 Chu et al. A Case for End-System Multicast
SIGMETRICS00. 2 Jannotti et
al.Overcast OSDI00
3 Karger et al. Consistent Hashing and Random
Trees STOC97. 4 Castro et al.Splitstream
SOSP03
12Talk Roadmap
- Introduction
- Background
- Research Outline
- Research Details - Analysis of the Cooperative
Homogeneous Scenario - Research Details - Algorithms for Heterogeneous
Scenarios - Research Snapshot Other Scenarios
- Future Work
13Research Outline
- Approach
- Simple Initial Scenario Cooperative,
Homogeneous, Static Client set, bulk content. - Given some file size, and client-set size.
- Find the best possible, or optimal algorithm,
that would minimize completion time from 1
server. - Use theory if possible else simulate.
- Compare to other known methods, like BitTorrent.
- Remove simplifying assumptions one-by-one.
- Assume static clients through-out.
14Contributions
- Cooperative Homogeneous (Bulk Content) Scenario
- Provably optimal algorithm
- Investigated BitTorrents completion time
- Simulations
- Heterogeneous scenario
- Randomized heuristic-based algorithm
- Non-cooperative Scenario
- Credit-limited Barter scheme analysis and
simulation - App-specific content delivery/ordering
- Priority-based distribution heuristic.
15Talk Roadmap
- Introduction
- Background
- Research Outline
- Research Details - Analysis of the Cooperative
Homogeneous Scenario - Research Details - Algorithms for Heterogeneous
Scenarios - Research Snapshot Other Scenarios
- Future Work
16Cooperative Distribution - Model
Block Size B Quantum of data transmission (Cannot
transmit before fully received)
File F k Blocks B1,B2Bk
- T(k,n) time taken for all clients to receive
all blocks. - Time unit Tick B/U.
To find the lowest possible value of T(k,n) and
the algorithm that achieves this value.
17Lower bound
e.g. 1 block, 7 nodes Binomial Tree is optimal
Server S
Tick 1
Bj
C1
C2
C3
- Observations
- K blocks take at least k ticks to leave server.
- Last block takes another log2n -1
C6
C4
C5
C7
Lower bound for T(k,n) k log2n -1 (ticks)
18Hypercube Algorithm
Tick 1
- Rule Round-Robin transmission to neighbours
Tick 2
Tick 3
Server S-000 B1,2,3
B3
B1
C-100
C-001
B2
B1
B3
C-010
B2
B1
B1
B2
C-011
C-110
B1
C-101
B1
B2
B1
C-111
B1
19Hypercube Algorithm
Tick 4
- Rule transmit highest numbered block
Server S-000 B1,2,3
B3
C-100
C-001
B3
B1
C-010
B2
B1
B1
B2
B3
C-011
C-110
C-101
B2
B1
B1
B1
C-111
B2
B1
20Hypercube Algorithm
- Completes in optimal time!
21Arbitrary n
- Use a hypercube of logical nodes
- Logical node can have 1 or 2 physical nodes
- Dimension of hypercube L Floor(log2n)
- At most one block mismatch within a logical node
- This finishes in k log n -1 ticks
Our optimal algorithm design is complete
22Performance of Some Distribution Methods
- Completion times T(k,n) for
- Server serves each client kn
- Linear pipeline kn-1
- Multicast tree of degree d d(k logdn -2)
- Splitstream with d parallel trees kd logdn
All of the above are sub-optimal Compare with
k log2n -1 (ticks)
23BitTorrent comparison
- Asynchronous simulator modeling client/client
messages in BitTorrent spec. - Small fixed no. of neighbours unchoked
- Chosen in order of reverse data rate
(tit-for-tat) - Decision revisited periodically (choke
interval) - Ties broken by bandwidth to neighbour.
- 1 neighbour unchoked optimistically
- Stays unchoked for 3choke-interval
24BitTorrent Results - Snapshot
- Assumed k blocks and n nodes (all arriving at
time 0) - Varied k and n from 10-2000
- Metric completion time T (of all nodes)
- Least-squares estimate of T(k,n)2.2k47log2n-173.
- With default parameters
- This can be improved to 1.3k9.8log2n-9
- By tuning parameters increasing choke interval,
and decreasing the number of simultaneous uploads.
BitTorrent can be 2.2x worse than optimal (in
completion time). That factor can fall to 1.3x,
by changing certain features (at the risk of
weakening the tit-for-tat scheme)
25Talk Roadmap
- Introduction
- Background
- Research Outline
- Research Details - Analysis of the Cooperative
Homogeneous Scenario - Research Details - Algorithms for Heterogeneous
Scenarios - Research Snapshot Other Scenarios
- Future Work
26Adapting to Heterogeneous Clients
- Hypercube algorithm requires synchronized
communication pattern - Can extend to some simple specific heterogeneous
cases. - Not suited for general heterogeneous scenario
- Key operation optimal mapping of nodes that need
a block to nodes that have that block, - to ensure maximal utilization of client upload
capacity - Can we do this mapping randomly?
- Random overlay graph.
- Neighbor selection e.g. Random
- Block Selection e.g. Rarest-First
- Transmit 1 Block
- Notify neighbors of block reception
Repeat
27The Price of Randomization
- Synchronous simulations of homogeneous clients
- Metric completion time T (k,n)
- Constant B T in ticks(B/U).
- Overall range k10-10000, n10-10000
- Least squares estimate of T(k,n) 1.01k4.4log
n3.2
Randomized algorithm close to optimal when
kgtgtlog2n Reduces completion time by factor of
1.3-2.2 compared to BitTorrent (depending on
tuning)
28Heterogeneous Conditions
- Issues
- Client upload bandwidths can be in a wide range.
- Bandwidth can depend on the destination client
too. - Nodes can consider neighbour bandwidths when
selecting neighbour to transmit to. - BitTorrent's selection policies are not targeted
for a completely cooperative scenario. - Always selecting higher-bandwidth neighbours can
lead to starvation of lower-bandwidth nodes
(worse completion times)
B1
Lower-bandwidth path
B1
29Heterogeneous Case Heuristics
- Proposal HRand
- Intuition
- Nodes have a queue of servable blocks
- Make sure no queue becomes too short/empty.
- Demand metric for node Y accounts for
- Blocks required by Y, to serve its neighbors.
- The no. of such neighbours
- The bandwidth at which N can serve to those
neighbours. - Use a bandwidth threshold
- Lowest Supply node chosen to send block to.
30Simulation Methodology
- Algorithms simulated
- RAND - random selection, no neighbour-heuristic
- HRAND - our heuristic uses U(N)
- GRAND greedy neighbour selection, approximates
BitTorrent, minus choke mechanisms. - Client Bandwidth Distributions Considered
- Homogeneous
- 2-Level
- Clustered
31Results
- Clustered Model clients grouped into 10
clusters bandwidth within clusters is 10x the
bandwidth across the clusters - E.g. clusters can be topological or geographical.
- GRAND actually performs worse than RAND in this
scenario - HRAND outperforms GRAND by around a factor of
1.6-2.1.
32Talk Roadmap
- Introduction
- Background
- Research Outline
- Research Details - Analysis of the Cooperative
Homogeneous Scenario - Research Details - Algorithms for Heterogeneous
Scenarios - Research Snapshot Other Scenarios
- Future Work
33Non-Cooperative Clients
- Proposal incentive scheme based on barter of
data blocks - Credit-limited Barter
- X uploads to Y only if the net no. of blocks from
X to Y is lt S. - Degree limit required to limit free blocks.
- Advantages
- Strictly defined invariant relationship between
peers - No timing parameter (which adversely affects
performance in BitTorrents case).
34Barter results (snapshot)
- Approach
- Focus on completion time of above algorithms
- Analysis of specific cases simulations for
general case. - Not in scope analysis of the strength of the
incentive scheme.
35Customizing delivery
- Different applications gt different requirements
on data delivery - Download-in-order gt download full movie, but
start watching quickly. - Live Video Stream
- Rewind and fast-forward semantics
- Proposal Block priority-based distribution
- Minimal change to the distribution algorithm
App-specific Layer
Block priorities
App-independent Distribution (HRAND/Hypercube)
36Summary of Contributions
- Hypercube Algorithm for optimal completion time
in a homogeneous scenario. - For heterogeneous scenarios, we proposed a
randomized heuristic-based algorithm (and
evaluated it by simulations) - The above two algorithms are faster, simpler, and
more general than related prior work BitTorrent,
Qiu et al. Sigcomm04, Xang et al. Infocom04,
Bar Noy et al. DAM 00, Splitstream - Established BitTorrents completion time by
simulations. - Adapted to non-cooperative scenario proposing
fast barter-based schemes - Proposed an mechanism to enable
application-specific customization of block
ordering.
37Future Work
- Real-world experience
- Implementation on PlanetLab
- Impact of messaging overheads
- Simulate real-world traces
- Reliability and Dynamicity
- Impact of network failure and node churn.
- Explore distributed/replicated tracker-state
- Consider resilient overlay routing to tracker
- More Customized Applications
- Emulate TIVO semantics.
- Algorithms for cyclic barter
- The hypercube satisfies cyclic barter, optimally.
38Backup Slides follow
39Optimal Algorithm/Proof
- Binomial Pipeline (n2L) 5
- Opening phase of L ticks
- nodes in L groups Gi has 2L-i nodes.
- Middle phase
- Match and swap!
- End server keeps sending Bk
After tick k-1 Bk moves along a binomial tree
gt Optimal !
5 Yang et al. Service Capacity of peer-to-peer
Networks INFOCOM04. discusses a version of
this algorithm for npower-of-2
40HRAND Results
- 2-Level Model 50 of Clients have 10x the
bandwidth of the remaining clients - e.g. Cable-Dialup mix or Cable-CampusNetwork mix.
- HRAND reduces the completion time by a factor of
1.2-1.8, compared to GRAND and RAND.
41BACKUP Barter Models (snapshot)
- Strict Barter lower boundkn/2.
- If download capacitygt2U, we have an algorithm
with T(k,n)kn-1. - High start-up cost gt high completion times
- Relaxed Barter
- X uploads to Y only if the net no. of blocks from
X to Y is lt S. - But Y can get S(degree) free blocks
- So S has to impose a degree limit (issuing tokens
to allow peering) - Special case analyses of Relaxed barter indicate
much lower completion times than strict barter - S2,npower-of-2 Hypercube algorithm can be
used. - S1 T(k,n) upper-bounded by kn-2.
- Simulations for general cases.
42Barter Results (snapshot)
- Random Block Selection requires high graph degree
- Low (near-optimal) completion time can be
achieved - Rarest-first block selection policy is necessary
to maintain low degree.
43Results (snapshot)
- Evaluation Approach
- 2 candidate Upper Layers
- IBULK Download-in-order
- ISTREAM Fixed-rate CBR
- Priority scheme sliding window (HRAND)
- Simulations
- Comparison Algorithms
- Rarest-lowest block
- GRW
- Greedy Bandwidth-based Neighbor Selection
- Rarest-First Block Selection
- Priority Window
- Metric highest uninterrupted data rate, loss
rate