Title: 1
1A Framework for Lazy Replication in P2P VoD
- Bin Cheng1, Lex Stein2, Hai Jin1, Zheng Zhang2
1 Huazhong University of Science Technology
(HUST) 2 Microsoft Research Asia (MSRA) NOSSDAV
2008, Braunschweig, Germany, May 30, 2008
2Background
VoD, popular Internet service -Youtube, Hulu
Can P2P help VoD? -Feasibility -Performance
improvement
P2P, useful technology -File sharing, live
streaming -BitTorrent, PPLive
GridCast with caching -36 decrease -43
departure misses
Replication in P2P VoD
3Outline
4Motivation -what does GridCast look like?
http//www.gridcast.cn
5Motivation -GridCast system overview
- Hybrid architecture (client-server P2P)
- Tracker indexes all joined peers
- Source Server stores a complete copy of every
video - Peer fetches chunks from source servers or other
peers - Web Portal provides the video catalog
tracker
Web portal
Source Server
6Motivation -trace collection
- GridCast has been deployed on CERNET since May
2006 - Network (CERNET)
- 1,500 Universities, 20 million hosts
- Good bandwidth, 2 to 100Mbps to the desktop (core
is complicated) - Content
- 2,000 videos
- 48 minutes on average
- 400 to 800Kbps, 610 Kbps on average
7Motivation -trace analysis
- Classify misses by their causes
- Chunk X does not hit in the peer cache, Why?
- New content
- Never fetched by any peer
- Peer departed
- Fetched by some peers, but all of them are
offline - Peer evicted
- Fetched by an online peer, but evicted
- Can not connect
- Cached by some online peer that is not in the
neighborhood - Insufficient bandwidth
- Cached by some neighbor, but cannot retrieve it
43
8Motivation -challenges and chances
Caching is not enough. Can we do better?
9Replication -three key questions
Framework
10Replication fundamental tradeoff
- Benefit
- Reduce departure misses
- Reduce some eviction misses if the cache is not
full
- Cost
- Increase network traffic
- Increase bandwidth misses
- Increase some eviction misses if the cache is
full
11Replication -eager replication
- Replicate all missed chunks
- Use all of unused bandwidth
A
neighborhood
B
C
12Replication -lazy replication
- Based on two predictors
- Peer departure predictor
- Chunk request predictor
- Lazy-oracle and lazy-simple
- Lazy factor
- How much remained bandwidth can be used
- Target peer selection
- Random, Sequentially, File locality first
neighborhood
13Replication -peer departure predictor
- Based on the observation of online time
- -50 of user session, less than 10 minutes
- -the peer with higher online time is likely to
stay longer - Simple departure predictor
- -online time
- -online time 10 minutes, stay
14Replication -chunk request predictor
- Chunks requested recently are more likely to be
requested earlier in the near future - Simple chunk request predictor
- -use the chunk access history in the last
several hours - -give higher weight to the recent requests
15Performance Evaluation -simulation setup
- Trace-driven
- 1GB
- Realized bandwidth
- Last 1 hour history for chunk request predictor
- 10 minutes interval for peer departure predictor
- Use the existing neighborhood
- Metrics
- Benefit decrease of chunks served by the source
servers - Cost increase of chunks replicated between peers
- Efficiency Benefit / Cost
16Performance Evaluation -exploring configurations
17Performance Evaluation -lazy factor
- -More chunks are delayed to be replicated when
the peer leaves - -Smaller lazy factor, more efficient
18Performance Evaluation -comparison
- Lazy-simple is close to lazy-oracle, in terms of
benefits - Lazy-simple is better than eager, in terms of
efficiency - Lazy-simple, 15 decrease of server load
19Conclusions
20Thank you!Any questions
- Bin Cheng, Lex Stein, Hai Jin and Zheng Zhang
HUST and MSRA Huazhong University of Science
Technology Microsoft Research Asia NOSSDAV 2008,
Braunschweig, Germany