Title: Protocol and System Design, Reliability, and Energy Efficiency in Peer-to-Peer Communication Systems
1Protocol and System Design, Reliability, and
Energy Efficiency in Peer-to-Peer Communication
Systems
- Salman Abdul Baset
- salman_at_cs.columbia.edu
- Thesis defense
- October 29, 2010
2Motivation and Contributions
3Client-server communication system
node C
node D
ip addr A, B, C, D
proxy server / registrar
media relay server
register (ip addr D)
Up to 40 InternetVoIP calls need media relays
register (ip addr C)
Scaling for millions of users economic costs -
servers - bandwidth - management
register (ip addr A)
register (ip addr B)
signaling
signaling
media traffic (voice, video, IM)
media traffic (voice, video, IM)
NAT / firewall
NAT / firewall
node A
node B
4What is a p2p communication system?
node C
node D
media traffic (voice, video, IM)
media traffic over TCP (voice, video, IM)
ip addr B,C
ip addr A
signaling
ip addr of B
register (ip addr D)
register (ip addr C)
What is the ip address of B
register (ip addr A)
register (ip addr B)
signaling
ip addr D
media traffic (voice, video, IM)
NAT / firewall
NAT / firewall
node A
node B
5ChallengesDesigning, building, and analyzing p2p
communication systems
- Protocol and system design
- can we design a p2p communication protocol for
diverse deployments such as ad hoc, enterprise,
and Internet? - Reliability
- are p2p communication systems reliable (dropped
calls)? - Session quality
- what is the quality of real-time traffic over TCP
or through other nodes? - Energy efficiency
- are p2p communication systems more energy
efficient than client-server? - Measurement
- how can we analyze the performance of p2p systems
such as Skype?
6Contributions
- Protocol and system design
- first standardized, and interoperable protocol
for building p2p communication systems - Peer-to-peer protocol (P2PP) and RELOAD IETF
I-Ds 2007 and 2010 - OpenVoIP a P2PP proof-of-concept system
SIGCOMM08 demo - Reliability
- a simple model to investigate the reliability of
p2p communication systems IPTCOMM10 - Session quality
- a comprehensive study of TCPs feasibility for
real-time traffic SIGMETRICS08 - Energy efficiency
- first energy-efficiency study of VoIP systems
GreenNetworking10 - Measurement
- novel techniques for analyzing the workings of
p2p communication systems GI08,Infocom06
7In this talk
- Protocol and system design
- first standardized, and interoperable protocol
for building p2p communication systems - Peer-to-peer protocol (P2PP) and RELOAD IETF
I-Ds 2007 and 2010 - OpenVoIP a P2PP proof-of-concept system
SIGCOMM08 demo - Reliability
- a simple model to investigate the reliability of
p2p communication systems IPTCOMM10 - Session quality
- a comprehensive study of TCPs feasibility for
real-time traffic SIGMETRICS08 - Energy efficiency
- first energy-efficiency study of VoIP systems
GreenNetworking10 - Measurement
- Novel techniques for analyzing the workings of
p2p communication systems GI08,Infocom06
8Related work
- Protocol and system design
- Skype proprietary commercial system
- Distributed Hash Table (DHT) design
Rhea05Sigcomm - nodes on Planet Lab run OpenDHT and store data
- no relaying service, only storage
- P2PP allows nodes with good connectivity to
provide storage and relaying service, not just
Planet Lab nodes - feasibility of making Session Initiation Protocol
(SIP) peer-to-peer Bryan06AAA, Singh06NOSSDAV - explored the feasibility of distributing SIP
registrar - protocol tied to SIP, could not be used by
non-SIP protocols - P2PP can be used by non-SIP protocols
9Related work
- Reliability analysis
- Node isolation Leonard07ToN
- probability that a node outlives its neighbors
- I build a framework for understanding reliability
of calls - Minimizing churn Godfrey06Sigcomm
- not sufficient
- I devise techniques to improve reliability
- Energy efficiency study
- p2p is more energy efficient than client-server
for file-sharing Valancius09CoNEXT,
Nedevschi08Hotpower - I analyze if it is also the case with VoIP
systems - I also study where are energy inefficiencies in
VoIP systems
10Outline
- Introduction
- Related work
- Protocol and system design
- Reliability analysis
- Energy efficiency study of VoIP systems
- Conclusion
11Protocol and system design
- Goal
- design open, standardized, and interoperable p2p
communication protocol for building systems - that distribute the functionality of proxy
server, registrar, and media relay server to
end-points - that work in diverse deployments, e.g., ad hoc,
enterprise, Internet - that run on heterogeneous devices
- that are extensible
12Requirements
node C
node D
- Multiple overlay algorithms
- Node heterogeneity
- NAT and firewall traversal
- Bootstrap and join
desktop
Emergency p2p
Internet p2p
portable
NAT / firewall
NAT / firewall
new node
node A
node B
13Requirements
ip addr A
node C
node D
- Multiple overlay algorithms
- Node heterogeneity
- NAT and firewall traversal
- Bootstrap and join
- Resilience
- Message reliability
- Request routing
- Security
- Data model
- Monitoring and diagnostics
message
node A
node B
14Peer-to-peer protocol (P2PP)
- Protocol stack of a P2PP node
- A request / response binary protocol
- protocol methods
- Reuses existing protocols
- Not a new DHT or bit-encapsulation
- How does it meet the requirements?
Published in IETF P2PSIP working group
15Meeting the requirements (1)
node C
node D
- Multiple overlay algorithms
- Multiple overlay algorithms
- LookupPeer method
- find a peer in the overlay to fill nodes routing
table - method customized for each overlay algorithm
- Chord (X2i,X2i1), Kademlia (XOR), etc.
- ExchangeTable method
- exchange routing tables
- KeepAlive method
- check liveness
node A
node B
Routing table
node B
node D
Routing table
node C
node D
In another overlay algorithm
In one overlay algorithm
16Meeting the requirements (2)
node C
node D
- Multiple overlay algorithms
- Node heterogeneity
- NAT traversal
- Bootstrap and join
- Node heterogeneity
- peers (super nodes) and clients (ordinary nodes)
- peer vs. client decision left to the system
designer - NAT traversal
- a node encodes its host, NAT, and a relay IP
address in every message - then performs connectivity checks
- Bootstrap and join
- Bootstrap, Join, Leave methods
- Bootstrap server
portable
node A
node B
node A
NAT / firewall
NAT / firewall
bootstrapserver
new node
17Meeting the requirements (3)
node C
node D
- Multiple overlay algorithms
- Node heterogeneity
- NAT traversal
- Bootstrap and join
- Resilience
- Message reliability
- Request routing
- Resilience
- KeepAlive method
- Message reliability
- hop-by-hop, e2e
- ACK-based mechanismfor unreliable transports
- Request routing
- recursive vs. iterative
- specified per message or per overlay
message
node B
node A
18Meeting the requirements (4)
node C
node D
- Multiple overlay algorithms
- Node heterogeneity
- NAT traversal
- Bootstrap and join
- Resilience
- Message reliability
- Request routing
- Data model
- Security
- Monitoring and diagnostics
- Data model
- Publish, Lookup, Replicate methods
- flexible data model
- key / type-value pairs
- data integrity (hash)
- Security
- identity (user, nodes)
- enrollment server, X.509 certificates
- Enroll method
- message confidentiality
- TLS, DTLS
- Monitoring and diagnostics gathering (e.g., CPU,
uptime) - GetDiagnostics method
node B
node A
enrollment server
19P2PP summary
- Design summary
- methods for implementing the common aspects of
overlay algorithms - overlay algorithm defines components of specific
methods - separation of mechanism vs. policy
- P2PP now part of RELOAD protocol being
standardized in the IETF - Limitations
- not a replacement for network file systems
- no permissions, store ephemeral data
- does not replace delay-tolerant network protocols
- Does it work in practice?
20OpenVoIP a P2PP proof of concept
bootstrap server
monitoring server / Google maps
Overlay 2
node C
node D
Overlay 1
node B
node A
NAT / firewall
NAT / firewall
node F
node E
SIGCOMM (demo) 2008
21OpenVoIP key facts and lessons learned
- 1000 node network on 500 PlanetLab machines
- DHTs Kademlia, Bamboo, Chord
- App Windows XP / Vista, Linux
- Code used and modified by Ericsson Labs, Nokia
Labs, Telecom Italia, and many universities
around the world - Lessons learned
- DHT specific part is only 10-15 of the total
code - want to test a new p2p protocol?
- use the library provided
22Outline
- Introduction
- Related work
- Protocol and system design
- Reliability analysis
- Energy efficiency study of VoIP systems
- Conclusion
23Reliability of P2P comm. systems
- ReliabilityProportion of completed calls (e.g.,
99.9) - Goals
- understand reasons for call failure
- devise techniques to address them
- Reasons for call failure
- (1) distributed search fails to find online
callee - (2) distributed search fails to find a suitable
relay - (3) relay fails during voice/video session
Recall up to 40 VoIP calls in the Internetneed
relaying
IPTCOMM2010
24Understanding reliability of relayed calls (1)
- For desired reliability, minimum relays k per
call? - Model
- when ith relay fails, call is switched to (i1)st
relay which is instantly selected from the
global pool of all relays. - Ri residual lifetime of a relay candidate
(i.i.d.) - let D denote the call duration.
D
Rk
Rk-1
R1
k
1
2
k-1
99.9
Qualitatively if node lifetime gtgtgt call
duration, small k and vice versa
25Understanding reliability of relayed calls (2)
Exponential node lifetimes
Skype node lifetimes
95 of Skype relay calls last less than 1 hour
Min of relays k
6 4
3 5
1 10
Min of relays k
Skype 12 hours (mean) 4 hours (med) 3 (mean call holding time one hour)
Mean node lifetime Mean call duration
lifetimes approximated as pareto
For one hour Skype calls, minimum of 3 relays
needed to maintain 99.9 success rate
What if the system does not have enough relays?
26Approaches for addressing the reliability of
relayed calls
- Approaches
- No-replacement (NR)
- select k relays in the beginning of a call
- do not replace failed relays
- With-replacement (WR)
- select k relays in the beginning of a call
- replace failed relays after µ
- Skype uses 2-relay with-replacement scheme
- Model
- complicated for arbitrary distributions
- For exponential lifetimes, I used markov analysis
pure death process
(1) Why not make k arbitrary large?
(2) Isnt WR always better NR?
27Comparing the approaches for reliability of
relayed calls
- Why not make k arbitrary large?
- i.e., add more relays?
- diminishing returns
- liveness checks overhead
- (2) Isnt WR always better than NR?
- yes, but the percentage improvement gains vary
- depends on mean lifetime, call duration, repair
time
Num relays MTTF improvement
2 50
3 22
4 13
Skype 2 relay with-replacement search time60s
Skype mean12 hours Median4 hours
28Outline
- Introduction
- Related work
- Protocol and system design
- P2PP
- OpenVoIP
- Reliability analysis
- Energy efficiency study of VoIP systems
- Conclusion
29Energy efficiency study
- Goals
- (1) Where is energy consumed in IP-telephony
systems? - (2) How do different design choices (p2p vs.
client-server) affect the energy consumption? - (3) How can we make IP-telephony more
energy-efficient? - IP-communication system classification
- p2p vs. client-server
- PSTN replacement always on, providing
emergency calling vs. communication addendum
GreenNetworking2010
30Sources of energy consumption in
IP-communication systems
- End-point
- handsets
- VoIP conversion boxes
- PCs
- NATs and firewalls
- Core
- signaling / directory
- media relaying
- PSTN / mobile gateways
- cooling
- power utilization efficiency (PUE)
- ratio of data center power draw to IT power draw
- Network
- joules per bit
31Approach
-
- Measurements
- End points
- desktop clients
- laptop clients
- hardware SIP phones
- Skype peers
- Core
- SIP server
- relay server
- Data (from client-server VoIP provider)
- 100K users (mostly business)
- 15 calls per second (CPS) peak
- 5K calls in system
- NAT keep-alive traffic
- all calls relayed
- Modeling
- P2P
- Client-server
32(1) Where is energy consumed? PSTN replacement
- VoIP servers consume less than 0.06 of total!
- 1 server 500k users 200W
- 1 servers 50k simultaneous calls 200W
- 500k phones, each phone 5-7W
- even after a redundancy factor of 2, and
conservative PUE of 2!
Make PSTN replacement green? Reduce end-device
power energy consumption
33Where is energy consumed? Non-PSTN replacement
- Typically run on desktops, laptops as soft phones
- If soft phone draws little additional power
- still likely that end-device biggest component
- but may not dominate consumption
- If users leave PCs on just as phones
- possibly even worse than PSTN!
34(2) Client-server vs. peer-to-peer?
- Client-server model
- C/S power consumption pc/s servers
Watts/server redundancy factor PUE
- P2P model
- S super nodes active
- ps Watts/super node
P2P more energy efficient when S ps lt pc/s
- One active super node per relayed call (Skype)
- 30 calls relayed
- super nodes 1.5 of total nodes
ps lt 162mW
P2P may consume more than client-server!
35(3) How can we make IP-telephony greener?
- Phones
- make phones energy efficient
- LCD, processor, Wake-on-LAN for phones?
- PCs
- wakeup on receiving calls
- NATs and firewalls
- eliminate NATs (IPv6 at least in theory)
36Conclusion
- Protocol and system design
- first standardized, and interoperable protocol
for building p2p communication systems - Peer-to-peer protocol (P2PP) and RELOAD IETF
I-Ds 2007 and 2010 - OpenVoIP a P2PP proof-of-concept system
SIGCOMM08 demo - Reliability
- a simple model to investigate the reliability of
p2p communication systems IPTCOMM10 - Session quality
- a comprehensive study of TCPs feasibility for
real-time traffic SIGMETRICS08 - Energy efficiency
- first energy-efficiency study of VoIP systems
GreenNetworking10 - Measurement
- novel techniques for analyzing the workings of
p2p communication systems GI08,Infocom06
37Publications
- Journal and magazine
- Eli Brosh, Salman A. Baset, Vishal Mira, Dan
Rubenstein, and Henning Schulzrinne, The
Delay-Friendliness of TCP for Real-time Traffic,
IEEE/ACM Transactions on Networking, Accepted. - Salman A. Baset and Henning Schulzrinne,
Reliability and Relay Selection in Peer-to-Peer
Communication Systems, in submission. - Salman A. Baset and Henning Schulzrinne, Making
Peer-to-Peer Video Conferencing Work, in
submission. - Conference and workshop
- Salman A. Baset, Joshua Reich, Jan Janak, Pavel
Kasparek, Vishal Misra, Dan Rubenstein, and
Henning Schulzrinne, How Green is IP-telephony?,
in Proc. of SIGCOMM Green Networking workshop,
New Delhi, India, August 2010 - Salman A. Baset and Henning Schulzrinne,
Reliability and Relay Selection in Peer-to-Peer
Communication Systems, in Proc. of IPTCOMM,
Munich, Germany, August 2010 (Best paper). - Omer Boyaci, Andrea Forte, Salman A. Baset, and
Henning Schulzrinne, vDelay A Tool to Measure
Capture-to-Display Latency and Frame Rate, in
Proc. of International Symposium on Multimedia
(ISM), San Diego, CA, USA, December 2009. - Katerina Argyraki, Salman A. Baset, Byung-Gon
Chun, Kevin Fall, Gianlucca Iannaconne, Allan
Knies, Eddie Kohler, Maziar Manesh, Sergiu
Nedevschi, and Sylvia Ratnasamy, Can Software
Routers Scale, in Proc. of second PRESTO
workshop, Seattle, WA, USA, August 2008. - Eli Brosh, Salman A. Baset, Dan Rubenstein, and
Henning Schulzrinne, The Delay-Friendliness of
TCP, in Proc. of ACM SIGMETRICS, Annapolis, MD,
USA, June 2008. - Wookyun Kho, Salman A. Baset, and Henning
Schulzrinne, Skype Relay Calls Measurements and
Experiments, in Proc. of IEEE Global Internet
Symposium, Phoenix, AZ, USA, April 2008. - Salman A. Baset and Henning Schulzrinne, An
Analysis of the Skype Peer-to-Peer Internet
Telephony Protocol, in Proc. of IEEE INFOCOM,
Barcelona, Spain, April 2006. - Kishore Dhara, Salman A. Baset, Venkatesh
Krishnaswamy, Dynamic Peer-To-Peer Overlays for
Voice Systems, in Proc. of 3rd IEEE Workshop on
Mobile Peer-to-Peer Computing, Pisa, Italy, March
2006. - Demo
- Omer Boyaci, Andrea Forte, Salman A. Baset, and
Henning Schulzrinne, vDelay A Tool to Measure
Capture-to-Display Latency and Frame Rate, in
Proc. of International Symposium on Multimedia
(ISM), San Diego, CA, USA, December 2009. - Salman A. Baset, Gaurav Gupta, and Henning
Schulzrinne, OpenVoIP An Open Peer-to-Peer VoIP
and IM System, in Proc. of SIGCOMM (demo),
Seattle, WA, August 2008. - Poster
- Salman A. Baset, Eli Brosh, Vishal Misra, Dan
Rubenstein, and Henning Schulzrinne,
Understanding the Behavior of TCP for Real-time
Workloads, in Proc. of CoNEXT, Lisbon, Portugal,
December 2006.
38References
- Bryan06AAA David A. Bryan, Bruce Lowekamp, and
Cullen Jennings, SoSIMPLE A SIP/SIMPLE based P2P
VoIP and IM system, in Proc. of AAA workshop,
Orlando, FL, USA, July 2005 - Godfrey06Sigcomm P. Brighten Godfrey, Scott
Shenker, and Ion Stoica, Minimizing churn in
distributed systems, in Proc. of SIGCOMM, Pisa,
Itlay, August 2006. - Leonard07ToN Derek Leonard, Zhongmei Yao, Vivek
Rai, and Dmitri Loguinov, On lifetime-based node
failure and stochastic resilience of
decentralized peer-to-peer networks, in IEEE/ACM
Transactions on Networking, June 2007. - Nedevschi08Hotpower Sergiu Nedevschi, Jitendra
Padhye, and Sylvia Ratnasamy, Hot data centers
vs. cool peers, in Proc. of HotPower, San Diego,
CA, USA, December 2008. - Rhea05Sigcomm Sean Rhea, OpenDHT A publicly
accessible DHT service, PhD thesis, University of
California at Berkeley, Berkeley, CA, USA, 2005. - Singh06NOSSDAV Kundan Singh and Henning
Schulzrinne, Peer-to-peer Internet telephony
using SIP, in Proc. of NOSSDAV, Stevenson, WA,
USA, June 2005. - Valancius09CoNEXT Vytautas Valancius, Nikolaos
Laoutaris, Laurent Massoulie, Christophe Diot,
and Pablo Rodriguez, Greening the Internet with
nano data centers, in Proc. of CoNEXT, Rome,
Italy, December 2009.
39Backup
40IP-based communication systems
Client-server
Peer-to-Peer
- Basic services
- establish voice, video, IM sessions
- voicemail
- Advanced services
- conferencing, telepresence
- voicemail to text
41Client-server IP communication system
SIP registrar / proxy / presence server
SIP registrar / proxy server
IP-PSTN gateway
REGISTER (ip addr)
REGISTER (ip addr)
(1) signaling
(1) signaling
(2) media (voice, video, IM)
User agent
User agent
Utopian Internet No NATs or firewalls
42Client-server IP communication system
NAT
packet
packet
Src-IP
Dst-IP
Pub-IP
Dst-IP
Network
Src-IP
Pub-IP
Pr-IP
Dst-IP
aka server-reflexive address
43P2P communication vs. file sharing
- P2P file-sharing systems
- tit-for-tat
- open NAT ports
- reduce download rate of files for nodes behind
NATs - P2P communication systems
- no tit-for-tat
- opening NAT ports is a hassle
- cannot reduce rate, will impact quality
44Percentage of VoIP calls in the Internet that
need relaying?
- the provider knows ?
- Some client-server VoIP providers relay all calls
- 15-20 calls for a commercial client-server IM /
VoIP application - Microsoft messenger 40
- 341 relayed calls in 20 days for Skype
Suh05Infocom - 17 per day for a super node (50K super nodes)
- NAT studies
45Protocol and system design
46Shared and different aspects
- Data model
- addressing, storage, integrity
- Message reliability
- hop-by-hop, e2e
- Different aspects
- Next-hop determination
- depends on the overlay algorithm
- Chord, Kademla, Gia,
- proximity aware etc.
- Shared aspects
- Connectivity
- NAT traversal
- bootstrap
- Resilience
- recovery from node churn
- Request routing
- recursive vs. iterative
- parallel vs. sequential
- Heterogeneity of nodes
- mobile, desktop
- super node vs. ordinary node
- Security
- Identity (user, nodes)
- message confidentiality
- Methods for implementing the common aspects
- Overlay algorithm defines components of specific
methods
Now part of RELOAD protocol being standardized in
the IETF
47Peer-to-peer protocol (P2PP)
- Now part of RELOAD protocol being standardized
in the IETF - Not a new DHT or bit-encapsulation
- Geared towards IP telephony but applicable to
streaming, VoD etc. - A request / response binary protocol
- Shared methods
- Join, Leave, Publish, Lookup, KeepAlive etc
- Overlay-specific methods
- FindPeer, ExchangeTable
- Support different overlay algorithms (Chord,
Kademlia etc) - Application-level API
- Security
- enrollment server, X.509 certificates
- TLS, DTLS for message confidentiality
SIP
API
P2PP
ICE
TLS / SSL
protocol stack of a node
IETF P2PSIP working group
48Peer-to-peer protocol (P2PP)
- Node heterogeneity
- peers (super nodes) and clients (ordinary nodes)
- decision left to the system designer
- use of peers as relays
- NAT traversal built-in
- a node exchanges its host, NAT, and a relay IP
address in requests and responses - then uses ICE (interactive connectivity
establishment) for NAT traversal - Message reliability
- hop-by-hop, e2e
- Data model
- key / value pairs
- data integrity
- Monitoring and diagnostics gathering
49Implementation design
app. pluggability
publish (key, value, callback)
callback (resp)
lookup (key, callback)
Client
Bootstrap
KadPeer
BambooPeer
Diagnostic
406
1182
1019
869
630
Node
208
2803
Parser / encoder
Routing table
Distance
211
2921
Transactions
Neighbor table
299
BigInt
2566
3400
NAT
Data storage
1946
Other 2771
multiplatform
Transport / timers
Sys
1177
821
DTLS
TLS
Non-DHT LoC 15783
UDP
TCP
50Why not the best DHT?
- Is there any such thing as the best DHT?
- Chord widely cited but not widely deployed
- DHTs are parameterized
- base, hash algorithm
- symmetric vs. asymmetric distance
- Chord (modulo) vs. Kademlia
- next-hop determination may be purely based on
DHTs or a combination of DHTproximity aware
routing - debugging and deployment
51P2PP and RELOAD
- Commonalities
- pluggable overlay algorithm
- feasibility demonstrated in OpenVoIP
- security
- self-signed and CA-signed certificates
- DTLS, TLS
- data integrity
- routing
- recursive, iterative, direct response
- NAT traversal
- core part of the protocol
52P2PP and RELOAD
- Differences
- message model
- P2PP all messages can be routed in recursive or
iterative manner - RELOAD only one message permitted for iterative
- data model
- P2PP opaque blob, only app can interpret the
data - RELOAD single value, array, dictionary
- message fragmentation over unreliable transports
- P2PP use TCP
- RELOAD handle UDP
- NAT traversal
- RELOAD explicit connection establishment
- P2PP encode host, server-reflexive, and relay
addresses in every message
53Skype using P2PP?
- Why not open the Skype protocol?
- sure, but
- Skype protocol tied with VoIP
- To use P2PP, Skype will have to
- abandon its own protocol ?
- use SIP for call establishment
- TLS, DTLS for security
- STUN, TURN, ICE protocols for NAT traversal
54OpenVoIP geological interface
55OpenVoIP lessons learned
- Bootstrap
- maintain bootstrap nodes and ensure their
availability - Randomization is our best friend!
- send the maintenance messages within a bounded
random time - Churn recovery
- is on demand and periodic
- Insert a new entry in routing table after
checking liveness - Periodically republish SIP records
- not feasible for large records
- Avoid overly complex mechanisms
- can backfire!
56P2P video conferencing
- Send video to every participant (NxN)
- But
- uplink capacity
- NAT and firewalls
- downlink capacity
- Solution
- centralized
- costly, hardware-based
- peer-to-peer
- use helpers (idle Skype users)
- construct an application layer multicast tree
rooted at every participant
- Send video to every participant (NxN)
- But
- uplink capacity
- NAT and firewalls
- downlink capacity
- Solution
- centralized
- costly, hardware-based
- peer-to-peer
- use helpers
- construct an application layer multicast tree
rooted at every participant
57P2P video conferencing
- Challenges
- optimize latency or number of helpers or both
(within a threshold)? - select helpers close to source or final
recipients? - recipients behind NAT and firewalls?
- helper churn
- backup for every helper?
- participant join and churn?
- who searches for the helper?
- root or new recipient?
- share helpers across trees?
- participants as helpers?
58Number of helpers
Nine party conference
1/4
Number of helpers per tree4 Total helpers36
from 63!
1/4
1/4
59Number of helpers
part Participant outdegree3 Participant outdegree3 Participant outdegree3 Participant outdegree3 Participant outdegree3 Participant outdegree1 Participant outdegree1 Participant outdegree1 Participant outdegree1 Participant outdegree1
2 3 4 5 6 2 3 4 5 6
3 0 0 0 0 0 3 3 3 3 3
4 4 4 4 4 4 8 4 4 4 4
6 12 6 6 6 6 24 12 12 6 6
10 60 30 20 20 20 80 40 30 20 20
back up for every helper?
6 helpers per tree
60Number of helpers
part Participant outdegree3 Participant outdegree3 Participant outdegree3 Participant outdegree3 Participant outdegree3 Participant outdegree1 Participant outdegree1 Participant outdegree1 Participant outdegree1 Participant outdegree1
2 3 4 5 6 2 3 4 5 6
3 0 0 0 0 0 3 3 3 3 3
4 4 4 4 4 4 8 4 4 4 4
5 5 5 5 5 5 15 10 5 5 5
6 12 6 6 6 6 24 12 12 6 6
7 21 14 7 7 7 35 21 14 14 7
8 32 16 16 8 8 48 24 16 16 16
9 45 27 18 18 9 63 36 27 18 18
10 60 30 20 20 20 80 40 30 20 20
HO od
back up for every helper?
6 helpers per tree
61Split the stream
participant outdegree1 helper outdegree2 split
the stream3 Total helpers3x618
participant outdegree1 helper outdegree2 Total
helpers4x624
62Tree construction for video conferencing
- Split the stream decreases helpers
- gain increases as the helpers increase (gt4)
- Source selects helpers close to itself
- Helper pool
- Related work
- ALM 1-gtmany vid-conf many-gtmany
- participant churn, bandwidth, managed servers as
helpers - 1-hop tree construction without split and
incorporate participants as helpers
63Reliability analysis
64Understanding reliability of relayed calls
- Model and simulations
- event driven, 107 calls
- synthetic exponential, pareto,real Skype
lifetime data set - Skype node lifetime data set (1,740 Skype nodes)
- Skype (uptime mean12 hours, med4 hours)
- approximated using shifted pareto and exponential
- 95 of relayed Skype calls are less than 60
minutes Guha06 - desired reliability 99.9
1-relay failure error (15)
95 of Skype call durations minimum of 3 relays
to maintain 99.9 success rate
65Improving reliability of relayed calls
- Approach 1 -- no-replacement
- select k relays in the beginning of a call
- do not replace failed relays
- Approach 2 -- with-replacement
- select k relays in the beginning of a call
- replace failed relays after µ
- no failure during switch over
- Skype uses 2-relay with-replacement scheme
pure death process
Bir04
66Distributed relay selection
- Goal O(1) hop
- 2-level hierarchical network
Give me a relay
Here is a randomly selected relay
IP address RTT Bandwidth
IP address RTT Bandwidth
NAT
search performance
dropped calls
close-by
NAT
1-relay
local-random scheme
67Distributed relay selection
- Results
- strategies perform similar near system collapse
point - minimizing latency increases annoyance, number of
jobs per relay, vice versa - threshold approach performs reasonably well
- Comparison with existing approaches
- OneHop DHT
- Efficient routing for peer-to-peer overlays
Gupta04NSDI - Direct comparison not possible
- we do not create one hop DHT
- leverage the connectivity information of a peer
- Delay
- User annoyance
- interference with user applications
- file sharing (draft idle peers)
- spare capacity
- random
- mindelay
- select relay with minimum delay
- netmax
- select relay with maximum spare bw
- threshold
- select relays with delay lt 150 ms and maximum
spare capacity
68Distributed relay selection
69Understanding TCP behavior for real-time traffic
70Real-time traffic over TCP
- Why over TCP?
- restrictive NAT and firewalls
- Why not?
- TCP is likely to exhibit poor performance for
VoIP and live video (first tried in 1970s, but
TCP has evolved ) - Our result
- acceptable performance for VoIP and video
(streaming, conferencing) under certain
conditions - Why VoIP and video over TCP is feasible?
- (1) Factors impacting delay
- (2) Working region
71Factors impacting delay
- (1) Packet size (small is better)
- during backlog, VoIP packets (200 bytes) can be
combined in one MSS ( 1500B) - (2) Congestion window regulation (implicitly
favors small packet size) - TCP regulates cwnd based on number of ACKs
received (ACK-counting) - for two flows with the same packet-rate,
butdifferent bit-rate, works in favor of smaller
bit-rate
- Delay friendliness for VoIP
- No Nagle or delayed ACKs
- ACK-counting
72Factors impacting delay
- Byte-counting increases VoIP delays by 10-20
- VoIP delays are significantly lower than video
for the same packet rate - TCP induced delay AIMD, HOL (head-of-line)
- VoIP HOL dominates, video AIMD dominates
- (CBR) VoIP 64 kb/s (173 byte/packet) video 573
kb/s (MSS bytes/packet)
Delay improvement VoIP modify tcp recv() Video
use parallel connections or inflate the window
73Working region
VoIP 100ms 2
Video (stream) 100ms 3
Video (conf) 100ms 1
- Video conferencing has the most
- constrained region
- acceptable performance for VoIP
74Playout buffer setting
- Time to recover a lost packet
- Fast retransmit 1.5RTT 3/f
- Timeout 4RTT
i
RTT
3/f
0.5RTT
i
VoIP RTT 100ms lr 2 Video (conf) RTT 100ms
lr 1 Video (stream) RTT 100ms lr 2
75Playout buffer setting
- Time to recover a lost packet
- Fast retransmit 1.5RTT 3/f (CBR flows)
- Timeout 4RTT
i
RTT
3/f
RTT100ms lr1
0.5RTT
i
RTT100ms lr3
76Related work
- Supporting low-latency TCP based media streams
IWQoS02 - TCP stack modification at sender
- TCP-RC a receiver-centered TCP protocol for
delay-sensitive applications MMCN05 - TCP stack modification at receiver
77Energy Efficiency
78Does VoIP consume more energy than PSTN?
- Insufficient information
- Columbia phone system (presently)
- system 40K watts
- cooling 50K watts
- phone lines 13,848
- per user 6.4W
- VoIP
- Cisco phone 5-7W
79Servers needed
Transport NAT keep-alive 100k 1M 10M 100M Watts / server
UDP Yes, NOTIFY/s 1 2 20 200 210
UDP NO 1 1 10 100 190
TLS NO 3 25 250 2500 209
server as of totals c/s
calls relayed 100k 1M 10M 100M
0 0 0 0 0
30 1 2 10 96
100 1 4 32 320
calls relayed 100k 1M 10M 100M
UDP (NAT) 0.4 0.1 0.05 0.04
UDP
TLS 0.2
80Skype
81Measurement Skype
- Super node, ordinary node, login server
- Actively prevent against reverse engineering
- LD_PRELOAD
- forcing Skype to use a modified shared library
- Voice and video calls
- relaying
- over TCP
- Ports no default listening port
- opens port 80 (HTTP) and 443 (TLS)
- Contact list
- stored centrally, initially distributed
- Video conferencing
- using central servers ?
INFOCOM06
82Is Skype free-riding on universities bandwidth?
- Two Skype clients in Columbia University forced
to use a relay - 6,000 relay calls
- Median latency 95ms
- 46 calls through relays with a .edu suffix
- 8 of calls through Columbia Skype users
- Is it deliberate?
- probably not
- relay selection biased towards high-capacity
nodes which happen to be in universities
our lab
NAT
NAT
GI08
83Future work
84Directions for future research
- A holistic framework for reliability,
performance, and energy tradeoffs in data centers - virtualization, consolidation
- Comparing VoIP and PSTN energy consumption
- Preventing data lock-in for social networks and
cloud-based services - enabling seamless data migration across different
cloud providers - holy grail one click data migration
85Client-server IP communication system
SIP registrar / proxy server
media server
IP-PSTN gateway
- What is centralized?
- directory service
- call signaling
- media session and conferencing
- PSTN connectivity
(1) signaling
NAT / firewall
Peer-to-Peer distribute to user agents
NAT / firewall
(1) signaling
(2) media (voice, video, IM) (UDP or TCP)
User agent
User agent
- Scaling for millions of users
- servers
- bandwidth costs
- management overhead
Why is this a problem?
- How many calls need media relaying?
- 15-40
- some ISPs relay all calls