Large Scale Overlay Measurements - PowerPoint PPT Presentation

1 / 66

About This Presentation

Title:

Large Scale Overlay Measurements

Description:

10 for Azureus have tested 50. 10 for Mainline have tested 200 ... Azureus can't reliably determine ownership. DHT Traffic. Break time into 5 second windows ... – PowerPoint PPT presentation

Number of Views:27

Avg rating:3.0/5.0

Slides: 67

Provided by: csR7

Category:

more less

Transcript and Presenter's Notes

Title: Large Scale Overlay Measurements

1
Large Scale Overlay Measurements

Scott A Crosby
Dan S Wallach
(Rice University)

2
Overview

Describe use of DHT in BitTorrent
Designimplementation of DHT
Design bugs
Measurements
Insights
Future work

3
Why BitTorrent

Structured overlay
Massive user count
Examined 2 overlays, gt4 million nodes
Real world measurements
Churn
Timeouts
NAT

4
BitTorrent

File broken up into pieces
Peers exchange with each other
Peers connected in random graph

5
BitTorrent

Rendezvous point
Where peers learn about each other
Centralized tracker

Tracker
Tracker
6
Distributed Tracker

Implement tracker using a DHT
DHT on top of Kademlia structured P2P overlay
Examine two different implementations
Azureus
Mainline BitTorrent

7
Part 1 Implementations

Kademlias Design
BitTorrent Implementation
Azureus Implementation

8
Kademlia

By Maymounkov and Mazières in 2002
Prefix-based routing table
XOR metric
Soft state
Stored on nearest k nodes to ID
Iterative search

9
Prefix Routing Table

B MRS queue
Paramaterized by
bBranching factor
kQueue length
Node sharing prefix with targetID knows node
sharing longer prefix

2
3
0
1
6
7
4
5
ID
B
B
B
B
B
B
B
2
x
B
B
B
B
B
B
B
6
2x
4
B
B
B
B
B
B
B
26x
5
B
B
B
B
B
B
B
264x
7
B
B
B
B
B
B
B
72645x
10
Routing

Priority queue
Use XOR distance from targetID
Iterative search
5-way Concurrency

11
Routing

Use XOR distance metric
Priority queue
Iterative search
5-way Concurrency

12
Routing

Initialized from own routing table

Closer to destination
13
Routing

Send out 5 concurrent requests

14
Routing

Get a reply

15
Routing

Current priority queue

16
Routing

Send out another request

17
Routing

Get another Reply

18
Routing

Current priority queue

19
Routing

Send out another request

20
Routing

Get a reply

21
Routing

Current priority queue

22
Routing

Send Request

23
Routing

Get a reply

24
Routing

Current priority queue

25
Routing

Send a request

26
Routing

Get a reply

27
Routing

Current priority queue

28
Routing

Send a request

29
Routing

All nodes dead. Full RPC timeout.

30
Routing

All nodes dead. Full RPC timeout.

31
Routing

Sending requests

32
Maintenance

Implicitly learn nodes
Incoming queries
Nodes lazily removed
Ping each bucket
Random lookup in every idle bucket

33
Mainline Implementation

Close implementation of design
Branching 2, k 8, 8-way concurrency
Refresh every 30 minutes
Activeused by default

34
Azureus

More complex
Branching 16, k 20, 5-way concurrency
Stateful
Migration Replication
Check every 30 minutes
Churn
Load balancing
Reinsert every 8 hours
Active, but not used by default

35
Part 2 Kademlia Design Bugs

Iterative lookups
Pruning dead nodes
Implicit route table learning
Find() nearest k nodes
XOR distance metric

36
Pruning Dead Nodes

Random lookup
I never use own routing table!
Only discover death during refresh or explicit
pings.

2
3
0
1
6
7
4
5
ID
1/81
B
B
B
B
B
B
B
3
x
1/82
B
B
B
B
B
B
B
6
2x
1/83
4
B
B
B
B
B
B
B
26x
1/84
5
B
B
B
B
B
B
B
264x
1/85
7
B
B
B
B
B
B
B
72645x
37
Implicit Route Table Learning

Random lookup source
Rarely learn eligible nodes
Only discover new nodes during refresh

2
3
0
1
6
7
4
5
ID
1/81
B
B
B
B
B
B
B
3
x
1/82
B
B
B
B
B
B
B
6
2x
1/83
4
B
B
B
B
B
B
B
26x
1/84
5
B
B
B
B
B
B
B
264x
1/85
7
B
B
B
B
B
B
B
72645x
38
Why Iterative?

Recursive pays RPC timeout on each dead node
Iterative pays RPC timeout on every concurrency
dead nodes

39
Nearest K nodes

We want
k nodes nearest to targetID
We get from each host
k nearest nodes it knows to destination
Not all are live!
Effective size .60k
Bad replication set
Kademlia doesnt define an algorithm

40
XOR Distance Metric

Cannot enumerate nodes
No global order
Nodes only orderable in order of distance to key
Changes for every lookup key
No predecessors/successors
No leaf set

41
XOR Distance Metric

Only way to determine who is responsible for key
is to route to it (even for replicas)
In a perfect world
Nodes responsible for key not have any other
replicas for that key in routing table.
Wont know when to replicate/migrate key
In an imperfect world
Replicas might not know each other

42
XOR Distance Metric

Fragile
Poor hard-state support
Correctness bugs
Migration bugs
Security problems
Cannot locally identify responsiblitiesduties

43
Example

NN(x) Nearest neighbor to x
NS(x) k nearest neighbors to x ordered by
distance
NN(x) not commutative
NS(x) is non transitive

44
Example

Assume
Replication factor k
Bucket size k
Perfect network
Static
If room in bucket for eligible node, node will be
in bucket
Nodes return the k nearest to a given search key

45
Example Details

Store(A)
B replica wont have any other nodes responsible
for A in local routing table.
KeyID 00 0000 0000 stored on M, A_is
MyID 01 0000 0000 having B_is in routing
table
A_i 10 0000 xxxx
B_i 11 0000 xxxx
M, Nearest neighbor to K knows no other replicas
storing K.
Note that A_is must have M in routing table in
perfect world. However, in imperfect world, A_is
and M might not know of each other!

46
Part 3 Measurements

Lookup performance
Node lifespan
Node count estimate
DHT traffic
Bad migrations (Azureus)

47
Collecting Measurements

Ran several Sybils
10 for Azureus have tested 50
10 for Mainline have tested 200
No defense against Sybil attack

48
Methodology

Measurement is unbaised if
Measurement value is independent of nodes being
sampled
Used for
Node lifespan
Dead routing table entries

49
Lookup Performance

Random lookups
30k measurements
Atrocious
median gt 60 seconds
Mainline bug
50 of searches end on dead nodes
Removed from results

50
Lookup Hops
51
Lookup Time
52
Dead nodes in routing table

Too many dead nodes in routing table
42 in Azureus
41 in Mainline
95 of lookups hit 25 dead nodes
Each dead node consumes request slot
until the RPC timeout
Mainline has higher concurrency

53
Fraction Alive CDF
54
Lookup Time w/o timeouts
55
Conclusion

RPC timeouts kill us alive
Residual performance stinks
Concurrency in lookup helps
But cant deal with 20 dead nodes

56
Node Timeline
Discovery
Uptime
Pre-Zombie
Zombie
Join
Seen
Last Alive
First Dead
Removed
57
Infant Mortality

Nodes that cant be reached.
Or only seen once
50 of Mainline nodes die young
70 of Azureus nodes die young

58
Azureus Lifetime CDF
59
Conclusion

Azureus has bugs in node refresh
Not conforming to Kademlia design

60
Mainline Lifetime CDF
61
How to count the nodes?

NodeID Density
Cant enumerate nodes in a range
Instead
Do a search for k nearest a random ID
Order nodes by distance to search ID
Distance between IDs
One sample

62
DHT Node Count Estimate

Algorithm 1
Adjacent Pairwise
2160/(IDdifference)
Gives us 719 distributions

63
DHT Node Count Estimate

Algorithm 2
Non-Adjacent Pairwise
N2160/(IDdifference)
Gives 7 distributions
Mainline BT

64
Azureus Node Count CDF
Estimated size 2.3M 100k, 73k measurements
65
Mainline Node Count CDF
Estimated size 2.0M 500k, 30k measurements
66
Conclusion

CDF Distributions should be identical
But arent!
Were not getting the k nearest
Empirically proves
Replication sets for Mainline are bad!!
Kademlia fragility
Azureus better
20-way replication
Handles dead nodes better in search

67
Azureus Migration Errors

40 of directly stored keys rejected
30 of migrated keys rejected
Empirically proves
Azureus cant reliably determine ownership

68
DHT Traffic

Break time into 5 second windows
Plot the CDF of the fraction of events occurring
in a window that had X events.
PLOTS TODO

69
DHT Traffic

Busiest 5 second window has only 4x the events of
the median 5 second window.

70
Measuring BitTorrent Swarm Sizes
71
Part 4 Insights

RPC Timeouts
Kill performance
Big impact in large DHT
Concurrency isnt a fix
Recursive/Iterative routing
Recursive routing implicitly verifies node
liveness

72
Part 4 Insights

FindReplicationSet(id,k)
Need primitive to find replication set
Even with dead nodes in search
Nearest neighbor distance metric bad
Not transitive or commutative
LatencyBandwidth Tradeoff
Maintance bandwidth for liveness checking
Timeouts hitting dead nodes

73
Future Work

Use insights to create better P2P design

74
Conclusion

Measurements of a large scale DHT
Identified many design problems
Insights in P2P network design

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

Introducing-PowerShowcom PowerPoint PPT Presentation

Introducing-PowerShowcom - Introducing-PowerShowcom (Without Music)

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

Surviving Large Scale Internet Outages PowerPoint PPT Presentation

Surviving Large Scale Internet Outages - Can lead to large scale 'failures' Inability of access or diversion to malicious sites. ... Aggregation of large un-owned IP blocks. Incompatible policies among AS'es ... | PowerPoint PPT presentation | free to view

Topology-Aware Overlay Construction and Server Selection PowerPoint PPT Presentation

Topology-Aware Overlay Construction and Server Selection - TopologyAware Overlay Construction and Server Selection | PowerPoint PPT presentation | free to view

Surviving Large Scale Internet Failures PowerPoint PPT Presentation

Surviving Large Scale Internet Failures - Malicious route advertisements via worms. 8/29/09 ... Slammer Worm (Jan 2003) Scanner worm started w/ buffer overflow of MS SQL. ... | PowerPoint PPT presentation | free to view

Stereoscopic Video Overlay with Deformable Registration PowerPoint PPT Presentation

Stereoscopic Video Overlay with Deformable Registration - Stereoscopic Video Overlay with Deformable Registration. Balazs Vagvolgyi ... Given pre-operative scan. data from a ... Pre-operative 3D model - most ... | PowerPoint PPT presentation | free to view

Internet-Scale Research at Universities PowerPoint PPT Presentation

Internet-Scale Research at Universities - ... based measurements across 12 ... I can just quote some measurement results from previous papers ... A scalable measurement methodology helps ease of adoption ... | PowerPoint PPT presentation | free to view

Infrastructure%20Primitives%20for%20Overlay%20Networks PowerPoint PPT Presentation

Infrastructure%20Primitives%20for%20Overlay%20Networks - Infrastructure Primitives for. Overlay Networks. Karthik Lakshminarayanan ... Challenge: To make the measurements scale to an infrastructure of 1000s of nodes ... | PowerPoint PPT presentation | free to view

Two challenges for building large self-organizing overlay networks PowerPoint PPT Presentation

Two challenges for building large self-organizing overlay networks - Two issues in multicast ... 2. Why is it difficult to write application programs for ... Centurion cluster at UVA (cluster of 300 Linux PCs) 2 to 10,000 ... | PowerPoint PPT presentation | free to view

HyperCast Brief overview PowerPoint PPT Presentation

HyperCast Brief overview - HyperCast is a set of protocols for large-scale overlay networks ... Centurion cluster at UVA (cluster of 300 Linux PCs) 2 to 100 PCs. 1 to 100 members per PC ... | PowerPoint PPT presentation | free to view

My Research in Network Monitoring and Measurements PowerPoint PPT Presentation

My Research in Network Monitoring and Measurements - Part 2: Monitoring as a First Class Citizen in an Autonomic Network Architecture ... BitTorrent, HTTP/1.1 (persistent) only keep-alive messages. transfer periods. 17 ... | PowerPoint PPT presentation | free to view

An Overlay Infrastructure for Decentralized Object Location and Routing PowerPoint PPT Presentation

An Overlay Infrastructure for Decentralized Object Location and Routing - Cooperative approach to large-scale applications ... landmark routing (Brocade) IPTPS 02. DOLR. PRR 97. multicast (Bayeux) NOSSDAV 02. file system ... | PowerPoint PPT presentation | free to view

Internet-Scale Research at Universities - Internet-Scale Research at Universities. Panel Session ... Emulator. Parameters modeled. Overlay topology: Generate 6,510-node physical network using GT-ITM ... | PowerPoint PPT presentation | free to view

Overlay%20Networks%20-%20Indirection%20 PowerPoint PPT Presentation

Overlay%20Networks%20-%20Indirection%20 - Problematic Interactions between Multiple Overlays and with IP-layers ... Defeats one of the objectives of BGP to decouple different domains by insulating ... | PowerPoint PPT presentation | free to view

Topology-Aware Overlay Networks for Group Communication PowerPoint PPT Presentation

Topology-Aware Overlay Networks for Group Communication - Proposed by Y. Chu, S. Rao, S. Seshan and H. Zhang (CMU) in 2000. ... RON, Detour. Unicast-based multicast protocol. REUNITE. Theoretical studies ... | PowerPoint PPT presentation | free to view

Internet Scale Overlay Hosting PowerPoint PPT Presentation

Internet Scale Overlay Hosting - CDNs use overlay methods to enhance performance ... Conventional servers have dreadful performance on IO-intensive applications ... | PowerPoint PPT presentation | free to view

Building Very Large Overlay Networks PowerPoint PPT Presentation

Building Very Large Overlay Networks - HyperCast Software: Data Exchange. Each overlay socket has 2 adapters to underlying network: ... Each member only knows its neighbors 'soft-state' protocol ... | PowerPoint PPT presentation | free to view

Towards Wireless Overlay Network Architectures PowerPoint PPT Presentation

Towards Wireless Overlay Network Architectures - But allow architecture-aware applications to obtain enhanced ... Soda Hall Base Station DirecPC Basestation Internet Gateway Hughes DBS Basestation ... | PowerPoint PPT presentation | free to view

Measurements of Peer-to-Peer Systems PowerPoint PPT Presentation

Measurements of Peer-to-Peer Systems - Measurements of Peer-to-Peer Systems Pradnya Karbhari Nov 25th, 2003 CS 8803: Network Measurements Seminar Introduction to Peer-to-Peer (P2P) systems End-systems (or ... | PowerPoint PPT presentation | free to view

Measurements, Analysis, and Modeling of BitTorrent-like Systems PowerPoint PPT Presentation

Measurements, Analysis, and Modeling of BitTorrent-like Systems - Title: Measurements, Analysis, and Modeling of BitTorrent-like Systems Author: wm Last modified by: wm Created Date: 8/30/2005 12:50:10 AM Document presentation format | PowerPoint PPT presentation | free to view

Peer-to-Peer Communication Systems Protocols and Systems, Reliability, Energy Efficiency and Measurements PowerPoint PPT Presentation

Peer-to-Peer Communication Systems Protocols and Systems, Reliability, Energy Efficiency and Measurements - Peer-to-Peer Communication Systems Protocols and Systems, Reliability, Energy Efficiency and Measurements Salman Abdul Baset salman@cs.columbia.edu | PowerPoint PPT presentation | free to view

Run-to-Run Control of Linewidth and Overlay in Semiconductor Manufacturing PowerPoint PPT Presentation

Run-to-Run Control of Linewidth and Overlay in Semiconductor Manufacturing - Run-to-Run Control of Linewidth and Overlay in Semiconductor Manufacturing Christopher Allen Bode Dr. Thomas F. Edgar, Advisor University of Texas at Austin | PowerPoint PPT presentation | free to view

CS290F, Winter 2005 Large-scale Networked Systems PowerPoint PPT Presentation

CS290F, Winter 2005 Large-scale Networked Systems - CS290F, Winter 2005 Large-scale Networked Systems Instructor Ben Y. Zhao (ravenben at cs.ucsb.edu, 1151 Engineering I) Office hour: Thur, 3-4pm Lecture time: TuTh, 1 ... | PowerPoint PPT presentation | free to view

Investigating the Mutual Impact of the P2P Overlay and the AS-level Underlay PowerPoint PPT Presentation

Investigating the Mutual Impact of the P2P Overlay and the AS-level Underlay - Investigating the Mutual Impact of the P2P Overlay and the AS-level Underlay Committee: Prof. Reza Rejaie (Chair) Prof. Virginia Lo Prof. Arthur Farley | PowerPoint PPT presentation | free to view

Maximizing User Gain in Multi-flow Multicast Streaming on Overlay Networks PowerPoint PPT Presentation

Maximizing User Gain in Multi-flow Multicast Streaming on Overlay Networks - Maximizing User Gain in Multi-flow Multicast Streaming on Overlay Networks Y.Nakamura, H.Yamaguchi and T.Higashino Graduate School of Information Science and Technology, | PowerPoint PPT presentation | free to view

RON: Resilient Overlay Networks PowerPoint PPT Presentation

RON: Resilient Overlay Networks - RON: Resilient Overlay Networks David Andersen, Hari Balakrishnan, Frans Kaashoek, and Robert Morris MIT Laboratory for Computer Science http://nms.lcs.mit.edu/ron/ | PowerPoint PPT presentation | free to view

$Scale, distance, area, etc. Expressed as a representative fraction 1:x or 1/x some agencies use equivalent distances as scale descriptors: e.g. 1 PowerPoint PPT Presentation$

Scale, distance, area, etc. Expressed as a representative fraction 1:x or 1/x some agencies use equivalent distances as scale descriptors: e.g. 1 - Scale, distance, area, etc. Expressed as a representative fraction 1:x or 1/x some agencies use equivalent distances as scale descriptors: e.g. 1 =1 Mile (or ... | PowerPoint PPT presentation | free to view

Atomic Scale Ordering in Metallic Nanoparticles PowerPoint PPT Presentation

Atomic Scale Ordering in Metallic Nanoparticles - Title: No Slide Title Author: Valued Gateway Customer Last modified by: Valued Gateway Customer Created Date: 10/2/2000 4:50:52 PM Document presentation format | PowerPoint PPT presentation | free to view

Surviving Large Scale Internet Outages - Title: PowerPoint Presentation Last modified by: sysadm Created Date: 1/1/1601 12:00:00 AM Document presentation format: On-screen Show Other titles | PowerPoint PPT presentation | free to view