Brahms - PowerPoint PPT Presentation

About This Presentation
Title:

Brahms

Description:

Brahms Byzantine-Resilient Random Membership Sampling Bortnikov, Gurevich, Keidar, Kliot, and Shraer – PowerPoint PPT presentation

Number of Views:168
Avg rating:3.0/5.0
Slides: 41
Provided by: Idi97
Category:
Tags: brahms | johannes

less

Transcript and Presenter's Notes

Title: Brahms


1
Brahms
  • Byzantine-Resilient Random Membership Sampling

Bortnikov, Gurevich, Keidar, Kliot, and Shraer
2
Edward (Eddie) Bortnikov
Maxim (Max) Gurevich
Idit Keidar
Alexander (Alex) Shraer
Gabriel (Gabi) Kliot
3
Why Random Node Sampling
  • Gossip partners
  • Random choices make gossip protocols work
  • Unstructured overlay networks
  • E.g., among super-peers
  • Random links provide robustness, expansion
  • Gathering statistics
  • Probe random nodes
  • Choosing cache locations

4
The Setting
  • Many nodes n
  • 10,000s, 100,000s, 1,000,000s,
  • Come and go
  • Churn
  • Every joining node knows some others
  • Connectivity
  • Full network
  • Like the Internet
  • Byzantine failures

5
Byzantine Fault Tolerance (BFT)
  • Faulty nodes (portion f)
  • Arbitrary behavior bugs, intrusions, selfishness
  • Choose f ids arbitrarily
  • No CA, but no panacea for Cybil attacks
  • May want to bias samples
  • Isolate nodes, DoS nodes
  • Promote themselves, bias statistics

6
Previous Work
  • Benign gossip membership
  • Small (logarithmic) views
  • Robust to churn and benign failures
  • Empirical study Lpbcast,Scamp,Cyclon,PSS
  • Analytical study Allavena et al.
  • Never proven uniform samples
  • Spatial correlation among neighbors views PSS
  • Byzantine-resilient gossip
  • Full views MMR,MS,Fireflies,Drum,BAR
  • Small views, some resilience SPSS
  • We are not aware of any analytical work

7
Our Contributions
  • Gossip-based BFT membership
  • Linear portion f of Byzantine failures
  • O(n1/3)-size partial views
  • Correct nodes remain connected
  • Mathematically analyzed, validated in simulations
  • Random sampling
  • Novel memory-efficient approach
  • Converges to proven independent uniform samples

The view is not all bad
Better than benign gossip
8
Brahms
  1. Sampling - local component
  2. Gossip - distributed component

Gossip
view
Sampler
sample
9
Sampler Building Block
  • Input data stream, one element at a time
  • Bias some values appear more than others
  • Used with stream of gossiped ids
  • Output uniform random sample
  • of unique elements seen thus far
  • Independent of other Samplers
  • One element at a time (converging)

next
Sampler
sample
10
Sampler Implementation
  • Memory stores one element at a time
  • Use random hash function h
  • From min-wise independent family Broder et al.
  • For each set X, and all ,

next
init
Sampler
Keep id with smallest hash so far
Choose random hash function
sample
11
Component S Sampling and Validation
id streamfrom gossip
init
next
Sampler
Sampler
Sampler
Sampler
using pings
sample
Validator
Validator
Validator
Validator
S
12
Gossip Process
  • Provides the stream of ids for S
  • Needs to ensure connectivity
  • Use a bag of tricks to overcome attacks

13
Gossip-Based Membership Primer
  • Small (sub-linear) local view V
  • V constantly changes - essential due to churn
  • Typically, evolves in (unsynchronized) rounds
  • Push send my id to some node in V
  • Reinforce underrepresented nodes
  • Pull retrieve view from some node in V
  • Spread knowledge within the network
  • Allavena et al. 05 both are essential
  • Low probability for partitions and star topologies

14
Brahms Gossip Rounds
  • Each round
  • Send pushes, pulls to random nodes from V
  • Wait to receive pulls, pushes
  • Update S with all received ids
  • (Sometimes) re-compute V
  • Tricky! Beware of adversary attacks

15
Problem 1 Push Drowning
Push Alice
A
E
M
M
Push Bob
Push Mallory
Push Carol
B
M
M
Push Ed
Push Dana
Push MM
D
Push Malfoy
M
16
Trick 1 Rate-Limit Pushes
  • Use limited messages to bound faulty pushes
    system-wide
  • E.g., computational puzzles/virtual currency
  • Faulty nodes can send portion p of them
  • Views wont be all bad

17
Problem 2 Quick Isolation
Ha! Shes out! Now lets move on to the next guy!
Push Alice
A
E
M
Push Bob
Push Carol
Push Mallory
Push Ed
Push Dana
Push MM
Push Malfoy
C
D
18
Trick 2 Detection Recovery
  • Do not re-compute V in rounds when too many
    pushes are received
  • Slows down isolation does not prevent it

Push Bob
Push Mallory
Hey! Im swamped! I better ignore all of em
pushes
Push MM
Push Malfoy
19
Trick 3 Balance Pulls Pushes
  • Control contribution of push - aV ids versus
    contribution of pull - ßV ids
  • Parameters a, ß
  • Pull-only ? eventually all faulty ids
  • Pull from faulty nodes all faulty ids, from
    correct nodes some faulty ids
  • Push-only ? quick isolation of attacked node
  • Push ensures system-wide not all bad ids
  • Pull slows down (does not prevent) isolation

20
Trick 4 History Samples
  • Attacker influences both push and pull
  • Feedback ?V random ids from S
  • Parameters a ß ? 1
  • Attacker loses control - samples are eventually
    perfectly uniform

Yoo-hoo, is there any good process out there?
21
View and Sample Maintenance
Pushed ids
Pulled ids
S
? V
?V
?V
View V
Sample
22
Key Property
  • Samples take time to help
  • Assume attack starts when samples are empty
  • With appropriate parameters
  • E.g.,
  • Time to isolation gt time to convergence

Prove lower bound using tricks 1,2,3(not using
samples yet)
Prove upper bound until some good sample
persists forever
Self-healing from partitions
23
History Samples Rationale
  • Judicious use essential
  • Bootstrap, avoid slow convergence
  • Deal with churn
  • With a little bit of history samples (10) we
    can cope with any adversary
  • Amplification!

24
Analysis
  1. Sampling - mathematical analysis
  2. Connectivity - analysis and simulation
  3. Full system simulation

25
Connectivity ? Sampling
  • Theorem If overlay remains connected
    indefinitely, samples are eventually uniform

26
Sampling ? Connectivity Ever After
  • Perfect sample of a sampler with hash h the id
    with the lowest h(id) system-wide
  • If correct, sticks once the sampler sees it
  • Correct perfect sample ? self-healing from
    partitions ever after
  • We analyze PSP(t) probability of perfect sample
    at time t

27
Convergence to 1st Perfect Sample
  • n 1000
  • f 0.2
  • 40 unique ids in stream

28
Scalability
  • Analysis says
  • For scalability, want small and constant
    convergence time
  • independent of system size, e.g., when

29
Connectivity Analysis 1 Balanced Attacks
  • Attack all nodes the same
  • Maximizes faulty ids in views system-wide
  • in any single round
  • If repeated, system converges to fixed point
    ratio of faulty ids in views, which is lt 1 if
  • ?0 (no history) and p lt 1/3 or
  • History samples are used, any p

There are always good ids in views!
30
Fixed Point Analysis Push
Local view node i
Local view node 1
i




Time t
push
lost push
push from faulty node



1

Time t1
x(t) portion of faulty nodes in views at round
t portion of faulty pushes to correct nodes p
/ ( p ( 1 - p )( 1 - x(t) ) )
31
Fixed Point Analysis Pull
Local view node i
Local view node 1
i




Time t
pull from i faulty with probability x(t)
pull from faulty




Time t1
Ex(t1) ? p / (p (1 - p)(1 - x(t))) ?
( x(t) (1-x(t))?x(t) ) ?f
32
Faulty Ids in Fixed Point
Assumed perfect in analysis, real history in
simulations
With a few history samples, any portion of bad
nodes can be tolerated
Perfectly validated fixed pointsand convergence
33
Convergence to Fixed Point
  • n 1000
  • p 0.2
  • aß0.5
  • ?0

34
Connectivity Analysis 2Targeted Attack
Roadmap
  • Step 1 analysis without history samples
  • Isolation in logarithmic time
  • but not too fast, thanks to tricks 1,2,3
  • Step 2 analysis of history sample convergence
  • Time-to-perfect-sample lt Time-to-Isolation
  • Step 3 putting it all together
  • Empirical evaluation
  • No isolation happens

35
Targeted Attack Step 1
  • Q How fast (lower bound) can an attacker isolate
    one node from the rest?
  • Worst-case assumptions
  • No use of history samples (? 0)
  • Unrealistically strong adversary
  • Observes the exact number of correct pushes and
    complements it to aV
  • Attacked node not represented initially
  • Balanced attack on the rest of the system

36
Isolation w/out History Samples
  • n 1000
  • p 0.2
  • aß0.5
  • ?0

Isolation time for V60
Depend on a,ß,p
37
Step 2 Sample Convergence
  • n 1000
  • p 0.2
  • aß0.5, ?0
  • 40 unique ids

Perfect sample in 2-3 rounds
Empirically verified
38
Step 3 Putting It All TogetherNo Isolation with
History Samples
  • n 1000
  • p 0.2
  • aß0.45
  • ?0.1

Works well despite small PSP
39
Sample Convergence (Balanced)
  • p 0.2
  • aß0.45
  • ?0.1

Convergence twice as fast with
40
Summary
  • O(n1/3)-size views
  • Resist Byzantine failures of linear portion
  • Converge to proven uniform samples
  • Precise analysis of impact of failures

41
Balanced Attack Analysis (1)
  • Assume (roughly) equal initial node degrees
  • x(t) portion of faulty ids in correct node
    views at time t
  • Compute Ex(t1) as function of x(t), p, ?, ?, ?
  • Result 1 Short-term Optimality
  • Any non-balanced schedule imposes a smaller x(t)
    in a single round

42
Balanced Attack Analysis (2)
  • Result 2 Existence of Fixed Point X
  • Ex(t1) x(t) X
  • Analyze X (function of p, ?, ?, ?)
  • Conditions for uniqueness
  • For ??0.5, p lt 1/3, exists X lt 1
  • The view is not entirely poisoned history
    samples are not essential
  • Result 3 Convergence to fixed point
  • From any initial portion lt 1 of faulty ids
  • From Hillam 1975 (sequence convergence)
Write a Comment
User Comments (0)
About PowerShow.com