Increasing Intrusion Tolerance Via Scalable Redundancy - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Increasing Intrusion Tolerance Via Scalable Redundancy

Description:

To design, implement and evaluate new protocols for ... Offloads work to client resulting in greater scalability. Only perform extra work when needed ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 31
Provided by: miker84
Category:

less

Transcript and Presenter's Notes

Title: Increasing Intrusion Tolerance Via Scalable Redundancy


1
Increasing Intrusion Tolerance Via Scalable
Redundancy
  • Greg Ganger
  • greg.ganger_at_cmu.edu
  • Natassa9 Ailamaki Mike Reiter Priya
    Narasimhan Chuck Cranor

2
Technical Objective
  • To design, implement and evaluate new protocols
    for implementing intrusion-tolerant services that
    scale better
  • Here, scale refers to efficiency as number of
    servers and number of failures tolerated grows
  • Targeting three types of services
  • Read-write data objects
  • Custom flat object types for particular
    applications, notably directories for
    implementing an intrusion-tolerant file system
  • Arbitrary objects that support object nesting

3
Expected Impact
  • Significant efficiency and scalability benefits
    over todays protocols for intrusion tolerance
  • For example, for data services, we anticipate
  • At-least twofold latency improvement even at
    small configurations (e.g., tolerating 3-5
    Byzantine server failures) over current best
  • And improvements will grow as system scales up
  • A twofold improvement in throughput, again
    growing with system size
  • Without such improvements, intrusion tolerance
    will remain relegated to small deployments in
    narrow application areas

4
The Problem Space
  • Distributed services manage redundant state
    across servers to tolerate faults
  • We consider tolerance to Byzantine faults, as
    might result from an intrusion into a server or
    client
  • A faulty server or client may behave arbitrarily
  • We also make no timing assumptions in this work
  • An asynchronous system
  • Primary existing practice replicated state
    machines
  • Offers no load dispersion, requires data
    replication, and degrades as system scales with
    O(N2) messages

5
Our approach
  • Combine techniques to eliminate work in common
    cases
  • Server-side versioning
  • allows optimism with read-time repair, if nec.
  • allows work to be off-loaded to clients in lieu
    of server agreement
  • Quorum systems (and erasure coding)
  • allows load dispersion (and more efficient
    redundancy for bulk data)
  • Several others applied to defend against
    Byzantine actions
  • Major risk?
  • could be complex for arbitrary objects

6
Evaluation
  • We are Scenario I centralized server setting
  • Baseline the BFT library
  • Popular, publicly available implementation of
    Byzantine fault-tolerant state machine
    replication (by Castro Liskov)
  • Reported to be an efficient implementation of
    that approach
  • Two measures
  • Average latency of operations, from clients
    perspective
  • Peak sustainable throughput of operations
  • Our consistency definition linearizability of
    invocations

7
Outline
  • Overview
  • Read-write storage protocol
  • Some results
  • Continuing work

8
Read-write block storage
  • Clients erasure-code/replicate blocks into
    fragments
  • Storage-nodes version fragments on every write

Storage-nodes
Data block
Client
9
Challenges Concurrency
  • Concurrent updates can violate linearizability

Servers
4
5
1
2
3
4
5
1
2
3
Data
Data
10
Challenges Server Failures
  • Can attempt to mislead clients
  • Typically addressed by voting

Servers
3
1
2
4
5
4
????
11
Challenges Client Failures
  • Byzantine client failures can also mislead
    clients
  • Typically addressed by submitting a request via
    an agreement protocol

Servers
5
4
1
2
3
4
?
2
Data
?
12
Consistency via versioning
  • Leverage versioning storage nodes for consistency
  • Allow writes to proceed with versioning
  • All writes create new data versions
  • Partial writes and concurrency wont destroy data
  • Reader detects and resolves update conflicts
  • Concurrency rare in FS workloads (typically lt 1)
  • Offloads work to client resulting in greater
    scalability
  • Only perform extra work when needed
  • Optimistically assume fault-free,
    concurrency-free operation
  • Single round-trip for reads and writes in common
    case

13
Our system model
  • Crash-recovery storage-node fault model
  • Up to t total bad storage-nodes
    (crashed/Byzantine)
  • Up to b t Byzantine (arbitrary faults)
  • So, t - b faults are crash-recovery faults
  • Client fault model
  • Any number of crash or Byzantine clients
  • Asynchronous timing model
  • Point-to-point authenticated channels

14
Read/write protocol
  • Unit of update a block
  • Complete blocks are read and written
  • Erasure-coding may be used for space-efficiency
  • Update semantics Readwrite
  • No guarantee of contents between read write
  • Sufficient for block-based storage
  • Consistency Linearizability
  • Liveness wait-freedom

15
R/W protocol Write
  • Client erasure-codes data-item into N
    data-fragments
  • Client tags write requests with logical timestamp
  • Round-trip required to read logical time
  • Client issues requests to at least W
    storage-nodes
  • Storage-nodes validate integrity of request
  • Storage-nodes insert request into version history
  • Write completes after W requests have completed

16
R/W protocol Read
  • Client reads latest version from storage-node
    subset
  • Read set guaranteed to intersect with latest
    complete write
  • Client determines latest candidate write
    (candidate)
  • Set of responses containing the latest timestamp
  • Client classifies the candidate as one of
  • Complete
  • Incomplete
  • Repairable

For consistency only complete writes can be
returned
17
R/W protocol Read classification
  • Based on clients (limited) system knowledge
  • Failures and asynchrony lead to imperfect
    information
  • Candidate classification rules
  • Complete candidate exists on ? W nodes
  • candidate is decoded and returned
  • Incomplete candidate exists on ? W nodes
  • Read previous version to determine new candidate
  • Iterateperform classification on new candidate
  • Repairable candidate may exist on ? W nodes
  • Repair and return data-item

18
Example Successful read
(N5, W3, t1, b0)
Storage Nodes
Ø
Ø
Ø
Ø
Ø
Time
D1
1
2
3
4
5
D0
D1
Ø
D0
D0
D0 determined complete, returned
D1 latest candidate
D1 incomplete
D0 latest candidate
19
Example Repairable read
(N5, W3, t1, b0)
Storage Nodes
Ø
Ø
Ø
Ø
Ø
D0
D0
D0
Time
D1
D2
D2
1
2
3
4
5
D0
D1
D2
D2
D2
D2
D2 repairable
Repair D2
Return D2
D2 latest candidate
20
Protecting against Byzantine storage-nodes
  • Must defend against servers that modify data in
    their possession
  • Solution Cross checksums Gong 89
  • Hash each data-fragment
  • Concatenate all N hashes
  • Append cross checksum to each fragment
  • Clients verify hashes against fragments and use
    cross checksums as votes

Data-fragments
Hashes
Data-item
Cross checksum

21
Protecting against Byzantine clients
  • Must ensure all fragment sets decode to same
    value
  • Solution Validating timestamps
  • Write place hash of cross checksum in timestamp
  • also prevents multiple values being written at
    same timestamp
  • Storage-nodes validate their fragment against
    corresponding hash
  • Read regenerate fragments and cross checksum

Example Byzantine encoding with poisonous
fragment
Data-fragments
?
Data-items
22
Experimental setup
  • Prototype system PASIS
  • 20 node cluster
  • Dual 1 GHz Pentium III storage-nodes
  • Single 2 GHz Pentium IV clients
  • 100 Mb switched Ethernet
  • 16 KB data-item size (before encoding)
  • Blowup of over the data-item size
  • Each fragment is the data-item size

23
PASIS response time
N 2t 2b 1
20
Writes
18
b t
Writes
b 1
Fault models b t and b 1
16
Reads
b t
14
Reads
b 1
12
Mean response time (ms)
10
Decode computation
NW delay redundant fragments
8
6
4
1-way 16KB ping
2
0
1
2
3
4
Total failures tolerated (t)
24
Throughput experiment
  • Same system set-up as resp. time experiment
  • Clients issue read or write requests
  • Increase number of clients to increase load
  • Demonstrate value of erasure-codes
  • Increase m to reduce per storage-node load
  • Compare with Byzantine atomic broadcast
  • BFT library Castro Liskov 99
  • Supports arbitrary operations
  • Replica (with multicast) limits write throughput
  • O(N2) messages limits performance scalability

25
PASIS vs. BFT Write throughput
m
N
b t 1
3500
PASIS
2
5
PASIS
3
6
3000
BFT
1
4
2500
PASIS has higher write throughput than BFT
BFT uses replication which increases per
storage-node load
Reduce per storage-node load with erasure-codes
2000
Throughput (req/s)
1500
1000
60
500
0
0
2
4
6
8
Clients
26
PASIS vs. BFT Read throughput
m
N
b t 1
3500
PASIS
2
5
PASIS
3
6
BFT
1
4
3000
2500
2000
Throughput (req/s)
1500
1000
500
0
0
2
4
6
8
Clients
27
Continuing work
  • New testbed 70 servers connected with switched
    Gbit/sec
  • experiments can then explore higher scalability
    points
  • baseline and our results will come from this
    testbed
  • Protocol for arbitrary deterministic functions on
    objects
  • built from same basic primitives
  • Protocol for objects with nested objects
  • adds requirement of replicated invocations

28
Summary
  • Goal To design, implement and evaluate new
    protocols for implementing intrusion-tolerant
    services that scale better
  • Here, scale refers to efficiency as number of
    servers and number of failures tolerated grows
  • Started with a protocol for read-write storage
  • based on versioning and quorums
  • scales efficiently (and much better than BFT)
  • also flexible (can add assumptions to reduce
    costs)
  • Going forward (in progress)
  • generalize types of objects and operations that
    can be supported

29
Questions?
30
Garbage collection
  • Pruning old versions is necessary to reclaim
    space
  • Versions prior to latest complete write can be
    pruned
  • Storage-nodes need to know latest complete write
  • In isolation they do not have this information
  • Perform read operation to classify latest
    complete write
  • Many possible policies exist for when to clean
    what
  • Best to clean during idle time (if possible)
  • Rank blocks in order of greatest potential gains
  • Work remains in this area
Write a Comment
User Comments (0)
About PowerShow.com