Increasing Intrusion Tolerance Via Scalable Redundancy - PowerPoint PPT Presentation

1 / 30

About This Presentation

Title:

Increasing Intrusion Tolerance Via Scalable Redundancy

Description:

To design, implement and evaluate new protocols for ... Offloads work to client resulting in greater scalability. Only perform extra work when needed ... – PowerPoint PPT presentation

Number of Views:34

Avg rating:3.0/5.0

Slides: 31

Provided by: miker84

Category:

more less

Transcript and Presenter's Notes

Title: Increasing Intrusion Tolerance Via Scalable Redundancy

1
Increasing Intrusion Tolerance Via Scalable
Redundancy

Greg Ganger
greg.ganger_at_cmu.edu
Natassa9 Ailamaki Mike Reiter Priya
Narasimhan Chuck Cranor

2
Technical Objective

To design, implement and evaluate new protocols
for implementing intrusion-tolerant services that
scale better
Here, scale refers to efficiency as number of
servers and number of failures tolerated grows
Targeting three types of services
Read-write data objects
Custom flat object types for particular
applications, notably directories for
implementing an intrusion-tolerant file system
Arbitrary objects that support object nesting

3
Expected Impact

Significant efficiency and scalability benefits
over todays protocols for intrusion tolerance
For example, for data services, we anticipate
At-least twofold latency improvement even at
small configurations (e.g., tolerating 3-5
Byzantine server failures) over current best
And improvements will grow as system scales up
A twofold improvement in throughput, again
growing with system size
Without such improvements, intrusion tolerance
will remain relegated to small deployments in
narrow application areas

4
The Problem Space

Distributed services manage redundant state
across servers to tolerate faults
We consider tolerance to Byzantine faults, as
might result from an intrusion into a server or
client
A faulty server or client may behave arbitrarily
We also make no timing assumptions in this work
An asynchronous system
Primary existing practice replicated state
machines
Offers no load dispersion, requires data
replication, and degrades as system scales with
O(N2) messages

5
Our approach

Combine techniques to eliminate work in common
cases
Server-side versioning
allows optimism with read-time repair, if nec.
allows work to be off-loaded to clients in lieu
of server agreement
Quorum systems (and erasure coding)
allows load dispersion (and more efficient
redundancy for bulk data)
Several others applied to defend against
Byzantine actions
Major risk?
could be complex for arbitrary objects

6
Evaluation

We are Scenario I centralized server setting
Baseline the BFT library
Popular, publicly available implementation of
Byzantine fault-tolerant state machine
replication (by Castro Liskov)
Reported to be an efficient implementation of
that approach
Two measures
Average latency of operations, from clients
perspective
Peak sustainable throughput of operations
Our consistency definition linearizability of
invocations

7
Outline

Overview
Read-write storage protocol
Some results
Continuing work

8
Read-write block storage

Clients erasure-code/replicate blocks into
fragments
Storage-nodes version fragments on every write

Storage-nodes
Data block
Client
9
Challenges Concurrency

Concurrent updates can violate linearizability

Servers
4
5
1
2
3
4
5
1
2
3
Data
Data
10
Challenges Server Failures

Can attempt to mislead clients
Typically addressed by voting

Servers
3
1
2
4
5
4
????
11
Challenges Client Failures

Byzantine client failures can also mislead
clients
Typically addressed by submitting a request via
an agreement protocol

Servers
5
4
1
2
3
4
?
2
Data
?
12
Consistency via versioning

Leverage versioning storage nodes for consistency
Allow writes to proceed with versioning
All writes create new data versions
Partial writes and concurrency wont destroy data
Reader detects and resolves update conflicts
Concurrency rare in FS workloads (typically lt 1)
Offloads work to client resulting in greater
scalability
Only perform extra work when needed
Optimistically assume fault-free,
concurrency-free operation
Single round-trip for reads and writes in common
case

13
Our system model

Crash-recovery storage-node fault model
Up to t total bad storage-nodes
(crashed/Byzantine)
Up to b t Byzantine (arbitrary faults)
So, t - b faults are crash-recovery faults
Client fault model
Any number of crash or Byzantine clients
Asynchronous timing model
Point-to-point authenticated channels

14
Read/write protocol

Unit of update a block
Complete blocks are read and written
Erasure-coding may be used for space-efficiency
Update semantics Readwrite
No guarantee of contents between read write
Sufficient for block-based storage
Consistency Linearizability
Liveness wait-freedom

15
R/W protocol Write

Client erasure-codes data-item into N
data-fragments
Client tags write requests with logical timestamp
Round-trip required to read logical time
Client issues requests to at least W
storage-nodes
Storage-nodes validate integrity of request
Storage-nodes insert request into version history
Write completes after W requests have completed

16
R/W protocol Read

Client reads latest version from storage-node
subset
Read set guaranteed to intersect with latest
complete write
Client determines latest candidate write
(candidate)
Set of responses containing the latest timestamp
Client classifies the candidate as one of
Complete
Incomplete
Repairable

For consistency only complete writes can be
returned
17
R/W protocol Read classification

Based on clients (limited) system knowledge
Failures and asynchrony lead to imperfect
information
Candidate classification rules
Complete candidate exists on ? W nodes
candidate is decoded and returned
Incomplete candidate exists on ? W nodes
Read previous version to determine new candidate
Iterateperform classification on new candidate
Repairable candidate may exist on ? W nodes
Repair and return data-item

18
Example Successful read
(N5, W3, t1, b0)
Storage Nodes
Ø
Ø
Ø
Ø
Ø
Time
D1
1
2
3
4
5
D0
D1
Ø
D0
D0
D0 determined complete, returned
D1 latest candidate
D1 incomplete
D0 latest candidate
19
Example Repairable read
(N5, W3, t1, b0)
Storage Nodes
Ø
Ø
Ø
Ø
Ø
D0
D0
D0
Time
D1
D2
D2
1
2
3
4
5
D0
D1
D2
D2
D2
D2
D2 repairable
Repair D2
Return D2
D2 latest candidate
20
Protecting against Byzantine storage-nodes

Must defend against servers that modify data in
their possession
Solution Cross checksums Gong 89
Hash each data-fragment
Concatenate all N hashes
Append cross checksum to each fragment
Clients verify hashes against fragments and use
cross checksums as votes

Data-fragments
Hashes
Data-item
Cross checksum

21
Protecting against Byzantine clients

Must ensure all fragment sets decode to same
value
Solution Validating timestamps
Write place hash of cross checksum in timestamp
also prevents multiple values being written at
same timestamp
Storage-nodes validate their fragment against
corresponding hash
Read regenerate fragments and cross checksum

Example Byzantine encoding with poisonous
fragment
Data-fragments
?
Data-items
22
Experimental setup

Prototype system PASIS
20 node cluster
Dual 1 GHz Pentium III storage-nodes
Single 2 GHz Pentium IV clients
100 Mb switched Ethernet
16 KB data-item size (before encoding)
Blowup of over the data-item size
Each fragment is the data-item size

23
PASIS response time
N 2t 2b 1
20
Writes
18
b t
Writes
b 1
Fault models b t and b 1
16
Reads
b t
14
Reads
b 1
12
Mean response time (ms)
10
Decode computation
NW delay redundant fragments
8
6
4
1-way 16KB ping
2
0
1
2
3
4
Total failures tolerated (t)
24
Throughput experiment

Same system set-up as resp. time experiment
Clients issue read or write requests
Increase number of clients to increase load
Demonstrate value of erasure-codes
Increase m to reduce per storage-node load
Compare with Byzantine atomic broadcast
BFT library Castro Liskov 99
Supports arbitrary operations
Replica (with multicast) limits write throughput
O(N2) messages limits performance scalability

25
PASIS vs. BFT Write throughput
m
N
b t 1
3500
PASIS
2
5
PASIS
3
6
3000
BFT
1
4
2500
PASIS has higher write throughput than BFT
BFT uses replication which increases per
storage-node load
Reduce per storage-node load with erasure-codes
2000
Throughput (req/s)
1500
1000
60
500
0
0
2
4
6
8
Clients
26
PASIS vs. BFT Read throughput
m
N
b t 1
3500
PASIS
2
5
PASIS
3
6
BFT
1
4
3000
2500
2000
Throughput (req/s)
1500
1000
500
0
0
2
4
6
8
Clients
27
Continuing work

New testbed 70 servers connected with switched
Gbit/sec
experiments can then explore higher scalability
points
baseline and our results will come from this
testbed
Protocol for arbitrary deterministic functions on
objects
built from same basic primitives
Protocol for objects with nested objects
adds requirement of replicated invocations

28
Summary

Goal To design, implement and evaluate new
protocols for implementing intrusion-tolerant
services that scale better
Here, scale refers to efficiency as number of
servers and number of failures tolerated grows
Started with a protocol for read-write storage
based on versioning and quorums
scales efficiently (and much better than BFT)
also flexible (can add assumptions to reduce
costs)
Going forward (in progress)
generalize types of objects and operations that
can be supported