Title: ITUA
1Intrusion Tolerance by Unpredictable Adaptation
Presented by Partha Pal and William Sanders
OASIS PI Meeting, August 21, 2002
2Outline
- What is ITUA?
- Status report
- Technical updates
- Decentralized management
- Intrusion tolerant gateway
- Future plans
- Accomplishments
-
3What is ITUA
Development of Intrusion Tolerance Technology
Range of Adaptive Response Local Rapid to
Coordinated
Intrusion Tolerance at Multiple Levels
Theme of Unpredictability
Block IP add./ Restore file
Block? Drop? Timed?
Application Level
Convict corrupt replica
IT Gateway
Restore just a file? Tree?
Replace corrupt replica
Placement of new replica
IT GCS
Isolate corrupt security domains
Which application object?
Loops
Select application objects
Architecture
Validation
Tech Transfer
Create Scientific Basis for
Probabilistically Quantifying Survivability
- Validate ITUA Technologies
- Probabilistic Evaluation
- Measurement
- Internal Red Teaming
Boeings IEIST Application
Army CECOM SMS
An architecture where ITUA tolerance technologies
are integrated
OASIS Dem/Val
4Status Report
- Started July 2000
- End Date December 2003
- Approximately 65 percent done
- Ongoing tasks
- Completion of the decentralized manager
implementation - Combining IT replication, group membership, and
reliable multicast into intrusion-tolerant
gateway - Including more IEIST Components (Boeing) in the
integrated demonstration - Validation Methodology development and
evaluation of ITUA technology
5Technical Update 1 Decentralized Redundancy
Management
Initial Concept
Current Approach
Domain 1
Domain 1
M
M
Domain 2
Domain 2
M
M
S
M
Domain 3
Domain 3
M
S
M
M
S
M
Scalable, two stage information dissemination is
more corruption prone, and harder to analyze
Elimination of subordinate groups makes it
simpler Scalability of multicast need not be
fused with the issue of surviving corruption
6Organizing Management Components
- Manager one per host each consists of two
components Replication Controller and Security
Adviser - Hosts are organized into security domains
consisting of hosts that are at risk to be
compromised together. - Managers communicate to each other through a
manager group composed of all managers.
Manager Group
M
M
M
M
HOST
M
M
Manager
M
M
M
Domain 1
Domain 2
Domain 4
Domain 3
7Manager Features
- Decentralized management of redundant resources
- Not possible to compromise management function
completely unless a majority of the managers are
corrupt - No notion of a leader probabilistic and
consensus based algorithms - Coordinated response, but actions are taken
locally - A manager can start and stop replicas on its own
machine - A manager may place a suspected manager in its
own purgatory - Validation study one manager per domain yields
better survivability characteristics
8The Notion of Purgatory
- A facility/abstraction to temporarily prevent
replicas started by a suspected manager from
joining a replication group - Suspicion bad behavior, but w/o sufficient proof
- Timing delays in a partially synchronous model
- If M1 suspects M2, M2 is in M1s purgatory
- If enough managers put M2 in their purgatory,
replicas started by M2 cannot join a replication
group - If a replica R owned by M2 misbehaves (and enough
managers agree), R will be excluded by the IT
GCS, and M2 will be asked to remove R - If M2 does not oblige, it becomes a provably bad
behavior on M2s part - Eliminate M2s domain
- Different times for different crimes past
actions and the type of may bring different
sentences. - By not (always) permanently ostracizing a domain,
temporary non-malicious problems do not remove
domains permanently
9Signed Messages and Provable Bad Behavior
- Intrusion-Tolerant Gateway reports misbehavior by
signed messages - If a replica becomes corrupt, owners of
remaining replicas will be notified - Each owner (the manager that started that
replica) receiving the notification verifies the
signature, and multicasts a signed message to the
manager group - A manager is expected to receive independent
corroboration of the event vouched for by
multiple owners - By checking the message content and signature, it
is possible to accurately identify if a manager
is injecting bad data - Signing but not checking values corrupt manager
can pass on bad information - Checking values but not signing anyone can pass
bad information - Domain of a manager caught this way will be
isolated - ITUA does not currently considers replenishing
dropped domains
ITUA assumes a Trusted CA
10Probabilistic Replica Start Algorithm
- When a member R of a replication group G
leaves (removed by managers/crashes), managers
that dont have a member of G become available
to start one
- An available manager M decides to start a
replica with probability p where - p 1/((domains without a replica) ( managers
in Ms domain)) , if Ms domain does not have a
member of G - p 0, otherwise
- If M decides to start a replica, it will consult
the trusted CA to obtain the new replicas
credentials, and multicast its intention and the
replicas public key. - Private key will be handed to the new replica
process via stdin when it starts after the mcast
operation returns - Other managers with members of G receiving this
message, will instruct replicas to admit the
newcomer if - M is not in their purgatory and M is the first to
propose in Ms domain - Note multiple managers in a domain may propose,
multiple replicas may start in multiple domains,
but on average one will be started, if none
starts within a specified period , retry
11Tolerated Manager Failures
- N set of domains in the system
- F set of faulty domains
- M(X) number of managers in domain set X
- We assume that loss of faulty domains does not
cause the system to stop operating i.e., - is small, e.g. .
- and
- In order to successfully multicast to the manager
group, more than 2/3 of the participating
managers need (gateway level) to be non-faulty. - Given these, manager algorithms (above the
gateway level) will operate correctly as long as
a majority of participating managers are
non-faulty.
12Technical Update 2 Intrusion Tolerant Gateway
Application Object
- ITUA IT Gateway
- Provides CORBA interface to applications.
- Implements specific communication strategies
(handlers). - Adaptable to multiple GCS implementations.
- ITUA IT GCS
- Provides the necessary multicast and group
membership properties.
ORB
ORB
Replication Group Factory
Naming Service
DII Processor
Handler Factory
Gateway
Replication Group
Connection Group
Group Member
GCS Adaptation Layer
Protocol stack
Intrusion-Tolerant GCS
13Infrastructure Details
- Group Member
- Communicates with the handlers and DII Processor
in the Gateway above and GCS Adaptor below. - Replication/Connection group members derived from
this class - Facilitates secure state transfer
- Three components
- Sending Processor
- Receiving Processor
- Secure State Transfer Processor
- GCS Adaptation Layer acts as interface to the GCS
below - The gateway is designed to work on top of several
GCSs providing reliable delivery, group
membership and total order.
Replication Group
Connection Group
Secure State Transfer Processor
GCS Adaptation Layer
Group Communication System
14IT Handler Algorithm
L
L
15Example PseudoCode-Message Arrival Step 2
- FromConnectionCastStep2(m)
- if(VerifyProof(m))
- if(m.source myRepGroup)
- MulticastLatencyTimer.stop()
- if(MajorityReached(m))
- MajorityBuffer.remove(m)
- else
- MajorityDelayBuffer.store(m)
- else
- if(isLeader())
- ReplicationGroupMulticast(m)
- elseif(McastDelayBuf.find(m))
- McastDelayBuf.remove(m)
- Deliver(m)
- else
- totalOrderBuffer.add(m)
- RebroadcastTimer.start()
- Timers used to ensure that leaders broadcast in a
timely manner They prevent leaders from being
able to stall the protocol. - Buffers used to ensure that messages are
delivered in a consistent total order even if a
leader fails. - Proofs are used to ensure that a leader cannot
falsely invoke a request in another group without
consensus from the replication group.
16Fault Reporting in Handler
- Report_Error(BAD_VOTE, msgSeqNo, ReplicaID)
- Vote for a request that never reaches a majority.
- Never sent votes for a particular message (step
1). - Invalid signature accompanying replication group
broadcast in step 1. - Failure to include sufficient proof in a
connection group cast in step 2. - Sending a message that does not match the
majority of requests generated by other replicas. - Leader sends a broadcast with an incorrect
message or an out of sequence message in step 3.
- Report_Error(NO_VOTE, msgSeqNo, ReplicaID)
- Report_Error(BAD_SIGNATURE, msgSeqNo, ReplicaID)
- Report_Error(BAD_PROOF, msgSeqNo, ReplicaID)
- Report_Error(BAD_PROOF, msgSeqNo, ReplicaID)
- Report_Error(BAD_SIGNATURE, msgSeqNo, ReplicaID)
17Recent Accomplishments/Next Steps
- Technology Development
- Interaction of Security Adviser-Replication
Controller in manager - Key handoff between Management level and
gateway/GCS - Next-generation IT GCS and its integration
- Validation (more in next presentation)
- Model based studies, Whiteboard/red team
(internal) type experiment - Transition
- IEIST Addition of intrusion tolerant
capabilities to increase the survivability of the
fighter guardian agent - SMS and Dem/Val
- Papers/Demos/Reports
- DSN 2002
- Full Paper on GCS
- Validation Fast Abstract
- Workshop papers on Gateway and ITUA architecure
- DSN Red Teaming Session
- DARPA Tech Demo
- Validation Report (in final review)
- Pacific Rim Full Paper on formal verification
of group membership protocol (to appear)