Title: Probabilistic Validation of Intrusion Tolerance
1Intrusion Tolerance by Unpredictable Adaptation
(ITUA)Probabilistic Validation of Intrusion
Tolerance
Presented by William Sanders and Michel Cukier
OASIS PI Meeting, August 21, 2002
2Motivation
- Aim of intrusion tolerance
- Increase the likelihood that an application will
be able to continue to operate correctly in spite
of malicious attacks that may occur and may
result in successful intrusions - Before intrusion tolerance can be accepted as an
approach to providing security, techniques must
be developed to validate its efficacy - Validation should be done
- During all phases of the design process to make
design choices - During testing and operation to gain confidence
that the amount of intrusion tolerance provided
is as advertised
3Our Approach
- We take a total-lifecycle approach to validation,
using - Probabilistic analytic models throughout the
system lifecycle - Detailed simulation models as the design matures
- Intrusion injection and controlled
experimentation on an implemented prototype - Detailed models combined with results from
experimentation to build an overall model - Red teaming on the complete prototype
- Models have two components
- a model, of an attacker, the system, and the
workload required to be supported by a system - a set of measures that provide estimates of the
desired survivability properties
4Validation Throughout System Life Cycle
Specification of Attacks/Faults to Consider
High-Level Design
Specification of Workload
High-level Design
High-Level Analytic/Simulation Model
Specification of Attacks/Faults to Consider
Detailed Design
Specification of Workload
Detailed Design
Analysis/ Detailed Simulation
Specification of Attacks/Faults to Inject
Specification of Workload
Prototype Implementation
Prototype Implementation
Analysis/ Detailed Simulation
Intrusion Injection Controlled Experimentation
Red Teaming
5Proposed Survivability Model Structure
Workload
Application
Survivability Measure
Attacker
Intrusion-Tolerance Mechanism
Resource/Privilege State
System
6Outline
- Will report on two new models
- High-level analytic model of IP Tables/SNORT
control loop - Detailed model to numerically analyzed to
understand fine-grained tradeoffs between system
and environmental parameter values in ITUA
Replication Management Scheme - Will report on preliminary results obtained from
data collection and the link to model parameters
7Modeling an ITUA Control Loop Resource
Consumption
- The IP Tables/SNORT control loop
- Monitor network ports for suspicious activity
- Respond to suspected attacks by filtering traffic
- When monitoring and response is local to a host,
response can be quick - Model
- Attack uses a single port
- each request on this port reserves a resource
- resource is limited
- Attack requests cannot initially be distinguished
from legitimate requests e.g., source is spoofed
and random - otherwise, selective filtering would be possible
- Defense times-out attack requests
- e.g. TCP SYN flood, attacker does not correctly
execute protocol - timeout interval randomly varied to prevent
prediction - Attack has maximum rate
8Possible Defense Strategy Periodically Close Port
resource exhaustion
close
close
close
close
time
open
open
open
open
average timeout
9Optimal Strategy
- Maximize availability for legitimate requests
- Periodic strategy has good property that
availability is constant regardless of attack
duration if resource is never exhausted - Optimal periodic strategy is to minimize
close-time while making resource exhaustion rare
in the steady state - optimum close-time is function of open-time,
attack rate, timeout, and resource limit - availability decreases only slowly if resource
exhaustion is allowed, because an exhausted
resource is still significantly available due to
timeouts
10Stochastic Activity Network Model of ITUA
Intrusion-Tolerant Replication
- Probabilistic modeling of intrusion-tolerant
replication management system using Stochastic
Activity Networks (SANs) - A detailed model that includes attackers with
varying degree of sophistication, correlations
between different phases of a single attack,
attacks against individual processes and hosts,
unpredictable defense strategies, and several
layers of intrusion detection - Multiple measures are considered
- Unavailability and unreliability
- Process load on a host
- Fraction of security domains excluded, and
- Fraction of hosts corrupted before a domain is
excluded
11SAN Models
Composed Rep/Join Model
Replica submodel models behavior of a single
replica starting of replica, attacks on replica,
detection of infiltration by IDS, misbehavior by
infiltrated replicas and its detection by other
replicas.
Host submodel models the activities on a single
host attacks on host, detections and false
alarms by IDS, starting replicas and management
entities, and shutting down
Management submodel models the process of
recovery by the management infrastructure through
the starting of new replicas
12Comparative Performance Under Different
Distributions of Constant Number of Hosts Into
Domains
12 hosts distributed into 1, 2, 3, 4, 6 or 12
domains. As hosts per domain increases, number of
domains decreases. 2, 4, 6, or 8 applications
with 7 replicas each. One time unit ? one hour.
- Observations
- Low unavailability possible even when system left
without any human intervention. - Unavailability for a particular application does
not change much with increase in the number of
applications. - Unavailability increases significantly as we
increase the hosts per domain. Fraction of
domains excluded also increases with the increase
in hosts per domain, along with decrease in the
total number of domains, both resulting in
decrease of the number of good domains available
for recoveryhence increase in unavailability.
13Comparative Performance Under Different Number of
Hosts Distributed Into Constant Number of Domains
Number of domains fixed at 10, number of hosts
per domain varied from 1 to 4. 4 applications
with 7 replicas each. One time unit ? one hour.
- Observations
- Slight increase in unavailability when more hosts
per domain. Reason probability of successful
intrusion into a host same for all experiments, ?
more hosts in a domain ? greater chance that one
of them is corrupted (and detected) resulting in
exclusion of its entire domain. - Considerable waste of resources with more hosts
per domain, since the domain is excluded as soon
as a small number of hosts are infiltrated.
14Comparison of Domain-Exclusion and Host-Exclusion
Management Algorithms
Domain-exclusion algorithm excludes an entire
domain if a host (or a replica on it) is found to
be corrupt (preemptive strike). Host-exclusion
algorithm only excludes the relevant host.
Infiltration of host OS and services quintuples
vulnerability of replicas and management entity
on that host. 10 domains with 3 hosts each. 4
applications with 7 replicas each. Rate of spread
of attack within a domain (determines how quickly
and how much infiltration of a host affects other
hosts) increased from 0 to 1.
- Observations
- Unavailability does not change much with increase
in rate of spread for the domain-exclusion
scheme, while host-exclusion scheme is quite
sensitive to it.
15Effect of Rate of Misbehavior by Infiltrated
Replicas
4 applications with 7 replicas each. Normal IDS
rates detection of script-based attacks in
hosts 95, exploratory attacks 75, innovative
attacks 40, attacks on management entities and
application replicas 80 each. For 20 efficacy,
each rate reduced to 20 of normal. Study 10
domains with 2 hosts each. Attack on a host
increases the vulnerability of replicas running
on it 5 times. Cumulative base attack rate of 5,
with direct attacks on hosts, attacks on replicas
and attack on management distributed in 311
ratio.
- Observations
- Improvement in intrusion tolerance with increase
of misbehavior rate of infiltrated replicas. - For higher misbehavior rates, systems with better
intrusion detection perform worse! Reason
Majority of contribution to base attack rate from
attacks on hosts. When IDS is good, host
infiltration is detected before it has time to
spread to replicas on the host, resulting in
exclusion of domain (and host) though replicas on
it may not be corrupt.
16Data Collection
- Data collection
- Needed for estimating parameter values of the
models - Focus on different vulnerabilities, attacks,
workloads, - Status Network vulnerabilities
- Use of Nessus
- Run on 3 networks at UIUC (data collected on 225
hosts not analyzed yet) - Status Host vulnerabilities
- Development of new tool Ferret with
- Perl plug-ins (one plug-in for each vulnerability
checked) - Open source license
- Addition of plug-ins from the security community
- Analysis of a former data collection performed at
LAAS-CNRS - This presentation focuses on the data collection
performed at LAAS-CNRS and links some results to
some model parameters
17Data Collection Host Vulnerabilities
- Data collection performed at LAAS-CNRS during 21
months (1995-1997) - Host vulnerabilities collected on network used by
more than 800 users - Results based on
- Some well-known host vulnerabilities/configuration
features - Guessable passwords found by Crack (limited
dictionary/rules) - Experiment
- Goal observe the behavior of the users of a real
computer network without interferences (e.g.,
results were not reported to system
administrators) - LAAS-CNRS network is used by researchers and
students in various branches of Engineering (not
only computer science) - LAAS-CNRS network is representative of a moderate
secure network (at that time)
18Results Guessable Passwords
- Observations
- Overall increase of ratio of vulnerabilities /
users - Sharp decrease day 470 due probably to action of
syst admins - Jump day 285 due probably to a change of
dictionary used by Crack - Number of vulnerabilities changes but rate of new
vulnerabilities is stable (0.41 before day 282
and 0.4 after day 287) - Rate of removed vulnerabilities is also stable
- Use constant value of rates of new/removed
guessable passwords as characterization of
vulnerabilities due to guessable passwords?
19Results Vulnerabilities in Configuration Files
and .rhosts
- Configuration files
- .login, .logout, .xinitrc,
- Sharp increases/decreases combined with periods
of stability - Model of a step function for describing the ratio
of vulnerabilities / users? - .rhosts file
- Remarkable stability of ratio of vulnerabilities
/ users - Ratios between 2.5 and 4.5
- Use of constant value (interval or average) for
characterizing ratio?
.rhosts files
Vulnerabilities in configuration files
20Proposed Models
- We will focus on guessable passwords and
vulnerabilities in configuration files from now
on - Guessable passwords
- Combination of a linear function and a step
function - Vulnerabilities in configuration files
- Step function
Guessable passwords
Vulnerabilities in configuration files
21Linking Collected Data to Model Parameters
- Link to model parameters
- D time before infiltration of a security domain
- attack_host rate of attacks on a host
- D and attack_rate are function of number of
vulnerabilities and attacks - Let us assume a constant rate of attacks to
exploit guessable passwords (r_attack_ passwords)
and vulnerabilities in configuration files
(r_attack_configuration) - As a first approximation we have
- More work is needed to
- Obtain the distributions of attacks on various
vulnerabilities - Confirm models of vulnerabilities
- However, this first analysis already gives some
hints on the link between the collected data and
the parameter value estimations
22Summary
- Probabilistic validation is an useful technique
for validating intrusion-tolerant systems - It should be used in all phases of a systems
lifecycle - Models are useful for making comparative studies
and evaluating design alternatives, even if exact
parameter values are not known - Better parameter value estimation is necessary,
for implemented systems, to quantify intrusion
tolerance obtained - More work is needed to build better models, and
to better determine input parameter values