Title: AWDRAT: ArchitecturalDifferencing, Wrappers, Diagnosis, Recovery, Adaptive Software and Trust Manage
1AWDRAT Architectural-Differencing, Wrappers,
Diagnosis, Recovery, Adaptive Software and Trust
Management
- Howie Shrobe MIT CSAIL
- Bob Balzer Teknowledge
2AWDRAT What are we trying to do?
- Applications that continue to render useful
services even after successful attack. - Particularly for Legacy Systems
3Overview Questions
- How is it done now?
- Hand inserted tests, assertions, error handlers.
Rarely done systematically. - For Legacy systems, youre often SOL.
- How do we make a difference?
- Systematizing checking, diagnosis and recovery
and providing the core of each service. - Risks and Mitigations
- Too big or too small a system. Start Smallish
(MAF editor) grow as capable. - Build Individual Facilities that Have Independent
value.
4AWDRAT How do we show success?
- Detect incorrect application behavior
- Correctly diagnose the cause
- Choose appropriate alternative method to realize
goal - Red Team experiments
- In Lab experiments
5AWDRAT Technical Approach
Recovery and Regeneration
Adaptive Software
Trust Model Behavior Compromises Attacks
Diagnosis
Attack Plan Recognition
Architectural Differencing
Other Sensors Intrusion Detectors
Attack Plan
Wrapper Synthesizer
System Models
6The On One Foot Story
- We use an Adaptive Software that selects one of
several possible methods based on expected net
benefit - The code is annotated by Wrapper Generators
- We run code in parallel with a model
- Wrappers send events to Architectural
Differencing - Deviations between model predictions and
observations from the wrappers are symptoms - Diagnosis infers possible compromises of the
underlying resources and updates a Trust Model - Recovery is effected by restoring corrupted data
resources and picking new method in light of the
updated Trust Model
7Distinguishing AWDRAT PMOP
- AWDRAT
- Detecting misbehaving software
- Hijacks, overprivileged scripts, trap doors,
faults - PMOP
- Detecting misbehaving (human) operators
- Malicious intent, operator error
8JBI DemVal Dataflow(via Publish/Subscribe)
9What Weve Got
The Good The Bad The Ugly
- End-To-End Demonstration (demo shortly)
- Working Prototypes of AWDRAT components
- Working models rules of target application
- Working integration of AWDRAT components
- A day late and a JVM incompatibility behind
- Architecture Visualizer (demo shortly)
- Event-Sequence diagrams
- Architecture dataflow
10What Were Missing
The Good The Bad The Ugly
- Realistic Rules (Domain Knowledgeable)
- Would be created by SMEs in real deployment
- Comprehensive Rule Set
- Would be created by SMEs in real deployment
- ??
11Accommodations
The Good The Bad The Ugly
- Java code base
- Created wrapper infrastructure for Java
- Limited Library of Alternative Java Methods
- Utilized alternative Windows Libraries
- Available JBI components to wrap
- Detailed on next slide
12JBI DemVal Dataflow(via Publish/Subscribe)
External
AODB
MAF
Proposed MI
AS
Approved MI
CAF
LOC
JEES
EDC
JW
SPI
TAP
ATO
Chem Hazard
CHW
CHI
TNL
Targeting
EDC
CHW
WLC
Chem Hazard
Weather Hazard
CHA
WH
Combat Ops
13AWDRAT
14A system adapts by having many methods for each
service
The system selects the method which maximizes net
benefit
15Methods are Described at the Plan Level
- Executable Code
- Selection Meta Data
- Resource requirements
- Constraints on the resources
- Qualities of service delivered given a set of
resources - Architectural model
- Decomposition into sub-modules
- Data and Control Flow
- Pre, Post and Maintain conditions for each
component - Causal links between these conditions
- Primitive, directly executable sub-modules
- Expected and prohibited events
- Timing constraints
16Selection Meta Data Language
(define-service (image-load image-loaded ?image
?path) (speed fast slow) (image-quality
high low) (safety checked unchecked)) (define
-method native-image-load service
image-load features ((speed fast)
(image-quality ?quality-of-image-type)
(safety unchecked))
other-parameters (?path) resources
(?image-file) resource-constraints
(image-file-exists ?path ?image-type
?image-file
image-type-consistent-with-method ?image-type
native-image-load
image-quality ?image-type
?quality-of-image-type) ) (define-method
pure-java-image-load service image-load
features ((speed slow)
(image-quality ?quality-of-image-type)
(safety checked)) other-parameters
(?path) resources (?image-file)
resource-constraints (image-file-exists ?path
?image-type ?image-file
image-type-consistent-with-method ?image-type
pure-java-image-load
image-quality ?image-type
?quality-of-image-type) )
17UTILITY LANGUAGE
(defun decide-how-to-load-images (max-value
path) (let ((utility-function
(utility-function-for-service 'image-load
'((speed fast (gtgt 1.1) speed slow)
(image-quality high (gtgt 1.5) image-quality low)
(safety checked (gtgt 2) safety unchecked)
(speed fast (gtgt 1.1) safety checked)
(image-quality high (gtgt 1.2) safety checked)
(speed fast (gtgt 1.5) image-quality high))
max-value ))) (find-em 'image-load
utility-function nil (list path))))
18Preferences and Utility Functions
- Utility functions are used to assign a numerical
value to a particular way of doing a task. - Utility functions are not a natural way for
people to express themselves - What people can state easily is their preferences
- E.g. Security and Speed are twice as good as High
Image Quality - A typical set of preference statements
- I prefer convenience of use to high security if
Im not under attack. - I prefer high security to convenience if Im
under attack - Preferences are compiled into Utility Functions
19How to Compile a Utility Function
- Convert preferences into a bit-vectors of
variables - Each multi-valued attribute assigned sub-vector
to cover its range of values - Bit-vectors form the nodes of a graph
- Preferences are compiled into weighted arcs
- Leaf nodes have value 1
- Other nodes have the least value consistent with
the arc weights - Compute using Dynamic Programming
20Method Selection
- Given a service name and a utility function,
method selection uses a prolog-like query
language to - Find relevant methods
- Find resources meeting that methods constraints
- Bind the service qualities
- For each successful query it
- Calculates the resource cost
- Calculates the utility of the service parameters
- Calculates net benefit
- Selects method that maximizes net benefit
21Decision Making with Compromises
- M is a method, the vector R is an assignment of
specific resources to M - Each resource Ri in R can be in one of several
specific modes - A resource state RS is an assignment of a
specific mode Ri,j to each resource Ri in R - The Trust Model (via diagnosis) assigns a
probability to each resource state - Given a method M and a resource state RS we can
calculate the vector of service qualities SQ(M,
RS) that will be delivered. - The Utility function U assigns a numeric value to
a vector of service qualities SQ consistent with
the requestors preferences
22Expected Benefit of a Resourced Method
- Where P(RSk) is the joint probability that each
Resource Ri is in the mode indicated by RSk as
assigned by the Trust Model.
23Successful Execution
- Each method M has a set of preconditions Prei(M)
that are necessary for M to execute successfully. - Some preconditions may not be directly
observable, particularly at diagnosis and
recovery time. Instead diagnosis assigns a
probability to each of these and to their
conjunction P(LPrei(M)). - The expected benefit EBsuccess of successful
execution is the expected benefit conditioned by
this probability
24Failing Execution
- If the preconditions of M dont hold, then the
method will fail. - The failure can be assigned a cost FailCost(M)
- This is ideally calculated by using a simulation
model of the organzation (necessary for insider
threat). - But it can be provided by table lookup
- The expected cost of failure is this cost
weighted by the probability that the method will
fail due to the preconditions not being satisfied.
25Expected Net Cost Benefit
- The total expected benefit is the difference
between the expected benefit of success and the
expected cost of failure. - TotalEB(M,Ri) EBsuccess(M,Ri) - ECfail(M)
- Each vector of resources Ri has a cost RC(Ri)
- The Expected Cost Benefit is the difference
between Total Expected Benefit and the cost of
the resources - ECB(M, Ri) TotalEB(M, Ri) - RC(Ri)
26Optimal Resource Assignment
- The system should select that method and set of
resources that maximize the Expected Cost Benefit
difference
27Decision Making Good Case
- Method NATIVE-IMAGE-LOAD
- Features SPEED FAST, IMAGE-QUALITY HIGH,
SAFETY UNCHECKED - Resources /foo/bar/baz.gif
- Resource Cost 0 Failure Cost 0 Utility 5.0
tradeoff 5.0 - Method NATIVE-IMAGE-LOAD
- Features SPEED FAST, IMAGE-QUALITY HIGH,
SAFETY UNCHECKED - Resources /foo/bar/baz.jpg
- Resource Cost 0 Failure Cost 0 Utility 5.0
tradeoff 5.0 - Method PURE-JAVA-IMAGE-LOAD
- Features SPEED SLOW, IMAGE-QUALITY HIGH,
SAFETY CHECKED - Resources /foo/bar/baz.gif
- Resource Cost 0 Failure Cost 0.0 Utility
4.444444 tradeoff 4.444444 - Method PURE-JAVA-IMAGE-LOAD
- Features SPEED SLOW, IMAGE-QUALITY HIGH,
SAFETY CHECKED
28Decision Making Bad Case
- Method NATIVE-IMAGE-LOAD
- Features SPEED FAST, IMAGE-QUALITY HIGH,
SAFETY UNCHECKED - Resources /foo/bar/baz.gif
- Resource Cost 0 Failure Cost 9.0 Utility 0.5
tradeoff -8.5 - Method NATIVE-IMAGE-LOAD
- Features SPEED FAST, IMAGE-QUALITY HIGH,
SAFETY UNCHECKED - Resources /foo/bar/baz.jpg
- Resource Cost 0 Failure Cost 9.9 Utility
0.050000004 tradeoff -9.849999 - Method PURE-JAVA-IMAGE-LOAD
- Features SPEED SLOW, IMAGE-QUALITY HIGH,
SAFETY CHECKED - Resources /foo/bar/baz.gif
- Resource Cost 0 Failure Cost 0.4995 Utility
0.0044444446 tradeoff -0.49505556 - Method PURE-JAVA-IMAGE-LOAD
- Features SPEED SLOW, IMAGE-QUALITY HIGH,
SAFETY CHECKED - Resources /foo/bar/baz.jpg
- Resource Cost 0 Failure Cost 0.49995 Utility
4.4444445e-4 tradeoff -0.49950555
29AWDRAT
30Wrappers Make The System Transparent
- Methods are executed as raw code (particularly
when its legacy system) - How do we know whats going on?
- Wrappers inserted in good places
- Architectural model tells us what those are
- Wrappers intercept events
- Wrappers squirrel away important information
safely
31Wrapper Synthesis
Simulated
Component
in'
out'
Differencer
D1
in
out
Backup
Translator
D1
in
in
Real
Real
Real
Component
out
Component
Component
out
Automaticallygenerate Probes Plumbing
Monitoring Backup Data
Translator
in
out
Differencer
out'
in'
Simulated
Component
32Data Provisioning
Simulated
Component
in'
out'
Differencer
D1
in
out
Backup
Translator
D1
in
in
Real
Real
Real
Component
out
Component
Component
out
Translator
Provide backupcopies of data resources
in
out
Differencer
out'
in'
Simulated
Component
33JavaWrap
- Facility to insert Wrappers around Java code
without changing the source code - Depends on JVMTI to rewrite byte-code at class
loading time. - Three types of Wrappers
- Tracers Like LISP trace facility, prints
customizable entry and exit information - Monitors Get control both before and after real
method - Transformers Get control before and after,
controls whether real method is invoked and with
what arguments, controls what value is returned. - Transformers are used to implement dynamic
dispatch - Specified at start up with XML spec
- ltMETHOD signature "(Ljava/lang/StringZ)V"
- monitor
tek.mafMed.Mediators.ConstructMission /gt
34AWDRAT Execution Architecture
JBI Server
35DataFlowDemo
36Event DiagramDemo
37AWDRAT
38Architectural Differencing
- The Architectural model is part of a methods
description. - Architectural model is interpreted in parallel
with method execution - Wrappers send events to Arch Diff coordinator
- Coordinator checks that state of system at event
time is consistent with predictions - In particular it checks the prerequisite and
post-conditions of each sub-module - The failure of a condition check initiates
diagnostic reasoning
39Architectural Differencing
Real Environment (Implementation)
in
out
Real
Component
List of
Conflicts
Reflection
Translator
Real
in
out
Output
Differencer
out'
in'
Simulated
Simulated
Output
Component
Simulated Environment (Model)
40MAF - CAF Architectural Differencing
- MAF is a Flight Plan Graphic Editor
- Typical GUI Program
- GUI actions invoke actions on core data
structures - MissionObject, Events, Legs, Sorties, Movements
- Differencer Checks the Validity of Basic
Operations on These Data Structures - Consistency of Structures
- Add operators dont delete and actually Insert
the Intended Stuff - Maintains a Running Simulation of These
Operations - Uses Lisp-Java integration to do this in Lisp
41ArchitectureDifferencerDemo
42AWDRAT
43Model Based Diagnosis
- Focus is on diagnosing misbehaviors of
Computations in order to assess the health of the
underlying resources - Given
- Plan Structure of the Computation describing
expected behavior - Observation of actual behavior that deviates from
expectations - Produce
- Localization which computational steps
misbehaved - Characterization what did they do wrong
- Inferences about the compromised state of the
computational resources involved. - Inferences about what attacks enabled the
compromise to occur - The likelihood that other resources have been
compromised - The likelihood that critical constraints have not
been satisfied
44Ontology of the Diagnostic Task
- A Computation is the execution of a piece of code
on some computational Host - Computations utilize a set of resources (e.g.
host computers, binary executable files,
databases etc.) - Resources have vulnerabilities
- Vulnerabilities enable attacks
- An successful attack on a resource causes that
resource to enter a compromised state - A computation that utilizes a compromised
resource may exhibit a misbehavior. - Misbehaviors are the symptoms which initiate
diagnostic activity, leading to updated
assessments of the state of the computational
environment. These form the Trust Model.
45Dependency Maintenance
- Architectural Differencing actively checks those
prerequisite, post-conditions and other
constraints in the plan that are easily
observable. - The Diagnostic executive builds a dependency
graph between checked and inferred conditions - Post-conditions and events within a step are
justified with a link to the assumption that the
step executed normally and to the prerequisites
conditions. - Preconditions are justified by the causal link in
the plan that connects it to a set of
post-conditions of prior steps - If an check succeeds, that condition is justified
as a premise - If an check fails, diagnosis is initiated.
46Dependency Chains Built by Model Simulation
Step1
Step1 Normal Mode
Preconditions Step2
Checked, Treated as premise
Post-Condition1
Preconditions Step1
Preconditions Step3
Post-Condition2
Solid arrows are P 1
Checked, Treated as premise
47Diagnosis with Fault Models
- In addition to modeling the normal behavior of
each component, we provide models of known
abnormal behaviors. - A Leak Model covers unknown failures.
- These alternative behavioral models are called
computational modes. - The diagnostic task is to find an assignment of a
mode to each computational step such that the
behavior predicted by the models associated with
those modes is consistent with the observations. - A set of assignments consistent with observations
is a diagnosis there may be several diagnoses. - A set of assignments at variance with
observations is a conflict.
Delay2,4
48Dependency Chains Computational Modes
Step1
Step1 Normal Mode
Unjustified if in abnormal mode
Preconditions Step2
Checked, Treated as premise
Post-Condition1
Preconditions Step1
Preconditions Step3
Post-Condition2
Step1 Abnormal Mode1
Bogus Condition
Solid arrows are P 1
Checked, Treated as premise
49Modeling Underlying Resources
- The misbehavior of the software component may
actually be due to a compromise of resources used
in that computation. - We extend the modeling framework showing the
dependence of computations on the resources - Each resource has models of its state of
compromise (I.e. its modes) - The modes of the resources are linked to the
modes of the computation by conditional
probabilities - E.g. if a computation resides on a node which
hosts a parasitic process, then the computation
is likely to be slowed down.
Conditional probability .2
Normal Probability 90 Hacked Probability 10
Normal Highjacked
Conditional probability .3
Has models
Has models
Component 1
Image-1
Uses Resource
50Bayesian Dependency Diagram
Host1 Normal Mode
P .8
Step1
Step1 Normal Mode
Preconditions Step2
Logical and probability table
Checked, Pinned at P 1
Post-Condition1
Preconditions Step1
Logical or probability table
Preconditions Step3
Post-Condition2
Logical and probability table
Step1 HIGHJACKED
Bogus Condition
P .9
Solid arrows are P 1
Checked, Pinned at P 1
IMAGE1 Abnormal Mode
51Adding Attack Models
- An Attack Model specifies the set of attacks that
are believed to be possible in the environment - Each resource has a set of vulnerabilities
- Vulnerabilities enable attacks on that resource
- Computational Vulnerability Analysis of the
actual configuration can determine the possible
attack model - Given a vulnerability and an attack that can
exploit the vulnerability it is possible that the
attack compromised the resource with the
vulnerability - This is a conditional probability
52Bayesian Dependency Diagram
Host1 Normal Mode
P .8
Step1
Step1 Normal Mode
Preconditions Step2
Logical and probability table
Checked, Pinned at P 1
Post-Condition1
Preconditions Step1
Logical or probability table
Preconditions Step3
Post-Condition2
Logical and probability table
Step1 Abnormal Mode1
Bogus Condition
P .9
Checked, Pinned at P 1
Host1 HighJacked
Bad Image File Attack
P .7
53Diagnostic Algorithm
- Start with each computation step in the normal
mode - Repeat Check for Consistency of the current
model with observations - If inconsistent then its a conflict
- Add a new node to the Bayesian Dependency network
- This node represents the logical-and of the modes
in the conflict. - Its truth-value is pinned at FALSE.
- Prune out all possible solutions which are a
super-set of the conflict set. - Pick another set of models from the remaining
solutions - If consistent, Add to the set of possible
diagnoses - Continue until all inconsistent sets of models
(conflicts) are found - Solve the Bayesian network
54What the Bayesian Network Tells You
- After adding all conflict nodes to the Bayesian
network - The posterior probabilities of the underlying
resource modes tell you how likely each
compromised (or healthy) mode is. - This is an aggregate estimate
- These probabilities are part of the trust-model
and guide resource selection in recovery and
future computations. - The posterior probability of each post-condition
assertion - This is an aggregate estimate
- This gives us an estimate of what conditions are
actually true. This also guides recovery. - The posterior probability of each possible attack
- This implies possible compromises of other
similar resources that have not yet been observed
and that will also guide recovery.
55Three Tiered Model
Program Memory
- The resource tier couples the modes of the
computation tier - The attack tier couples the modes of the resource
tier. - We have 2 tiers of common mode failures.
- Common mode coupling also precludes certain
diagnoses on the grounds that no single attack
could have caused the compromises necessary to
cause the components to misbehave as observed.
56Summary of Diagnosis
- The result of Diagnosis is the construction of a
Bayesian network coupling attacks, resource
vulnerabilities, compromised states of the
resources and finally the observed behavior of a
computation. - This network assigns posterior probabilities to
- Assertions modeling the state of the computation
- These assertions are the prerequisite and
post-conditions of the various computational
steps in the plan diagram - Compromised modes of the resources used by the
computation - The recovery task is to find a new plan and a new
set of resources that is most likely to achieve
the main goal of the plan, given this updated
probabilistic information about the world.
57Example of MAF Diagnosis
58AWDRAT
59The Nature of a Trust Model
- Trust is a continuous, probabilistic notion
- All computational resources must be considered
suspect to some degree. - Trust is a dynamic notion
- the degree of trustworthiness may change with
further compromises - the degree of trustworthiness may change with
efforts at amelioration - The degree of trustworthiness may depend on the
political situation and the motivation of a
potential attacker - Trust is a multidimensional notion
- A System may be trusted to deliver a message
which not being trusted to preserve its privacy. - A system may be unsafe for one user but
relatively safe for another.
60Three Tiers of a Trust Model
- Attack Level history of events that suggest
multi-stage attacks and intent of attackers - penetration, denial of service, unusual access,
Flooding - Compromise Level state of the mechanisms that
provide key properties - Login control
- Job admission control
- Scheduler
- Key manager
- DLLs, databases, source code
- Trust Level degree of confidence in key
properties - Privacy stolen passwords, stolen data, packet
snooping - Integrity parasitized, changed data, changed
code - Authentication changed keys, stolen keys
- QoS slow execution
61How Do We Know What Attacks Are Possible?
- Build a Model of the Computational Environment
- System Structure, Resources, Permissions
- Plan Against it As is Youre The Red Team
- Reason Abstractly
- Typical Attackers
- Typical Resource of Specific Type
- Reason about control and dependency
- Develop Multistage plan for compromising a
typical resource
62Modeling System Structure
File System
files
Part-of
resources
Operating System
Access Controller
Hardware
controls
Logon Controller
User Set
Processor
Part-of
Part-of
Input-to
Job Admitter
Memory
Device Controllers
controls
Scheduler
Work Load
controls
Input-to
Devices
Device Drivers
Scheduler Policy
Resides-In
controls
63Modeling the topology
Machine name sleepy OS Type Windows-NT Server
Suite IIS.. User Authentication Pool Dwarfs
Switch subnet restrictions. .
Switch subnet restrictions. .
Router Enclave restrictions. .
Topology tells you who can share (and sniff)
which packets who can affect what types of
connections to whom
64Key Notions Dependency and Control
- Start with the desirable properties of systems
- Reliable performance
- Privacy of communications
- Integrity and/or privacy of data
- Analyze which system components impact those
properties - Performance - scheduler
- Privacy - access-controller
- Rule 1 To affect a desirable property control a
component that contributes to the delivery of
that property
65Controlling components (1)
- One way to gain control of a component is to
directly exploit a known vulnerability - One way to control a Microsoft IIS web server is
to use a buffer overflow attack on it.
MAF Editor
Image Loading
Is vulnerable to
Takes control of
Malformed Image File Attack
Malformed File Attack
66Controlling components (2)
- Another way to control a component is to find an
input to the component and then find a way to
modify the input - Modify the scheduler policy parameters
Scheduler
Scheduler
Input to
control by
Scheduler Policy Parameters
67Modifying Data
- One way to modify data is to find a component
which controls data and then to find a way to
gain control of that component
Scheduler
Scheduler
Input-of
control by
Workload
Controls
Job Admitter
Workload
Controls
Controls
Job Admitter
Attack.
68Affecting Data Integrity of MAF Plan
69AWDRAT
70The Recovery Process
- Recovery is driven the by the Trust Assessments
developed during diagnosis - World State
- Can the prerequisite conditions of a method be
assumed to hold and with what probability - Compromise state of resources
- Which resource are compromised
- In what way are they compromised
- Three core problems
- What resources to regenerate
- Where to restart
- How to continue after restart
- Regenerate if the delta in Expected Benefit is
greater than the cost of regeneration
71MAF-CAF Recovery
- During execution weve captured the execution
history and the intended state of the data
structures - Weve also updated the trust model based on
diagnosis of the last failure - Recovery can then be accomplished by restarting,
replaying the history - Recovery can also be accomplished by restarting
and setting up the data structures to the
intended state (if a complete enough trace was
built) - In either case, method selection will be driven
by updated trust estimates
72Client ReconstitutionDemo
73(No Transcript)
74Technology Developed
- Java Wrappers
- Windows DLL Wrappers ported to Java
- Java methods wrapped with Java mediation code
- Architecture Differencer
- Fine grained checking of MAF-CAF execution
- Architecture Visualizer
- Vulnerability Analysis for JBI Scenario
- Diagnosis for MAF compromises
- Method diversity and dynamic dispatch for image
loading - Data Provisioning Client Reconstitution