Title: AWDRAT: ArchitecturalDifferencing, Wrappers, Diagnosis, Recovery, Adaptive Software and Trust Manage
1AWDRAT Architectural-Differencing, Wrappers,
Diagnosis, Recovery, Adaptive Software and Trust
Management
- Howie Shrobe MIT CSAIL
- Bob Balzer Teknowledge
2AWDRAT What are we trying to do?
- Applications that continue to render useful
services even after successful attack. - Particularly for legacy information systems
Adaptive Software
Recovery and Regeneration
Trust Model Behavior Compromises Attacks
Diagnosis
Attack Plan Recognition
Architectural Differencing
Other Sensors Intrusion Detectors
Attack Plan
System Models
Wrapper Synthesizer
3The On One Foot Story
- We use an Adaptive Software that selects one of
several possible methods based on expected net
benefit - The code is annotated by Wrapper Generators
- We run code in parallel with a model
- Wrappers send events to Architectural
Differencing - Deviations between model predictions and
observations from the wrappers are symptoms - Diagnosis infers possible compromises of the
underlying resources and updates a Trust Model - Recovery is effected by restoring corrupted data
resources and picking new method in light of the
updated Trust Model
4Outline
- Review of AWDRAT
- Emphasis on major changes to the system
- Red Team
- Other Experimentation
- Detection, Diagnosis, Correction
- Modeling
- Next Steps
- New Directions
5AWDRAT
Adaptive (Decision Theoretic) Method Selection
Recovery and Regeneration
Trust Model Behavior Compromises Attacks
Diagnosis
Architectural Differencer
Wrapper Synthesizer
System Models
6Wrappers Make The System Transparent
- Methods are executed as raw code (particularly
when its legacy system) - How do we know whats going on?
- Wrappers inserted in good places
- Architectural model tells us what those are
- All Action Performed Methods (for Swing)
- Key Data Structure Manipulators
- Wrappers intercept events
- Wrappers squirrel away important information
safely
7JavaWrap
- Facility to insert Wrappers around Java code
without changing the source code - Depends on JVMTI to rewrite byte-code at class
loading time. - Three types of Wrappers
- Tracers Like LISP trace facility, prints
customizable entry and exit information - Monitors Get control both before and after real
method - Transformers Get control before and after,
controls whether real method is invoked and with
what arguments, controls what value is returned. - Transformers are used to implement dynamic
dispatch - Specified at start up with XML spec
- ltMETHOD signature "(Ljava/lang/StringZ)V"
- monitor
tek.mafMed.Mediators.ConstructMission /gt
8SafeFamily Wrappers
- SafeFamily wrappers watch all application
accesses to OS resources (files, registry keys,
IPs, and so on) - SafeFamily wrapper can detect, prevent or contain
accesses to critical resources
9Role in AWDRAT
- Attempts to access, or modify, critical resources
indicates the system has been corrupted or
hijacked, and should be reconstituted. - Attempts to open communications ports at
inappropriate times indicate system has been
hijacked - Execution in unprotected memory indicates memory
hijack - Rule violations are forwarded to the Diagnostic
Controller for further action
10The Execution Monitor Generator
- Originally we hand coded
- The JavaWrap specification
- The Wrapper Methods
- Architectural Differencer
- This was tedious, time consuming and error prone
- Solution Monitor Generator
Wrapper Spec
System Model
Wrappers
Execution Monitor
11- Each component can be annotated with
- Entry Events
- Exit Events
- Allowable Events
Control Flow
Data Flow
12Example Model
- (define-ensemble maf-create-events
- entry-events ((create-action-performed entry
(the-model))) - exit-events ((mission-builder-submit exit
(the-model))) - allowable-events ((set-initial-info exit))
- inputs (the-model)
- outputs (the-model)
- components ((get-next-cmd type
maf-get-next-cmd models (normal)) - (get-event-info type
maf-get-event-info - models (normal
compromised)) . ) - dataflows ((the-model maf-create-events
the-model join-exit-exit ) - controlflows ((after more-events?-exit
before join-exit-exit) - (after takeoff?-get-additional-info
before get-leg) - )
- splits ((more-events? maf-more-events? (cmd)
(build-event exit)) ) - joins ((join-events (the-model) (take-off
non-take-off)) ) - resources ((code-files loadable-files
(normal .8) (hacked .2))) - resource-mappings ((get-event-info
code-files) ) - vulnerabilities ((code-files loads-code)))
13Generating Plumbing
- For every event in the model
- Generate an entry in the JavaWrap Specification
- Generate a Java Wrapping method that creates an
entry event, calls the main method, then
generates an exit event - Certain wrapping clichés (startup method,
imposed dispatch, specified by single keyword) - For every data flow generate a trigger to move
the data (similar for control flow) - For every component generate a forward chaining
rule that triggers when all inputs are present
and checks the prerequisite conditions - For every component generate a forward chaining
rule that triggers on completion and asserts the
post-conditions.
14AWDRAT
- Architectural monitoring
- Wrapper synthesis
- Diagnosis
- Recovery and regeneration
- Adaptive method selection
- Trust modeling
Recovery and Regeneration
Adaptive Method Selection
Trust Model Behavior Compromises Attacks
Compromise Descriptions
Failure Localization
Diagnosis
Execution Discrepancies
Architecture level Execution Monitor
Event Stream
System Model
15AWDRAT Monitoring Implementation
Execution Monitor
Java Mediators
Lisp Mediators
Original Java Program
Application Tracking
Wrapper
Method
Event Stream
Wrapper
Wrapper
Method
Wrapper
Wrapper
Wrapper
Method Selection
Method
Application Scripting
Data Model
Duplicated Data Model
Integrity Checks
Reconstitution
16AWDRAT
Adaptive (Decision Theoretic) Method Selection
Recovery and Regeneration
Trust Model Behavior Compromises Attacks
Diagnosis
Architectural Differencer
Wrapper Synthesizer
System Models
17Architectural Differencing
Real Environment (Implementation)
in
out
Real
Component
List of
Conflicts
Reflection
Translator
Real
in
out
Output
Differencer
out'
in'
Simulated
Simulated
Output
Component
Simulated Environment (Model)
18Architectural Differencing
- The Architectural model is part of a methods
description. - The Architectural model is interpreted in
parallel with method execution - Wrappers send events to Arch Diff coordinator
- Coordinator checks that state of system at event
time is consistent with predictions - In particular it checks the prerequisite and
post-conditions of each sub-module - It also checks that the event received is
allowable - The failure of a condition check initiates
diagnostic reasoning
19The Execution Monitor
- Hierarchical Task Network
- Pre, Post conditions
- Entry, Exit, Allowable Events
- Data Flow, Control Flow, Splits and Joins
- AWDRAT Generates
- Plumbing to pick up the events and create event
stream - State Machine corresponding to task network,
receives event stream, checks for validity - Module States
- Inactive (data not available, preconditions not
satisfied) - Ready (data available)
- Running (initiating event seen)
- Completed (terminating event seen)
- An unclaimed event initiates diagnosis
20Monitor Algorithm
- Initially all modules are inactive
- When system starts up it creates a startup event
for the top level module - Top level module is put into its running state
- When a module enters running state
- Instantiate its sub-network, propagate input data
along data flows and control along control flow
links. - When a data arrives at input port
- Check if all data available
- if so enter ready state
- Check preconditions, signal if check fails
- When an event arrives
- Check if this is initiating event of a module in
ready state, if so change state to running,
capture input data in event and apply to input
ports - Check if its a terminating event of a running
module, if so change state to completed, capture
output data in event and apply to output ports,
check post-conditions, signal if check fails - Check if its an allowable event of a module in
running state, if so capture data in event and
apply to output ports - Otherwise signal an unclaimed event error
21Behavior Models
- Each mode (good, bad, ) of each component has a
behavior model - Preconditions, Post-conditions
- Generally about data-structure integrity
- At a very abstract level data-structures are
about sets, sequences, mappings - Introduced simple data modeling language
- Add-to-set, Add-to-mapping, Insert-in-sequence
- Delete-from-set, Delete-from-mapping,
Delete-from-Sequence - Default implementation for each
- Predicate to force consistency check
22Example Behavior Model
(defbehavior-model (maf-add-event-to-model
normal) inputs (the-event the-model)
outputs (the-model event-number)
prerequisites (dscs ?the-event event good
dscs ?the-model mission-builder good)
post-conditions (dscs ?the-model
mission-builder good add-to-map (events
?the-model)?event-number ?the-event
?before-maf-add-event-to-model))
23AWDRAT
Adaptive (Decision Theoretic) Method Selection
Recovery and Regeneration
Trust Model Behavior Compromises Attacks
Diagnosis
Architectural Differencer
Wrapper Synthesizer
System Models
24Dependency Maintenance
- Architectural Differencing actively checks those
prerequisite, post-conditions and other
constraints in the plan that are easily
observable. - The Diagnostic executive builds a dependency
graph between checked and inferred conditions - Post-conditions and events within a step are
justified with a link to the assumption that the
step executed normally and to the prerequisites
conditions. - Preconditions are justified by the causal link in
the plan that connects it to a set of
post-conditions of prior steps - If an check succeeds, that condition is justified
as a premise - If an check fails, diagnosis is initiated.
25Diagnosis with Fault Models
- In addition to modeling the normal behavior of
each component, we provide models of known
abnormal behaviors. - A Leak Model covers unknown failures.
- These alternative behavioral models are called
computational modes. - The diagnostic task is to find an assignment of a
mode to each computational step such that the
behavior predicted by the models associated with
those modes is consistent with the observations. - A set of assignments consistent with observations
is a diagnosis there may be several diagnoses. - A set of assignments at variance with
observations is a conflict.
Delay2,4
26Modeling Underlying Resources
- The misbehavior of the software component may
actually be due to a compromise of resources used
in that computation. - We extend the modeling framework showing the
dependence of computations on the resources - Each resource has models of its state of
compromise (I.e. its modes) - The modes of the resources are linked to the
modes of the computation by conditional
probabilities - E.g. if a computation resides on a node which
hosts a parasitic process, then the computation
is likely to be slowed down.
Conditional probability .2
Normal Probability 90 Hacked Probability 10
Normal Highjacked
Conditional probability .3
Has models
Has models
Component 1
Image-1
Uses Resource
27Adding Attack Models
- An Attack Model specifies the set of attacks that
are believed to be possible in the environment - Each resource has a set of vulnerabilities
- Vulnerabilities enable attacks on that resource
- Computational Vulnerability Analysis of the
actual configuration can determine the possible
attack model - Given a vulnerability and an attack that can
exploit the vulnerability it is possible that the
attack compromised the resource with the
vulnerability - This is a conditional probability
28Bayesian Dependency Diagram
Host1 Normal Mode
P .8
Step1
Step1 Normal Mode
Preconditions Step2
Logical and probability table
Checked, Pinned at P 1
Post-Condition1
Preconditions Step1
Logical or probability table
Preconditions Step3
Post-Condition2
Logical and probability table
Step1 Abnormal Mode1
Bogus Condition
P .9
Checked, Pinned at P 1
Host1 HighJacked
Bad Image File Attack
P .7
29Summary of Diagnosis
- The result of Diagnosis is the construction of a
Bayesian network coupling attacks, resource
vulnerabilities, compromised states of the
resources and finally the observed behavior of a
computation. - This network assigns posterior probabilities to
- Assertions modeling the state of the computation
- These assertions are the prerequisite and
post-conditions of the various computational
steps in the plan diagram - Compromised modes of the resources used by the
computation - The recovery task is to find a new plan and a new
set of resources that is most likely to achieve
the main goal of the plan, given this updated
probabilistic information about the world.
30AWDRAT
Adaptive (Decision Theoretic) Method Selection
Recovery and Regeneration
Trust Model Behavior Compromises Attacks
Diagnosis
Architectural Differencer
Wrapper Synthesizer
System Models
31The Trust Model
- The Trust Model Includes Probability for Each
Resource that it is in a compromised state. - Diagnosis Updates the Trust Model
- Trust Model is Read in Upon System Startup
- Trust Model Guides method selection
32AWDRAT
Recovery and Regeneration
Adaptive Software
Trust Model Behavior Compromises Attacks
Diagnosis
Architectural Differencer
Wrapper Synthesizer
System Models
33A system adapts by having many methods for each
service
Net benefit
The system selects the method which maximizes net
benefit
34Method Selection
- Given a service name and a utility function,
method selection uses a prolog-like query
language to - Find relevant methods
- Find resources meeting that methods constraints
- Bind the service qualities
- For each successful query it
- Calculates the resource cost
- Calculates the utility of the service parameters
- Calculates net benefit
- Selects method that maximizes net benefit
35Extension to Hierarchical Planning
- Utility function returns max and min utility, by
iterating over all combinations of unbound
parameters - A method can require sub-services
- Sub-service parameters unified with top-level
parameters - Depth first search
- Resources (and cost) accumulates
- More parameters bound -gt Max utility descreases
- Net utility descreases
- Branch-and-bound If current utility lt Max so
far, then back-track
36Decision Making with Compromises
- M is a method, the vector R is an assignment of
specific resources to M - Each resource Ri in R can be in one of several
specific modes - A resource state RS is an assignment of a
specific mode Ri,j to each resource Ri in R - The Trust Model (via diagnosis) assigns a
probability to each resource state - Given a method M and a resource state RS we can
calculate the vector of service qualities SQ(M,
RS) that will be delivered. - The Utility function U assigns a numeric value to
a vector of service qualities SQ consistent with
the requestors preferences
37Successful Execution
- Each method M has a set of preconditions Prei(M)
that are necessary for M to execute successfully. - Some preconditions may not be directly
observable, particularly at diagnosis and
recovery time. Instead diagnosis assigns a
probability to each of these and to their
conjunction P(LPrei(M)). - The expected benefit EBsuccess of successful
execution is the expected benefit conditioned by
this probability
38Failing Execution
- If the preconditions of M dont hold, then the
method will fail. - The failure can be assigned a cost FailCost(M)
- This is ideally calculated by using a simulation
model of the organzation (necessary for insider
threat). - But it can be provided by table lookup
- The expected cost of failure is this cost
weighted by the probability that the method will
fail due to the preconditions not being satisfied.
39Expected Net Cost Benefit
- The total expected benefit is the difference
between the expected benefit of success and the
expected cost of failure. - TotalEB(M,Ri) EBsuccess(M,Ri) - ECfail(M)
- Each vector of resources Ri has a cost RC(Ri)
- The Expected Cost Benefit is the difference
between Total Expected Benefit and the cost of
the resources - ECB(M, Ri) TotalEB(M, Ri) - RC(Ri)
40Optimal Resource Assignment
- The system should select that method and set of
resources that maximize the Expected Cost Benefit
difference
41AWDRAT
Adaptive (Decision Theoretic) Method Selection
Recovery and Regeneration
Trust Model Behavior Compromises Attacks
Diagnosis
Architectural Differencer
Wrapper Synthesizer
System Models
42The Recovery Process
- Recovery is driven the by the Trust Assessments
developed during diagnosis - World State
- Can the prerequisite conditions of a method be
assumed to hold and with what probability - Compromise state of resources
- Which resource are compromised
- In what way are they compromised
- Three core problems
- What resources to regenerate
- Where to restart
- How to continue after restart
- Regenerate if the delta in Expected Benefit is
greater than the cost of regeneration
43MAF-CAF Current Recovery
- During execution weve captured the execution
history and the intended state of the data
structures - Weve also updated the trust model based on
diagnosis of the last failure - Recovery can then be accomplished by restarting,
replaying the startup history (login, queries,) - Recovery continues by setting up the data
structures to the intended state - Method selection is driven by updated trust
estimates - Including choice of code resources, class path,
etc.
44Initial Situation
All Images Trusted Native Method Selected
Create Initial Mission Data
Create Leg
Add Leg to Plan
Publish Plan
Save Plan
Execution Proceeds This Far, No Discrepancies
45The Takeover
Create Initial Mission Data
Create Leg
Add Leg to Plan
Publish Plan
Save Plan
Execution Proceeds This Far, Data Structures
Corrupted
46The Recovery
AWDRAT Controller
Update Trust Model
Dump out good version of mission plan
47Regeneration Through Adaptation
Complete Operation and Continue past the attack
Regenerate Mission-Plan
- Avoid Unsafe Methods
- Avoid Corrupted Code Resources by using backup
copies
Update Trust Model
Suspect Image, Java Method Wins
JAVA Method
Load-Image Service
Trusted Image, Native Method Wins
Native Method
48Experimentation
49AWDRAT Goals in Protecting MAF
- The data structures maintained by the application
should accurately reflect user requests - Each field of each data structure should be what
the GUI requested - No other fields should be present
- All the fields requested by the GUI should be
present - The application should not be able to open a port
to any application other than to the JBI servers
with which it must communicate (i.e. the JBOSS
server). - The application should not be able to open a port
to a foreign host - The application should not be able to write files
except to the specified directories that
constitute its temporary storage areas.
50Blue View of Red Team Exercise
- Red-Team provided valuable system-wide analysis
- Red and Blue did not spend nearly enough time
reaching common understanding - Focus of Our research
- Difficulty of testing client application
- Difficulty of testing Java application (few
standard exploits) - Blue didnt stage test system far enough in
advance - Configuration errors, system crash 2 days before
test - Several things that should have worked failed due
to simple bugs - Red-team was not prepared to test what mattered
to us - On the fly, collaborative, ad hoc inserting of
bad code - Definition of success wasnt always clear
- Red-Team provided valuable long term suggestions
51Experimental Methodology
- Modification of class files.
- Hand insert attack code into a Java source file,
compile and place the class file into the
appropriate directory on the Java Classpath. - Random attacks.
- The AWDRAT controller launches a parasitic thread
on startup. This thread waits for a random delay
and then attempts to violate one of the
constraints. Which type of violation (and in the
case of data integrity which field of which data
structure) is chosen randomly. - Wrapped methods.
- A wrapped method is used as a place from which to
launch a simulated attack for example, by
wrapping the loadImage method, one can simulate
an attack payload that is carried by a corrupted
image file
52MAF API Levels
MAF API
Lower API
53Experimental Results
Diagnosis includes identification of where the
failure ocurred, the nature of the failure (bad
event, corrupted data) and whether code-files vs.
data files are likely candidates.
54Modeling Experiment 1
- Re-implemented AWDRAT framework after last PI
meeting - At least 1 intent was to simplify building of
model - Used knowledge gained in prior experience to
inform model development - Modeled more of MAF system at much finer grain
- Included code files as resources
- About 1 week of effort intermixed with debugging
underlying infrastructure - 448 lines, could be shorter
- 39 events
- 23 hooked events
55Modeling Experiment 2
- Took a modest sized sub-system of Intelligent
Room software - Distributed Java-based system
- Server for device switchboard
- Decent documentation available
- Sketched coarse-grained architectural model
- Coded up first cut model
- Total Time
- 1 summer intern week
- 2 master student weeks
- Both starting from dead start on AWDRAT and MAF
56What improvements are possible
- Coordination between SafeFamily and Execution
Monitor - SafeFamily Wrappers are but not context sensitive
- Its allowed to write to directory xyz
- But only at specific points of the program
- SafeFamily should block, tell Execution Monitor
which can either initiate diagnosis or return and
allow the action - Develop more primitives to simplify behavior
modeling - Use in both server and client of larger
application - Application to less safe implementation language
- Hardening AWDRAT infrastructure
- Execution monitor in separate process
- Safe and reliable communication between the two
- Use redundant storage for backup data
57Finer Grained Recovery
- Currently we always recover by restarting
- Could recover by dynamic reloading of code
(class) files in running application - Modularize code resources into loadable chunks
- Perhaps JAR file level
- Rebuild data structures after code reload
- Need to know whether to switch code-file
resources - Fingerprint code files
- Need finer grained modeling of dependencies
between execution and code resources
58Use of Wrappers for Exploratory Modeling
- Building an understanding of the application is
the first step of modeling - Use wrappers to gather data for machine (and/or
human) learning. - Main approach is observational, explanation-based
generalization - Method wrappers collect control and data flow
- Multiple runs inform generalization
- DLL wrappers (SafeFamily) detect dangerous
actions - Method wrappers (JavaWrap) provides the context
- Statistical techniques characterize
non-functional properties
59Significant Improvements
60Broadening Detection and Diagnosis
61Planning and Counter Planning
- AWDRAT currently reasons backward from
misbehavior to compromise, to attack - Use planning technology to generate attack plans
- We have already done some of this
- Planning as if we were the Red Team
- Non-linear, temporally extended plans
- Plan recognition technology generates recognizer
that projects forward from tell-tales of attack
to probability of compromise and misbehavior - Game-theoretic counter planning
- Adaptive method selection
- Limited horizon expected value mini-max
- Insertion of counter measures (e.g. taste-tester,
honey-pots, increased run-time variability,
contained execution)
62Controlling components (2)
- Another way to control a component is to find an
input to the component and then find a way to
modify the input - Modify the scheduler policy parameters
Scheduler
Scheduler
Input to
control by
Scheduler Policy Parameters
63Counter Planning to Attack
64Use of Wrappers as Sensors
- Current sensors are imprecise and uninformative
- Network based sensors dont observe effect on
host - Host based sensors are typically profile based
- Pervasive use of wrappers in core host software
- Using such wrappers we can follow development of
an attack through the host(s) processes - Models (even coarse grained) can identify when
OS/DLLs are take off-course - Need to reason about where to place them
- Normal program flow
- Attacker plan flow
- Cost-benefit
65Machine Learning
- Approach is observational, explanation-based
generalization - Single-shot, and few example learning
- Relies on extensive knowledge base and reasoning
capabilities - Observing program traces leads to AWDRAT system
model - Planning and counter-planning leads to attack
plan recognition - Sensitivity to attacker delays and lack of delays
66AWDRAT
- Self Monitoring is important and feasible
- Even coarse grain monitoring provides information
- Diagnosis can provide useful Trust Information
- Even with coarse grained modeling
- Trust Information can Inform Adaptive Choice
- Adaptive Choice can avoid compromised resources
- Wrappers can provide visibility and control for
legacy applications - Without (extensive) rewriting in our experience.