Title: Carnegie Mellon University DYNAMiX Technologies
1Carnegie Mellon UniversityDYNAMiX Technologies
RAPIDRepresentation and Analysis
ofProbabilistic Intelligence Data
- July 19, 2007
- Kick-off Meeting
2People
Carnegie Mellon University Prof. Jaime
Carbonell Dr. Eugene Fink Dr. Chun Jin Two
graduate students
DYNAMiX Technologies Dr. Ganesh Mani Mr. Dwight
Dietrich Development team
3Motivation
- We are developing tools for the analysis of
dynamically evolving intelligence, which may
include uncertain and partially missing data. - These tools will help analysts to draw
conclusions based on available intelligence,
identify critical uncertainties, and develop
strategies for proactive collection of additional
intelligence.
4Puzzle-solving analogy
Which missing parts are most helpful and when?
Available knowledge
Observable facts
Hiddenfacts
Initial knowledge
- Knowledge sources
- Public domain
- Intelligence data collection
- Inferences and conclusions
5Example
- Assessment of the potential WMDcapabilities of
hostile countries.
- Identify a potential threat based on available
intelligenceThe nation of Akbarstan may be
developing nuclear weapons
- Formulate the related specific hypothesesAkbarst
ani may be secretly acquiring fissionable
materials andbuilding an underground nuclear
facility to the north of their capital
- If the available intelligence is insufficient for
validating or refuting these hypotheses, collect
additional intelligenceUse UAVs to track the
deliveries to the suspected nuclear facility
- Re-evaluate the threat based on new
intelligenceAkbarstan may be developing
chemical rather than nuclear weapons
6Example
- Assessment of the potential WMDcapabilities of
hostile countries.
- Identify a potential threat based on available
intelligence
- Formulate the related specific hypotheses
- If the available intelligence is insufficient for
validating or refuting these hypotheses, collect
additional intelligence
- Re-evaluate the threat based on new
intelligenceAkbarstan may be developing
chemical rather than nuclear weapons
7Innovative claims
- Suite of intelligent tools for identification of
hidden patterns in uncertain intelligence data - Automated analysis of critical uncertainties and
development of intelligence-collection plans - Collaboration between human analysts and
automated data-processing engines
8Previous work ARGUS
- ARGUS project sponsored by DTO/ARDA
Identification and tracking of novel patterns in
massive databases and data streams.
Create
Detect
Create
Detect
Novel
Novel
Historical
Background
Novel
Historical
Background
Novel
Background
Novel
Re
-
cluster
Background
Novel
Re
-
cluster
Clusters
Clusters
Data
Model
Events
Data
Model
Events
Analyst
Model
Events
Model
Events
Tracked
New
Events
Data
Generate
Generate
Update
New
New
Match
Match
Profiles
Alerts
Profiles
Alerts
Profiles
Profiles
Profiles
Profiles
Profiles
Analyst
9ARGUS novelty detection
- Estimate density function at t0
- Grow the cluster for a period of ?t while
reducing the weight of old records - Estimate the new density function at t0?t
- Compare the two estimates
10ARGUS novelty detection
Respiratory Diseases
SARS
Re-clustering
t0
?t
11Previous work RADAR
- RADAR project sponsored by DARPAAnalysis and
management of volatile crisis situations based on
uncertain data.
Top-level control and learning
Processnew data
Analyst
12Previous work RADAR
We have applied the system to repair a schedule
of a conference after a crisis loss of rooms.
13Proposed RAPID functionality
- Representation of uncertainty
- Inferences from uncertain data
- Analysis of critical uncertainties
- Predictive Markov models
- Graphical user interface
- Unlike ARGUS
- Represents and analyzes uncertainty
- Supports complex inferences
- Analyzes possible adversarial actions
- Unlike RADAR
- Scales to massive intelligence datasets
- Analyzes complex external situations
- Develops intelligence-collection plans
14Proposed functionality
Learning of new knowledge
Knowledge editing
Knowledge base
Fast matchingand retrieval
Inferencerules
Markov models
Uncertain situationassessment
Real-time responses
New intelligence
Massive databases
Analyst
Intelligent tools for data analysis
Contingency analysis
Adversarialsearch
Explanationof inferences
Identification ofcritical uncertainties
15Representation of uncertain intelligence data
- Uncertain nominals, numbers, strings, spatial
data, graph topologies, and functions - Indexing of massive uncertain data, and fast
retrieval of exact and approximate matches
Possible values
16Inferences from uncertain intelligence data
- Representation of dependencies among data by
inference rules - Fast propagation of inferences through
large-scale networks of dependencies
17Proactive collection of intelligence data
- Automated identification of critical
uncertainties - Planning of proactive intelligence collection
- Contingency analysis of alternative scenarios
Filtering and processing of new intelligence
Propagation of inferences
Analysisof key indicators
Development of an intelligence collection plan
New intelligence
18Predictive Markov models
- Hypothesis validation and identification of key
indicators - Automated improvement ofmodel topologies
X2
X1
Obser-vations
WMD facilities
Qualified personnel
Available material
Z12
Z11
Hiddenreality
Development Goal
Develop-ment goal
Z22
Z21
Material acquisition
Facility construction
Personnelhiring
Y2
Y1
New obser-vations
Present
Past
19Graphical user interface
- Integrated access to all proposed tools
- Visualization and explanation of proactive
intelligence-collection strategies
20Test data
- Phase 1 (July 2007 Dec 2008)
- Patient data from MS Health Data Consortium1.6
million records, 70 attributes - Network event database from CyDAT Center10
billion records - Phases 24 (Jan 2008 Dec 2011)
- Challenge problems provided by PAINT/DTO
Related question Can we get any
preliminaryinformation about the types of target
data?
21Project plan
Task From To
Representation of uncertainty July 2007 June 2008
Inferences from uncertain data July 2007 Dec 2008
Analysis of uncertainties Jan 2009 Dec 2010
Predictive Markov models July 2007 Dec 2009
Graphical user interface Jan 2008 Dec 2010
System integration Jan 2008 Dec 2011
22First year
- Representation (July Dec 2007)
- Uncertain data and inferences
- Task hierarchies and utilities
- Markov model input
-
- Basic operations (Jan June 2008)
- Indexing of uncertain data
- Propagation of inferences
- Predictive Markov modeling
- Initial GUI design (Jan June 2008)
23Collaborations
We plan to explore collaborations withother
participants of the PAINT program.
- New Vectors and Set CorporationIntegration of
multiple predictive models - Fair IsaacAnalysis of unstructured intelligence
data - Least Squares
We will also explore collaborations with
companies that may be interested to convert RAPID
into a commercial product.