RAPID: Representation and Analysis of Probabilistic Intelligence Data - PowerPoint PPT Presentation

About This Presentation
Title:

RAPID: Representation and Analysis of Probabilistic Intelligence Data

Description:

Develop strategies for proactive collection of additional intelligence to ... proactive data collection. Fast database operations on a. stream of newly incoming data, ... – PowerPoint PPT presentation

Number of Views:89
Avg rating:3.0/5.0
Slides: 48
Provided by: DTO83
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: RAPID: Representation and Analysis of Probabilistic Intelligence Data


1
RAPIDRepresentation and Analysis
ofProbabilistic Intelligence Data
PAINT
  • Carnegie Mellon University
  • PI Prof. Jaime G. Carbonell / jgc_at_cs.cmu.edu /
    (412) 268-7279
  • Dr. Eugene Fink / e.fink_at_cs.cmu.edu / (412)
    268-6593
  • Dr. Anatole Gershman / anatoleg_at_cs.cmu.edu /
    (412) 268-8259
  • DYNAMiX Technologies
  • POC Dr. Ganesh Mani / gmani_at_dynamixtechnologies.c
    om / (412) 401-0121
  • Mr. Dwight Dietrich / ddietrich_at_dynamixtechnologie
    s.com / (724) 940-4304

2
People
  • Carnegie Mellon
  • FacultyJaime G. CarbonellEugene FinkAnatole
    Gershman
  • StudentsBin FuDiwakar PunjaniAndrew Yeager

DYNAMiX PrincipalsDwight DietrichGanesh
Mani EngineersAtul BhandariJeremy
HermannVeera Manda
3
Outline of the presentation
  • RAPID functionality
  • Preliminary demo
  • Architecture and main components
  • Integration with REALISM
  • Current results and work plan

4
Analysis of uncertain intelligence
  • RAPID is a probabilistic reasoning engine for the
    analysis of dynamically evolving intelligence
    data.
  • RAPID will help
  • Identify important holes
  • Locate most crucialmissing pieces
  • Insert these pieces
  • Knowledge sources
  • Public domain
  • Intelligence
  • Inferences

5
Analysis of uncertain intelligence
  • RAPID will help intelligence analysts to
    accomplish the following tasks.
  • Draw probabilistic conclusions from available
    intelligence, including uncertain and missing
    data
  • Identify potentially surprising developments
  • Formulate and assess hypotheses
  • Identify critical uncertainties
  • Develop strategies for proactive collection of
    additional intelligence to resolve uncertainties,
    based on the analysis of cost / benefit trade-offs

6
Underlying functionality
  • Representation of uncertaintyNovel
    representation of massive uncertain data,which
    supports fast matching and inferences
  • Inferences from uncertain dataScalable
    inference mechanism for reasoningabout uncertain
    intelligence
  • Analysis of critical uncertaintiesAssessment of
    uncertain situations, evaluation of datautility,
    and identification of important missing data
  • Proactive intelligence planningEvaluation of
    available probes and constructionof optimized
    intelligence-collection plans

7
Outline of the presentation
  • RAPID functionality
  • Preliminary demo
  • Architecture and main components
  • Integration with REALISM
  • Current results and work plan

8
Preliminary demo
Uncertainty analysisand probe evaluation,integra
ted into Excel.
9
Outline of the presentation
  • RAPID functionality
  • Preliminary demo
  • Architecture and main components
  • Integration with REALISM
  • Current results and work plan

10
Architecture
Advanced analysis of incomplete
data,identification of critical
uncertainties,evaluation and selection of
probes,what-if analysis, and visualization.
Uncertainty calculus andproactive probe planning
Excel extension for the analysis of uncertainty,
probes, and proactive data collection
11
Architecture
Analystinterface
Uncertain situation assessmentand
data-collection planning
Fast database operations on astream of newly
incoming data,and integration of this
streamwith the static database.
Uncertainty calculus andproactive probe planning
Excel extension for the analysis of uncertainty,
probes, and proactive data collection
Processing ofdata streams
Real-time matching of queriesand inference rules
against amassive stream of new data
Scalable assessment ofuncertain intelligence
Relational database of uncertaindata and
inference rules
External API
OTHER PAINT SYSTEMS
12
Architecture
Analystinterface
Approved plans forproactive data collection
Uncertain situation assessmentand
data-collection planning
Uncertainty calculus andproactive probe planning
Proactiveintelligencecollection
Excel extension for the analysis of uncertainty,
probes, and proactive data collection
Processing ofdata streams
Real-time matching of queriesand inference rules
against amassive stream of new data
Massive newintelligence
Massive newintelligence
Scalable assessment ofuncertain intelligence
Generalintelligencecollection
Relational database of uncertaindata and
inference rules
External API
Value-addedreasoning tools
OTHER PAINT SYSTEMS
13
Uncertainty calculus andproactive probe planning
Microsoft Excel
14
Scalable assessmentof uncertain intelligence
Uncertainfacts
Uncertaininferencerules
Semanticnetwork
15
Value-added reasoning tools
The available intelligence data and inference
rules are in Excel tables, and in the uncertainty
database integrated with Excel.
Uncertainty calculus andproactive probe planning
Excel extension for the analysis of uncertainty,
probes, and proactive data collection
16
Analyst interface
  • Optional extension of the Excel interface
  • Visualization and explanation of intelligence
    data, inferences, and data-collection plans

17
Outline of the presentation
  • RAPID functionality
  • Preliminary demo
  • Architecture and main components
  • Integration with REALISM
  • Current results and work plan

18
Integration goals
We will integrate the text-extraction system
developed by HNC / Fair Isaac with the
uncertainty-analysis system developed by CMU /
DYNAMiX. The integrated system will support the
following capabilities.
  • Extraction of facts, relations, and causal links
    from natural-language documents
  • Evaluation of given hypotheses
  • Proactive information gathering
  • Application to the analysis of Iranian
    nano-technology plans and capabilities

19
Inputs and outputs
REALISM
RAPID
  • Input
  • Requirements and filters for the information
    extraction
  • Natural-language documents
  • World-wide web
  • Input
  • Tables of uncertain facts
  • Uncertain inference rules
  • Queries for specific data
  • Analyst hypotheses
  • Output
  • Large structured tables of relevant facts and
    entities, which include uncertainty
  • Inference-rule representation of relations and
    causal links, also including uncertainty
  • Output
  • Inferences from uncertain data
  • Exact and approximatematches for given queries
  • Hypothesis assessment
  • Proactive plans for collectingadditional data

20
Architecture
Analystinterface
Uncertain situation assessmentand
data-collection planning
Informationrequests
Topicfilters
Uncertainty calculus andproactive probe planning
Structuredfacts andentities
TEXT DOCUMENTS
REALISM
External API
Structured relations andcausal links
Scalable assessment ofuncertain intelligence
WEB
RAPID CMU / DYNAMiX
REALISM HNC / Fair Isaac
21
Outline of the presentation
  • RAPID functionality
  • Preliminary demo
  • Architecture and main components
  • Integration with REALISM
  • Current results and work plan

22
Initial results
  • Detailed technical plan of uncertain situation
    assessment and proactive probe planningarchitect
    ure, functionality, and algorithms
  • Uncertain intelligence scenario based onpublic
    data about Iranian nano-technology

CONECPTUAL
23
Current work
  • Uncertainty calculus,integrated with Excel
  • Proactive probe planning
  • Scalable uncertainty assessment,integrated with
    a relational database
  • Integration with REALISM
  • Initial analyst interface

24
Short-term plan
Prototype of uncertainty calculus March
Prototype of probe-planning tools March
Initial RAPID / REALISM integration May
Initial analyst interface (extended Excel) June
Prototype of uncertainty database July
25
Long-term plan
All versions of RAPID will demonstrate all main
capabilities, with increasing functionality over
time.
Uncertain situation assessmentand proactive probe planning July 2008
Discrimination among competing hypothesesand identification of critical uncertainties July 2009
Fully integrated deployable prototype July 2009
Advanced proactive-intelligence planningand learning of inference rules July 2010
Value-added tools, which may include data-stream processing, entity co-reference, adversarial search, and Markov reasoning July 2011
Fully integrated deliverable system Jan 2012
26
Evaluation
We expect that RAPID will provide significant
advantage over available off-the-shelf tools,
such as standard spreadsheets and database
systems.
27
Evaluation
We expect that RAPID will provide significant
advantage over available off-the-shelf tools,
such as standard spreadsheets and database
systems.
To support this claim, we plan to compare the
productivity of analysts using RAPID with that of
analysts who perform the same tasks using
commercially available tools.
We will view RAPID as success if it consistently
outperforms the standard tools, and the analysts
report the overall positive experience of using
it.
28
Adjustment of the earlier plan
  • We need to adjust the plan to the new budget.We
    will deliver the full core functionality, but we
    propose to reduce the work on value-added tools.
  • Reduced work
  • Processing of data streams
  • Advanced contingency analysis
  • Analyst interface
  • Suspended work
  • Predictive Markov models
  • Analysis of adversarial actions

29
APPENDICES
30
Appendices
  • Previous work
  • Empirical evaluation
  • PAINT contributions

31
ARGUS
  • ARGUS project sponsored by DTO/ARDA
    Identification and tracking of novel patterns in
    massive databases and data streams.

Create
Detect
Create
Detect
Novel
Novel
Historical
Background
Novel
Historical
Background
Novel
Background
Novel
Re
-
cluster
Background
Novel
Re
-
cluster
Analysts
Clusters
Clusters
Data
Model
Events
Data
Model
Events
Model
Events
Model
Events
Tracked
New
Events
Data
Generate
Generate
Update
New
New
Match
Match
Profiles
Alerts
Profiles
Alerts
Profiles
Profiles
Profiles
Profiles
Profiles
Analysts
32
ARGUS
  • Estimate the density function at t0
  • Grow the cluster for a period of ?t while
    reducing the weight of old records
  • Estimate the new density function at t0?t
  • Compare the two estimates

33
ARGUS
Respiratory Diseases
SARS
Re-clustering
t0
?t
34
RADAR
  • RADAR project sponsored by DARPAAnalysis and
    management of volatile crisis situations based on
    uncertain data.

Top-level control and learning
Processnew data
Analysts
35
RADAR
We have applied the system to repair a schedule
of a conference after a crisis loss of rooms.
36
RAPID
  • Unlike ARGUS
  • Represents and analyzes uncertainty
  • Supports complex inferences
  • Unlike RADAR
  • Scales to massive intelligence datasets
  • Analyzes complex external situations
  • Develops intelligence-collection plans

37
Appendices
  • Previous work
  • Empirical evaluation
  • PAINT contributions

38
Evaluation goals
We expect that RAPID will provide significant
advantage over available off-the-shelf tools,
such as standard spreadsheets and database
systems.
39
Experimental setup
We expect to recruit retired intelligence
analysts for the system evaluation, and ask them
to perform several tasks based on given uncertain
data.
  • Identify the data most relevant to given tasks
  • Evaluate the validity of given hypotheses
  • Find relevant hidden patterns
  • Identify critical missing data and propose
    acost-effective plan for collecting this data

40
Performance measurements
We will measure the following main factors to
evaluate the performance of analysts
  • Number of high-level tasks completedwithin the
    experiment time frame
  • Accuracy of hypothesis evaluation
  • Number and relevance of identified patterns
  • Effectiveness and costs of data-collection plans

We will also ask analysts to complete a
questionnaire on their overall experience.
41
Expected results
  • We will view the proposed work as success if
  • RAPID consistently outperforms the off-the-shelf
    tools in all four performance factors,
  • the performance difference for each factor is
    statistically significant, and
  • analysts report the overall positive experience
    of using the system.

42
RAPID / REALISM evaluation
  • Component evaluation
  • We will measure the following performance
    factors
  • Accuracy and completeness of text extraction
  • Accuracy of hypothesis evaluation
  • Effectiveness of data-collection plans
  • Speed of each system component

Component utility We will also evaluate the
utility of REALISM and RAPID by comparing the
productivity of subjects under the following
three conditions
  • Use of the integrated system
  • Use of REALISM without RAPID
  • Use of RAPID without REALISM

43
Appendices
  • Previous work
  • Empirical evaluation
  • PAINT contributions

44
Main contributions
Strategy Generation and Exploration
3
Response Options
Data
Dynamic Simulation Models
Feedback
4
Representation of massive uncertain
knowledge Automated discovery of causal
relationships
Fast probabilistic integration of all
evidence Analysis of possible future developments
3
Identification of critical uncertainties Planning
of proactive intelligence gathering
4
45
Inputs and outputs
Generalintelligencecollection
CONTINUOUS DATA STREAM
Uncertain intelligence and analyst
opinions Massive stream ofstructured records
Proactiveintelligencecollection
RAPID
Data-searchqueries
Querymatches
Uncertainsituationassessment
Specifichypotheses
Evaluation ofhypotheses
INTERACTIVE DATA ANALYSIS
New learnedrules
Inferencerules
Domainknowledge
Plans for proactiveintelligence collection
46
Inputs
  • From other PAINT components
  • Available intelligence data and its certainty
  • Hypotheses about unknown factors
  • Available domain knowledge
  • From analysts
  • Intelligence-analysis tasks and priorities
  • Hypotheses and related opinions
  • Responses to RAPID-generated probes
  • Additional domain knowledge
  • From other sources
  • Databases with available intelligence
  • Public databases with relevant data

47
Outputs
  • Inferences from available uncertain data
  • Evaluation of given hypotheses
  • New hypotheses and their certainties
  • Plans for proactive intelligence collection
  • Learned inference rules
Write a Comment
User Comments (0)
About PowerShow.com