Analysis of uncertain data: Selection of probes for information gathering

About This Presentation

Title:

Analysis of uncertain data: Selection of probes for information gathering

Description:

Selection of search and learning algorithms ... ruction. Model. Evalu- ation. Question. Selection. Reasoning or. Optimization. current. model ... – PowerPoint PPT presentation

Number of Views:41

Avg rating:3.0/5.0

Slides: 46

Provided by: eugen8

Learn more at: http://www.cs.cmu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Analysis of uncertain data: Selection of probes for information gathering

1
Analysis of uncertain data Selection of probes
for information gathering
Eugene Fink
May 27, 2009
2
Outline

High-level part
Research interests and dreams
Proactive learning under uncertainty
Military intelligence applications

Technical part
Evaluation of given hypotheses
Choice of relevant observations
Selection of effective probes

3
High-Level Part
4
Research interests and dreams

Semi-automated representation changes

Problem reformulation and simplification
Selection of search and learning algorithms
Trade-offs among completeness, accuracy, and
speed of these algorithms

5
Research interests and dreams

Semi-automated representation changes
Semi-automated reasoning under uncertainty

Conclusions from incomplete and imprecise data
Passive and active learning
Targeted information gathering

6
Research interests and dreams

Semi-automated representation changes
Semi-automated reasoning under uncertainty

Recent projects

Scheduling based on uncertain resources and
constraints
Excel tools for uncertain numeric and nominal
data
Analysis of military intelligence and targeted
data gathering

7
Representation changes

Semi-automated representation changes
Semi-automated reasoning under uncertainty
Theoretical foundations of AI

Formalizing messy AI techniques
AI-complexity and AI-completeness

8
Representation changes

Semi-automated representation changes
Semi-automated reasoning under uncertainty
Theoretical foundations of AI
Algorithm theory

Generalized convexity
Indexing of approximate data
Compression of time series
Smoothing of probability densities

9
Subject of the talk

Semi-automated representation changes
Semi-automated reasoning under uncertainty

Analysis of military intelligence
Targeted information gathering

Theoretical foundations of AI
Algorithm theory

10
Learning under uncertainty
Learning is almost always a response to
uncertainty.
If we knew everything, we would not need to learn.
11
Learning under uncertainty

Passive learning

Construction of predictive models, response
mechanisms, etc. based on available data.
12
Learning under uncertainty

Passive learning
Active learning

Targeted requests for additional data, based on
simplifying assumptions.

The oracle can answer any question.
The answers are always correct.
All questions have the same cost.

13
Learning under uncertainty

Passive learning
Active learning
Proactive learning

Extensions to active learning aimed at removing
these assumptions.

Different questions incur different costs.
We may not receive an answer.
An answer may be incorrect.
The information value depends on the intended use
of the learned knowledge.

14
Proactive learning architecture
Top-Level Control
modelutility andlimitations
ModelConst-ruction
ModelEvalu-ation
QuestionSelection
currentmodel
Reasoning orOptimization
questions
answers
DataCollection
15
Military intelligence applications
We have studied proactive learning in the context
of military intelligence and homeland security.

The purpose is to develop tools for
Drawing conclusions from available intelligence.
Planning of additional intelligence gathering.

16
Modern military intelligence
Gather and analyze
Front end Massive data collection, including
satellite and aerial imaging, interviews, human
intelligence, etc.
Back end Sifting through massive data sets, both
public and classified.
Almost no feedback loop back-end analysts are
passive learners, who do not give tasks to
front-end data collectors.
17
Traditional goals

Gather and analyze massive data
Draw (semi-)reliable conclusions
Propose actions that are likely to accomplish
given objectives

18
Novel goals
Identify critical missing intelligence and plan
effective information gathering.

Targeted observations (expensive).
Active probing (very expensive).

19
Analysis of leadership and pathways
We can evaluate the intent and possible future
actions of an adversary through the analysis of
its leadership and pathways.
20
Analysis of leadership and pathways
We can evaluate the intent and possible future
actions of an adversary through the analysis of
its leadership and pathways.
Leadership Social networks, goals, and pet
projects of decision makers.
Pathways Typical projects and their sequences in
research, development, and production.
research onenhanced orcs
military orcdeployment
secret orcdevelopment
mass orcproduction
21
Analysis of leadership and pathways
22
Analysis of leadership and pathways

Construct models of social networks and
production pathways.
For each set of reasonable assumptions about the
adversarys intent, use these models to predict
observable events.
Check which of the predictions match actual
observations.

23
Example
Model predictions
If Sauron were secretly forging a new ring

80 chance we would observe deliveries of
black-magic materials to Mordor.
60 chance we would observe an unusual
concentration of orcs.

What can we conclude?
Intelligence The aerial imaging by eagles shows
black-magic deliveries but no orcs.
24
Technical Part
Anatole Gershman, Eugene Fink, Bin Fu, and Jaime
G. Carbonell
25
General problem
We have to distinguish among n mutually exclusive
hypotheses, denoted H1, H2,, Hn. We base the
analysis on m observable features, denoted obs1,
obs2, , obsm. Each observation is a variable
that takes one of several discrete values.
26
Input

Prior probabilities For every hypothesis, we
know its prior thus, we have an array of n of
priors, prior1..n.
Possible observations For every observation,
obsa, we know the number of its possible values,
numa. Thus, we have the array num1..m with
the number of values for each observation.
Observation distributions For every hypothesis,
we know the related probability distribution of
each observation. Thus, we have a matrix
chance1..n, 1..m, where each element is a
probability-density function. Every element
chancei, a is itself a one-dimensional array
with numa elements, which represent the
probabilities of possible values of obsa.
Actual observations We know a specific value of
each observation, which represents the available
intelligence. Thus, we have an array of m
observed values, val1..m.

27
Output
We have to evaluate the posterior probabilities
of the n given hypotheses, denoted post1..n.
28
Approach
We can apply the Bayesian rule, but we have to
address two complications.

The hypotheses may not cover all
possibilities.Sauron may be neither working on a
new ring nor doing white-magic research.

The observations may not be independent and we
usually do not know the dependencies.The
concentration of orcs may or may not be directly
related to the black-magic deliveries.

29
Simple Bayesian case
We have one observed value, vala, and the sum
of the prior1..n probabilities is exactly 1.0.
Integrated likelihood of observing
vala likelihood(vala) chance1, avala
prior1 chancen, avala
priorn.
Posterior probability of Hi posti prob(Hi
vala) chancei, avala priori /
likelihood(vala).
30
Rejection of all hypotheses
We have one observed value, vala, and the sum
of the prior1..n probabilities is less than 1.0.
We consider the hypothesis H0 representing the
believe that all n hypotheses are
incorrect prob0 1.0 - prior1 - -
priorn.
Posterior probability of H0 post0 prior0
prob(vala H0) / prob(vala) prior0
prob(vala H0) / (prior0 prob(vala
H0) likelihood(vala)).
31
Rejection of all hypotheses
Bad news We do not know prob(vala H0).
Good news post0 monotonically depends on
prob(vala H0) thus, if we obtain lower and
upper bounds for prob(vala H0), we also get
bounds for post0.
Posterior probability of H0 post0 prior0
prob(vala H0) / prob(vala) prior0
prob(vala H0) / (prior0 prob(vala
H0) likelihood(vala)).
32
Plausibility principle
Unlikely events normally do not happen thus, if
we have observed vala, then its likelihood must
not be too small.
Plausibility threshold We use a global constant
plaus, which must be between 0.0 and 1.0. If we
have observed vala, we assume that
prob(vala) plaus / numa.
We use it to obtains bounds for prob(vala
H0) Lower (plaus / numa -
likelihood(vala)) / prior0. Upper 1.0.
33
Plausibility principle
We substitute these bounds into the dependency of
post0 on prob(vala H0), thus obtaining the
bounds for post0 Lower 1.0 -
likelihood(vala) numa / pluas. Upper
prior0 / (prior0 likelihood(vala)).
We use it to obtains bounds for prob(vala
H0) Lower (plaus / numa -
likelihood(vala)) / prior0. Upper 1.0.
We have derived bounds for the probability that
none of the given hypotheses is correct.
34
Judgment calls
A human has to specify a plausibility threshold
and decide between the use of the lower and the
upper bounds.

Plausibility threshold Reducing it leads to more
reliable conclusions at the expense of a looser
lower bound. We have used 0.1, which tends to
give good practical results.
Lower vs. upper bound We should err on the
pessimistic side. If H0 is a pleasant surprise,
use the lower bound else, use the upper bound.

35
Multiple observations
We have multiple observed values, val1..m.
We have tried several approaches

Joint distributions We usually cannot obtain
joint distributions or information about
dependencies.

Independence assumption We usually get terrible
practical results, which are no better (and
sometimes worse) than random guessing.

Use of one most relevant observation We usually
get surprisingly good practical results.

36
Most relevant observation
We identify the highest-utility observation and
do not use other observations to corroborate it.
Pay attention only to black-magic deliveries and
ignore observations of orc armies.
Advantage We use a conservative approach, which
never leads to excessive over-confidence.
Drawback We may significantly underestimatethe
value of available observations.
37
Most relevant observation
We identify the highest-utility observation and
do not use other observations to corroborate it.

Selection procedure
For each of the m observable values
Compute the posteriors based on this value.
Evaluate their information utility.
Select the observable value that gives the
highest information utility of the posteriors.

38
Alternative utility measures
Negation of Shannons entropy post0 log
post0 postn log postn. It rewards
high certainty, that is, situations in which
the posteriors clearly favor one hypothesis over
all others. It is high when the probability of
some hypothesis is close to 1.0 it is low when
all hypotheses are about equally
likely. Drawback It may reward unwarranted
certainty.
39
Alternative utility measures
Negation of Shannons entropy post0 log
post0 postn log postn.
Kullback-Leibler divergence post0 log
(post0 / prior0) postn log
(postn / priorn). It rewards situations in
which the posteriors are very different from the
priors. It tends to give preference to
observations that have the potential for
paradigm shifts. Drawback It may encourage
unwarranted departure from the right conclusions.
40
Alternative utility measures
Negation of Shannons entropy post0 log
post0 postn log postn.
Kullback-Leibler divergence post0 log
(post0 / prior0) postn log
(postn / priorn).
Task-specific utilities We may construct better
utility measures by analyzing the impact of
posterior estimates on our future actions and
evaluating the related rewards and penalties, but
it involves more lengthy formulas.
41
Probe selection
We may obtain additional intelligence by probing
the adversary, that is, affecting it by external
actions and observing its response.
Increase the cost of black-magic materials
through market manipulation and observe whether
Sauron continues purchasing them.
We have to select among k available probes.
42
Additional input

Probe costs For every probe, we know its
expected cost thus, we have an array of k
numeric costs, cost1..k.
Observation distributions The likelihood of
specific observed values depends on (1) which
hypothesis is correct and (2) which probe has
been applied. For every hypothesis and every
probe, we know the related probability
distribution of each observation. Thus, we have
an array with n m k elements,
chance1..n, 1..m, 1..k, where each element is a
probability density function. Every element
chancei, a, j is itself a one-dimensional array
with numa elements, which represent the
probabilities of possible values of obsa.

43
Selection procedure

For each of the k probes
Consider the related observation distributions.
Select the most relevant observation.
Compute the expected gain as the difference
between the expected utility of the posterior
probabilities and the probe cost.
Select the probe with the highest gain.
If this gain is positive, recommend its
application.

44
Extensions

Task-specific utility functions.
Accounting for the probabilities of observation
and probe failures.
Selection of multiple observations based on their
independence or joint distributions.
Application of parameterized probes.

45
Analysis of
Uncertain Data

Write a Comment

User Comments (0)

About PowerShow.com

Analysis of uncertain data: Selection of probes for information gathering - PowerPoint PPT Presentation

Analysis of uncertain data: Selection of probes for information gathering

Selection of search and learning algorithms ... ruction. Model. Evalu- ation. Question. Selection. Reasoning or. Optimization. current. model ... – PowerPoint PPT presentation