Analysis of uncertain data: Selection of probes for information gathering - PowerPoint PPT Presentation

About This Presentation
Title:

Analysis of uncertain data: Selection of probes for information gathering

Description:

Selection of search and learning algorithms ... ruction. Model. Evalu- ation. Question. Selection. Reasoning or. Optimization. current. model ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 46
Provided by: eugen8
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Analysis of uncertain data: Selection of probes for information gathering


1
Analysis of uncertain data Selection of probes
for information gathering
Eugene Fink
May 27, 2009
2
Outline
  • High-level part
  • Research interests and dreams
  • Proactive learning under uncertainty
  • Military intelligence applications
  • Technical part
  • Evaluation of given hypotheses
  • Choice of relevant observations
  • Selection of effective probes

3
High-Level Part
4
Research interests and dreams
  • Semi-automated representation changes
  • Problem reformulation and simplification
  • Selection of search and learning algorithms
  • Trade-offs among completeness, accuracy, and
    speed of these algorithms

5
Research interests and dreams
  • Semi-automated representation changes
  • Semi-automated reasoning under uncertainty
  • Conclusions from incomplete and imprecise data
  • Passive and active learning
  • Targeted information gathering

6
Research interests and dreams
  • Semi-automated representation changes
  • Semi-automated reasoning under uncertainty

Recent projects
  • Scheduling based on uncertain resources and
    constraints
  • Excel tools for uncertain numeric and nominal
    data
  • Analysis of military intelligence and targeted
    data gathering

7
Representation changes
  • Semi-automated representation changes
  • Semi-automated reasoning under uncertainty
  • Theoretical foundations of AI
  • Formalizing messy AI techniques
  • AI-complexity and AI-completeness

8
Representation changes
  • Semi-automated representation changes
  • Semi-automated reasoning under uncertainty
  • Theoretical foundations of AI
  • Algorithm theory
  • Generalized convexity
  • Indexing of approximate data
  • Compression of time series
  • Smoothing of probability densities

9
Subject of the talk
  • Semi-automated representation changes
  • Semi-automated reasoning under uncertainty
  • Analysis of military intelligence
  • Targeted information gathering
  • Theoretical foundations of AI
  • Algorithm theory

10
Learning under uncertainty
Learning is almost always a response to
uncertainty.
If we knew everything, we would not need to learn.
11
Learning under uncertainty
  • Passive learning

Construction of predictive models, response
mechanisms, etc. based on available data.
12
Learning under uncertainty
  • Passive learning
  • Active learning

Targeted requests for additional data, based on
simplifying assumptions.
  • The oracle can answer any question.
  • The answers are always correct.
  • All questions have the same cost.

13
Learning under uncertainty
  • Passive learning
  • Active learning
  • Proactive learning

Extensions to active learning aimed at removing
these assumptions.
  • Different questions incur different costs.
  • We may not receive an answer.
  • An answer may be incorrect.
  • The information value depends on the intended use
    of the learned knowledge.

14
Proactive learning architecture
Top-Level Control
modelutility andlimitations
ModelConst-ruction
ModelEvalu-ation
QuestionSelection
currentmodel
Reasoning orOptimization
questions
answers
DataCollection
15
Military intelligence applications
We have studied proactive learning in the context
of military intelligence and homeland security.
  • The purpose is to develop tools for
  • Drawing conclusions from available intelligence.
  • Planning of additional intelligence gathering.

16
Modern military intelligence
Gather and analyze
Front end Massive data collection, including
satellite and aerial imaging, interviews, human
intelligence, etc.
Back end Sifting through massive data sets, both
public and classified.
Almost no feedback loop back-end analysts are
passive learners, who do not give tasks to
front-end data collectors.
17
Traditional goals
  • Gather and analyze massive data
  • Draw (semi-)reliable conclusions
  • Propose actions that are likely to accomplish
    given objectives

18
Novel goals
Identify critical missing intelligence and plan
effective information gathering.
  • Targeted observations (expensive).
  • Active probing (very expensive).

19
Analysis of leadership and pathways
We can evaluate the intent and possible future
actions of an adversary through the analysis of
its leadership and pathways.
20
Analysis of leadership and pathways
We can evaluate the intent and possible future
actions of an adversary through the analysis of
its leadership and pathways.
Leadership Social networks, goals, and pet
projects of decision makers.
Pathways Typical projects and their sequences in
research, development, and production.
research onenhanced orcs
military orcdeployment
secret orcdevelopment
mass orcproduction
21
Analysis of leadership and pathways
22
Analysis of leadership and pathways
  • Construct models of social networks and
    production pathways.
  • For each set of reasonable assumptions about the
    adversarys intent, use these models to predict
    observable events.
  • Check which of the predictions match actual
    observations.

23
Example
Model predictions
If Sauron were secretly forging a new ring
  • 80 chance we would observe deliveries of
    black-magic materials to Mordor.
  • 60 chance we would observe an unusual
    concentration of orcs.

What can we conclude?
Intelligence The aerial imaging by eagles shows
black-magic deliveries but no orcs.
24
Technical Part
Anatole Gershman, Eugene Fink, Bin Fu, and Jaime
G. Carbonell
25
General problem
We have to distinguish among n mutually exclusive
hypotheses, denoted H1, H2,, Hn. We base the
analysis on m observable features, denoted obs1,
obs2, , obsm. Each observation is a variable
that takes one of several discrete values.
26
Input
  • Prior probabilities For every hypothesis, we
    know its prior thus, we have an array of n of
    priors, prior1..n.
  • Possible observations For every observation,
    obsa, we know the number of its possible values,
    numa. Thus, we have the array num1..m with
    the number of values for each observation.
  • Observation distributions For every hypothesis,
    we know the related probability distribution of
    each observation. Thus, we have a matrix
    chance1..n, 1..m, where each element is a
    probability-density function. Every element
    chancei, a is itself a one-dimensional array
    with numa elements, which represent the
    probabilities of possible values of obsa.
  • Actual observations We know a specific value of
    each observation, which represents the available
    intelligence. Thus, we have an array of m
    observed values, val1..m.

27
Output
We have to evaluate the posterior probabilities
of the n given hypotheses, denoted post1..n.
28
Approach
We can apply the Bayesian rule, but we have to
address two complications.
  • The hypotheses may not cover all
    possibilities.Sauron may be neither working on a
    new ring nor doing white-magic research.
  • The observations may not be independent and we
    usually do not know the dependencies.The
    concentration of orcs may or may not be directly
    related to the black-magic deliveries.

29
Simple Bayesian case
We have one observed value, vala, and the sum
of the prior1..n probabilities is exactly 1.0.
Integrated likelihood of observing
vala likelihood(vala) chance1, avala
prior1 chancen, avala
priorn.
Posterior probability of Hi posti prob(Hi
vala) chancei, avala priori /
likelihood(vala).
30
Rejection of all hypotheses
We have one observed value, vala, and the sum
of the prior1..n probabilities is less than 1.0.
We consider the hypothesis H0 representing the
believe that all n hypotheses are
incorrect prob0 1.0 - prior1 - -
priorn.
Posterior probability of H0 post0 prior0
prob(vala H0) / prob(vala) prior0
prob(vala H0) / (prior0 prob(vala
H0) likelihood(vala)).
31
Rejection of all hypotheses
Bad news We do not know prob(vala H0).
Good news post0 monotonically depends on
prob(vala H0) thus, if we obtain lower and
upper bounds for prob(vala H0), we also get
bounds for post0.
Posterior probability of H0 post0 prior0
prob(vala H0) / prob(vala) prior0
prob(vala H0) / (prior0 prob(vala
H0) likelihood(vala)).
32
Plausibility principle
Unlikely events normally do not happen thus, if
we have observed vala, then its likelihood must
not be too small.
Plausibility threshold We use a global constant
plaus, which must be between 0.0 and 1.0. If we
have observed vala, we assume that
prob(vala) plaus / numa.
We use it to obtains bounds for prob(vala
H0) Lower (plaus / numa -
likelihood(vala)) / prior0. Upper 1.0.
33
Plausibility principle
We substitute these bounds into the dependency of
post0 on prob(vala H0), thus obtaining the
bounds for post0 Lower 1.0 -
likelihood(vala) numa / pluas. Upper
prior0 / (prior0 likelihood(vala)).
We use it to obtains bounds for prob(vala
H0) Lower (plaus / numa -
likelihood(vala)) / prior0. Upper 1.0.
We have derived bounds for the probability that
none of the given hypotheses is correct.
34
Judgment calls
A human has to specify a plausibility threshold
and decide between the use of the lower and the
upper bounds.
  • Plausibility threshold Reducing it leads to more
    reliable conclusions at the expense of a looser
    lower bound. We have used 0.1, which tends to
    give good practical results.
  • Lower vs. upper bound We should err on the
    pessimistic side. If H0 is a pleasant surprise,
    use the lower bound else, use the upper bound.

35
Multiple observations
We have multiple observed values, val1..m.
We have tried several approaches
  • Joint distributions We usually cannot obtain
    joint distributions or information about
    dependencies.
  • Independence assumption We usually get terrible
    practical results, which are no better (and
    sometimes worse) than random guessing.
  • Use of one most relevant observation We usually
    get surprisingly good practical results.

36
Most relevant observation
We identify the highest-utility observation and
do not use other observations to corroborate it.
Pay attention only to black-magic deliveries and
ignore observations of orc armies.
Advantage We use a conservative approach, which
never leads to excessive over-confidence.
Drawback We may significantly underestimatethe
value of available observations.
37
Most relevant observation
We identify the highest-utility observation and
do not use other observations to corroborate it.
  • Selection procedure
  • For each of the m observable values
  • Compute the posteriors based on this value.
  • Evaluate their information utility.
  • Select the observable value that gives the
  • highest information utility of the posteriors.

38
Alternative utility measures
Negation of Shannons entropy post0 log
post0 postn log postn. It rewards
high certainty, that is, situations in which
the posteriors clearly favor one hypothesis over
all others. It is high when the probability of
some hypothesis is close to 1.0 it is low when
all hypotheses are about equally
likely. Drawback It may reward unwarranted
certainty.
39
Alternative utility measures
Negation of Shannons entropy post0 log
post0 postn log postn.
Kullback-Leibler divergence post0 log
(post0 / prior0) postn log
(postn / priorn). It rewards situations in
which the posteriors are very different from the
priors. It tends to give preference to
observations that have the potential for
paradigm shifts. Drawback It may encourage
unwarranted departure from the right conclusions.
40
Alternative utility measures
Negation of Shannons entropy post0 log
post0 postn log postn.
Kullback-Leibler divergence post0 log
(post0 / prior0) postn log
(postn / priorn).
Task-specific utilities We may construct better
utility measures by analyzing the impact of
posterior estimates on our future actions and
evaluating the related rewards and penalties, but
it involves more lengthy formulas.
41
Probe selection
We may obtain additional intelligence by probing
the adversary, that is, affecting it by external
actions and observing its response.
Increase the cost of black-magic materials
through market manipulation and observe whether
Sauron continues purchasing them.
We have to select among k available probes.
42
Additional input
  • Probe costs For every probe, we know its
    expected cost thus, we have an array of k
    numeric costs, cost1..k.
  • Observation distributions The likelihood of
    specific observed values depends on (1) which
    hypothesis is correct and (2) which probe has
    been applied. For every hypothesis and every
    probe, we know the related probability
    distribution of each observation. Thus, we have
    an array with n  m  k elements,
    chance1..n, 1..m, 1..k, where each element is a
    probability density function. Every element
    chancei, a, j is itself a one-dimensional array
    with numa elements, which represent the
    probabilities of possible values of obsa.

43
Selection procedure
  • For each of the k probes
  • Consider the related observation distributions.
  • Select the most relevant observation.
  • Compute the expected gain as the difference
    between the expected utility of the posterior
    probabilities and the probe cost.
  • Select the probe with the highest gain.
  • If this gain is positive, recommend its
    application.

44
Extensions
  • Task-specific utility functions.
  • Accounting for the probabilities of observation
    and probe failures.
  • Selection of multiple observations based on their
    independence or joint distributions.
  • Application of parameterized probes.

45
Analysis of
Uncertain Data
Write a Comment
User Comments (0)
About PowerShow.com