Title: Uncertainty in Artificial Intelligence Research at USC: Research Presentation for Graduate Students
1Uncertainty in Artificial Intelligence Research
at USC Research Presentation for Graduate
Students
- September 10, 2004
- Marco Valtorta
- SWRG 3A55
- mgv_at_cse.sc.edu
2Uncertainty in Artificial Intelligence
- Artificial Intelligence (AI)
- Robotics
- Automated Reasoning
- Theorem Proving, Search, etc.
- Reasoning Under Uncertainty
- Fuzzy Logic, Possibility Theory, etc.
- Normative Systems
- Bayesian Networks
- Influence Diagrams (Decision Networks)
3Research Interests
- Algorithms for Probability Update in BNs
- factor tree method, with Mark Bloemeke
- Modeling of uncertain evidence
- observation variables, with Young-Gyun Kim and
Jirka Vomlel - Soft Evidential Update in BNs
- and the big clique algorithm, with Young-Gyun Kim
and Jirka Vomlel - Causal Bayesian networks
- Learning
- CB algorithm, with Moninder Singh and Bing Xia
- the effect of data quality on learning, with
Valerie Sessions
4Algorithms and Modeling
- Algorithms for probability update in BNs
- factor tree method, with Mark Bloemeke
- Modeling of uncertain evidence with observation
variables, with Young-Gyun Kim and Jirka Vomlel - Soft evidential update in BNs and the big clique
algorithm, with Young-Gyun Kim and Jirka Vomlel - Causal Bayesian networks, with Yimin Huang
5Correlation vs. Causation
- The genotype theory (Fisher, 1958) of smoking and
lung cancer smoking and lung cancer are both
effects of a genetic predisposition - Three node network
- X( smoking) and Y( lung cancer) are in lockstep
- X precedes Y in time (smoke before cancer)
- But, X does not cause Y, because if we set X, Y
does not change Y only changes according to the
value of U (the genotype)
U
X
Y
6An Example Cochran through Pearl, 2000
Soil fumigants (X) are used to increase oat crop
yields (Y) by controlling the eelworm population
(Z). Last years eelworm population (Z0) is an
unknown quantity that is strongly correlated with
this years population. Through laboratory
analysis of soil samples, we can determine the
eelworm populations before and after the
treatments (Z1 and Z2). Furthermore , we assume
that the fumigants do not affect the growth of
eelworms surviving the treatment. Instead,
eelworms growth depends on the population of
birds (B), which is correlated with last years
eelworm population and hence with the treatment
itself. Z3 here represents the eelworm population
at the end of the season.
We wish to assess the total effect of the
fumigants on yields. But, controlled randomized
experiment are unfeasible and Z0 is unknown. If
we got a correct model, can we obtain consistent
estimate of the target quantity the total
effect of the fumigants on yields through
observations?
7Nonidentifiability
- The identifiablility of the effect of X on Y
ensures that it is possible to infer the effect
of action do(Xx) on Y from passive observations
and the causal graph G, which specifies which
variables participate in the determination of
each variable in the domain - To prove nonidentifiability, it is sufficient to
present two sets of structural equations that
induce identical distributions over observed
variables but have different causal effects - X and Y are observable, U is not. All of them are
binary variables - Let P(X0U) (0.5,0.5)
- P(Y0X,U) is given by the table on the right
- We cannot observe U, so we do not know P(U)
- When P(U0) 0.5, P(YX0) (.45,.55)
- When P(U0) 0.1, P(YX0) (.73,.27)
- So, P(Ydo(X)) is non-identifiable
U
X
Y
Y0 X 0 X 1
U 0 0.1 0.2
U1 0.8 0.7
8Smoking and the genotype theory
- Consider the relation between smoking(X) and lung
cancer(Y). - The tobacco industry has managed to forestall
antismoking legislation by arguing that observed
correlation between smoking and lung cancer could
be explained by some sort of carcinogenic
genotype(U) that involves inborn carving for
nicotine - Suppose that Z is the amount of tar deposited in
a person's lungs and we believe in the causal
model shown on the right. - Can we now recover from
observational data only?
9Learning
- Parallel learning with background knowledge, with
Bhaskara Moole - CB algorithm, with Moninder Singh and Bing Xia
- Effect of data quality on learning, with Valerie
Sessions
10An Example of Learning Chernobyl
11A Bayesian Network Model
12Simulation
13Simulation File Conversion
14Sample(s)
Key Yes 1 Read 1 Received 1 Heard
1 Received 1 No 2
15Visual CB
- CB Singh and Valtorta, 1993 1995
- in Visual C Bing Xia, MS, 2002
16Learning
17Result on Chernobyl Example
18Results II
19Results III
20Results IV
21Applications
- Assessment of the risk of mental retardation in
infants, with Subramani Mani and Suzanne
McDermott - Agent-based intrusion detection with soft
evidence, with Vaibhav Gowadia and Csilla Farkas - Support for intelligence analysis, with Michael
Huhns, Hrishi Goradia, Jiangbo Dang, and Jingshan
Huang - Modeling damage in critical resources, with Yimin
Huang and Bill Full
22MENTOR
23The OmniSeer Project
- Represent prior knowledge to support intelligence
analysis - Explicate formerly tacit knowledge for use and
collaboration - Support relevance analysis, evidence gathering,
and novelty detection - with Bayesian networks!
24OmniSeer Functional Architecture
The massive data might be filtered by preferences
and interests specified in the UConn User Model
Outdated fragments are removed periodically from
the set of partially instantiated fragments
Tacit Knowledge
Matcher
BN fragments represent an analysts prior
knowledge about terrorist activities or other
domains of interest specified in the UConn user
model
Differences between an analysts conclusion and
the situation-specific scenario lead to
explication of formerly tacit knowledge,
represented as new BN fragments
Forgetter
Alerts
The noun-phrase analyzer from UConn processes
messages a 3rd-party tagger processes news feeds
The analyst explores which information should be
acquired to reduce uncertainty and assesses the
robustness of conclusions
The analyst is notified of surprises and
interesting situations, as specified in the UConn
User Model
Relevant facts extracted from the documents and
messages fill in the details of the BN fragments
of interest
Composer
Instantiated BN fragments are composed into
scenarios specific to the situation at hand
25Competence and Resources
- Several faculty members in the CSE department
have worked in normative probabilistic reasoning
for many years - Some colleagues and students in the Statistics
department are also interested - Tools for editing BNs and IDs, propagation,
interface with relational databases, soft
evidential update, learning, etc., have been
acquired or developed and used in projects and
courses (CSCE 582 and CSCE 822)
26Some Local UAI Researchers (Notably Missing Juan
Vargas)
Billy Turkett, Ph.D. (Wake Forest)
Young-Gyun Kim, Ph.D. (S.C. State)
Wayne Smith, Ph.D. (Presyterian College)
Clif Presser, Ph.D. (Gettysburg College)
Miguel Barrientos, Ph.D.
27Judea Pearl and Finn V.Jensen
28Additional Information
- Bayesian networks journal club
- meets every two weeks on Wednesdays next meeting
on September 15 at 1pm in 3A75 - http//www.cse.sc.edu/mgv/BNSeminar/index.html
- 3A55, 777-4641
- mgv_at_cse.sc.edu
- www.cse.sc.edu/mgv