Intelligent Systems 2II40 C7 - PowerPoint PPT Presentation

1 / 50
About This Presentation
Title:

Intelligent Systems 2II40 C7

Description:

Neighbors John and Mary promised to call if the alarm goes off; ... Given data on application form (other unshaded nodes) Efficient conditional distributions ... – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 51
Provided by: wwwisW
Category:

less

Transcript and Presenter's Notes

Title: Intelligent Systems 2II40 C7


1
Intelligent Systems (2II40)C7
  • Alexandra I. Cristea

October 2003
2
VI. Uncertainty
  • VI.1. Decision theory basics
  • Uncertainty
  • Probability
  • Syntax
  • Semantics
  • Inference Rules
  • VI.2. Probabilistic reasoning
  • Conditional independence
  • Bayesian networks syntax and semantics
  • Exact inference
  • Approximate inference

3
VI.2.B. Belief networks (Bayesian networks)
4
Return to belief network example
  • Neighbors John and Mary promised to call if the
    alarm goes off sometimes it starts because of
    earthquakes. Is there a burglar?
  • Variables Burglary, Earthquake, Alarm,
    JohnCalls, MaryCalls (n variables)

5
Belief network example cont.
6
Constructing belief networks
  • Choose an ordering of variables X1,,Xn
  • For i1 to n
  • add Xi to the network
  • select parents from X1,,Xi-1 such
    that
  • P(XiParents(Xi))
    P(XiX1,,Xi-1)

7
Constructing belief networks example
8
Constructing belief networks example
9
Constructing belief networks example
10
Constructing belief networks example
11
Constructing belief networks example
12
Example car diagnosis
  • Initial evidence engine wont start
  • Testable variables (thin ovals), diagnosis
    variables (thick ovals), hidden variables
    (shaded) ensure sparse structure, reduce
    parameters

13
Example car insurance
  • Predict claim (medical, liability, property)
  • Given data on application form (other unshaded
    nodes)

14
Efficient conditional distributions
  • CPT grows exponentially w. no. of parents
  • CPT becomes infinite w. continuous variables
  • Other, more compact methods are needed

15
Compact conditional distributions - cont.
  • Noisy-OR distributions model multiple
    noninteracting causes
  • Parents U1 Uk include all causes (can add leak
    node)
  • Independent failure probability qi for each cause
    alone ? P(XU1Uj,?Uj1, ?Uk)1 - ?ji1qi

Number of parameters linear in number of parents
16
Hybrid (discrete continuous) networks
  • Discrete (Subsidy? and Buys?)
  • Continuous (Harvest and Cost)
  • How to deal with this?

17
Probability density functions
  • Instead of probability distributions
  • For continuous variables
  • Ex. let X denote tomorrows maximum temperature
    in the summer in Eindhoven
  • Belief that X is distributed uniformly between 18
    and 26 degree Celsius
  • P(Xx) U18,26(x)
  • P(X20,5) U18,26(20,5)0,125/C

18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
Hybrid (discrete continuous) networks
  • Discrete (Subsidy? and Buys?)
  • Continuous (Harvest and Cost)
  • How to deal with this?

22
Hybrid (discrete continuous) networks
  • Discrete (Subsidy? and Buys?)
  • Continuous (Harvest and Cost)
  • Option 1 discretization
  • possibly large errors, large CPTs
  • Option 2 finitely parameterized canonical
    families
  • Continuous variable, discrete continuous
    parents (e.g., Cost)
  • Discrete variable, continuous parents (e.g.,
    Buys?)

23
a) Continuous child variables
  • Need one conditional density function for child
    variable given continuous parents, for each
    possible assignment to discrete parents
  • Most common is the linear Gaussian model, e.g.
  • Mean Cost varies linearly w. Harvest, variance is
    fixed
  • Linear variation is unreasonable over the full
    range, but works OK if the likely range of
    Harvest is narrow

24
Continuous child variables ex.
  • All-continuous network w. LG distribution ?
    full joint is a multivariate Gaussian
  • Discrete continuous LG network is a conditional
    Gaussian network, i.e., a multivariate Gaussian
    over all continuous variables for each
    combination of discrete variable values

25
b) Discrete child, continuous parent
  • P(buysCostc) ??((-c ??) / ?)
  • with ? - threshold for buying
  • Probit distribution
  • ? - the integral on the standard normal
    distribution
  • Logit distribution
  • Uses the sigmoid function

26
VI.2. Probabilistic reasoning
  • Conditional independence
  • Bayesian networks syntax and semantics
  • Exact inference
  • Exact inference by enumeration
  • Exact inference by variable elimination
  • Approximate inference
  • Approximate inference by stochastic simulation
  • Approximate inference by Markov chain Monte Carlo

27
Exact inference w. enumeration
ndn n2n
dn 2n
28
Enumeration algorithm
  • Exhaustive depth-first enumeration O(n) space,
    O(dn) time

29
Inference by variable elimination
  • Enumeration is inefficient repeated computation
  • e.g., computes P(Jtruea)P(Mtruea) for each
    value of e
  • Variable elimination summation from right to
    left, storing intermediate results (factors) to
    avoid recomputation

30
Variable elimination basic operations
  • Pointwise product of factors f1 and f2
  • f1(x1,,xj,y1,,yk)?? f2(y1,,yk,z1,,zl)
  • f(x1,,xj,y1,,yk,z1,,zl)
  • e.g., f1(a,b)?? f2(b,c) f(a,b,c)
  • Summing out a variable from a product of factors
  • move any constant factors outside the summation
  • ?xf1??fk f1??fi ?xfi1??fk f1? ?fi fX
  • assuming f1,fi do not depend on X

31
Example pointwise product
32
Example pointwise product
33
Variable elimination algorithm
34
Complexity of exact inference
  • Polytrees (singly connected network) network in
    which there is at most one undirected path
    between any two nodes
  • Time, space complexity of exact inference on
    polytrees linear in size of network
  • Multiply connected networks ?polytrees
  • Variable elimination can have exponential time
    and space complexity
  • inference in Bayesian networks is NP-hard
  • includes inference in propositional logics as
    special case

35
  • VI.2. D. Approximate inference

36
Inference by stochastic simulation
  • Basic idea
  • Draw N samples from a sampling distribution S
  • Compute approximate posterior probability P
  • Show it converges to the true probability P

37
VI.2. D. Approximate inference
  • Sampling from an empty network
  • Rejection sampling reject samples disagreeing w.
    evidence
  • Likelihood weighting use evidence to weight
    samples
  • MCMC sample from a stochastic process whose
    stationary distribution is the true posterior

38
i. Sampling from an empty network
  • function PRIOR-SAMPLE(bn) returns an
    event sampled from the prior specified by bn
  • x ? an event w. n elements
  • for i1 to n do
  • xi ? a random sample from
    P(Xiparents(Xi))
  • return x
  • P(Cloudy) lt0.5,0.5gt

39
i. Sampling from an empty network cont.
  • Probability that PRIOR-SAMPLE generates a
    particular event
  • SPS(x1, xn) ? n i1P(XiParents(Xi))P(x1,xn)
  • NPS (Yy) no. of samples generated for which Yy
    for any set of variables Y.
  • Then, P(Yy) NPS(Yy)/N and
  • lim N??? P(Yy) ?h SPS(Yy,Hh)
  • ?h P(Yy,Hh)
  • P(Yy)
  • ? estimates derived from PRIOR-SAMPLE are
    consistent

40
ii. Rejection sampling example
  • Estimate P(RainSprinklertrue) ? using
  • 100 samples
  • 27 samples have Sprinklertrue out of these,
  • 8 have Raintrue and
  • 19 have Rainfalse.
  • P(RainSprinklertrue)
  • NORMALIZE(lt8,19gt) lt0.296,0.704gt
  • Similar to a basic real-world empirical
    estimation procedure

41
ii. Rejection sampling
  • P(Xe) is estimated from samples agreeing with
    evidence e

PROBLEM a lot of collected samples are thrown
away!!
42
iii. Likelihood weighting
  • Idea
  • fix evidence variables E,
  • sample only nonevidence var., X, Y
  • weight ?? sample by likelihood it accords to
    evidence E

43
iii. Likelihood weighting example
  • Estimate P(RainSprinklertrue,WetGrasstrue)

44
iii. Likelihood weighting example
  • Sample generation process
  • w ? 1.0
  • Sample P(Cloudy)lt0.5,0.5gt say true
  • Sprinkler has value true, so
  • w ? w ? P(Sprinklertrue Cloudytrue) 0.1
  • Sample P(RainCloudytrue)lt0.8,0.2gt say true
  • WetGrass has value true, so
  • w ? w ?P(WetGrasstrueSprinklertrue,Raintrue)
    0.099

45
iii. Likelihood weighting function
46
iii. Likelihood weighting analysis
  • Sampling probability for WEIGHTED-SAMPLE is
  • SWS(y,e) ? l i1P(yiParents(Yi))
  • Note pays attention to evidence in ancestors
    only ? somewhere in between prior and posterior
    distribution
  • Weight for a given sample y,e, is
  • w(y,e) ? n i1P(eiParents(Ei))
  • Weighted sampling probability is
  • SWS(y,e) w(y,e) ? l i1P(yiParents(Yi)) ? m
    i1P(eiParents(Ei)) P(y,e) by
    standard global semantics of network
  • Hence, likelihood weighting is consistent
  • But performance still degrades w. many evidence
    variables

47
iv. MCMC inference
  • State of network current assignment to all
    variables
  • Generate next state by sampling one variable
    given Markov blanket
  • Sample each variable in turn, keeping evidence
    fixed
  • Approaches stationary distribution long-run
    fraction of time spent in each state is exactly
    proportional to its posterior probability

48
Markov blanket - reminder
  • Each node is conditionally independent of all
    others given its Markov blanket parents
  • children childrens parents

49
MCMC algorithm
50
Homework 7
  • Continue till step 8 with your project.
Write a Comment
User Comments (0)
About PowerShow.com