Bayesian Optimization Algorithm, Decision Graphs, and Occam - PowerPoint PPT Presentation

About This Presentation
Title:

Bayesian Optimization Algorithm, Decision Graphs, and Occam

Description:

BDe metric for Bayesian networks with decision graphs. Bayesian Networks ... BDe metric combines the prior knowledge about the problem and the statistical ... – PowerPoint PPT presentation

Number of Views:126
Avg rating:3.0/5.0
Slides: 18
Provided by: bisn2
Category:

less

Transcript and Presenter's Notes

Title: Bayesian Optimization Algorithm, Decision Graphs, and Occam


1
Bayesian Optimization Algorithm, Decision Graphs,
and Occams Razor
  • Martin Pelikan, David E. Goldberg,
  • and Kumara Sastry
  • IlliGAL Report No. 2000020
  • May 2000.

2
Abstract
  • The use of various scoring metrics for Bayesian
    networks.
  • The use of decision graphs in Bayesian networks
    to improve the performance of the BOA.
  • BDe metric for Bayesian networks with decision
    graphs.

3
Bayesian Networks
  • Two basics components in Bayesian Networks
  • A scoring metric for discriminates the networks
  • A search algorithm for finding the best scoring
    metric value
  • BOA (in previous works)
  • The complexity of the considered models was
    bounded by the maximum number of incoming edges
    into any node.
  • To search the space of networks, a simple greedy
    algorithm was used due to its efficiency.

4
Bayesian-Dirichlet Metric
  • BDe metric combines the prior knowledge about the
    problem and the statistical data from a given
    data set.
  • Bayes theorem
  • The higher the p(BD), the more likely the
    network B is a correct model of the data.
  • ? Bayesian scoring metric, or the posterior
    probability
  • Even more, we use a fixed data set D.

5
Bayesian-Dirichlet Metric
  • p(B) prior probability of the network B
  • BDe metric gives preference to simpler networks
  • But, its not enough!

6
Bayesian-Dirichlet Metric
  • p(BD)
  • Data is a multinomial sample
  • Parameters are independent
  • The parameters associated with each variable are
    independent (global parameter independence)
  • The parameters associated with each instance of
    the parents of a variable are independent (local
    parameter independence)
  • Dirichlet distribution
  • No missing data (complete data)

7
Bayesian-Dirichlet Metric
  • Often referred to K2 metric

8
Minimum Description Length Metric
  • Not good for using prior information

9
Constructing a Network
  • Constructing a best network is NP-complete.
  • Most of the commonly used metrics can be
    decomposed into independent terms each of which
    corresponds to one variable.
  • Empirical results show that more sophisticated
    search algorithms do not improve the obtained
    result significantly.

10
Decision Graphs in Bayesian Networks
  • The use of local structures as decision trees,
    decision graphs, and default tables to represent
    equalities among parameters was proposed
  • The network construction algorithm takes an
    advantage of using decision graphs by directly
    manipulating the network structure through the
    graphs.

11
Decision Graphs
  • A decision graph is an extension of a decision
    tree in which each non-root node can have
    multiple parents.

12
Advantages of Decision Graph
  • Much less parents can be used to represent a
    model
  • Learning more complex class of models, called
    Bayesian multinets
  • Performs smaller and more specific steps what
    results in better models with respect to their
    likelihood.
  • Network complexity measure can be incorporated
    into the scoring metir

13
Bayesian Score for Networks with Decision Graphs
14
Operators on Decision Graphs
split
merge
15
Constructing BN with DG
  1. Initialize a decision graph Gi for each node xi
    to a graph containing only a single leaf.
  2. Initialize the network B into an empty network.
  3. Choose the best split or merge that does not
    result in a cycle in B.
  4. If the best operator does not improve the score,
    finish.

16
Constructing BN with DG
  1. Execute the chosen operator
  2. If the operator was a split, update the network B
    by adding a new edge.
  3. Go to (3)

17
Experiments
  • One-max
  • 3-deceptive
  • Spin-glass
  • Graph bisection
Write a Comment
User Comments (0)
About PowerShow.com