Two Approaches to Bayesian Network Structure Learning - PowerPoint PPT Presentation

About This Presentation
Title:

Two Approaches to Bayesian Network Structure Learning

Description:

Two Approaches to Bayesian Network Structure Learning. Goal: ... Example I: Corral. Build-BN does not force Irrelevant' variables to be linked into the BN ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 10
Provided by: Imag89
Category:

less

Transcript and Presenter's Notes

Title: Two Approaches to Bayesian Network Structure Learning


1
Two Approaches to Bayesian Network Structure
Learning
  • Goal Compare an algorithm that learns a BN tree
    structure (TAN) with
  • an algorithm that learns a constraints-free
    structure (Build-BN).
  • Problem Definition
  • Finding an exact BN structure for complete
    discrete data.
  • Known to be NP-hard.
  • maximization problem over defined score.
  • Build-BN Algorithm
  • Algorithms Attributes
  • No structural constraints.
  • Straight Forward approach not avoiding any
    computation.
  • Feasible only for small networks (lt30 variables).
  • Crucial Facts lying in the core of the Algorithm
  • There are scoring-functions which are
    decomposable to local scores (we used BIC for
    the algorithm)
  • Every DAG has at least one node with no outgoing
    arcs (sink).

Yael Kinderman Tali Goren
Implementation Note Build-BN requires a lot of
memory.Therefore, implementation strongly
utilizes the file-system.
2
Algorithms Flow
  • Step I Find Local Scores
  • , (V
    set of all variables), calculate local BIC

?
Î
"
)
\
(
,
x
V
vs
V
x
All in all, n2n-1 scores are calculated in this
step. Step II Find Best Parents
, find best parents of x in the var-set.
Traversing var-sets by lexicographic order
(smaller to larger), Results in time complexity
of O((n-1)2n-1).
3
Algorithms Flow cont.
  • Step III Find Best Sinks
  • For each 2n var-sets we find a best sink.
  • Let Sink(W) be the best sink of a var-set W.
    Then Sink(W) canbe found by


Î
W
s
  • Where gs(var-set) the best set of parents for
    s in the var-set.
  • G(var-set) the highest scoring
    network for a var-set.
  • We traverse var-sets by lexicographic order, and
    use scores that were calculated in previous
    iterations.

4
Algorithms Flow cont.
Step IV Find Best Order Best sinks immediately
yield the best ordering (in reverse order).




V

U



)
)
(
\
(
sin
)
(
V
ord
V
k
V
ord
j
i


1
i
j
  • Step V Find best network
  • Having best order (ordi(V)) and best parents
    (g(W)) for each W ? V,
  • we can find the network as following

In other words the ith var in the optimal
ordering, picks best parents from the var-set
that contains all the variables that are
predecessors in the ordering.
5
Using the BN for Prediction
  • 5-fold cross validation80 of the data used for
    building structure CPDs,20
    label prediction.
  • Predicting the label C of a given sample is
    done using

6
Test over the Famous Student Model
  • Testing our implementation over synthetic data
  • We simulated 300 samples according to the BN and
    the CPDs as were presented in class.
  • Prediction performed using TAN and build-BN.

Build-BN result
TAN result
Note In Build-BN, 4 out of 5 fold cross
validation gave the above net.
7
Experimental Results
Data taken from UCI machine learning DB
  • Possible explanation for the last 2 results
  • Zoo only 101 instances
  • Vehicle whats wrong with this data ?! ?
  • Note the low in-degrees (model induced by
    data-sets are by nature close to trees).

8
Example I CorralBuild-BN does not force
Irrelevant variables to be linked into the BN
Build-BN result
9
Example II TIC TAC TOENo constraints on the
structure enables better prediction
References Tomi Silander, Petri Myllymaki,
HIIT. A Simple Approach for Finding the Globally
Optimal Bayesian Network Structure.
Write a Comment
User Comments (0)
About PowerShow.com