Two Approaches to Bayesian Network Structure Learning

About This Presentation

Title:

Two Approaches to Bayesian Network Structure Learning

Description:

Two Approaches to Bayesian Network Structure Learning. Goal: ... Example I: Corral. Build-BN does not force Irrelevant' variables to be linked into the BN ... – PowerPoint PPT presentation

Number of Views:51

Avg rating:3.0/5.0

Slides: 10

Provided by: Imag89

Category:

more less

Transcript and Presenter's Notes

Title: Two Approaches to Bayesian Network Structure Learning

1
Two Approaches to Bayesian Network Structure
Learning

Goal Compare an algorithm that learns a BN tree
structure (TAN) with
an algorithm that learns a constraints-free
structure (Build-BN).
Problem Definition
Finding an exact BN structure for complete
discrete data.
Known to be NP-hard.
maximization problem over defined score.
Build-BN Algorithm
Algorithms Attributes
No structural constraints.
Straight Forward approach not avoiding any
computation.
Feasible only for small networks (lt30 variables).
Crucial Facts lying in the core of the Algorithm
There are scoring-functions which are
decomposable to local scores (we used BIC for
the algorithm)
Every DAG has at least one node with no outgoing
arcs (sink).

Yael Kinderman Tali Goren
Implementation Note Build-BN requires a lot of
memory.Therefore, implementation strongly
utilizes the file-system.
2
Algorithms Flow

Step I Find Local Scores
, (V
set of all variables), calculate local BIC

?
Î
"
)
\
(
,
x
V
vs
V
x
All in all, n2n-1 scores are calculated in this
step. Step II Find Best Parents
, find best parents of x in the var-set.
Traversing var-sets by lexicographic order
(smaller to larger), Results in time complexity
of O((n-1)2n-1).
3
Algorithms Flow cont.

Step III Find Best Sinks
For each 2n var-sets we find a best sink.
Let Sink(W) be the best sink of a var-set W.
Then Sink(W) canbe found by

Î
W
s

Where gs(var-set) the best set of parents for
s in the var-set.
G(var-set) the highest scoring
network for a var-set.
We traverse var-sets by lexicographic order, and
use scores that were calculated in previous
iterations.

4
Algorithms Flow cont.
Step IV Find Best Order Best sinks immediately
yield the best ordering (in reverse order).

V

U

)
)
(
\
(
sin
)
(
V
ord
V
k
V
ord
j
i

1
i
j

Step V Find best network
Having best order (ordi(V)) and best parents
(g(W)) for each W ? V,
we can find the network as following

In other words the ith var in the optimal
ordering, picks best parents from the var-set
that contains all the variables that are
predecessors in the ordering.
5
Using the BN for Prediction

5-fold cross validation80 of the data used for
building structure CPDs,20
label prediction.
Predicting the label C of a given sample is
done using

6
Test over the Famous Student Model

Testing our implementation over synthetic data
We simulated 300 samples according to the BN and
the CPDs as were presented in class.
Prediction performed using TAN and build-BN.

Build-BN result
TAN result
Note In Build-BN, 4 out of 5 fold cross
validation gave the above net.
7
Experimental Results
Data taken from UCI machine learning DB

Possible explanation for the last 2 results
Zoo only 101 instances
Vehicle whats wrong with this data ?! ?
Note the low in-degrees (model induced by
data-sets are by nature close to trees).

8
Example I CorralBuild-BN does not force
Irrelevant variables to be linked into the BN
Build-BN result
9
Example II TIC TAC TOENo constraints on the
structure enables better prediction
References Tomi Silander, Petri Myllymaki,
HIIT. A Simple Approach for Finding the Globally
Optimal Bayesian Network Structure.

Write a Comment

User Comments (0)