Learning Bayesian Network Structure from Massive Datasets: The - PowerPoint PPT Presentation

1 / 17

About This Presentation

Title:

Learning Bayesian Network Structure from Massive Datasets: The

Description:

Bioinformatics. Learning Bayesian Network Structures. Constraint satisfaction problem ... To iterate basic procedure, using the previously constructed network to ... – PowerPoint PPT presentation

Number of Views:106

Avg rating:3.0/5.0

Slides: 18

Provided by: Kyubae8

Category:

more less

Transcript and Presenter's Notes

Title: Learning Bayesian Network Structure from Massive Datasets: The

1
Learning Bayesian Network Structure from Massive
DatasetsThe Sparse Candidate Algorithm

Nir Friedman, Iftach Nachman, and Dana Peer
Announcer Kyu-Baek Hwang

2
Abstract

Learning Bayesian network
Optimization problem (in machine learning)
Constraint satisfaction (in statistics)
Search space is extremely large.
Search procedure spends most of times examining
extremely unreasonable candidate structures.
If we can reduce search space, faster learning
will be possible.
Some restrictions on candidate parent variables
for a variable are given.
Bioinformatics

3
Learning Bayesian Network Structures

Constraint satisfaction problem
?2-test
Optimization problem
BDe, MDL
Learning is to find the structure maximizes these
scores.
Search technique
Generally NP-hard
Greedy hill-climbing, simulated annealing
O(n2)
If the number of examples and the number of
attributes are large, the computational cost is
too expensive to get tractable result.

4
Combining Statistical Properties

Most of the candidates considered during the
search procedure can be eliminated in advance
based on our statistical understanding on the
domain
If X and Y are almost independent in data, we
might decide not to consider Y as a parent of X.
Mutual information
Restricting the possible parents of each variable
(k)
k ltlt n 1
The key idea is to use the network structure
found at the last stage to find better candidate
parents.

5
Background

A Bayesian network for X X1, X2, , Xn
B ltG, ?gt
The problem of learning a Bayesian network
Given a training set D X1, X2, , XN,
Find a B that best matches D.
BDe, MDL
Score(GD) ?iScore(XiPa(Xi)NXi, Pa(Xi))
Greedy hill-climbing search
At each step, all possible local change is
examined and the change which brings maximal gain
in the score is selected.
Calculation of sufficient statistics is
computational bottle-neck.

6
Simple Intuitions

Using mutual information or correlation
If the true structure is X -gt Y -gt Z,
I(XZ) gt 0, I(YZ) gt 0, I(XY) gt 0 and I(XZY)
0
Basic idea of Sparse Candidate algorithm
For each variable X, we find a set of variables
Y1, Y2, , Yk that are most promising candidate
parents for X.
This gives us smaller search space.
The main drawback of this idea
A mistake in initial stage can lead us to find an
inferior scoring network.
To iterate basic procedure, using the previously
constructed network to reconsider the candidate
parents.

7
Outline of the Sparse Candidate Algorithm
8
Convergence Properties of the Sparse Candidate
Algorithm

We require that in Restrict step, the selected
candidates for Xis parents include Xis current
parents.
PaGn(Xi) ? Cin1
This requirement implies that the winning network
Bn is a legal structure in the n 1 iteration.
Score(Bn1D) ? Score(BnD)
Stopping criterion
Score(Bn) Score(Bn-1)

9
Mutual Information