Model Averaging with Discrete Bayesian Network Classifiers - PowerPoint PPT Presentation

1 / 19

About This Presentation

Title:

Model Averaging with Discrete Bayesian Network Classifiers

Description:

with Discrete Bayesian Network Classifiers. Denver Dash and Gregory F. Cooper. In the Proceedings of the Ninth International Workshop on Artificial ... – PowerPoint PPT presentation

Number of Views:45

Avg rating:3.0/5.0

Slides: 20

Provided by: jang5

Category:

more less

Transcript and Presenter's Notes

Title: Model Averaging with Discrete Bayesian Network Classifiers

1
Model Averagingwith Discrete Bayesian Network
Classifiers

Denver Dash and Gregory F. Cooper
In the Proceedings of the Ninth International
Workshop on Artificial Intelligence and
Statistics (AISTATS 2003)

2
Contents

Model-averaging over a class of discrete Bayesian
network classifiers
A partial ordering and bounded in-degree k.
Theoretical results (for N nodes)
The class has at least
distinct structures.
The summation can be performed in
time.
Approximate averaging in O(N) time.
Experiments
The technique can be beneficial even when the
generating distribution is not a member of the
class.
Characterize the performance over several
parameters.

3
Bayesian network classifiers

Naïve Bayes classifier
General Bayesian network classifiers

C
F1
F2
FN
Optimal in zero-one loss
Poor generalization performance could be improved
by Bayesian model averaging. ? the space of
network structure is super-exponential.
C
F1
F2
FN
4
In this paper

Bayesian model-averaging over a restricted class
of Bayesian network classifiers
A partial order (p) and a bounded in-degree (k).
Contributions
The factorization of the conditionals to apply to
the task of classification.
Show that MA over this class can be approximated
by a single network S ? calculation in O(N)
time.
Empirical evaluation of the method compared with
A single naïve Bayes classifer
A single Bayesian network learned by a greedy
search
Exact MA on naïve Bayes classifiers.

5
Notations

The classification problem
A set of features F F1, F2, , FN.
X0 C, X1 F1, , XN FN. ? X (in Bayesian
networks)
A set of classes C C1, C2, , CNC.
A database D D1, D2, , DR.
A Bayesian network
G(X) a DAG structure
Xi a multinomial distribution
Pi a parents of Xi
A parameter
Parameter set ?
Other assumptions parameter independence,
Dirichlet priors,

6
Fixed network structures

With the fixed network parameters ?
Bayesian averaging over the parameters with
conjugate priors

7
Averaging with a fixed ordering (1)

For a structural feature, e.g. XL ? XM
The posterior probability P(XL ? XMD),
The structure modularity
The marginal likelihood (decomposable)

8
Averaging with a fixed ordering (2)

Then, the posterior probability of a structural
feature can be represented as,

9
Averaging with a fixed ordering (3)

Enumerating the possible parents of Xi given a
partial ordering
p ltX1, X3, X2, X4gt, k 2.
P20 0, P21 X1, P22 X3, P23 X1, X3.

10
Averaging with a fixed ordering (4)
11
Averaging with a fixed ordering (5)
12
Averaging with a fixed ordering (6)

Dynamic programming solution
Finally,

13
Model averaging for predictions

The probability of a new example can be
calculated as similarly as the probability of a
structural feature. Hence,
The parameter value ?ijk is used on behalf of the
Kronecker-delta function.

14
Approximation on the model averaging

The time bound is still severe even for moderate
cases (k 3 or 4).
One approximation
Order the set of possible parents for Xi based on
the function f(Xi, Pi?D) and prune them.

15
Experimental evaluation (1)