Title: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines
1Asymptotic Behavior of Stochastic Complexity of
Complete Bipartite Graph-Type Boltzmann Machines
- Yu Nishiyama and Sumio Watanabe
- Tokyo Institute of Technology, Japan
2Background
Learning machines
Information systems
Pattern recognition
Mixture models
Natural language processing
Hidden Markov models
Gene analysis
Bayesian networks
mathematically
Bayes learning is effective
Singular statistical models
3Problem Calculations which include a Bayes
posterior require
huge computational
cost.
a Bayes posterior
a trial distribution
Mean field approximation
Accuracy of approximation
Stochastic Complexity
Difference from regular
statistical models
Model selection
4Asymptotic behavior of mean field stochastic
complexities are studied.
- Mixture models K. Watanabe, et al. 2004.
- Reduced rank regressions Nakajima, et al. 2005.
- Hidden Markov models Hosino, et al. 2005.
- Stochastic context-free grammar Hosino, et al.
2005.
- Neural networks Nakano, et al. 2005.
5Purpose
- We derive the upper bound of mean field
stochastic complexity of complete bipartite
graph-type Boltzmann machines.
Graphical models
Boltzmann Machines
Spin systems
6Table of Contents
Bayes Learning
Mean Field Approximation
Boltzmann Machines
( Complete Bipartite Graph-type )
Main Theorem
Outline of the Proof
- Discussion and Conclusion
7Bayes Learning
True distribution
model
prior
Bayes posterior
Bayes predictive distribution
8Mean Field Approximation (1)
The Bayes posterior can be rewritten as
.
We consider a Kullback distance from a trial
distribution
to the Bayes posterior
.
9Mean Field Approximation (2)
When we restrict the trial distribution
to
,
which minimizes
is called mean field approximation.
The minimum value of
is called mean field stochastic complexity.
10Complete Bipartite Graph-typeBoltzmann Machines
units
units
parametric model
takes
11True Distribution
We assume that the true distribution is included
in the parametric model
and the number of hidden units is
.
units
True distribution is
units
12Main Theorem
The mean field stochastic complexity of complete
bipartite graph-type Boltzmann machines has the
following upper bound.
constant
the number of input and output units
the number of hidden units (learning machines)
the number of hidden units (true distribution)
13Outline of the Proof (Methods)
depends on the BM
normal distribution family
prior
14Outline of the Proof
lemma
and
,
For Kullback information
if there exists a value
of parameter
such that the number of elements of the set
mean field stochastic complexity
is less than or equal to
,
following upper bound.
has the
Hessian matrix
e
r
o
15We apply this lemma to the Boltzmann machines.
Kullback information is given by
.
The second order differential is
.
Here
,
.
16The parameter
is a true parameter
.
becomes
Then,
,
.
Then,
.
hold.
and
By using the lemma, we have
e
r
o
.
17Discussion
Comparison with other studies
regular statistical model
Stochastic Complexity
algebraic geometry
derived result
Yamazaki
upper bound
upper bound
mean field approximation
Bayes learning
Number of Training data
asymptotic area
18Conclusion
- We derived the upper bound of mean field
stochastic complexity of complete bipartite
graph-type Boltzmann Machines.
Future works
- Comparison with experimental results