Title: Expressive Graphical Models in Variational Approximations: Chain-Graphs and Hidden Variables
1Expressive Graphical Models in Variational
Approximations Chain-Graphs and Hidden Variables
- Tal El-Hay Nir Friedman
- School of Computer Science EngineeringHebrew
University
2Inference in Graphical Models
- Exact Inference
- NP-hard, in general
- Can be efficient for certain classes
- What do we do when exact inference is
intractable? - Resort to approximate methods
- Approximate inference is also NP-hard
- But, specific approximation methods work for
specific classes of models - ? Need to enrich approximate methods
3Variational Approximations
- Approximate the posterior of a complex model
using a simpler distribution - Choice of a simpler model ? method Mean field,
Structured approximations, and Mixture models
4Variational Approximations
- Approximate the posterior of a complex model
using a simpler distribution - Choice of a simpler model ? method Mean field,
Structured approximations, and Mixture models
5Variational Approximations
- Approximate the posterior of a complex model
using a simpler distribution - Choice of a simpler model ? method Mean field,
Structured approximations, and Mixture models
6Variational Approximations
- Approximate the posterior of a complex model
using a simpler distribution - Choice of a simpler model ? method Mean field,
Structured approximations, and Mixture models
7Enhancing Variational Approximations
- Basic tradeoff
- accuracy ? complexity
- Goal
- New families of approximating distributions
- ?better tradeoff
8Outline
- Structured variational approximations review
- Using chain-graphs
- Adding hidden variables
- Discussion
9Structured Approximations
10Structured Approximations
- Goal Maximize the following functional
- ? FQ is a lower bound on the log likelihood
- If Q is tractable then FQ might be tractable
11Structured Approximations
- To characterize the maximum point we definethe
generalized functional - Differentiation yields the following equation
-
- ? approximates using
the lower bound on the local distribution
12Structured Approximations
- Optimization
- Asynchronous updates guaranties convergence
- Efficient calculation of the update formulas
13Chain Graph Approximations
- Posterior distributions can be modeled as chain
graphs
14Chain Graph Approximations
- Chain graph distributionswhere are
potential functions on subsets of T - Generalize both Bayesian networks and Markov
networks - A simple approximation example
15Chain Graph Approximations
16Adding Hidden Variables
- Potential pitfall Multi-modal distributions
- Jaakkola Jordan Use mixture models
- Modeling assumption Factorized mixture
components - GeneralizationStructured approximation with an
extra set of hidden variables - Approximating distribution
17Adding Hidden Variables Intuition
- Lower bound improvement potentialwhere I(TV)
is the mutual information - Capture correlations in a compact manner
18Adding Hidden Variables Prospects
- Lower bound improvement potentialwhere I(TV)
is the mutual information - Describing correlations in a compact manner
19Relaxing the lower bound
- Rewriting the lower bound on the
log-likelihoodwhere - The conditional entropy does not decompose
- ? The lower bound is intractable
20Relaxing the lower bound
- Using the following convexity bound
- Introducing extra variational parameters
- The relaxed lower bound becomes tractable
21Optimization
- Bayesian network parameters
- Smoothing parameters
- Asynchronous updates guaranties convergence
22Results
KL Bound
Number of time slices
Number of time slices
23Discussion
- Extending representational features of
approximating distributions ?Better tradeoff ? - Addition of hidden variables improves
approximation - Derivations of different methods use a uniform
machinery - Future directions
- Saving computations by planning the order of
updates - Structure of the approximating distribution