A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation PowerPoint PPT Presentation

presentation player overlay
1 / 10
About This Presentation
Transcript and Presenter's Notes

Title: A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation


1
A Collapsed Variational Bayesian Inference
Algorithm for Latent Dirichlet Allocation
  • Yee W. Teh, David Newman and Max Welling
  • Published on NIPS 2006

Discussion led by Iulian Pruteanu
2
Outline
  • Introduction
  • Approximate inferences for LDA
  • Collapsed VB inference for LDA
  • Experimental results
  • Conclusions

3
Introduction (1/2)
  • Latent Dirichlet Allocation is suitable for many
    applications from document modeling to computer
    vision.
  • Collapsed Gibbs sampling seems to be the
    preferred choice to the large scale problems
    however collapsed Gibbs sampling has its own
    problems.
  • CVB algorithm, making use of some approximations,
    is easy to implement and more accurate than
    standard VB.

4
Introduction (2/2)
  • This paper
  • proposes an improved VB algorithm based on
    integrating out the model parameters
  • - assumption the latent variables are
    mutually independent
  • uses a Gaussian approximation for computation
    efficiency

5
Approximate inferences for LDA(1/3)
6
Approximate inferences for LDA (2/3)
Given the observed words the
task of Bayesian inference is to compute the
posterior distribution over
  • Variational Bayes

7
Approximate inferences for LDA (3/3)
2. Collapsed Gibbs sampling
8
Collapsed VB inference for LDAand
marginalization on model parameters
  • In variational Bayesian approximation, we assume
    a factorized form for the posterior approximating
    distribution. However it is not a good assumption
    since changes in model parameters ( ) will
    have a considerable impact on latent variables (
    ).
  • CVB is equivalent to marginalizing out the model
    parameters before approximating the
    posterior over the latent variable .
  • The exact implementation of CVB has a closed form
    but is computationally too expensive to be
    practical. Therefore, the authors propose a
    simple Gaussian approximation which seems to work
    very accurately.

9
Experimental results
Left results for KOS. D3,430 documents W6,909
N467,714 words Right results for
NIPS. D1,675 documents W12,419 N2,166,029
words 10 for testing 50 random runs
Variational bounds ( iterations) Log
probabilities ( iterations)
10
Conclusions
  • Variational approximation are much more efficient
    computationally than Gibbs sampling, with almost
    no loss in accuracy
  • The CVB inference algorithm is easy to
    implement, computationally efficient (Gaussian
    approximation) and more accurate than standard
    VB.
Write a Comment
User Comments (0)
About PowerShow.com