Collapsed Variational Dirichlet Process Mixture Models

About This Presentation

Title:

Collapsed Variational Dirichlet Process Mixture Models

Description:

Collapsed Variational Dirichlet Process Mixture Models Kenichi Kurihara, Max Welling and Yee W. Teh Published on IJCAI 07 Discussion led by Qi An – PowerPoint PPT presentation

Number of Views:144

Avg rating:3.0/5.0

Slides: 15

Provided by: QQ83

Learn more at: http://people.ee.duke.edu

Category:

more less

Transcript and Presenter's Notes

Title: Collapsed Variational Dirichlet Process Mixture Models

1
Collapsed Variational Dirichlet Process Mixture
Models

Kenichi Kurihara, Max Welling and Yee W. Teh
Published on IJCAI 07

Discussion led by Qi An
2
Outline

Introduction
Four approximations to DP
Variational Bayesian Inference
Optimal cluster label reordering
Experimental results
Discussion

3
Introduction

DP is suitable for many density estimation and
data clustering applications.
Gibbs sampling solution is not efficient enough
to scale up to the large scale problems.
Truncated stick-breaking approximation is
formulated in the space of explicit,
non-exchangeable cluster labels.

4
Introduction

This paper
propose an improved VB algorithm based on
integrating out mixture weights
compare the stick-breaking representation against
the finite symmetric Dirichlet approximation
maintain optimal ordering of cluster labels in
the stick-breaking VB algorithm

5
Approximations to DP

Truncated stick-breaking representation

The joint distribution can be expressed as
6
Approximations to DP

Finite symmetric Dirichlet approximation

The joint distribution can be expressed as
The essential difference from TSB representation
is that the cluster labels remain interchangeable
under this formulation.
7
Dirichlet process is most naturally defined on a
partition space while both TSB and FSD are
defined over the cluster label space. Moreover,
TSB and FSD also live in different spaces
8
Marginalization

In variational Bayesian approximation, we assume
a factorized form for the posterior distribution.
However it is not a good assumption since changes
in p will have a considerable impact on z.

If we can integrate out p , the joint
distribution is given by
For the TSB representation
For the FSD representation
a
9
VB inference

We can then apply the VB inference on the four
approximations

The lower bound is given by
The approximated posterior distribution for TSB
and FSD are
Depending on marginalization or not, v and p may
be integrated out.
10
Gaussian approximation

For collapsed approximations, the computation for
q(zij) seems intractable due to the exponentially
large space of assignments for all other zij.

With central limit theory and Taylor expansion,
the expectation over zij will be approximated
with those expectations
11
Optimal cluster label reordering

For FSB representation, the prior assumes a
certain ordering of the clusters.
The authors claims the optimal relabelling of the
clusters is given by ordering the cluster sizes
in decreasing order.

12
Experimental results
13
Experimental results
14
Discussion

There is very little difference between
variational Bayesian inference in the reordered
stick-breaking representation and the finite
mixture model with symmetric Dirichlet priors.
Label reordering is important for the
stick-breaking representation
Variational approximation are much more efficient
computationally than Gibbs sampling, with almost
no loss in accuracy

Write a Comment

User Comments (0)