Topic Models - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Topic Models

Description:

Topic models tools for automatically organizing, ... This process assumes that images (documents) are drawn exchangeably from the same set of topics. ... – PowerPoint PPT presentation

Number of Views:91
Avg rating:3.0/5.0
Slides: 22
Provided by: ip6
Category:

less

Transcript and Presenter's Notes

Title: Topic Models


1
Topic Models
Presented by Iulian Pruteanu Friday, July 28th,
2006
2
Outline
  1. Introduction
  2. Exchangeable topic models (L. Fei-Fei. et al.
    CVPR 2005)
  3. Dynamic topic models (D. Blei et al. ICML 2006)

3
Introduction
Topic models tools for automatically
organizing, searching and browsing large
collections (documents, images, etc.) Topic
models the discovered patterns often reflect
the underlying topics which combined, form
corpuses. Exchangeable (static) topic models
the words (patches) of each document (image) are
assumed to be independently drawn from a mixture
of multinomials the mixture components (topics)
are shared by all documents Dynamic topic models
capture the evolution of topics in a
sequentially organized corpus of documents
(images)
4
Exchangeable topic models (CVPR 2005)
  • Used for learning natural scene categories.
  • A key idea is to use intermediate representations
    (themes) before classifying scenes.
  • Avoid using manually labeled or segmented images
    to train the system.
  • Local regions are first clustered into different
    intermediate themes, and then into categories. NO
    supervision is needed apart from a single
    category label to the training image.
  • the algorithm provides a principled approach to
    learning relevant intermediate representations of
    scenes, without supervision
  • the model is able to group categories of images
    into a sensible hierarchy

5
Exchangeable topic models (CVPR 2005)
6
Exchangeable topic models (CVPR 2005)
7
Exchangeable topic models (CVPR 2005)
a patch x is the basic unit of an image an image
is a sequence of N patches a category is a
collection of I images is the total number
of themes intermediate themes
(K-dim unit vectors) is the total number of
codewords
8
Exchangeable topic models (CVPR 2005)
Bayesian decision
For convenience, is always assumed to
be a fixed uniform distribution,
9
Exchangeable topic models (CVPR 2005)
Learning Variational inference
10
Exchangeable topic models (CVPR 2005)
Features and codebook
  1. Evenly sampled grid
  2. Random sampling
  3. Kadir Brady saliency detector
  4. Lowes DoG detector

11
Exchangeable topic models (CVPR 2005)
Experimental setup and results A model for each
category was obtained from the training images.
12
Exchangeable topic models (CVPR 2005)
Experimental setup and results
13
Exchangeable topic models (CVPR 2005)
Experimental setup and results
14
Dynamic topic models (ICML 2006)
Topic models tools for automatically
organizing, searching and browsing large
collections (documents, images, etc.) Topic
models the discovered patterns often reflect
the underlying topics which combined, form
documents. Exchangeable (static) topic models
the words (patches) of each document (image) are
assumed to be independently drawn from a mixture
of multinomials the mixture components (topics)
are shared by all documents Dynamic topic models
capture the evolution of topics in a
sequentially organized corpus of documents
(images)
15
Dynamic topic models (ICML 2006)
Static topic model review Each document (image)
is assumed drawn from the following generative
process 1. choose topic proportions from a
distribution over the (K-1) simplex, such as a
Dirichlet 2. for each (word) patch - choose a
topic assignment - choose a patch This process
assumes that images (documents) are drawn
exchangeably from the same set of topics. In a
dynamic topic model, we suppose that the data is
divided by time slice, for example by year. The
images of each slice are modeled with a
K-component topic model, where the topics
associated with slice t evolve from the topics
associated with slice t-1.
16
Dynamic topic models (ICML 2006)
Dynamic topic models
Extension of the logistic normal distribution to
time-series simplex data
17
Dynamic topic models (ICML 2006)
Approximate inference In the dynamic topic
model, the latent variables are the topics
, mixture proportions
and topic indicators .
They optimize the free parameters of a
distribution over the latent variables so that
the distribution is close to K-L divergence to
the true posterior. Follow all the derivations in
the paper.
18
Dynamic topic models (ICML 2006)
Experimental setup and results A subset of
30,000 articles from the journal Science, 250
from each of the 120 years between 1881 and
1999. The corpus is made up of approximately 7.5
million words. To explore the corpus and its
themes, a 20-component dynamic topic model was
estimated.
19
Dynamic topic models (ICML 2006)
20
Dynamic topic models (ICML 2006)
Discussion A sequential topic model for
discrete data was developed by using Gaussian
time series on the natural parameters of the
multinomial topics and logistic normal topic
proportion models. The most promising extension
to the method presented here is to incorporate a
model of how new topics in the collection appear
or disappear over time, rather than assuming a
fixed number of topics.
21
  • References
  • Blei, D., Ng, A., and Jordan, M. (JMLR 2003)
    Latent Dirichlet allocation
  • Blei, D., Lafferty, J. D. (NIPS 2006)
    Correlated topic models
  • Fei-Fei, L. and Perona, P. (IEEE CVPR 2005) A
    Bayesian hierarchical model for learning natural
    scene categories
  • Blei, D., Lafferty, J. D. (ICML 2006) Dynamic
    topic models
Write a Comment
User Comments (0)
About PowerShow.com