Bayesian topic models - PowerPoint PPT Presentation

About This Presentation
Title:

Bayesian topic models

Description:

Define a generative model for documents. each document is a mixture of topics ... expectation propagation; Minka & Lafferty (2002) Markov chain Monte Carlo ... – PowerPoint PPT presentation

Number of Views:200
Avg rating:3.0/5.0
Slides: 82
Provided by: tomg157
Category:
Tags: bayesian | minka | models | topic

less

Transcript and Presenter's Notes

Title: Bayesian topic models


1
Bayesian topic models
  • Tom Griffiths
  • UC Berkeley

2
(No Transcript)
3
(No Transcript)
4
Latent structure
probabilistic process
Observed data
5
Latent structure (meaning)
probabilistic process
Observed data (words)
6
Latent structure (meaning)
statistical inference
Observed data (words)
7
Outline
  • Topic models
  • Latent Dirichlet allocation
  • More complex models
  • Conclusions

8
Outline
  • Topic models
  • Latent Dirichlet allocation
  • More complex models
  • Conclusions

9
Topic models
  • Hierarchical Bayesian models
  • share parameters (topics) across densities
  • Define a generative model for documents
  • each document is a mixture of topics
  • each word is chosen from a single topic
  • An idea that is widely used in NLP
  • e.g., Bigi et al., 1997 Blei et al., 2003
    Hofmann, 1999 Iyer Ostendorf, 1996 Ueda
    Saito, 2003

10
A generative model for documents
w P(wz 1)
w P(wz 2)
HEART 0.2 LOVE 0.2 SOUL 0.2 TEARS 0.2 JOY
0.2 SCIENTIFIC 0.0 KNOWLEDGE 0.0 WORK
0.0 RESEARCH 0.0 MATHEMATICS 0.0
HEART 0.0 LOVE 0.0 SOUL 0.0 TEARS 0.0 JOY
0.0 SCIENTIFIC 0.2 KNOWLEDGE 0.2 WORK
0.2 RESEARCH 0.2 MATHEMATICS 0.2
topic 1
topic 2
11
Choose mixture weights for each document,
generate bag of words
P(z 1), P(z 2) 0, 1 0.25, 0.75 0.5,
0.5 0.75, 0.25 1, 0
MATHEMATICS KNOWLEDGE RESEARCH WORK MATHEMATICS
RESEARCH WORK SCIENTIFIC MATHEMATICS WORK
SCIENTIFIC KNOWLEDGE MATHEMATICS SCIENTIFIC
HEART LOVE TEARS KNOWLEDGE HEART
MATHEMATICS HEART RESEARCH LOVE MATHEMATICS WORK
TEARS SOUL KNOWLEDGE HEART
WORK JOY SOUL TEARS MATHEMATICS TEARS LOVE LOVE
LOVE SOUL
TEARS LOVE JOY SOUL LOVE TEARS SOUL SOUL TEARS JOY
12
Geometric interpretation
P(TEARS)
P(TEARS)
1
0
1
P(LOVE)
1
P(LOVE)
P(RESEARCH)
P(RESEARCH)
13
Geometric interpretation
P(TEARS)
P(TEARS)
1
0
1
P(LOVE)
1
P(LOVE)
P(RESEARCH)
P(RESEARCH)
14
Geometric interpretation
P(TEARS)
P(TEARS)
1
TOPIC 1
TOPIC 2
0
1
P(LOVE)
1
P(LOVE)
P(RESEARCH)
P(RESEARCH)
15
Geometric interpretation
P(TEARS)
P(TEARS)
1
TOPIC 1
TOPIC 2
0
1
P(LOVE)
1
P(LOVE)
P(RESEARCH)
P(RESEARCH)
16
Geometric interpretation
P(wz 1)
P(w)
try to minimize KL divergence
P(wz 2)
minimum KL projection
P(wz 3)
SW
(Hofmann, 1999)
17
Matrix factorization interpretation
documents
topics
documents
??
words
topics
P(wz)
P(z)
P(w)
words
Maximum-likelihood estimation is finding the
factorization that minimizes KL divergence
(Hofmann, 1999)
18
Three nice properties of topic models
  • Interpretability of topics
  • useful in many applications

19
Interpretable topics
JOB WORK JOBS CAREER EXPERIENCE EMPLOYMENT OPPORTU
NITIES WORKING TRAINING SKILLS CAREERS POSITIONS F
IND POSITION FIELD OCCUPATIONS REQUIRE OPPORTUNITY
EARN ABLE
SCIENCE STUDY SCIENTISTS SCIENTIFIC KNOWLEDGE WORK
RESEARCH CHEMISTRY TECHNOLOGY MANY MATHEMATICS BI
OLOGY FIELD PHYSICS LABORATORY STUDIES WORLD SCIEN
TIST STUDYING SCIENCES
BALL GAME TEAM FOOTBALL BASEBALL PLAYERS PLAY FIEL
D PLAYER BASKETBALL COACH PLAYED PLAYING HIT TENNI
S TEAMS GAMES SPORTS BAT TERRY
FIELD MAGNETIC MAGNET WIRE NEEDLE CURRENT COIL POL
ES IRON COMPASS LINES CORE ELECTRIC DIRECTION FORC
E MAGNETS BE MAGNETISM POLE INDUCED
STORY STORIES TELL CHARACTER CHARACTERS AUTHOR REA
D TOLD SETTING TALES PLOT TELLING SHORT FICTION AC
TION TRUE EVENTS TELLS TALE NOVEL
MIND WORLD DREAM DREAMS THOUGHT IMAGINATION MOMENT
THOUGHTS OWN REAL LIFE IMAGINE SENSE CONSCIOUSNES
S STRANGE FEELING WHOLE BEING MIGHT HOPE
DISEASE BACTERIA DISEASES GERMS FEVER CAUSE CAUSED
SPREAD VIRUSES INFECTION VIRUS MICROORGANISMS PER
SON INFECTIOUS COMMON CAUSING SMALLPOX BODY INFECT
IONS CERTAIN
WATER FISH SEA SWIM SWIMMING POOL LIKE SHELL SHARK
TANK SHELLS SHARKS DIVING DOLPHINS SWAM LONG SEAL
DIVE DOLPHIN UNDERWATER
each column shows words from a single topic,
ordered by P(wz)
20
Three nice properties of topic models
  • Interpretability of topics
  • useful in many applications
  • Handling of multiple meanings or senses
  • better than spatial representations (e.g., LSI)

21
Handling multiple senses
JOB WORK JOBS CAREER EXPERIENCE EMPLOYMENT OPPORTU
NITIES WORKING TRAINING SKILLS CAREERS POSITIONS F
IND POSITION FIELD OCCUPATIONS REQUIRE OPPORTUNITY
EARN ABLE
SCIENCE STUDY SCIENTISTS SCIENTIFIC KNOWLEDGE WORK
RESEARCH CHEMISTRY TECHNOLOGY MANY MATHEMATICS BI
OLOGY FIELD PHYSICS LABORATORY STUDIES WORLD SCIEN
TIST STUDYING SCIENCES
BALL GAME TEAM FOOTBALL BASEBALL PLAYERS PLAY FIEL
D PLAYER BASKETBALL COACH PLAYED PLAYING HIT TENNI
S TEAMS GAMES SPORTS BAT TERRY
FIELD MAGNETIC MAGNET WIRE NEEDLE CURRENT COIL POL
ES IRON COMPASS LINES CORE ELECTRIC DIRECTION FORC
E MAGNETS BE MAGNETISM POLE INDUCED
STORY STORIES TELL CHARACTER CHARACTERS AUTHOR REA
D TOLD SETTING TALES PLOT TELLING SHORT FICTION AC
TION TRUE EVENTS TELLS TALE NOVEL
MIND WORLD DREAM DREAMS THOUGHT IMAGINATION MOMENT
THOUGHTS OWN REAL LIFE IMAGINE SENSE CONSCIOUSNES
S STRANGE FEELING WHOLE BEING MIGHT HOPE
DISEASE BACTERIA DISEASES GERMS FEVER CAUSE CAUSED
SPREAD VIRUSES INFECTION VIRUS MICROORGANISMS PER
SON INFECTIOUS COMMON CAUSING SMALLPOX BODY INFECT
IONS CERTAIN
WATER FISH SEA SWIM SWIMMING POOL LIKE SHELL SHARK
TANK SHELLS SHARKS DIVING DOLPHINS SWAM LONG SEAL
DIVE DOLPHIN UNDERWATER
each column shows words from a single topic,
ordered by P(wz)
22
Three nice properties of topic models
  • Interpretability of topics
  • useful in many applications
  • Handling of multiple meanings or senses
  • better than spatial representations (e.g., LSI)
  • Well-formulated generative models
  • many options for inferring topics from documents
  • supports extensions of the basic model

23
Outline
  • Topic models
  • Latent Dirichlet allocation
  • More complex models
  • Conclusions

24
Latent Dirichlet allocation(Blei, Ng, Jordan,
2001 2003)
?
? (d) ? Dirichlet(?)
? (d)
?
zi
zi ? Discrete(? (d) )
? (j) ? Dirichlet(?)
? (j)
T
wi ? Discrete(? (zi) )
wi
Nd
D
25
Dirichlet priors
  • Multivariate equivalent of Beta distribution
  • Hyperparameters ? determine form of the prior

26
Relationship to other models
  • A fully generative aspect model (Hofmann,
    1999)
  • A multinomial version of PCA (Buntine,
    2002)
  • Poisson models
  • non-negative matrix factorization (Lee Seung,
    1999)
  • GaP model (Canny, 2004)
  • discussed in detail in Buntine Jakulin (2005)
  • Statistics grade of membership (GoM) models
  • (Erosheva, 2002)
  • Genetics structure of populations
  • (Pritchard, Stephens, Donnelly, 2000)

27
Inverting the generative model
  • Maximum likelihood estimation (EM)
  • e.g. Hofmann (1999)
  • Deterministic approximate algorithms
  • variational EM Blei, Ng Jordan (2001 2003)
  • expectation propagation Minka Lafferty (2002)
  • Markov chain Monte Carlo
  • full Gibbs sampler Pritchard et al. (2000)
  • collapsed Gibbs sampler Griffiths Steyvers
    (2004)

28
The collapsed Gibbs sampler
  • Using conjugacy of Dirichlet and multinomial
    distributions, integrate out continuous
    parameters
  • Defines a distribution on discrete ensembles z

29
The collapsed Gibbs sampler
  • Sample each zi conditioned on z-i
  • This is nicer than your average Gibbs sampler
  • memory counts can be cached in two sparse
    matrices
  • optimization no special functions, simple
    arithmetic
  • the distributions on ? and ? are analytic given z
    and w, and can later be found for each sample

30
Gibbs sampling in LDA
iteration 1
31
Gibbs sampling in LDA
iteration 1 2
32
Gibbs sampling in LDA
iteration 1 2
33
Gibbs sampling in LDA
iteration 1 2
34
Gibbs sampling in LDA
iteration 1 2
35
Gibbs sampling in LDA
iteration 1 2
36
Gibbs sampling in LDA
iteration 1 2
37
Gibbs sampling in LDA
iteration 1 2
38
Gibbs sampling in LDA
iteration 1 2
1000
39
A visual example Bars
sample each pixel from a mixture of topics
pixel word image document
40
(No Transcript)
41
(No Transcript)
42
Effects of hyperparameters
  • ? and ? control the relative sparsity of ? and ?
  • smaller ?, fewer topics per document
  • smaller ?, fewer words per topic
  • Good assignments z compromise in sparsity

log ?(x)
x
43
Analysis of PNAS abstracts
  • Test topic models with a real database of
    scientific papers from PNAS
  • All 28,154 abstracts from 1991-2001
  • All words occurring in at least five abstracts,
    not on stop list (20,551)
  • Total of 3,026,970 tokens in corpus

44
A selection of topics
STRUCTURE ANGSTROM CRYSTAL RESIDUES STRUCTURES STR
UCTURAL RESOLUTION HELIX THREE HELICES DETERMINED
RAY CONFORMATION HELICAL HYDROPHOBIC SIDE DIMENSIO
NAL INTERACTIONS MOLECULE SURFACE
NEURONS BRAIN CORTEX CORTICAL OLFACTORY NUCLEUS NE
URONAL LAYER RAT NUCLEI CEREBELLUM CEREBELLAR LATE
RAL CEREBRAL LAYERS GRANULE LABELED HIPPOCAMPUS AR
EAS THALAMIC
TUMOR CANCER TUMORS HUMAN CELLS BREAST MELANOMA GR
OWTH CARCINOMA PROSTATE NORMAL CELL METASTATIC MAL
IGNANT LUNG CANCERS MICE NUDE PRIMARY OVARIAN
MUSCLE CARDIAC HEART SKELETAL MYOCYTES VENTRICULAR
MUSCLES SMOOTH HYPERTROPHY DYSTROPHIN HEARTS CONT
RACTION FIBERS FUNCTION TISSUE RAT MYOCARDIAL ISOL
ATED MYOD FAILURE
HIV VIRUS INFECTED IMMUNODEFICIENCY CD4 INFECTION
HUMAN VIRAL TAT GP120 REPLICATION TYPE ENVELOPE AI
DS REV BLOOD CCR5 INDIVIDUALS ENV PERIPHERAL
FORCE SURFACE MOLECULES SOLUTION SURFACES MICROSCO
PY WATER FORCES PARTICLES STRENGTH POLYMER IONIC A
TOMIC AQUEOUS MOLECULAR PROPERTIES LIQUID SOLUTION
S BEADS MECHANICAL
45
Cold topics
Hot topics
46
Cold topics
Hot topics
2 SPECIES GLOBAL CLIMATE CO2 WATER ENVIRONMENTAL Y
EARS MARINE CARBON DIVERSITY OCEAN EXTINCTION TERR
ESTRIAL COMMUNITY ABUNDANCE
134 MICE DEFICIENT NORMAL GENE NULL MOUSE TYPE HOM
OZYGOUS ROLE KNOCKOUT DEVELOPMENT GENERATED LACKIN
G ANIMALS REDUCED
179 APOPTOSIS DEATH CELL INDUCED BCL CELLS APOPTOT
IC CASPASE FAS SURVIVAL PROGRAMMED MEDIATED INDUCT
ION CERAMIDE EXPRESSION
47
Cold topics
Hot topics
37 CDNA AMINO SEQUENCE ACID PROTEIN ISOLATED ENCOD
ING CLONED ACIDS IDENTITY CLONE EXPRESSED ENCODES
RAT HOMOLOGY
2 SPECIES GLOBAL CLIMATE CO2 WATER ENVIRONMENTAL Y
EARS MARINE CARBON DIVERSITY OCEAN EXTINCTION TERR
ESTRIAL COMMUNITY ABUNDANCE
289 KDA PROTEIN PURIFIED MOLECULAR MASS CHROMATOGR
APHY POLYPEPTIDE GEL SDS BAND APPARENT LABELED IDE
NTIFIED FRACTION DETECTED
75 ANTIBODY ANTIBODIES MONOCLONAL ANTIGEN IGG MAB
SPECIFIC EPITOPE HUMAN MABS RECOGNIZED SERA EPITOP
ES DIRECTED NEUTRALIZING
134 MICE DEFICIENT NORMAL GENE NULL MOUSE TYPE HOM
OZYGOUS ROLE KNOCKOUT DEVELOPMENT GENERATED LACKIN
G ANIMALS REDUCED
179 APOPTOSIS DEATH CELL INDUCED BCL CELLS APOPTOT
IC CASPASE FAS SURVIVAL PROGRAMMED MEDIATED INDUCT
ION CERAMIDE EXPRESSION
48
Outline
  • Topic models
  • Latent Dirichlet allocation
  • More complex models
  • Conclusions

49
Extensions to the basic model
  • No need for a pre-segmented corpus
  • useful for modeling discussions, meetings

50
A model for meetings
su
?

?(u)su0 ? Delta(?(u-1)) ?(u)su1 ? Dirichlet(?)
?(u-1)
?(u)
?
zi
zi ? Discrete(? (u) )
? (j) ? Dirichlet(?)
? (j)
T
wi ? Discrete(? (zi) )
wi
Nu
U
51
Sample of ICSI meeting corpus(25 meetings)
  • no it's o_k.
  • it's it'll work.
  • well i can do that.
  • but then i have to end the presentation in the
    middle so i can go back to open up javabayes.
  • o_k fine.
  • here let's see if i can.
  • alright.
  • very nice.
  • is that better.
  • yeah.
  • o_k.
  • uh i'll also get rid of this click to add notes.
  • o_k. perfect
  • NEW TOPIC (not supplied to algorithm)
  • so then the features we decided or we decided we
    were talked about.
  • right.
  • uh the the prosody the discourse verb choice.
  • you know we had a list of things like to go and
    to visit and what not.
  • the landmark-iness of uh.

52
Topic segmentation applied to meetings
Inferred Segmentation
Inferred Topics
53
Comparison with human judgments
Topics recovered are much more coherent than
those found using random segmentation, no
segmentation, or an HMM
54
Extensions to the basic model
  • No need for a pre-segmented corpus
  • Learning the number of topics

55
Learning the number of topics
  • Can use standard Bayes factor methods to evaluate
    models of different dimensionality
  • e.g. importance sampling via MCMC
  • Alternative nonparametric Bayes
  • fixed number of topics per document, unbounded
    number of topics per corpus
  • (Blei, Griffiths, Jordan, Tenenbaum, 2004)
  • unbounded number of topics for both (the
    hierarchical Dirichlet process)
  • (Teh, Jordan, Beal, Blei, 2004)

56
Extensions to the basic model
  • No need for a pre-segmented corpus
  • Learning the number of topics
  • Learning topic hierarchies

57
Learning topic hierarchies
Topic 0
  • Fixed hierarchies Hofmann Puzicha (1998)
  • Learning hierarchies Blei et al. (2004)

Topic 1.1
Topic 1.2
Topic 2.1
Topic 2.2
Topic 2.3
58
Learning topic hierarchies
Topic 0
The topics in each document form a path from root
to leaf
  • Fixed hierarchies Hofmann Puzicha (1998)
  • Learning hierarchies Blei et al. (2004)

Topic 1.1
Topic 1.2
Topic 2.1
Topic 2.2
Topic 2.3
59
Twelve years of NIPS
(Blei, Griffiths, Jordan, Tenenbaum, 2004)
60
Extensions to the basic model
  • No need for a pre-segmented corpus
  • Learning the number of topics
  • Learning topic hierarchies
  • Modeling authors as well as documents

61
The Author-Topic model(Rosen-Zvi, Griffiths,
Smyth, Steyvers, 2004)
?
? (a) ? Dirichlet(?)
? (a)
A
?
xi
xi ? Uniform(A (d) )
? (j) ? Dirichlet(?)
? (j)
zi
zi ? Discrete(? (xi) )
T
wi ? Discrete(? (zi) )
wi
Nd
D
62
Four example topics from NIPS
63
Who wrote what?
  • A method1 is described which like the kernel1
    trick1 in support1 vector1 machines1 SVMs1 lets
    us generalize distance1 based2 algorithms to
    operate in feature1 spaces usually nonlinearly
    related to the input1 space This is done by
    identifying a class of kernels1 which can be
    represented as norm1 based2 distances1 in Hilbert
    spaces It turns1 out that common kernel1
    algorithms such as SVMs1 and kernel1 PCA1 are
    actually really distance1 based2 algorithms and
    can be run2 with that class of kernels1 too As
    well as providing1 a useful new insight1 into how
    these algorithms work the present2 work can form
    the basis1 for conceiving new algorithms
  • This paper presents2 a comprehensive approach for
    model2 based2 diagnosis2 which includes proposals
    for characterizing and computing2 preferred2
    diagnoses2 assuming that the system2 description2
    is augmented with a system2 structure2 a
    directed2 graph2 explicating the interconnections
    between system2 components2 Specifically we first
    introduce the notion of a consequence2 which is a
    syntactically2 unconstrained propositional2
    sentence2 that characterizes all consistency2
    based2 diagnoses2 and show2 that standard2
    characterizations of diagnoses2 such as minimal
    conflicts1 correspond to syntactic2 variations1
    on a consequence2 Second we propose a new
    syntactic2 variation on the consequence2 known as
    negation2 normal form NNF and discuss its merits
    compared to standard variations Third we
    introduce a basic algorithm2 for computing
    consequences in NNF given a structured system2
    description We show that if the system2
    structure2 does not contain cycles2 then there is
    always a linear size2 consequence2 in NNF which
    can be computed in linear time2 For arbitrary1
    system2 structures2 we show a precise connection
    between the complexity2 of computing2
    consequences and the topology of the underlying
    system2 structure2 Finally we present2 an
    algorithm2 that enumerates2 the preferred2
    diagnoses2 characterized by a consequence2 The
    algorithm2 is shown1 to take linear time2 in the
    size2 of the consequence2 if the preference
    criterion1 satisfies some general conditions

Written by (1) Scholkopf_B
Written by (2) Darwiche_A
64
Extensions to the basic model
  • No need for a pre-segmented corpus
  • Learning the number of topics
  • Learning topic hierarchies
  • Modeling authors as well as documents
  • Combining topics and syntax

65
Combining topics and syntax
semantics probabilistic topics
Factorization of language based on statistical
dependency patterns long-range, document
specific, dependencies short-range
dependencies constant across all documents
q
z
z
z
w
w
w
x
x
x
syntax probabilistic regular grammar
(Griffiths, Steyvers, Blei, Tenenbaum, 2005)
66
Semantic topics
PLANTS PLANT LEAVES SEEDS SOIL ROOTS FLOWERS WATER
FOOD GREEN SEED STEMS FLOWER STEM LEAF ANIMALS RO
OT POLLEN GROWING GROW
GOLD IRON SILVER COPPER METAL METALS STEEL CLAY LE
AD ADAM ORE ALUMINUM MINERAL MINE STONE MINERALS P
OT MINING MINERS TIN
BEHAVIOR SELF INDIVIDUAL PERSONALITY RESPONSE SOCI
AL EMOTIONAL LEARNING FEELINGS PSYCHOLOGISTS INDIV
IDUALS PSYCHOLOGICAL EXPERIENCES ENVIRONMENT HUMAN
RESPONSES BEHAVIORS ATTITUDES PSYCHOLOGY PERSON
CELLS CELL ORGANISMS ALGAE BACTERIA MICROSCOPE MEM
BRANE ORGANISM FOOD LIVING FUNGI MOLD MATERIALS NU
CLEUS CELLED STRUCTURES MATERIAL STRUCTURE GREEN M
OLDS
DOCTOR PATIENT HEALTH HOSPITAL MEDICAL CARE PATIEN
TS NURSE DOCTORS MEDICINE NURSING TREATMENT NURSES
PHYSICIAN HOSPITALS DR SICK ASSISTANT EMERGENCY P
RACTICE
BOOK BOOKS READING INFORMATION LIBRARY REPORT PAGE
TITLE SUBJECT PAGES GUIDE WORDS MATERIAL ARTICLE
ARTICLES WORD FACTS AUTHOR REFERENCE NOTE
MAP NORTH EARTH SOUTH POLE MAPS EQUATOR WEST LINES
EAST AUSTRALIA GLOBE POLES HEMISPHERE LATITUDE PL
ACES LAND WORLD COMPASS CONTINENTS
FOOD FOODS BODY NUTRIENTS DIET FAT SUGAR ENERGY MI
LK EATING FRUITS VEGETABLES WEIGHT FATS NEEDS CARB
OHYDRATES VITAMINS CALORIES PROTEIN MINERALS
67
Syntactic classes
BE MAKE GET HAVE GO TAKE DO FIND USE SEE HELP KEEP
GIVE LOOK COME WORK MOVE LIVE EAT BECOME
ONE SOME MANY TWO EACH ALL MOST ANY THREE THIS EVE
RY SEVERAL FOUR FIVE BOTH TEN SIX MUCH TWENTY EIGH
T
HE YOU THEY I SHE WE IT PEOPLE EVERYONE OTHERS SCI
ENTISTS SOMEONE WHO NOBODY ONE SOMETHING ANYONE EV
ERYBODY SOME THEN
MORE SUCH LESS MUCH KNOWN JUST BETTER RATHER GREAT
ER HIGHER LARGER LONGER FASTER EXACTLY SMALLER SOM
ETHING BIGGER FEWER LOWER ALMOST
ON AT INTO FROM WITH THROUGH OVER AROUND AGAINST A
CROSS UPON TOWARD UNDER ALONG NEAR BEHIND OFF ABOV
E DOWN BEFORE
THE HIS THEIR YOUR HER ITS MY OUR THIS THESE A AN
THAT NEW THOSE EACH MR ANY MRS ALL
GOOD SMALL NEW IMPORTANT GREAT LITTLE LARGE BIG
LONG HIGH DIFFERENT SPECIAL OLD STRONG YOUNG COMMO
N WHITE SINGLE CERTAIN
SAID ASKED THOUGHT TOLD SAYS MEANS CALLED CRIED SH
OWS ANSWERED TELLS REPLIED SHOUTED EXPLAINED LAUGH
ED MEANT WROTE SHOWED BELIEVED WHISPERED
68
Outline
  • Topic models
  • Latent Dirichlet allocation
  • More complex models
  • Conclusions

69
Conclusions
  • Topic models are a flexible class of models that
    can capture the content of documents
  • Dirichlet priors can be used to assert a
    preference for sparsity in multinomials
  • The collapsed Gibbs sampler makes it possible to
    exploit these priors, and is simple to use
  • easy to extend to other aspects of language
  • and applicable in a broad range of models

70
(No Transcript)
71
(Dumais, Landauer)
P(w)
72
(No Transcript)
73
(No Transcript)
74
NIPS support vector topic
75
NIPS neural network topic
76
NIPS Semantics
KERNEL SUPPORT VECTOR SVM KERNELS SPACE FUNCTION
MACHINES SET
NETWORK NEURAL NETWORKS OUPUT INPUT TRAINING INPUT
S WEIGHTS OUTPUTS
EXPERTS EXPERT GATING HME ARCHITECTURE MIXTURE LEA
RNING MIXTURES FUNCTION GATE
MEMBRANE SYNAPTIC CELL CURRENT DENDRITIC POTENTI
AL NEURON CONDUCTANCE CHANNELS
IMAGE IMAGES OBJECT OBJECTS FEATURE RECOGNITION VI
EWS PIXEL VISUAL
DATA GAUSSIAN MIXTURE LIKELIHOOD POSTERIOR PRIOR D
ISTRIBUTION EM BAYESIAN PARAMETERS
STATE POLICY VALUE FUNCTION ACTION REINFORCEMENT L
EARNING CLASSES OPTIMAL
NIPS Syntax
IN WITH FOR ON FROM AT USING INTO OVER WITHIN
IS WAS HAS BECOMES DENOTES BEING REMAINS REPRESENT
S EXISTS SEEMS
I X T N - C F P
SEE SHOW NOTE CONSIDER ASSUME PRESENT NEED PROPOSE
DESCRIBE SUGGEST
MODEL ALGORITHM SYSTEM CASE PROBLEM NETWORK METHOD
APPROACH PAPER PROCESS
HOWEVER ALSO THEN THUS THEREFORE FIRST HERE NOW HE
NCE FINALLY
USED TRAINED OBTAINED DESCRIBED GIVEN FOUND PRESEN
TED DEFINED GENERATED SHOWN
(Griffiths, Steyvers, Blei, Tenenbaum, 2005)
77
(No Transcript)
78
Varying ? ?
79
Varying ?
80
Number of words per topic
Number of words
Topic
81
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com