Title: The Dynamic Hierarchical Dirichlet Process
1The Dynamic Hierarchical Dirichlet Process
The 25th International Conference on Machine
Learning
(ICML 2008)
Lu Ren, David B. Dunson and Lawrence
Carin Presenter John Paisley Duke
University 07/06/08
2Outline
- Dirichlet process (DP) mixture model
- Sharing statistical strength (HDP and Dynamic
DP) - Dynamic hierarchical Dirichlet process (dHDP)
- dHDP for music segmentation (HMM)
- Time-evolving model for gene analysis (GMM)
- Conclusions and future work
3Dirichlet Process (DP)
- Dirichlet process (DP) a measure on measures
- G DP( , G0 )
Precision parameter and base measure G0
- Good clustering property
- Non-parametric Bayesian prior for density
estimation - Explicit mathematical form stick-breaking
process
4DP Mixture Model
Assume we have data points
- Infinite number of atoms
- infinite mixture model
Figure 1. Graphical model for DP mixture
Independent assumption
2.
5Sharing Statistical Strength
- A recurring theme
- Separate observations into groups
- The groups to remain linked
Two methods
1. Hierarchical Dirichlet Process (HDP)
- Assume the data are subdivided
- Parameters are shared among groups
- Different groups are exchangeable
6Hierarchical Dirichlet Process
Stick-breaking construction
7Dynamic Mixture DPs
2. Dynamic mixture of DPs (DMDP)
- Accommodates autocorrelation in the
distributions - The atom (parameter) number might be huge
8Problem Definition Motivation
- Data collected sequentially
- Temporal evolution is assumed
- Non-parametric prior and sparse model
- Solution
- dynamic hierarchical Dirichlet process (dHDP)
- The parameters shared globally
- The mixture weights change dynamically
9Dynamic HDP
10Dynamic HDP
11Dynamic HDP
Fig. 3 Graphical model for dHDP
12Dynamic HDP
Two indicator variable and for each
observation .
The model specification
Fig. 4 Stick-breaking representation for dHDP
13Dynamic HDP
Theorem 1
Theorem 2
14Posterior Inference
- A modification of the block Gibbs sampler
- Collect the samples for each random variable
- Approximate the posterior distribution
- depending on the specific applications
1. For HMMs mixture
2. For GMM
15Music Segmentation
- dHDP HMMs mixture
- Contiguous part clustered together
- Segment changes detected as innovations
The music the movement Largo-Allegro of the
Beethoven piano sonata No. 17, also referred as
The Tempest.
Fig 5. Auditory waveform of the Sonata
- MFCC features extracted
- discretized by VQ technique
16Music Segmentation
Fig. 6 Segmentation result on the auditory
waveform of the Sonata
- dominant and temporally localized auditory
phenomena - make the model sparse and keeps the temporal
coherence - automatically annotate the music in Bayesian
setting
17Music Segmentation
(a) dHDP-HMMs
(b) HDP-HMMs
Fig. 7 Similarity matrix from HMM
mixture modeling
Temporal dependence makes the dHDP-HMMs
segmentation more insensitive to those local
temporal bursts than the HDP-HMMs.
18Gene Analysis
- Problem
- Time-evolving modeling ---Disease development
- Related genes---High dimensions of gene
expressions
Time after infection (t)
3hr 6hr 12hr 24hr 48hr 72hr
Number of samples( ) 10 12 12
10 12 9
Step 1 Prune the genes with Fisher score Step 2
dHDP mixture model developed for further analysis
Assume samples at each time shot ,
Consider 1. Individual diversity of samples
2. Similar temporal pattern of
infection level
19Gene Analysis
3. At , represents the virus
infection level for
p-dim and each iid drawn from a
student-t distribution
T1
T2
TJ
T3
Fig. 8 Time evolving model for gene analysis
20Gene Analysis
4. For each time shot, assumed
to be drawn from a Gaussian mixture
?
Fig. 9 Median values and associated uncertainty
based on posterior distributions of the hidden
variables .
21Gene Analysis
(a)
(b)
Fig. 10 The dHDP GMM modeling for the gene
expression data. (a) The posterior distribution
of . (b) The similarity matrix .
22Gene Analysis
Fig. 11 The first ten inferred important genes
(color red and blue) and the relatively unrelated
genes (color green).
23Gene Analysis
dHDP encourages proper sharing to improve
parameters estimation as the data number is
limited.
(a)
(b)
Fig. 12 Similarity matrix with four
samples for each temporal group. (a) HDP, (b)
dHDP.
24Gene Analysis
Fig. 13 Comparison of dHDP and HDP with box
plots of the hidden variables as the sample
size is reduced to four for each temporal group (
the standard deviation based on dHDP is 12.1
reduced on average relative to HDP the means are
very similar).
25Gene Analysis
(a)
(b)
Fig. 14 Similarity matrix between data at
different time points based on the correlation
coefficients (Theorem 2), as computed from the
dHDP posterior. (a) Using all available data,
(b) using four samples for each temporal group.
26Conclusions
- Non-parametric prior, dynamic HDP is proposed
- Time dependence explored
- A modification of the block Gibbs sampler
- The dHDP HMMs mixture for music modeling
- Temporal dependent model for analyzing the
Dengue gene expression data
27Thanks!