Dirichlet Prior Sieves in Finite Normal Mixtures By: Hemant Ishwaran and Mahmoud Zarepour - PowerPoint PPT Presentation

1 / 10
About This Presentation
Title:

Dirichlet Prior Sieves in Finite Normal Mixtures By: Hemant Ishwaran and Mahmoud Zarepour

Description:

Dirichlet Prior Sieves in Finite Normal Mixtures By: Hemant Ishwaran and Mahmoud Zarepour John Paisley Dirichlet Distribution How do you draw from a Dirichlet ... – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 11
Provided by: John61
Category:

less

Transcript and Presenter's Notes

Title: Dirichlet Prior Sieves in Finite Normal Mixtures By: Hemant Ishwaran and Mahmoud Zarepour


1
Dirichlet Prior Sieves in Finite Normal
MixturesBy Hemant Ishwaran and Mahmoud Zarepour
  • John Paisley

2
Dirichlet Distribution
Reparameterize
Written this way, alpha is a scalar and g is a pmf
3
How do you draw from a Dirichlet distribution?
  • Two finite exact methods
  • Using Gamma Distributions
  • Using exact stick-breaking
  • With \pi_1 set to v_1 and \pi_k getting the rest
    of the stick

4
How do you draw from a Dirichlet distribution? (2)
  • Two converging methods
  • Polya Urn
  • alpha balls distributed according to g are placed
    in an urn. Balls are then drawn with replacement
    and another ball of the same color as that drawn
    placed back in the urn. Empirical distribution in
    the urn represents a draw from DD as the number
    of ball draws approaches infinity
  • Sethuramans Stick-Breaking

5
Sethuramans Stick-Breaking
  • To draw from below distribution, do the following
  • Sethuramans discovery allows one to draw from a
    Dirichlet distribution by drawing the weights and
    the components independently of one another. The
    final pi values are not independent.

6
Dirichlet Process
Movie In DP, we enforce that G_0(B_i) 1/k ---
What does that look like as a function of k? I
dont have the math proof, but I say that these
regions converge to points and a uniform draw of
a region (Dirichlet component) is the same as
drawing a point from G_0.
7
Letting k go to infinity
  • Using a uniform prior, the breaks
  • will be infinitely small and since the
    probability of drawing the same Y is zero with a
    uniform G_0(B_i), you are left with G_0 in the
    limit.
  • For alpha less than infinity, you have the
    Dirichlet process (obvious when looking at this
    written in stick-breaking representation)
  • Also, notice that the exact methods are now
    impractical, but the two infinite methods can
    be used to approximate the draw.

8
This Paper How DD approximates DP for mixture
modeling
  • After parameters are drawn, the mathematical
    forms are exactly the same (with truncated DP)
  • For DD, \pi comes from a DD
  • For DP, \pi comes from stick-breaking
  • \thetas are drawn iid from G_0 in each case
  • The only question is How does \pi differ between
    the two
  • Answer We assume for DD prior that draws of Y
    never reduplicate, which is where we differ from
    DP. However, if we set k large enough and alpha
    small enough for the DD, there can be a good
    probability that most of the stick (defined by
    truncation error) will be allocated before
    duplicate copies of Y are drawn. When this
    happens, weve coincidentally drawn from a DP.
    Increasing k and/or reducing alpha increases the
    probability of this happening.

9
Summary
  • DP is a sparseness promoting prior on the weights
    with a mathematical relation to the atoms that
    makes everything logical.
  • DD, when alpha lt k is also a sparseness promoting
    prior on the weights, but there is no strict
    mathematical relation to the components, making
    it more ad hoc (but not really)
  • In practice, however, they both function to
    obtain the same exact ends.

10
Aside
  • For the generation of a DD mixture model, if we
    were to also draw the Ys just for fun, then on
    the occasions when the Ys are all unique, that
    specific draw is exactly the same as a draw from
    a DP (as I understand it).
  • Using Sethuramans definition, and given a
    specific truncation error for the stick, we can
    get an expected number of necessary breaks (to
    meet that error) as a function of alpha, which is
    also the number of Ys we need to draw, call it
    N.
  • We can then calculate the probability of drawing
    duplicate Ys because we are drawing from N balls
    uniformly with replacement and we can find the
    probability that we draw only unique balls.
    Therefore, we can determine the probability that
    a draw from a DD mixture model is also a draw
    from a DP mixture model.
  • This paper says you can use a DD to approximate a
    DP. What this value would do is give you a
    probability as a function of alpha and N that is
    a measure of how close you are to a DP. Clearly
    when N is greater than the number of components,
    that probability is zero.
Write a Comment
User Comments (0)
About PowerShow.com