Computing the Marginal Likelihood - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Computing the Marginal Likelihood

Description:

Typically impossible to compute analytically. All sorts of Monte Carlo approximations ... Monte Carlo Estimates of p(D|M) (cont.) Newton and Raftery's 'Harmonic ... – PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 17
Provided by: Madi1
Category:

less

Transcript and Presenter's Notes

Title: Computing the Marginal Likelihood


1
Computing the Marginal Likelihood
David Madigan
2
Bayesian Criterion
  • Typically impossible to compute analytically
  • All sorts of Monte Carlo approximations

3
Laplace Method for p(DM)
(i.e., the log of the integrand divided by n)
Laplaces Method
4
Laplace cont.
  • Tierney Kadane (1986, JASA) show the
    approximation is O(n-1)
  • Using the MLE instead of the posterior mode is
    also O(n-1)
  • Using the expected information matrix in ? is
    O(n-1/2) but convenient since often computed by
    standard software
  • Raftery (1993) suggested approximating by a
    single Newton step starting at the MLE
  • Note the prior is explicit in these approximations

5
Monte Carlo Estimates of p(DM)
Draw iid ?1,, ?m from p(?)
In practice has large variance
6
Monte Carlo Estimates of p(DM) (cont.)
Draw iid ?1,, ?m from p(?D)
Importance Sampling
7
Monte Carlo Estimates of p(DM) (cont.)
Newton and Rafterys Harmonic Mean
Estimator Unstable in practice and needs
modification
8
p(DM) from Gibbs Sampler Output
First note the following identity (for any ? )
p(D?) and p(?) are usually easy to
evaluate. What about p(?D)?
Suppose we decompose ? into (?1,?2) such that
p(?1D,?2) and p(?2D,?1) are available in
closed-form
Chib (1995)
9
p(DM) from Gibbs Sampler Output
The Gibbs sampler gives (dependent) draws from
p(?1, ?2 D) and hence marginally from p(?2 D)
Rao-Blackwellization
10
What about 3 parameter blocks
OK
OK
?
To get these draws, continue the Gibbs sampler
sampling in turn from
and
11
p(DM) from Metropolis Output
Chib and Jeliazkov, JASA, 2001
12
p(DM) from Metropolis Output
E1 with respect to ?y
E2 with respect to q(?, ?)
13
Bayesian Information Criterion
(SL is the negative log-likelihood)
  • BIC is an O(1) approximation to p(DM)
  • Circumvents explicit prior
  • Approximation is O(n-1/2) for a class of priors
    called unit information priors.
  • No free lunch (Weakliem (1998) example)

14
(No Transcript)
15
(No Transcript)
16
Score Functions on Hold-Out Data
  • Instead of penalizing complexity, look at
    performance on hold-out data
  • Note even using hold-out data, performance
    results can be optimistically biased
  • Pseudo-hold-out Score

Recall
Write a Comment
User Comments (0)
About PowerShow.com