Bayesian Analysis of the Normal Distribution, Part II - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Bayesian Analysis of the Normal Distribution, Part II

Description:

Modern Parametric Bayesians and the normal model with unknown mean and variance ... What prior distribution would a modern parametric Bayesian choose to satisfy the ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 20
Provided by: jeffgry
Category:

less

Transcript and Presenter's Notes

Title: Bayesian Analysis of the Normal Distribution, Part II


1
Bayesian Analysis of the Normal Distribution,
Part II
  • Set-up of the basic model of a normally
    distributed random variable with unknown mean and
    variance (a two-parameter model).
  • Discuss philosophies of prior selection
  • Implementation of different priors with a
    discussion of MCMC methods.

2
Different types of Bayesians choose different
priors
  • Classical Bayesians the prior is a necessary
    evil.
  • ? choose priors that interject the least
    information possible.
  • Modern Parametric Bayesians the prior is a
    useful convenience.
  • ? choose prior distributions with desirable
    properties (e.g. conjugacy). Given a
    distributional choice, prior parameters are
    chosen to interject the least information
    possible.
  • Subjective Bayesians the prior is a summary of
    old beliefs
  • ? choose prior distributions based on previous
    knowledgeeither the results of earlier studies
    or non-scientific opinion.

3
Modern Parametric Bayesians and the normal model
with unknown mean and variance
  • y N(? , ?2) where ? and ?2 are both unknown
    random variables.
  • What prior distribution would a modern parametric
    Bayesian choose to satisfy the demands of
    convenience?
  • What if we used the definition of conditional
    probability, so p(? , ?2) p(??2)p(?2)?

4
Modern Parametric Bayesians and the normal model
with unknown mean and variance
  • y N(? , ?2) where ? and ?2 are both unknown
    random variables.
  • A modern parametric Bayesian would typically
    choose a conjugate prior.
  • For the normal model with unknown mean and
    variance, the conjugate prior for the joint
    distribution of ? and ?2 is the normal
    inverse-gamma (?) distribution (i.e.
    normal-inverse-?2)
  • p( ?, ?2 ) N-Inv-?2(?0, ?02/k0 v0,?02)

Four Parameters in the prior
5
  • Suppose p(?, ?2) N-Inv-?2(?0, ?02/k0 v0, ?02)
  • ICBST the above expression can be factored such
    that
  • p(?,?2) p(??2)p(?2)
  • where ??2 N(?0, ?2/k0) and ?2
    Inv-?2(v0,?02)
  • Because this is a conjugate distribution for the
    normal distribution with unknown mean and
    variance, the posterior distribution will also be
    normal-Inv-?2.

6
The posterior distribution if y N(?,?2) and
?,?2 N-Inv-?2(?0, ?02/k0 v0, ?02)
  • p(?,?2 y) N-Inv-?2(?n, ?n2/kn vn, ?n2)

Weighted average of the prior mean and the data.
Weighted sum of the prior variance, the sample
variance and the distance between the sample and
prior means
7
If p(?,?2 y) N-Inv-?2(?n, ?n2/kn vn, ?n2) we
can factor the posterior distribution just like
the prior
These are essentially the same posterior
distributions as we found with the improper
priors. To implement the conjugate priors in
WinBugs we would use the same code, but
substitute the values ?n, ?n2/kn vn, ?n2 into
the posterior.
8
Implementation using the Gibbs Sampler
  • Recall from last time that the Gibbs Sampler is a
    method for analyzing the data
  • Consider a problem with two parameters ?1 and ?2
    and let X denote the data.
  • Suppose further that we know the conditional
    distributions p(?1 ?2 , X) and p(?2 ?1 , x)
  • We need to find p(?1 X) and p(?2 X)

9
  • The Gibbs Sampler proceeds by choosing some
    initial point which we will denote ?10, ?20 from
    the parameter space.
  • ? This can be any reasonable value of ?
  • Then, we take draws from the two conditional
    distributions in the following sequence
  • ?11 p(?1 ?20 , Y)
  • ?21 p(?2 ?11 , Y)
  • ?12 p(?1 ?21 , Y)
  • ?22 p(?2 ?12 , Y)
  • ?13 p(?1 ?22 , Y)
  • ?23 p(?2 ?13 , Y)
  • etc.

This sequence of draws is a Markov Chain because
the values at step t only depend on the value at
step t-1. If allowed to run long enough, the
Gibbs sampler will converge to the true
posterior distribution. If allowed to run for
sufficiently long after convergence, the Gibbs
sampler produces a complete sample from the
distribution of ?.
10
  • This algorithm can be easily generalized to the
    n-parameter case ?1, ?2, , ?n.
  • For the tth iteration of the Gibbs Sampler we
    have
  • ?1t p(?1 ?2t-1, ?3t-1, ?kt-1, Y)
  • ?2t p(?2 ?1t, ?3t-1, ?kt-1, Y)
  • ?nt p(?2 ?1t, ?2t, ?n-1t-1, Y)

11
The Gibbs Sampler is not Rocket Science even if
the original application was in physics
  • We have already created a Gibbs Sampler when we
    implemented Method 2 last class.
  • In the code, we used WinBugs random number
    generators rather than allow the program to
    identify the conditional distribution which will
    be the usual case.
  • We had to do this because WinBugs does not allow
    researchers to use improper priors.
  • We can even use Excel to do MCMC.

12
Excel Implementation of the Normal model with
conjugate priors for the mean and variance
  • To implement a Gibbs Sampler in this case, we
    need to perform the following steps
  • Everything subscripted with an n is a function of
    stuff that is known

In Excel, for each iteration t ?2(t)
1/GAMMAINV(rand(),vn,1/?n2) ?(t)
NORMINV(rand(), ?n , ?2(t) / kn) where ?2(t)
spreadsheet cell correspond to the tth draw
13
Lazy Modern Parametric Bayesians and the normal
model with unknown mean and variance
  • Suppose that y N(?, ?) where ? was the prior
    precision.
  • From here on when we talk about the normal
    distribution you should expect that we will speak
    in terms of the precision ? rather than the
    variance ?2. This is because WinBugs is
    programmed to use ? rather than ?2
  • Suppose also that you dont want to think too
    hard about the prior joint distribution of ? and
    ?, and assume that
  • p(?, ?) p(?)p(?)
  • What distributions would you choose for p(?) and
    p(?)?

14
Suppose that y N(?, ?) What priors would you
choose for ? and ??
  • I would choose
  • ? N( 0 , t ) (where t is small
    indicating little precision)
  • This is because, if we expect something like the
    central limit theorem to hold, then the
    distribution of the sample mean should be
    approximately normal for large n.
  • ? ?( a , b ) (where a, b are small
    numbers)
  • This is because this distribution is bounded
    below at zero and unlike the ?2 distribution
    which shares this property it is not constrained
    to have the mean proportional to the variance.
  • ? Note how we now have to talk about the mean of
    the distribution of the variance.

Gamma
15
Estimation of the lazy parametric Bayesians
approach to the normal model
  • Suppose yi N(?, ?) for i ? 1,,n
  • where ? N( 0 , t ) and ? ?( a , b ).
  • By Bayes Rule, posterior ? prior ? likelihood
  • p(?, ? y) ? N( 0 , t ) ?( a , b ) N(?, ?)
  • Joint prior distribution
  • To estimate the marginal distributions from this
    model would be a bit difficult analytically, but
    it is pretty easy to implement using the Gibbs
    Sampler.
  • What do we need to know to implement the Gibbs
    Sampler?

16
  • In order to use the Gibbs Sampler, we must
    identify the conditional distribution for the
    posterior
  • p(?, ? y) ? N( 0 , t ) ?( a , b ) N(?, ?)
  • In other words, we need to identify
  • p(? y,?) and p( ? y,? )
  • In this case, the conditional distributions are
    easy to find

The proportionality follows because p(?,y) does
not depend on ?. Therefore, to find the
conditional for ? we only need to pick out the
terms from p(?, ?, y) that involve ?. We follow
an analogous process for p(? ?,y).
17
Once we identify the conditional posterior
distributions, we simple take random draws using
the Gibbs Sampling approach introduced before.
18
  • Congdons Blood Pressure example again
  • model
  • for (i in 1N)
  • yi dnorm( mu, tau)
  • mu dnorm(0, .001)
  • tau dgamma(.01 , .001)
  • list(N20,yc(98,160,136,128,130,114,123,
  • 134,128,107,123,125,129,132,154,115,126,132,136,13
    0))

19
WinBugs results for the model where y N(?,
?)and ? N(0, .001), ? ?(.01 , .001)
Notice that the mean of the posterior mean is
less than 128 despite the fact that we are using
what appear to be diffuse prior beliefs.
Write a Comment
User Comments (0)
About PowerShow.com