Title: Bayesian Analysis of the Normal Distribution, Part II
1Bayesian Analysis of the Normal Distribution,
Part II
- Set-up of the basic model of a normally
distributed random variable with unknown mean and
variance (a two-parameter model). - Discuss philosophies of prior selection
- Implementation of different priors with a
discussion of MCMC methods.
2Different types of Bayesians choose different
priors
- Classical Bayesians the prior is a necessary
evil. - ? choose priors that interject the least
information possible. - Modern Parametric Bayesians the prior is a
useful convenience. - ? choose prior distributions with desirable
properties (e.g. conjugacy). Given a
distributional choice, prior parameters are
chosen to interject the least information
possible. - Subjective Bayesians the prior is a summary of
old beliefs - ? choose prior distributions based on previous
knowledgeeither the results of earlier studies
or non-scientific opinion.
3Modern Parametric Bayesians and the normal model
with unknown mean and variance
- y N(? , ?2) where ? and ?2 are both unknown
random variables. - What prior distribution would a modern parametric
Bayesian choose to satisfy the demands of
convenience? - What if we used the definition of conditional
probability, so p(? , ?2) p(??2)p(?2)?
4Modern Parametric Bayesians and the normal model
with unknown mean and variance
- y N(? , ?2) where ? and ?2 are both unknown
random variables. - A modern parametric Bayesian would typically
choose a conjugate prior. - For the normal model with unknown mean and
variance, the conjugate prior for the joint
distribution of ? and ?2 is the normal
inverse-gamma (?) distribution (i.e.
normal-inverse-?2) - p( ?, ?2 ) N-Inv-?2(?0, ?02/k0 v0,?02)
Four Parameters in the prior
5- Suppose p(?, ?2) N-Inv-?2(?0, ?02/k0 v0, ?02)
- ICBST the above expression can be factored such
that - p(?,?2) p(??2)p(?2)
- where ??2 N(?0, ?2/k0) and ?2
Inv-?2(v0,?02) - Because this is a conjugate distribution for the
normal distribution with unknown mean and
variance, the posterior distribution will also be
normal-Inv-?2.
6The posterior distribution if y N(?,?2) and
?,?2 N-Inv-?2(?0, ?02/k0 v0, ?02)
- p(?,?2 y) N-Inv-?2(?n, ?n2/kn vn, ?n2)
Weighted average of the prior mean and the data.
Weighted sum of the prior variance, the sample
variance and the distance between the sample and
prior means
7If p(?,?2 y) N-Inv-?2(?n, ?n2/kn vn, ?n2) we
can factor the posterior distribution just like
the prior
These are essentially the same posterior
distributions as we found with the improper
priors. To implement the conjugate priors in
WinBugs we would use the same code, but
substitute the values ?n, ?n2/kn vn, ?n2 into
the posterior.
8Implementation using the Gibbs Sampler
- Recall from last time that the Gibbs Sampler is a
method for analyzing the data - Consider a problem with two parameters ?1 and ?2
and let X denote the data. - Suppose further that we know the conditional
distributions p(?1 ?2 , X) and p(?2 ?1 , x) - We need to find p(?1 X) and p(?2 X)
9- The Gibbs Sampler proceeds by choosing some
initial point which we will denote ?10, ?20 from
the parameter space. - ? This can be any reasonable value of ?
- Then, we take draws from the two conditional
distributions in the following sequence - ?11 p(?1 ?20 , Y)
- ?21 p(?2 ?11 , Y)
- ?12 p(?1 ?21 , Y)
- ?22 p(?2 ?12 , Y)
- ?13 p(?1 ?22 , Y)
- ?23 p(?2 ?13 , Y)
- etc.
This sequence of draws is a Markov Chain because
the values at step t only depend on the value at
step t-1. If allowed to run long enough, the
Gibbs sampler will converge to the true
posterior distribution. If allowed to run for
sufficiently long after convergence, the Gibbs
sampler produces a complete sample from the
distribution of ?.
10- This algorithm can be easily generalized to the
n-parameter case ?1, ?2, , ?n. - For the tth iteration of the Gibbs Sampler we
have - ?1t p(?1 ?2t-1, ?3t-1, ?kt-1, Y)
- ?2t p(?2 ?1t, ?3t-1, ?kt-1, Y)
-
- ?nt p(?2 ?1t, ?2t, ?n-1t-1, Y)
11The Gibbs Sampler is not Rocket Science even if
the original application was in physics
- We have already created a Gibbs Sampler when we
implemented Method 2 last class. - In the code, we used WinBugs random number
generators rather than allow the program to
identify the conditional distribution which will
be the usual case. - We had to do this because WinBugs does not allow
researchers to use improper priors. - We can even use Excel to do MCMC.
12Excel Implementation of the Normal model with
conjugate priors for the mean and variance
- To implement a Gibbs Sampler in this case, we
need to perform the following steps - Everything subscripted with an n is a function of
stuff that is known
In Excel, for each iteration t ?2(t)
1/GAMMAINV(rand(),vn,1/?n2) ?(t)
NORMINV(rand(), ?n , ?2(t) / kn) where ?2(t)
spreadsheet cell correspond to the tth draw
13Lazy Modern Parametric Bayesians and the normal
model with unknown mean and variance
- Suppose that y N(?, ?) where ? was the prior
precision. - From here on when we talk about the normal
distribution you should expect that we will speak
in terms of the precision ? rather than the
variance ?2. This is because WinBugs is
programmed to use ? rather than ?2 - Suppose also that you dont want to think too
hard about the prior joint distribution of ? and
?, and assume that - p(?, ?) p(?)p(?)
- What distributions would you choose for p(?) and
p(?)?
14Suppose that y N(?, ?) What priors would you
choose for ? and ??
- I would choose
- ? N( 0 , t ) (where t is small
indicating little precision) - This is because, if we expect something like the
central limit theorem to hold, then the
distribution of the sample mean should be
approximately normal for large n. - ? ?( a , b ) (where a, b are small
numbers) - This is because this distribution is bounded
below at zero and unlike the ?2 distribution
which shares this property it is not constrained
to have the mean proportional to the variance. - ? Note how we now have to talk about the mean of
the distribution of the variance.
Gamma
15Estimation of the lazy parametric Bayesians
approach to the normal model
- Suppose yi N(?, ?) for i ? 1,,n
- where ? N( 0 , t ) and ? ?( a , b ).
- By Bayes Rule, posterior ? prior ? likelihood
- p(?, ? y) ? N( 0 , t ) ?( a , b ) N(?, ?)
- Joint prior distribution
- To estimate the marginal distributions from this
model would be a bit difficult analytically, but
it is pretty easy to implement using the Gibbs
Sampler. - What do we need to know to implement the Gibbs
Sampler?
16- In order to use the Gibbs Sampler, we must
identify the conditional distribution for the
posterior - p(?, ? y) ? N( 0 , t ) ?( a , b ) N(?, ?)
- In other words, we need to identify
- p(? y,?) and p( ? y,? )
- In this case, the conditional distributions are
easy to find
The proportionality follows because p(?,y) does
not depend on ?. Therefore, to find the
conditional for ? we only need to pick out the
terms from p(?, ?, y) that involve ?. We follow
an analogous process for p(? ?,y).
17Once we identify the conditional posterior
distributions, we simple take random draws using
the Gibbs Sampling approach introduced before.
18- Congdons Blood Pressure example again
- model
- for (i in 1N)
- yi dnorm( mu, tau)
- mu dnorm(0, .001)
- tau dgamma(.01 , .001)
-
- list(N20,yc(98,160,136,128,130,114,123,
- 134,128,107,123,125,129,132,154,115,126,132,136,13
0))
19WinBugs results for the model where y N(?,
?)and ? N(0, .001), ? ?(.01 , .001)
Notice that the mean of the posterior mean is
less than 128 despite the fact that we are using
what appear to be diffuse prior beliefs.