Other MCMC features in MLwiN and the MLwiN->WinBUGS interface PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: Other MCMC features in MLwiN and the MLwiN->WinBUGS interface


1
Lecture 8
  • Other MCMC features in MLwiN and the
    MLwiN-gtWinBUGS interface

2
Lecture Contents
  • Convergence diagnostics.
  • Starting values.
  • MLwiN demonstration.
  • MLwiN to WinBUGS interface.
  • t distributed residuals example.

3
Recap of practical session
  • The practical covered two features of MCMC
    sampling that we will cover in more detail here
  • Convergence diagnostics and summary statistics.
  • Choosing starting values for unknown parameters.

4
Reunion Island Model revisited
  • We firstly fitted the following model to the
    dataset

5
Trajectories window
  • By default we have the last 500 iterations
  • Clearly some chains mix better than others!

6
MCMC diagnostics for ?0
  • We will describe each pane separately

7
Trace plot
  • This graph plots the generated values of the
    parameter against the iteration number.
  • A crude test of mixing is the blue finger test.
  • This chain doesnt mix that well but could be
    worse!

8
Kernel Density plot
  • This plot is like a smoothed histogram.
  • Instead of counting the estimates into bins of
    particular widths like a histogram, the effect of
    each iteration is spread around the estimate via
    a Kernel function e.g. a normal distribution.
  • This means that at each point we get the sum of
    the Kernel function parts for each iteration.
  • The Kernel density plot has a smoothness
    parameter that can be modified.

9
Time series diagnostics
Here we have the Auto correlation function (ACF)
and partial autocorrelation function (PACF)
plots. The ACF measures how correlated the values
in the chain are with their close neighbours. The
lag is the distance between the two chains to be
compared. An independent chain will have
approximately zero autocorrelation at each lag. A
Markov chain should have a power relationship in
the lags i.e. if ACF(1) ? then ACF(2) ?2 etc.
This is known as an AR(1) process. The PACF
measures discrepancies from such a process and so
should normally have values 0 after lag 1.
10
Accuracy Diagnostics
  • MLwiN has 2 accuracy diagnostics
  • Raftery-Lewis works on quantiles of distribution.
  • Brooks-Draper works on quoting the mean to n
    significant figures. Its formulae uses the
    estimate, its s.d. and the ACF and it can often
    give very small or very large values!

11
Raftery Lewis Diagnostic
  • This diagnostic is based on the properties of a 2
    state Markov chain.
  • For a given quantile the chain is dichotomized
    into values that are greater (1) or less than (0)
    the value of the quantile.
  • For an independent chain the 0s and 1s should
    appear randomly. If there is large clustering (in
    time) of 0s or 1s then the chain is mixing
    badly.

12
Summary Statistics
Three estimates of location are given Mean
from the chain. Mode from the kernel
plot. Median (50 quantile) by sorting the
chain and finding the middle value. The SD is
calculated from the chain and the other quantiles
are used to give (possibly) non-symmetric
interval estimates. The MCSE and ESS will be
discussed next.
13
Monte Carlo Standard Error
The Monte Carlo Standard Error (MCSE) is an
indication of how much error is in the estimate
due to the fact that MCMC is used. As the number
of iterations increases the MCSE?0. For an
independent sampler it equals the SD/?n. However
it is adjusted due to the autocorrelation in the
chain. The graph above gives estimates for the
MCSE for longer runs.
14
Effective Sample Size
  • This quantity gives an estimate of the equivalent
    number of independent iterations that the chain
    represents.
  • This is related to the ACF and the MCSE.
  • Its formula is
  • For this parameter our 5,000 actual iterations
    are equivalent to only 344 independent iterations!

15
MCMC diagnostics for ?2u
This is the worst mixing parameter with a
suggestion of a run of over 100k iterations being
required.
16
MCMC diagnostics for ?2uAfter 100,000 iterations
  • Note that now the diagnostics are all satisfied.

17
Starting Values in MLwiN
  • By default MLwiN uses the values currently stored
    for each parameter.
  • As we usually run MCMC directly after IGLS this
    means we use IGLS estimates as starting values.
  • These values are a good place to start and hence
    burnin will often be short.
  • This is OK for most multilevel models as we know
    they should be unimodal.
  • However it is better to try several starting
    values to ensure we have convergence.
  • WinBUGS allows multiple chains to be run which
    allows multiple chain MCMC diagnostics.

18
Diagnostics in WinBUGS
  • WinBUGS uses a multiple chain diagnostic by
    Brooks, Gelman and Rubin.
  • This is based on the ratio of betweenwithin
    chain variance (ANOVA).
  • WinBUGS produces plots of
  • Average 80 interval within-chains (blue) and
    pooled 80 interval between chains (green) which
    should converge to stable values.
  • Ratio pooledaverage interval widths (red)
    should converge to 1.0.
  • To print values of the diagnostic double click on
    the plot and press CTRL left hand mouse button.

19
MCSE in WinBUGS
  • Accuracy of the posterior estimates can be
    assessed by the Monte Carlo standard error for
    each parameter.
  • If samples were generated independently could
    estimate SE of mean as
  • is the sample variance.
  • This will underestimate the true MCSE due to
    autocorrelation in the chain.
  • WinBUGS uses a batch means method replace
    sample variance S2 by variance of batched means
    that are assumed independent.
  • MLwiN replaces actual sample size with effective
    sample size.

20
BUGS history
  • Bayesian inference Using Gibbs Sampling
  • Language for specifying complex Bayesian models.
  • Constructs object-oriented internal
    representation of the model.
  • Builds up an arbitrary complex model through
    specification of local structure.
  • Simulation from full conditionals using Gibbs
    sampling
  • Current version (WinBUGS 1.4) runs in Windows,
    and incorporates a script language for running in
    batch mode.
  • Classic BUGS available for UNIX but this is an
    old version.
  • WinBUGS is freely available from
    http//www.mrc-bsu.cam.ac.uk

21
WinBUGS common distributions
Expression Distribution Usage
dbin binomial r dbin(p,n)
dnorm normal x dnorm(mu,tau)
dpois Poisson r dpois(lambda)
dunif uniform x dunif(a,b)
dgamma gamma x dgamma(a,b)
NB As noted earlier the Normal is parameterised
in terms of its mean and precision 1/variance
1/sd2. See Model Specification/The BUGS
language stochastic nodes/ Distributions in
manual for full syntax. Note Functions cannot be
used as argument in distributions!! You need to
create additional nodes
22
Some Built in WinBUGS functions
  • p lt- step(x-.5) 1 if x 0.5, 0 otherwise
    hence can be used to monitor when x 0.5
  • p lt- equals(x,.5) 1 if x0.5, 0 otherwise.
  • tau lt- 1/pow(s,2) sets
  • s lt- 1/sqrt(tau) sets
  • pi,k lt- inprod(pi, Lambdai,,k) sets
  • See Model Specification/Logical nodes in manual
    for full syntax.

23
Graphical Models
  • Model building
  • Statistical modelling of complex systems involve
    usually many interconnected random variables.
  • How to we build the connections?
  • Key idea conditional independence
  • It is helpful to represent the modelling process
    by a graph
  • Nodes all random quantities
  • Links (directed or undirected) association
    between the nodes
  • Directed edges natural ordering of association,
    causal influence
  • Undirected edges symmetric association,
    correlation
  • The graph is used to represent a set of
    conditional independence statements

24
Graphical Model Doodle in WinBUGS
  • The following is a doodle for the rats example in
    WinBUGS

25
The MLwiN to WinBUGS interface
  • MLwiN has an interface that will generate WinBUGS
    code for an MLwiN model.

The interface will save the model in two possible
versions of WinBUGS and will input back the
WinBUGS output in CODA format and use MLwiN
diagnostics.
26
Model Description
  • model
  • Level 1 definition
  • for(i in 1N)
  • lncfsi dnorm(mui,tau)
  • muilt- beta1 consi
  • u2cowi consi
  • u3herdi consi
  • Higher level definitions
  • for (j in 1n2)
  • u2j dnorm(0,tau.u2)
  • for (j in 1n3)
  • u3j dnorm(0,tau.u3)
  • Priors for fixed effects
  • beta1 dflat()
  • Priors for random terms

WinBUGS describes the model in a mixture of
deterministic and stochastic statements. The
interface takes the names from the MLwiN columns
and uses some other standard names e.g. beta,
sigma2. Note this means there is some redundancy
e.g. cons is included. Caution Care must be
taken with some column names.
27
Data description
  • Note that the data is stored in the file after
    the word list and between parentheses.
  • For example
  • list(N 3027, n2 1575, n3 50,
  • cow c(1,1,2,2,3,4,5,5,5,5,6,6,6,7,7,7,8,8,8,9,
  • 9,9,9,10,10,10,11,11,12,12,13,14,14,14,14,15,15,15
    ,16,16,
  • 16,17,17,17,17,18,18,18,18,19,20,20,20,21,21,21,22
    ,22,22,22,
  • 23,23,23,23,24,24,25,26,27,27,28,29,29,29,30,30,31
    ,31,32,33,
  • 33,33,33,34,34,34,35,35,35,36,36,37,38,39,39,40,40
    ,40,41,41,
  • 41,41,42,42,42,43,43,43,44,44,44,44,45,45,45,46,46
    ,47,47,47,
  • .),.
  • Note some notation e.g. c for vectors borrowed
    from R and S-plus.

28
Initial values in WinBUGS
  • Note that the initial values are also stored in
    the file after the word list and between
    parentheses.
  • For example
  • list(beta c(4.217674),
  • u2 c( 0.062732,-0.020885,0.054452,0.125001,-0.06
    3034,-0.028254,-0.013445,-0.137141,-0.056420,-0.01
    7268,
  • 0.025402,-0.013377,-0.013452,-0.018461,-0.103801,-
    0.095923,0.031627,-0.019062,-0.026862,-0.037929,
  • 0.152077,0.008395,-0.015221,0.047518,-0.029718,0.0
    41764,-0.040790,-0.055156,0.048479,-0.036886,
  • The MLwiN-gtWinBUGS interface has given the IGLS
    starting values to WinBUGS.

29
Running the reunion island dataset in WinBUGS
  • The model was run for 3 parallel chains using
    MLwiN starting values and 2 other sets.
  • We run the 3 chains for 5,000 after a 500 burnin
  • Even the worst mixing parameter shows overlap in
    the three chains (see below)

30
Running the reunion island dataset in WinBUGS
  • Brooks,Gelman Rubin Diagnostic
  • Summary Statistics for the run
  • node mean sd (2.5, 97.5)
  • beta1 4.217 0.0192 (4.18,4.255)
  • sigma2 0.135 0.0048 (0.126,0.145)
  • sigma2.u2 0.0193 0.0041 (0.0117,0.0274)
  • sigma2.u3 0.0152 0.0039 (0.0092,0.0244)

31
Testing the sensitivity to the Normal assumption
  • When we look at plots of residuals in a
    multilevel model we often see outlying residuals.
  • These may be genuine outlying higher level units
    that do not follow the same distribution or it
    may be that the Normal distribution is not
    appropriate.
  • The Normal is a member of the t family of
    distributions and other members of the family
    have longer tails and so may be more appropriate.
  • We will investigate this in the practical.

32
Information on the practical
  • In the practical we will look at how to fit the
    variance components model for the education
    dataset using WinBUGS.
  • We will also examine sensitivity to the Normal
    assumption by considering instead a t
    distribution with unknown degrees of freedom and
    with fixed degrees of freedom (8).
Write a Comment
User Comments (0)
About PowerShow.com