Introduction to WinBUGS - PowerPoint PPT Presentation

1 / 82
About This Presentation
Title:

Introduction to WinBUGS

Description:

White stork (Ciconia ciconia) in Alsace 1948-1970. Climate (Temperature, rainfall, etc... to population of White storks breeding in Baden W rttemberg ... – PowerPoint PPT presentation

Number of Views:583
Avg rating:3.0/5.0
Slides: 83
Provided by: byronm8
Category:

less

Transcript and Presenter's Notes

Title: Introduction to WinBUGS


1
Introduction to WinBUGS
  • Olivier Gimenez
  • University of St Andrews, Scotland

2
A brief history
  • 1989 project began with a Unix version called
    BUGS
  • 1998 first Windows version, WinBUGS was born
  • Initially developed by the MRC Biostatistics Unit
    in Cambridge and now joint work with Imperial
    College School of Medicine at St Mary's, London.
  • Windows Bayesian inference Using Gibbs Sampling
  • Software for the Bayesian analysis of complex
    statistical models using Markov chain Monte Carlo
    (MCMC) methods

3
Who?
Nicky Best Imperial College Faculty of Medicine,
London (UK)
Thomas Andrew University of Helsinki, Helsinki
(Finland)
Wally Gilks MRC Biostatistics Unit Institute of
Public Health Cambridge (UK)
David Spiegelhalter MRC Biostatistics Unit
Institute of Public Health Cambridge (UK)
Freely downloadable from http//www.mrc-bsu.cam.a
c.uk/bugs/winbugs/contents.shtml
4
Key principle
  • You specify the prior and build up the likelihood
  • WinBUGS computes the posterior by running a Gibbs
    sampling algorithm, based on
  • ?(?D) / L(D?) ?(?)
  • WinBUGS computes some convergence diagnostics
    that you have to check

5
A biological example throughout White stork
(Ciconia ciconia) in Alsace 1948-1970
Demographic components (fecundity, breeding
success, survival, etc)
Climate (Temperature, rainfall, etc)
6
WinBUGS Linear Regression
15.1 67 13.3 52 15.3 88 13.3 61 14.6 32
15.6 36 13.1 72 13.1 43 15.0 92 11.7 32
15.3 86 14.4 28 14.4 57 12.7 55 11.7 66
11.9 26 15.9 28 13.4 96 14.0 48 13.9 90
12.9 86 15.1 78 13.0 87
2.55 1.85 2.05 2.88 3.13 2.21 2.43 2.69 2.55 2.84
2.47 2.69 2.52 2.31 2.07 2.35 2.98 1.98 2.53 2.21
2.62 1.78 2.30
Y Number of chicks per pairs
T Temp. May (C) R Rainf. May (mm)
7
WinBUGS Linear Regression
1. Do temperature and rainfall affect the number
of chicks?
2. Regression model
Yi ? ?r Ri ?t Ti ?i , i1,...,23 ?i
i.i.d. N(0,?2)
Yi i.i.d. N(?i,?2), i1,...,23 ?i ? ?r Ri
?t Ti
3. Estimation of parameters ?, ?r, ?t, ?
4. Frequentist inference uses t-tests
8
Linear Regression using Frequentist approach
15.1 67 13.3 52 15.3 88 13.3 61 14.6 32
15.6 36 ... ... 13.0 87
2.55 1.85 2.05 2.88 3.13 2.21 ... 2.30
Y Number of chicks per pairs
T Temp. May (C) R Rainf. May (mm)
Y 2.451 0.031 T - 0.007 R
Estimate Std. Error t value Pr(gtt)
temperature 0.031069 0.054690 0.568
0.57629 rainfall -0.007316 0.002897
-2.525 0.02011
9
Linear Regression using Frequentist approach
15.1 67 13.3 52 15.3 88 13.3 61 14.6 32
15.6 36 ... ... 13.0 87
2.55 1.85 2.05 2.88 3.13 2.21 ... 2.30
Y Number of chicks per pairs
T Temp. May (C) R Rainf. May (mm)
Y 2.451 0.031 T - 0.007 R
Estimate Std. Error t value Pr(gtt)
temperature 0.031069 0.054690 0.568
0.57629 rainfall -0.007316 0.002897
-2.525 0.02011
? Influence of Rainfall only
10
Running WinBUGS What do you need?
11
Running WinBUGS The model
12
Running WinBUGS Data and initial values
13
Running WinBUGS Overall
14
Running WinBUGS At last!!
1- check model 2- load data 3- compile model 4-
load initial values 5- generate burn-in values 6-
parameters to be monitored 7- perform the
sampling to generate posteriors 8- check
convergence and display results
15
Running WinBUGS 1. Check model
16
Running WinBUGS 1. Check model highlight 'model'
17
Running WinBUGS 1. Check model open the Model
Specification Tool
18
Running WinBUGS 1. Check model Now click 'check
model'
19
Running WinBUGS 1. Check model Watch out for the
confirmation at the foot of the screen
20
Running WinBUGS 2. Load data Now highlight the
'list' in the data window
21
Running WinBUGS 2. Load data then click 'load
data'
22
Running WinBUGS 2. Load data watch out for the
confirmation at the foot of the screen
23
Running WinBUGS 3. Compile model Next, click
'compile'
24
Running WinBUGS 3. Compile model watch out for
the confirmation at the foot of the screen
25
Running WinBUGS 4. Load initial values highlight
the 'list' in the data window
26
Running WinBUGS 4. Load initial values click
'load inits'
27
Running WinBUGS 4. Load initial values watch out
for the confirmation at the foot of the screen
28
Running WinBUGS 5. Generate Burn-in values Open
the Model Update Tool
29
Running WinBUGS 5. Generate Burn-in values Give
the number of burn-in iterations (1000)
30
Running WinBUGS 5. Generate Burn-in values click
'update' to do the sampling
31
Running WinBUGS 6. Monitor parameters open the
Inference Samples Tool
32
Running WinBUGS 6. Monitor parameters Enter
'intercept' in the node box and click 'set'
33
Running WinBUGS 6. Monitor parameters Enter
'slope_temperature' in the node box and click
'set'
34
Running WinBUGS 6. Monitor parameters Enter
'slope_rainfall' in the node box and click 'set'
35
Running WinBUGS 7. Generate posterior values
enter the number of samples you want to take
(10000)
36
Running WinBUGS 7. Generate posterior values
click 'update' to do the sampling
37
Running WinBUGS 8. Summarize posteriors Enter
'' in the node box and click 'stats'
38
Running WinBUGS 8. Summarize posteriors mean,
median and credible intervals
39
Running WinBUGS 8. Summarize posteriors 95
Credible intervals
tell us the same story
Estimate Std. Error t value Pr(gtt)
temperature 0.031069 0.054690 0.568
0.57629 rainfall -0.007316 0.002897
-2.525 0.02011
40
Running WinBUGS 8. Summarize posteriors 95
Credible intervals
tell us the same story
Estimate Std. Error t value Pr(gtt)
temperature 0.031069 0.054690 0.568
0.57629 rainfall -0.007316 0.002897
-2.525 0.02011
41
Running WinBUGS 8. Summarize posteriors click
'history'
42
Running WinBUGS 8. Summarize posteriors click
'auto cor'
? Problem of autocorrelation
43
Coping with autocorrelation use standardized
covariates
44
Coping with autocorrelation use standardized
covariates
45
Re-running WinBUGS 1,2,...7, and 8. Summarize
posteriors click 'auto cor'
? autocorrelation OK
46
Re-running WinBUGS 1,2,...7, and 8. Summarize
posteriors click 'density'
47
Re-running WinBUGS 1,2,...7, and 8. Summarize
posteriors click 'quantiles'
48
Running WinBUGS 8. Checking for convergence using
the Brooks-Gelman-Rubin criterion
  • A way to identify non-convergence is to simulate
    multiple sequences for over-dispersed starting
    points
  • Intuitively, the behaviour of all of the chains
    should be basically the same.
  • In other words, the variance within the chains
    should be the same as the variance across the
    chains.
  • In WinBUGS, stipulate the number of chains after
    'load data' and before 'compile' (obviously, as
    many sets of initial values as chains have to be
    loaded, or generated)

49
Running WinBUGS 8. Checking for convergence using
the Brooks-Gelman-Rubin criterion
The normalized width of the central 80 interval
of the pooled runs is green The normalized
average width of the 80 intervals within the
individual runs is blue
50
Re-running WinBUGS 1,2,...7, and 8. Summarize
posteriors others...
  • Click 'coda' to produce lists of data suitable
    for external treatment via the Coda R package
  • Click 'trace' to produce dynamic history
    changing in real time

51
Another example logistic regression
15.1 67 13.3 52 15.3 88 13.3 61 14.6 32
15.6 36 13.1 72 13.1 43 15.0 92 11.7 32
15.3 86 14.4 28 14.4 57 12.7 55 11.7 66
11.9 26 15.9 28 13.4 96 14.0 48 13.9 90
12.9 86 15.1 78 13.0 87
151 / 173 105 / 164 73 / 103 107 / 113 113
/ 122 87 / 112 77 / 98 108 / 121 118 /
132 122 / 136 112 / 133 120 / 137 122 / 145
89 / 117 69 / 90 71 / 80 53 / 67
41 / 54 53 / 58 31 / 39 35 /
42 14 / 23 18 / 23
Y Proportion of nests with success (gt0 youngs)
T Temp. May (C) R Rainf. May (mm)
52
Performing a logistic regression with WinBUGS
53
Performing a logistic regression with WinBUGS
54
Performing a logistic regression with WinBUGS
noninformative priors
55
Performing a logistic regression with WinBUGS
data initial values
56
Performing a logistic regression with WinBUGS the
results
lower
upper
  • influence of rainfall, but not temperature (see
    credible intervals)

57
Performing a logistic regression with WinBUGS the
results
  • additional parameters as a by-product of the
    MCMC samples just add them in the model as
    parameters to be monitored
  • - geometric mean
  • geom lt- pow(prod(p),1/N)
  • - odds-ratio
  • odds.rainfall lt- exp(slope.rainfall)
  • odds.temperature lt- exp(slope.temperature)

58
Performing a logistic regression with WinBUGS the
results
  • additional parameters as a by-product of the
    MCMC samples
  • - geom. mean probability of success around 82
    8184
  • - odds-ratio -16 for an increase of rainfall of
    1 unit

59
Running WinBUGS from R R2WinBUGS package The
logistic regression example revisited
  • It may be uneasy to read complex sets of data
    and initial values
  • It is also quite boring to specify the
    parameters to be monitored in each run
  • It might be interesting to save the output and
    read it into R for further analyses
  • solution 1 WinBUGS can be used in batch mode
    using scripts
  • solution 2 R2WinBUGS allows WinBUGS to be run
    from R
  • We can work with the results after importing
    them back into R again
  • ? create graphical displays of data and posterior
    simulations or use WinBUGS in Monte Carlo
    simulation studies

60
Running WinBUGS from R R2WinBUGS package The
logistic regression example revisited
  • To call WinBUGS from R
  • 1. Write a WinBUGS model in a ASCII file.
  • 2. Go into R.
  • 3. Prepare the inputs to the 'bugs' function and
    run it
  • 4. A WinBUGS window will pop up amd R will
    freeze up. The
  • model will now run in WinBUGS. You will see
    things happening
  • in the Log window within WinBUGS. When
    WinBugs is done, its
  • window will close and R will work again.
  • If an error message appears, re-run with 'debug
    TRUE'.

61
Running WinBUGS from R R2WinBUGS package 1 -
Write the WinBUGS code in a ASCII file
This covers logistic regression model using two
explicative variables. The White storks in
Baden-Wurttemberg (Germany) data set is
provided model for( i in 1 N) A binomial
distribution as a likelihood nbsuccessi
dbin(pi,nbpairsi) The probability of
success is a function of both rainfall and
temperature logit(pi) lt- intercept
slope.temperature (temperaturei -
mean(temperature))/(sd(temperature))
slope.rainfall (rainfalli -
mean(rainfall))/(sd(rainfall)) priors
for regression parameters intercept
dnorm(0,0.001) slope.temperature
dnorm(0,0.001) slope.rainfall dnorm(0,0.001)
62
Running WinBUGS from R R2WinBUGS package 3 -
Prepare the inputs to the 'bugs' function
Nbsuccess nbpairs temperature rainfall 151 173
15.1 67 105 164 13.3 52 73 103 15.3
88 107 113 13.3 61 113 122 14.6
32 87 112 15.6 36 53 58
14.0 48 31 39 13.9 90 35 42 12.9
86 14 23 15.1 78 18 23 13.0 87
63
Running WinBUGS from R R2WinBUGS package The
logistic regression example revisited
Load R2WinBUGS package library(R2WinBUGS)
Data (R 'list' format) N 23 data
read.table("datalogistic.dat",headerT) attach(dat
a) datax list("N","nbsuccess","nbpairs","tempera
ture","rainfall") MCMC details nb.iterations
10000 nb.burnin 1000
64
Running WinBUGS from R R2WinBUGS package The
logistic regression example revisited
Initial values init1 list(intercept-1,slope.t
emperature-1,slope.rainfall-1) init2
list(intercept0,slope.temperature0,slope.rainfal
l0) init3 list(intercept1,slope.temperature1,
slope.rainfall1) inits list(init1,init2,init3)
nb.chains length(inits) Parameters to be
monitored parameters lt- c("intercept","slope.tempe
rature","slope.rainfall") MCMC simulations
res.sim lt- bugs(datax, inits, parameters,
"logisticregressionstandardized.bug", n.chains
nb.chains, n.iter nb.iterations, n.burnin
nb.burnin) Save results save(res.sim, file
"logistic.Rdata")
65
Running WinBUGS from R R2WinBUGS package The
logistic regression example revisited
Summarize results res.simsummary use
print(res.sim) alternatively mean
sd 50 97.5
Rhat Intercept 1.55122555 0.05396574
1.55100 1.6589750 1.0020641 slope.temperatur
e 0.03030854 0.06128879 0.03148
0.1510975 0.9997848 slope.rainfall
-0.15302837 0.06206554 -0.15295
-0.0243025 1.0014894 deviance
204.60259481 2.48898337 203.90000 211.2000000
1.0002280
66
Running WinBUGS from R R2WinBUGS package The
logistic regression example revisited
Numerical summaries for slope.rainfall? quantile
(slope.rainfall,c(0.025,0.975)) 2.5
97.5 -0.2693375 -0.0243025 Calculate the
odds-ratio odds.rainfall lt- exp(slope.rainfall)
quantile(odds.rainfall,c(0.025,0.975))
2.5 97.5 0.7638855 0.9759904
67
Running WinBUGS from R R2WinBUGS package The
logistic regression example revisited
Graphical summaries for slope.rainfall? plot(den
sity(slope.rainfall),xlab"",ylab"",
main"slope.rainfall a posteriori density")
68
Recent developments in WinBUGS Model selection
using RJMCMC
  • We consider data relating to population of White
    storks breeding in Baden Württemberg (Germany).
  • Interest lies in the impact of climate variation
    (rainfall) in their wintering area on their
    population dynamics (adult survival rates).
  • Mark-recapture data from 1956-71 are available.
  • The covariates relate to the amount of rainfall
    between June-September each year from 10 weather
    stations located around the Sahel region.
  • Interest lies in identifying the given rainfall
    locations that explain the adult survival rates.

69
Bayesian Model Selection
  • Discriminating between different models can often
    be of particular interest, since they represent
    competing biological hypotheses.
  • How do we decide which covariates to use? often
    there may be a large number of possible
    covariates.

70
Example (cont)
  • We express the survival rate as a logistic
    function of the covariates
  • logit ?t ? ?Txt ?t
  • where xt denotes the set of covariate values at
    time t and ?t are random effects,
  • ?t N(0,?2).
  • However, which rainfalls explain the survival
    rates for the adults?
  • Alternatively, what values of ? are non-zero?

71
Model Selection
  • In the classical framework, likelihood ratio
    tests or information criterion (e.g. AIC) are
    often used.
  • There is a similar Bayesian statistic the
    DIC.
  • This is programmed within WinBUGS however its
    implementation is not suitable for hierarchical
    models (e.g. random effect models).
  • In addition, the DIC is known to give fallacious
    results in even simple problems.
  • Within the general Bayesian framework, there is a
    more natural way of dealing with the issue of
    model discrimination.

72
Bayesian Approach
  • We treat the model itself as an unknown parameter
    to be estimated.
  • Then, applying Bayes Theorem we obtain the
    posterior distribution over both parameter and
    model space
  • ?(?m, m data) Ç L(data ?m, m) p(?m) p(m).
  • Here ?m denotes the parameters in model m.

73
Posterior Model Probabilities
  • The Bayesian approach then allows us to
    quantitatively discriminate between competing
    models via posterior model probabilities
  • ?(m data) s ?(?m, m data) d?m
  • Ç p(m) s L(data ?m, m) p(?m) d?m
  • Note that we need to specify priors on both the
    parameters and now also on the models themselves.
  • Thus we need to specify a prior probability for
    each model to be considered.

74
MCMC-based estimates
  • We have a posterior distribution (over parameter
    and model space) defined up to proportionality
  • ?(?m, m data) Ç L(data ?m, m) p(?m m) p(m)
  • If we can sample from this posterior distribution
    then we are able to obtain posterior estimates of
    summary statistics of interest.
  • In particular the posterior model probabilities
    can be estimated as the proportion of time that
    the chain is in each model.
  • So, all we need to do is define how we construct
    such a Markov chain!!

75
Reversible Jump MCMC
  • The reversible jump MCMC algorithm allows us to
    construct a Markov chain with stationary
    distribution equal to the posterior distribution.
  • It is simply an extension of the
    Metropolis-Hastings algorithm that allows moves
    between different dimensions.
  • This algorithm is needed because the number of
    parameters, ?m, in model m, may differ between
    models.
  • Note that this algorithm needs only one Markov
    chain irrespective of the number of models!!

76
Markov chain
  • Each iteration of the Markov chain essentially
    involves two steps
  • Within model moves updating each parameter
    using standard MCMC moves (Gibbs sampler,
    Metropolis-Hastings)
  • Between model moves updating the model using a
    reversible jump type move.
  • Then, standard MCMC-type processes apply, such as
    using an appropriate burn-in, obtaining summary
    statistics of interest etc.
  • To illustrate the RJ algorithm, we consider a
    particular example relating to variable selection
    (e.g. covariate analysis).

77
WinBUGS
  • General RJ updates cannot currently be programmed
    into WinBUGS.
  • Bespoke code needs to be written instead.
  • However, the recent add-on called jump allows
    two particular RJ updates to be performed in
    WinBUGS
  • Variable selection
  • Splines
  • See http//www.winbugs-development.org.uk/rjmcmc.h
    tml
  • So, in particular, WinBUGS can be used for model
    selection in the White storks example.

78
Example White Storks
  • Recall our white storks example there are 10
    possible covariates, hence a total of 210
    possible models (1024!).
  • We specify a prior probability of 0.5 that each
    covariate is in the model.
  • Then, conditional on the covariate being present,
    we specify,
  • ? N(0,10).
  • Finally, for the random effect error variance,
  • ?2 ?-1(0.001,0.001).

79
Example White Storks
  • WinBUGS demonstration.
  • RJMCMC is performed on the betas only, neither
    on the ? nor on the intercept ?

80
Results
Models with largest posterior support
0001000000 0.5178058297 0.5178058297 0000000000
0.07629791141 0.5941037411 0001010000 0.0474087675
0.6415125086 0001100000 0.03814092265 0.679653431
3 0001000001 0.03085379849 0.7105072297 1001000000
0.02549001607 0.7359972458 0001000100 0.023745696
58 0.7597429424 0001001000 0.02336699564 0.7831099
38 0001000010 0.0229240303 0.8060339683 0000000100
0.02123479458 0.8272687629 0101000000 0.018099609
82 0.8453683727 0011000000 0.01540739041 0.8607757
631 1000000000 0.01186825798 0.8726440211 00000100
00 0.01103282075 0.8836768419
81
Results
Additionally the (marginal) posterior probability
that each covariate influence the survival rates
node mean sd posterior marg
prob effect1 0.01 0.04 0.06 effect2 0.00 0.0
3 0.04 effect3 0.00 0.02 0.03 effect4 0.30 0
.17 0.83 effect5 -0.01 0.04 0.07 effect6 0.0
1 0.05 0.09 effect7 -0.00 0.03 0.04 effect8
0.01 0.06 0.07 effect9 0.01 0.04 0.05 effect10
-0.01 0.04 0.05 p 0.91 0.01 sdeps 0.20 0.14
82
Results survival rates
Write a Comment
User Comments (0)
About PowerShow.com