Title: Structure and Uncertainty
1Structure and Uncertainty
- Peter Green, University of Bristol, 10 July 2003
2Statistician a term that is more or less
equivalent to that of Statesman. Galton,
Francis Memories of My Life Chapter XXI
We are not concerned with the very poor. They
are unthinkable, and only to be appreciated by
the statistician or the poet. Forster, E.M.
Howards End
Organic chemist! said Tilley expressively.
Probably knows no statistics whatever.
Balchin, Nigel The Small Back Room
Before the curse of statistics fell upon mankind
we lived a happy, innocent life, full of
merriment and go, and informed by fairly good
judgment. Belloc, Hilaire The Silence of the Sea
Like dreams, statistics are a form of
wish fulfillment. Baudrillard, Jean Cool
Memories
It is commonly believed that anyone who
tabulates numbers is a statistician. This is
like believing that anyone who owns a scalpel
is a surgeon. Hooke, R. How to Tell the Liars
from the Statisticians
3Statistics and science
- If your experiment needs statistics, you ought
to have done a better experiment
Ernest Rutherford (1871-1937)
4Statistics and science
- Organic chemist! said Tilley expressively.
Probably knows no statistics whatever.
Nigel Balchin (1908-1970) The Small Back Room
5Graphical models
Modelling
Mathematics
Inference
Algorithms
6Contingency tables
Graphical models
7Modular structure
- Basis for
- understanding the real system
- capturing important characteristics statistically
- defining appropriate methods
- computation
- inference and interpretation
81. Modelling
Modelling
Mathematics
Inference
Algorithms
9Structured systems
- A framework for building models, especially
probabilistic models, for empirical data - Key idea -
- understand complex system
- through global model
- built from small pieces
- comprehensible
- each with only a few variables
- modular
10Structured systems
- A framework for building models, especially
probabilistic models, for empirical data
11Structured systems
- Key idea -
- understand complex system
- through global model
- built from small pieces
- comprehensible
- each with only a few variables
- modular
12Mendelian inheritance - a natural structured model
A
AB
A
O
O
Mendel
13Ion channelmodel
model indicator
transition rates
hidden state
Hodgson and Green, Proc Roy Soc Lond A, 1999
binary signal
levels variances
data
14model indicator
C1
C2
C3
O1
O2
transition rates
hidden state
binary signal
levels variances
data
15Gene expression using Affymetrix chips
Zoom Image of Hybridised Array
Hybridised Spot
Single stranded, labeled RNA sample
Oligonucleotide element
20µm
Millions of copies of a specific oligonucleotide
sequence element
Expressed genes
Approx. ½ million different complementary
oligonucleotides
Non-expressed genes
1.28cm
Slide courtesy of Affymetrix
Image of Hybridised Array
16Gene expression is a hierarchical process
- Substantive question
- Experimental design
- Sample preparation
- Array design manufacture
- Gene expression matrix
- Probe level data
- Image level data
17Mapping of rare diseases
Larynx cancer in females in France, 1986-1993
(standardised ratios)
18Mapping of rare diseases
regression coefficient
covariate
random spatial effects
relative risks
observed counts
19Mapping of rare diseases
Estimated posterior probability of excess risk,
using Hidden Markov model, G Richardson, 2002
20Mapping of rare diseases using Hidden Markov model
Larynx cancer in females in France, 1986-1993
(standardised ratios)
Posterior probability of excess risk
G Richardson, 2002
21(No Transcript)
22Probabilistic expert systems
232. Mathematics
Modelling
Mathematics
Inference
Algorithms
24Graphical models
- Use ideas from graph theory to
- represent structure of a joint probability
distribution - by encoding conditional independencies
25Where does the graph come from?
- Genetics
- pedigree (family connections)
- Lattice systems
- interaction graph (e.g. nearest neighbours)
- Gaussian case
- graph determined by non-zeroes in inverse
variance matrix
26A B C D
A B C D
Inverse of (co)variance matrix independent case
A
B
C
D
27A B C D
A B C D
Inverse of (co)variance matrix dependent case
non-zero ? non-zero
A
B
C
D
Few links implies few parameters - Occams razor
28- Few links implies few parameters
- stable estimation
- Parsimony
- Occams razor
Few links implies few parameters - Occams razor
29Conditional independence
- X and Z are conditionally independent given Y if,
knowing Y, discovering Z tells you nothing more
about X p(XY,Z) p(XY) - X ? Z ? Y
X
Y
Z
30Conditional independence
- as seen in data on perinatal mortality vs.
ante-natal care.
Does survival depend on ante-natal care?
.... what if you know the clinic?
31Conditional independence
ante
survival
clinic
survival and clinic are dependent
and ante and clinic are dependent
but survival and ante are CI given clinic
32Conditional independence provides a mathematical
basis for splitting up a large system into
smaller components
33C
D
D
F
B
E
B
E
A
343. Inference
Modelling
Mathematics
Inference
Algorithms
35or non-
Bayesian
36Bayesian paradigm in structured modelling
- borrowing strength
- automatically integrates out all sources of
uncertainty - properly accounting for variability at all levels
- including, in principle, uncertainty in model
itself - avoids over-optimistic claims of certainty
37Repeated measures seizure counts in a randomised
trial of anti-convulsant therapy in epilepsy
38Bayesian structured modelling
- borrowing strength
- automatically integrates out all sources of
uncertainty - for example in forensic statistics with DNA
probe data..
39(thanks to J Mortera)
40(No Transcript)
41Bayesian structured modelling
- borrowing strength
- automatically integrates out all sources of
uncertainty - for example in modelling complex biomedical
systems like ion channels..
424. Algorithms
Modelling
Mathematics
Inference
Algorithms
43Algorithms for probability and likelihood
calculations
- Exploiting graphical structure
- Markov chain Monte Carlo
- Probability propagation (Bayes nets)
- Expectation-Maximisation
- Variational methods
44Markov chain Monte Carlo
- Subgroups of one or more variables updated
randomly, - maintaining detailed balance with respect to
target distribution - Ensemble converges to equilibrium target
distribution ( Bayesian posterior, e.g.)
45Markov chain Monte Carlo
?
?
Updating
- need only look at neighbours
46Probability propagation
7
6
5
4
2
3
1
47 Message passing in junction tree
root
root
48Message passing in junction tree
root
root
49Emission tomography
Industry standard reconstruction, using Radon
transform
50Emission tomography, continued
Bayesian Reconstruction (G, 1994)
51Learningstructure
Learning a Bayesian network, for an
ICU ventilator management system, from 10000
cases on 37 variables (Spirtes Meek, 1995)
52Structured systems success stories include...
- Genomics bioinformatics
- DNA protein sequencing,
gene mapping, evolutionary genetics - Spatial statistics
- image analysis, environmetrics,
geographical epidemiology, ecology - Temporal problems
- longitudinal data, financial time series, signal
processing
53thanks to many
54The role of statistical modelling
- Discipline in creation of methodology
- Framework
- for study of foundations
- for expressing principles
- for provision of computational tools
- Use more to communicate ideas
- break down barriers between theory and practice?