UK Biobank and biobank harmonization - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

UK Biobank and biobank harmonization

Description:

biobank harmonization. Paul Burton. Dept of Health Sciences. Dept of Genetics ... Harmonization initiatives. Population Biobanks. FP6 Co-ordination Action ... – PowerPoint PPT presentation

Number of Views:207
Avg rating:3.0/5.0
Slides: 35
Provided by: cng3
Category:

less

Transcript and Presenter's Notes

Title: UK Biobank and biobank harmonization


1
UK Biobank andbiobank harmonization
  • Paul Burton
  • Dept of Health Sciences
  • Dept of Genetics
  • University of Leicester

2
Structure of talk
  • What is UK Biobank?
  • Scientific rationale?
  • Statistical power of nested case-control studies
  • Expected event rates in UK Biobank
  • Biobank Harmonization
  • Conclusions

3
What is UK Biobank?
4
Basic design features
  • A prospective cohort study
  • 500,000 adults across UK
  • Middle aged (40-69 years)
  • A population-based biobank
  • Not disease or exposure based
  • Recruitment via electronic GP lists
  • Broad spectrum not fully representative
  • Individuals not families
  • MRC, Wellcome Trust, DH, Scottish Executive
  • 61M

5
Basic design features
  • Longitudinal health tracking
  • Nested case-control studies
  • Long time-horizon
  • Owned by the Nation
  • Central Administration Manchester
  • PI Prof Rory Collins - Oxford
  • 6 collaborating groups of university scientists

6
The Fosse Way
UK BIOBANK FOSSE WAY REGIONAL COLLABORATION
CENTRE Local Collection Centres B Birmingham E
Exeter L Leicester N Nottingham P
Plymouth S Sheffield T Truro W Warwick
S
N
L
B
W
E
P
T
7
Scientific Rationale
8
Justification for UK Biobank
  • Primary justifications
  • Roles that can best be fulfilled by a new large
    cohort study of the type represented by UK
    Biobank
  • Secondary justifications
  • Roles that could be provided by other types of
    study, but given that UK Biobank is to go ahead
    anyway these additional roles can be taken on at
    relatively low marginal cost

9
A platform for research in biomedical science
  • Studies of the joint effects of genes and
    environment/life-style
  • Genotype-based studies
  • The genetics of disease progression
  • Direct association of genes with disease
  • Universal controls
  • Family-based studies

10
Statistical powerand sample size
11
Issues that are often ignored in standard power
calculations
  • Multiple testing/low prior probability of
    association
  • Interactions
  • Unobserved frailty
  • Misclassification
  • Genotype
  • Environmental determinant
  • Case-control status
  • Subgroup analyses
  • Population substructure

12
Power calculations
  • Work with least powerful setting
  • Binary disease, binary genotype, binary
    environmental exposure
  • Logistic regression interactions departure
    from a multiplicative model
  • Complexity

13
Summarise power using MDORs calculated by
iterative simulation
  • Estimate minimum ORs detectable with 80 power at
    stated level of statistical significance under
    specified scenario

14
Whole genome scan
  • Genetic main effect, plt10-7

15
Summary
  • 80 power for genotype frequency 0.1, (allele
    frequency ? 0.05 under dominant model)
  • Genetic main effect ? 1.5, p10-4 ? 5,000 cases
  • Genetic main effect ? 1.3, p10-4 ? 10,000 cases
  • Genetic main effect ? 1.2, p10-4 ? 20,000 cases
  • Genetic main effect ? 1.4, p10-7 ? 10,000 cases
  • Genetic main effect ? 1.3, p10-7 ? 20,000 cases
  • (allele frequency 0.1 ?
    10,000 cases)
  • GE interaction with environmental exposure
  • prevalance 0.2 ? 2.0, p10-4 ? 20,000
    cases

16
Expected event ratesin UK Biobank
17
Taking account of
  • Age range at recruitment 40-69 years
  • Recruitment over 5 years
  • All cause mortality
  • Disease incidence (healthy cohort effect)
  • Migration overseas
  • Comprehensive withdrawal (max 1/500 p.a.)
  • Partial withdrawal (c.f. 1958 Birth Cohort)

18
No need to contact subjects
19
Smaller sample sizes
20
Conclusions
  • Having taken account of realistic bioclinical
    complexity, UK Biobank is just about large enough
    to be of great value as a stand-alone research
    infrastructure
  • Its value will be greatly augmented if it proves
    possible to set up a coherent and scientifically
    harmonized international network of Biobanks and
    large cohort studies

21
Harmonizing biobanks internationally
22
Why harmonize?
  • Investigate less common (but not rare) conditions
  • UKBB Ca stomach 2,500 cases in 29 years
  • 6 UKBB equivalents ? 10,000 cases in 20 years
  • Investigate smaller ORs
  • GME 1.5 ? 1.2 requires 5,000 ? 20,000
  • 4 UKBB equivalents
  • Analysis based on subsets homogeneous classes
    of phenotype, or e.g. by sex

23
Why harmonize?
  • Earlier analyses
  • UKBB Alzheimers disease, 10,000 cases in 18 yrs
  • 5 UKBB equivalents ? 9 years
  • Events at younger ages
  • Broad range of environmental exposures
  • Aim for 4-6 UKBB equivalents
  • 2M 3M recruits

24
(No Transcript)
25
Harmonization initiatives
  • Population Biobanks
  • FP6 Co-ordination Action
  • Camilla Stoltenberg, Paul Burton, Leena Peltonen,
    George Davey Smith ..
  • GenomeEUhealth
  • Proposed FP6 Integrated Project
  • Leena Peltonen .
  • Public Population Program in Genomics (P3G)
  • Canada Europe
  • Tom Hudson, Bartha Knoppers ..

26
Extra slides
27
Genetic main effects
  • plt10-4

28
Geneenvironment interaction
  • 20,000 cases, plt10-4

29
Rarer genotypes
  • Genetic main effects

30
Necessary to contact subjects
31
(No Transcript)
32
Summarise power using MDORs calculated by
iterative simulation
  • Want minimum ORs detectable with 80 power at
    stated level of statistical significance
  • 1. Guess starting values for ORs
  • 2. Simulate population under specified scenario
  • 3. Sample required number of cases and controls
  • 4. Analyse resultant case-control study in
    standard way
  • 5. Repeat 2,3,4 1,000 times
  • 6. Use empirical statistical power results from
    the 1,000 analyses to update ORs to new values
    expected to generate a power of 80
  • Repeat 2-6 till all ORs have 80 power

33
Proposed assessment visit model
34
Hattersley AT, McCarthy MI. A question of
standards what makes a good genetic association
study? Lancet 2005 in press.
Write a Comment
User Comments (0)
About PowerShow.com