Title: Genomics Workshop
1Genomics Workshop Demography of Aging Centers
Biomarker Network Meeting in Conjunction with
the Annual Meeting of the PAA April 14,
900 AM to 330 PM Hyatt Regency, Dallas, Texas
Sponsored by USC/UCLA Center of Biodemography
and Population Health Organized by Teresa
Seeman, Steven Cole, Eileen Crimmins
2Tactical aspects of study administration and
sample capture/storage
Biological overview of genetics functional
genomics
Strategic aspects of study design and data
analysis
Lunch
Technical aspects of study design and data
analysis
Perspectives on the State of the Field
Application clinic
3Tactical aspects of study administration and
sample capture/storage
- DNA
- New sample capture
- Methods e.g., Oragene, leukocytes
- Consent administrative issues
- Retrospective analyses
- Sources blood spots, cheek swabs, etc
- Consent administrative issues
- Epigenetics
- DNA methylation
- Histone acetylation chromatin dynamics
- Tissue specificity (vs DNA)
- Tactical issues Reports from the Field
- I wish Id known then
- RNA
- Identifying appropriate target tissues
- Whole blood, PBMC, saliva, hair, path specim.
- Sample capture/storage
- Consent administrative issues
4(No Transcript)
5(No Transcript)
6(No Transcript)
7Tactical aspects of study administration and
sample capture/storage
- DNA
- New sample capture
- Methods e.g., Oragene, leukocytes
- Consent administrative issues
- Retrospective analyses
- Sources blood spots, cheek swabs, etc
- Consent administrative issues
- Epigenetics
- DNA methylation
- Histone acetylation chromatin dynamics
- Tissue specificity (vs DNA)
- Tactical issues Reports from the Field
- I wish Id known then
- RNA
- Identifying appropriate target tissues
- Whole blood, PBMC, saliva, hair, path specim.
- Sample capture/storage
- Consent administrative issues
8(No Transcript)
9(No Transcript)
10Tactical aspects of study administration and
sample capture/storage
- DNA
- New sample capture
- Methods e.g., Oragene, leukocytes
- Consent administrative issues
- Retrospective analyses
- Sources blood spots, cheek swabs, etc
- Consent administrative issues
- Epigenetics
- DNA methylation
- Histone acetylation chromatin dynamics
- Tissue specificity (vs DNA)
- Tactical issues Reports from the Field
- I wish Id known then
- RNA
- Identifying appropriate target tissues
- Whole blood, PBMC, saliva, hair, path specim.
- Sample capture/storage
- Consent administrative issues
11DNA
IL6
Gene
12DNA
IL6
Gene
13RNA
DNA
IL6
Gene
14Health
RNA
DNA
IL6
Gene
15Tactical aspects of study administration and
sample capture/storage
- DNA
- New sample capture
- Methods e.g., Oragene, leukocytes
- Consent administrative issues
- Retrospective analyses
- Sources blood spots, cheek swabs, etc
- Consent administrative issues
- Epigenetics
- DNA methylation
- Histone acetylation chromatin dynamics
- Tissue specificity (vs DNA)
- Tactical issues Reports from the Field
- I wish Id known then
- RNA
- Identifying appropriate target tissues
- Whole blood, PBMC, saliva, hair, path specim.
- Sample capture/storage
- Consent administrative issues
16Biological overview of genetics functional
genomics
- Theoretical framework Genes, Environments,
transcription, and health - Genetic influences (missing h, penetrance
R-square, etc.) - Functional genomics
- Transcription factors
- Epigenetics
- Gene-Environment interactions
- Regulatory polymorphism
- Coding polymorphism
- System dynamics
- Feedback, network pleiotropy
- Recursive developmental trajectories
17DNA
IL6
Gene
18Biological overview of genetics functional
genomics
- Theoretical framework Genes, Environments,
transcription, and health - Genetic influences (missing h, penetrance
R-square, etc.) - Functional genomics
- Transcription factors
- Epigenetics
- Gene-Environment interactions
- Regulatory polymorphism
- Coding polymorphism
- System dynamics
- Feedback, network pleiotropy
- Recursive developmental trajectories
19DNA
IL6
Gene
20DNA
IL6
Gene
21RNA
DNA
IL6
Gene
22Health
RNA
DNA
IL6
Gene
23Health
RNA
DNA
IL6
Gene
24Social Environment
Health
RNA
DNA
IL6
Gene
25Social Environment
Health
RNA
DNA
IL6
Gene
26Social Environment
Health
RNA
DNA
IL6
Gene
27Social Environment
Health
RNA
DNA
IL6
Gene
28IL6 gene transcription
TCT TGCGATGCTA AAG
IL6
29IL6 gene transcription
NE
TCT TGCGATGCTA AAG
IL6
30IL6 gene transcription
NE
PKA
TCT TGCGATGCTA AAG
IL6
31IL6 gene transcription
NE
PKA
P
GATA1
TCT TGCGATGCTA AAG
IL6
32IL6 gene transcription
NE
PKA
TCT TGCGATGCTA AAG
IL6
33IL6 gene transcription
NE
PKA
TCT TGCGATGCTA AAG
IL6
34Socio-environmental regulation of IL6
p .008
35Biological overview of genetics functional
genomics
- Theoretical framework Genes, Environments,
transcription, and health - Genetic influences (missing h, penetrance
R-square, etc.) - Functional genomics
- Transcription factors
- Epigenetics
- Gene-Environment interactions
- Regulatory polymorphism
- Coding polymorphism
- System dynamics
- Feedback, network pleiotropy
- Recursive developmental trajectories
36DNA
IL6
Gene
37DNA
IL6
Gene
38Health
RNA
DNA
IL6
Gene
39Health
RNA
DNA
IL6
Gene
40DNA
IL6
Gene
41Biological overview of genetics functional
genomics
- Theoretical framework Genes, Environments,
transcription, and health - Genetic influences (missing h, penetrance
R-square, etc.) - Functional genomics
- Transcription factors
- Epigenetics
- Gene-Environment interactions
- Regulatory polymorphism
- Coding polymorphism
- System dynamics
- Feedback, network pleiotropy
- Recursive developmental trajectories
42Social Environment
Health
RNA
DNA
IL6
Gene
43Social Environment
Health
RNA
DNA
G/C
IL6
Gene
44Social Environment
Health
RNA
DNA
G/C
IL6
Gene
45Social Environment
DNA
G/C
IL6
Gene
46Gene x Environment Interaction In silico
IL6
TCT TGCGATGCTA AAG
47Gene x Environment Interaction In silico
VGATA1_01 .943
IL6
TCT TGCGATGCTA AAG
48Gene x Environment Interaction In silico
VGATA1_01 .943
IL6
TCT TGCGATGCTA AAG
C
49Gene x Environment Interaction In silico
50Gene x Environment Interaction In silico
In vitro
IL6 promoter WT -174C
Transcriptional activity (fold-change)
Norepinephrine (mM) 0 10 - 0
10
51Gene x Environment Interaction In silico
In vitro
IL6 promoter WT -174C
Difference p lt .0001
Transcriptional activity (fold-change)
Norepinephrine (mM) 0 10 - 0
10
52Gene x Environment Interaction
IL6 -174 GG IL6 -174 CC/GC
p .008
53Gene x Environment Interaction
IL6 -174 GG IL6 -174 CC/GC
p .439
p .008
54Biological overview of genetics functional
genomics
- Theoretical framework Genes, Environments,
transcription, and health - Genetic influences (missing h, penetrance
R-square, etc.) - Functional genomics
- Transcription factors
- Epigenetics
- Gene-Environment interactions
- Regulatory polymorphism
- Coding polymorphism
- System dynamics
- Feedback, network pleiotropy
- Recursive developmental trajectories
55Social Environment
Health
RNA
DNA
IL6
Gene
56Social Environment
Health
RNA
DNA
IL6
G/C
Gene
57Social Environment
Health2
RNA2
DNA
IL6
G/C
Gene
58(No Transcript)
59(No Transcript)
60Biological overview of genetics functional
genomics
- Theoretical framework Genes, Environments,
transcription, and health - Genetic influences (missing h, penetrance
R-square, etc.) - Functional genomics
- Transcription factors
- Epigenetics
- Gene-Environment interactions
- Regulatory polymorphism
- Coding polymorphism
- System dynamics
- Feedback, network pleiotropy
- Recursive developmental trajectories
61Social Environment
Health
RNA
DNA
IL6
Gene
62Behavior
Social Environment
RNA
DNA
IL6
Gene
63Gene-Environment Correlation
Behavior
Social Environment
RNA
DNA
IL6
Gene
64Gene-Environment Correlation
Behavior
Social Environment
RNA
DNA
IL6
Gene
65Gene-Environment Correlation
Behavior
Social Environment
RNA
DNA
IL6
Gene
66Gene-Environment Correlation
Behavior
Social Environment
RNA
DNA
IL6
Gene
67Gene-Environment Correlation
Behavior
Social Environment
Recursive Molecular Remodeling
RNA
DNA
IL6
Gene
68Recursive developmental remodeling
Body1
Cole (2009) Current Directions in Psychological
Science
69Recursive developmental remodeling
Environment1
Body1
Cole (2009) Current Directions in Psychological
Science
70Recursive developmental remodeling
Behavior1
Environment1
Body1
Cole (2009) Current Directions in Psychological
Science
71Recursive developmental remodeling
Behavior1
Environment1
Body1
RNA1
Cole (2009) Current Directions in Psychological
Science
72Recursive developmental remodeling
Time 2
Body2
Cole (2009) Current Directions in Psychological
Science
73Recursive developmental remodeling
Time 2
Environment2
Body2
Cole (2009) Current Directions in Psychological
Science
74Recursive developmental remodeling
Cole (2009) Current Directions in Psychological
Science
75Recursive developmental remodeling
Cole (2009) Current Directions in Psychological
Science
76Recursive developmental remodeling
RNA intra-organismic adaptation
Cole (2009) Current Directions in Psychological
Science
77Biological overview of genetics functional
genomics
- Theoretical framework Genes, Environments,
transcription, and health - Genetic influences (missing h, penetrance
R-square, etc.) - Functional genomics
- Transcription factors
- Epigenetics
- Gene-Environment interactions
- Regulatory polymorphism
- Coding polymorphism
- System dynamics
- Feedback, network pleiotropy
- Recursive developmental trajectories
78Strategic aspects of study design and data
analysis
- Basic substantive objectives study designs
- Gene discovery (e.g., genetic epidemiology)
- Environmental regulation of health (via
transcription) - Gene-Environment interaction
79DNA
IL6
Gene
80Health
DNA
IL6
Gene
81Strategic aspects of study design and data
analysis
- Basic substantive objectives study designs
- Gene discovery (e.g., genetic epidemiology)
- Environmental regulation of health (via
transcription) - Gene-Environment interaction
82Health
DNA
IL6
Gene
83Health
RNA
DNA
IL6
Gene
84Strategic aspects of study design and data
analysis
- Basic substantive objectives study designs
- Gene discovery (e.g., genetic epidemiology)
- Environmental regulation of health (via
transcription) - Gene-Environment interaction
85Health
RNA
DNA
IL6
Gene
86Health
RNA
DNA
G/C
IL6
G/C
Gene
87Strategic aspects of study design and data
analysis
- Basic substantive objectives study designs
- Gene discovery (e.g., genetic epidemiology)
- Environmental regulation of health (via
transcription) - Gene-Environment interaction
Antagonistic pleiotropy
88Antagonistic pleiotropy
Older Adult Adolescent
p .007
p .032
3.0 2.0 1.0 0.0 -1.0 -2.0 -3.0
CRP mg/L / Adversity SD
IL6 -174 CC GC GG CC GC GG
89Antagonistic pleiotropy
Older Adult Adolescent
p .007
p .032
3.0 2.0 1.0 0.0 -1.0 -2.0 -3.0
CRP mg/L / Adversity SD
IL6 -174 CC GC GG CC GC GG
90Antagonistic pleiotropy
Older Adult Adolescent
p .007
p .032
3.0 2.0 1.0 0.0 -1.0 -2.0 -3.0
CRP mg/L / Adversity SD
IL6 -174 CC GC GG CC GC GG
Evolution deletes disadvantage, particularly to
the young
91Outcome
GG GC CC
92Fishers regression
Outcome
GG GC CC
y a b(G) e
93Fishers regression
Environment A
Environment B
Outcome
Outcome
GG GC CC
GG GC CC
y a b(G) e
94Fishers regression
Environment A
Environment B
Outcome
Outcome
GG GC CC
GG GC CC
y a b(G) c(Env) d(G x Env) e
95Fishers regression
Environment A
Environment B
Outcome
Outcome
GG GC CC
GG GC CC
y a b(G) e ? c(Env) d(G x Env) e
96Fishers regression
Environment A
Environment B
Outcome
Outcome
GG GC CC
GG GC CC
y a b(G) e ? c(Env) d(G x Env) e
? power
97Fishers regression
Environment A
Environment B
Outcome
Outcome
GG GC CC
GG GC CC
y a b(G) e ? c(Env) d(G x Env) e
? power ? parameter estimate
bias
98Fishers regression
Environment A
Environment B
Outcome
Outcome
GG GC CC
GG GC CC
y a b(G) e ? c(Env) d(G x Env) e
? power ? parameter estimate
bias Marginal 0
99Strategic aspects of study design and data
analysis
- Basic substantive objectives study designs
- Gene discovery (e.g., genetic epidemiology)
- Environmental regulation of health (via
transcription) - Gene-Environment interaction
Antagonistic pleiotropy
Valid statistical models are one major reason
that substantive interests (environments) matter.
100Strategic aspects of study design and data
analysis
- Basic substantive objectives study designs
- Gene discovery (e.g., genetic epidemiology)
- Environmental regulation of health (via
transcription) - Gene-Environment interaction
Antagonistic pleiotropy
Valid statistical models are one major reason
that substantive interests (environments)
matter. OK, then, lets have lunch.
101Technical aspects of study design and data
analysis
- Study designs, assay technologies, and
statistical methods - Gene discovery (e.g., genetic epidemiology)
- Candidate gene studies
- Genome-wide association studies
- The bioinformatic middle road
- Environmental regulation of health (via
transcription) - Candidate transcript studies
- Genome-wide approaches
- Gene-Environment interaction
- Statistical issues
- Revisiting the bioinformatic middle road
102Technical aspects of study design and data
analysis
- Study designs, assay technologies, and
statistical methods - Gene discovery (e.g., genetic epidemiology)
- Candidate gene studies
- - Candidate identification
- - Targeted genotyping
- a. PCR
- b. High-throughput approaches
- - Statistical models
- a. Fishers basic regression model
- b. Multivariate mapping / association /
recombination - i. Recombination
- ii. Haplotype blocks
- c. Confounding
- i. Linkage disequilibrium haplotype analyses
- ii. Ethnic stratification
- Phenotypic ascertainment
- Genetic ancestry
- iii. Mendelian randomization
103(No Transcript)
104Gene x Environment Interaction
IL6
TCT TGCGATGCTA AAG
105IL6
TCT TGCGATGCTA AAG
C
106Gene x Environment Interaction In silico
VGATA1_01 .943
IL6
TCT TGCGATGCTA AAG
C
107Gene x Environment Interaction In silico
108Gene x Environment Interaction In silico
In vitro
IL6 promoter WT -174C
Difference p lt .0001
Transcriptional activity (fold-change)
Norepinephrine (mM) 0 10 - 0
10
109Gene x Environment Interaction
IL6 -174 GG IL6 -174 CC/GC
p .439
p .008
110Technical aspects of study design and data
analysis
- Study designs, assay technologies, and
statistical methods - Gene discovery (e.g., genetic epidemiology)
- Candidate gene studies
- - Candidate identification
- - Targeted genotyping
- a. PCR
- b. High-throughput approaches
- - Statistical models
- a. Fishers basic regression model
- b. Multivariate mapping / association /
recombination - i. Recombination
- ii. Haplotype blocks
- c. Confounding
- i. Linkage disequilibrium haplotype analyses
- ii. Ethnic stratification
- Phenotypic ascertainment
- Genetic ancestry
- iii. Mendelian randomization
111(No Transcript)
112(No Transcript)
113 Well ID1 ID2 RFU1 RFU2 Ct1 Ct2 Call A0
1 053 053 1094.39 956.90 42.53 41.36 Heterozy
gote A02 065 065 -43.33 1519.25 60.00 40.39
Allele2 A03 075 075 1126.77
890.96 42.82 42.02 Heterozygote A04 079 079
2095.09 25.36 42.84 60.00 Allele1 A05 087 0
87 2187.80 18.09 41.27 60.00 Allele1
114Technical aspects of study design and data
analysis
- Study designs, assay technologies, and
statistical methods - Gene discovery (e.g., genetic epidemiology)
- Candidate gene studies
- - Candidate identification
- - Targeted genotyping
- a. PCR
- b. High-throughput approaches
- - Statistical models
- a. Fishers basic regression model
- b. Multivariate mapping / association /
recombination - i. Recombination
- ii. Haplotype blocks
- c. Confounding
- i. Linkage disequilibrium haplotype analyses
- ii. Ethnic stratification
- Phenotypic ascertainment
- Genetic ancestry
- iii. Mendelian randomization
115(No Transcript)
116(No Transcript)
117(No Transcript)
118Technical aspects of study design and data
analysis
- Study designs, assay technologies, and
statistical methods - Gene discovery (e.g., genetic epidemiology)
- Candidate gene studies
- - Candidate identification
- - Targeted genotyping
- a. PCR
- b. High-throughput approaches
- - Statistical models
- a. Fishers basic regression model
- b. Multivariate mapping / association /
recombination - i. Recombination
- ii. Haplotype blocks
- c. Confounding
- i. Linkage disequilibrium haplotype analyses
- ii. Ethnic stratification
- Phenotypic ascertainment
- Genetic ancestry
- iii. Mendelian randomization
119Fishers regression
Outcome
GG GC CC
120Fishers regression
Outcome
GG GC CC
121Fishers regression
Outcome
GG GC CC
122Fishers regression
Outcome
GG GC CC
123Fishers regression
Outcome
GG GC CC
y a b(G)
124Fishers regression
Outcome
GG GC CC
y a b(G) y a b(GG) c(GC) d(CC)
125Fishers regression
Outcome
GG GC CC
y a b(G) y a b(GG) c(GC) d(CC)
126Technical aspects of study design and data
analysis
- Study designs, assay technologies, and
statistical methods - Gene discovery (e.g., genetic epidemiology)
- Candidate gene studies
- - Candidate identification
- - Targeted genotyping
- a. PCR
- b. High-throughput approaches
- - Statistical models
- a. Fishers basic regression model
- b. Multivariate mapping / association /
recombination - i. Recombination
- ii. Haplotype blocks
- c. Confounding
- i. Linkage disequilibrium haplotype analyses
- ii. Ethnic stratification
- Phenotypic ascertainment
- Genetic ancestry
- iii. Mendelian randomization
127(No Transcript)
128(No Transcript)
129(No Transcript)
130(No Transcript)
131Fishers regression
Outcome
GG GC CC
y a b(G rs1800795)
132Fishers regression
Outcome
GG GC CC
y a b(G rs1800795) y a b(G rs1800795)
c(T rs20937) .
133Fishers regression
Outcome
GG GC CC
y a b(G rs1800795) y a b(Haplotype
containing rs1800795)
134Fishers regression
Outcome
GG GC CC
y a b(G rs1800795) y a b(Haplotype
containing rs1800795) y a b(ATTCGTAC)
135Fishers regression
Outcome
GG GC CC
HapMap Tag SNP
y a b(G rs1800795) y a b(Haplotype
containing rs1800795) y a b(ATTCGTAC)
136Technical aspects of study design and data
analysis
- Study designs, assay technologies, and
statistical methods - Gene discovery (e.g., genetic epidemiology)
- Candidate gene studies
- - Candidate identification
- - Targeted genotyping
- a. PCR
- b. High-throughput approaches
- - Statistical models
- a. Fishers basic regression model
- b. Multivariate mapping / association /
recombination - i. Recombination
- ii. Haplotype blocks
- c. Confounding
- i. Linkage disequilibrium haplotype analyses
- ii. Ethnic stratification
- Phenotypic ascertainment
- Genetic ancestry
- iii. Mendelian randomization
137Linkage-driven indirect association gradients
138Linkage-driven indirect association gradients
139Technical aspects of study design and data
analysis
- Study designs, assay technologies, and
statistical methods - Gene discovery (e.g., genetic epidemiology)
- Candidate gene studies
- - Candidate identification
- - Targeted genotyping
- a. PCR
- b. High-throughput approaches
- - Statistical models
- a. Fishers basic regression model
- b. Multivariate mapping / association /
recombination - i. Recombination
- ii. Haplotype blocks
- c. Confounding
- i. Linkage disequilibrium haplotype analyses
- ii. Ethnic stratification
- Phenotypic ascertainment
- Genetic ancestry
- iii. Mendelian randomization
140(No Transcript)
141Culture/behavior/exposure Environment
142(No Transcript)
143(No Transcript)
144Ancestry classification via mitochondrial
haplogroups (also Y haplogroups for paternal
lineage)
145Technical aspects of study design and data
analysis
- Study designs, assay technologies, and
statistical methods - Gene discovery (e.g., genetic epidemiology)
- Candidate gene studies
- - Candidate identification
- - Targeted genotyping
- a. PCR
- b. High-throughput approaches
- - Statistical models
- a. Fishers basic regression model
- b. Multivariate mapping / association /
recombination - i. Recombination
- ii. Haplotype blocks
- c. Confounding
- i. Linkage disequilibrium haplotype analyses
- ii. Ethnic stratification
- Phenotypic ascertainment
- Genetic ancestry
- iii. Mendelian randomization
146(No Transcript)
147(No Transcript)
148CRP
CVD
149CRP
CVD
CRP
150CRP
CVD
CRP
151CRP
CVD
CRP
IL-6
152Technical aspects of study design and data
analysis
- Study designs, assay technologies, and
statistical methods - Gene discovery (e.g., genetic epidemiology)
- Candidate gene studies
- - Candidate identification
- - Targeted genotyping
- a. PCR
- b. High-throughput approaches
- - Statistical models
- a. Fishers basic regression model
- b. Multivariate mapping / association /
recombination - i. Recombination
- ii. Haplotype blocks
- c. Confounding
- i. Linkage disequilibrium haplotype analyses
- ii. Ethnic stratification
- Phenotypic ascertainment
- Genetic ancestry
- iii. Mendelian randomization
153Technical aspects of study design and data
analysis
- Study designs, assay technologies, and
statistical methods - Gene discovery (e.g., genetic epidemiology)
- Candidate gene studies
154Technical aspects of study design and data
analysis
- Study designs, assay technologies, and
statistical methods - Gene discovery (e.g., genetic epidemiology)
- Candidate gene studies
- Genome-wide association studies
155Technical aspects of study design and data
analysis
- Study designs, assay technologies, and
statistical methods - Gene discovery (e.g., genetic epidemiology)
- Candidate gene studies
- Genome-wide association studies
- - Marker selection for blind search tag SNPs
- - Massively parallel genotyping
- a. Array-based strategies
- Deep resequencing
- - Statistical models
- a. Main effect models
- Interaction models
- Managing Type I error
- - Bonferronni FDR
- - Internal cross-validation
- - External replication
156(No Transcript)
157Technical aspects of study design and data
analysis
- Study designs, assay technologies, and
statistical methods - Gene discovery (e.g., genetic epidemiology)
- Candidate gene studies
- Genome-wide association studies
- - Marker selection for blind search tag SNPs
- - Massively parallel genotyping
- a. Array-based strategies
- Deep resequencing
- - Statistical models
- a. Main effect models
- Interaction models
- Managing Type I error
- - Bonferronni FDR
- - Internal cross-validation
- - External replication
158(No Transcript)
159(No Transcript)
160(No Transcript)
161Technical aspects of study design and data
analysis
- Study designs, assay technologies, and
statistical methods - Gene discovery (e.g., genetic epidemiology)
- Candidate gene studies
- Genome-wide association studies
- - Marker selection for blind search tag SNPs
- - Massively parallel genotyping
- a. Array-based strategies
- Deep resequencing
- - Statistical models
- a. Main effect models
- Interaction models
- Managing Type I error
- - Bonferronni FDR
- - Internal cross-validation
- - External replication
162Fishers regression
Outcome
GG GC CC
y a b(G) y a b(GG) c(GC) d(CC)
163Fishers regression
Environment A
Environment B
Outcome
Outcome
GG GC CC
GG GC CC
y a b(G) y a b(GG) c(GC) d(CC)
164Fishers regression
Environment A
Environment B
Outcome
Outcome
GG GC CC
GG GC CC
y a b(G) c(Env) d(G x Env) y a
b(GG) c(GC) d(CC) e(Env) f(Env x GG)
g(Env x GC) h(Env x CC)
165Technical aspects of study design and data
analysis
- Study designs, assay technologies, and
statistical methods - Gene discovery (e.g., genetic epidemiology)
- Candidate gene studies
- Genome-wide association studies
- - Marker selection for blind search tag SNPs
- - Massively parallel genotyping
- a. Array-based strategies
- Deep resequencing
- - Statistical models
- a. Main effect models
- Interaction models
- Managing Type I error
- - Bonferronni FDR
- - Internal cross-validation
- - External replication
166Type 1 / false positive error
167Type 1 / false positive error Confirmatory
hypothesis testing (candidate genes) 1
hypothesis 1 t-test 1 p-value no problem p
lt .05 p lt .05
168Type 1 / false positive error Confirmatory
hypothesis testing (candidate genes) 1
hypothesis 1 t-test 1 p-value no problem p
lt .05 p lt .05 Gene mapping (exploratory
association testing) Gene expression 22,000
p-values 1,100 false positives (p lt
.05) p(false discovery gt 0)
.999999999999999999999999
169Type 1 / false positive error Confirmatory
hypothesis testing (candidate genes) 1
hypothesis 1 t-test 1 p-value no problem p
lt .05 p lt .05 Gene mapping (exploratory
association testing) Gene expression 22,000
p-values 1,100 false positives (p lt
.05) p(false discovery gt 0)
.999999999999999999999999 Gene polymorphism
10,000,000 p-values 500,000 false positives (p
lt .05) p(false discovery gt 0)
.999999999999999999999999
170What to do?
171What to do? 1. Increase stringency
(intra-study) Bonferroni correct ( p
.05/22,000 .00000227 ) Choice huge samples or
massive Type 2 false negative error
172What to do? 1. Increase stringency
(intra-study) Bonferroni correct ( p
.05/22,000 .00000227 ) Choice huge samples or
massive Type 2 false negative
error Model/simulate error Randomization test or
FDR modeling less conservative
bias Unimpressive yield p .00000300 if
youre lucky. Still too conservative, and
biased ( omitted true effects in error term )
173(No Transcript)
174What to do? 1. Increase stringency
(intra-study) Bonferroni correct ( p
.05/22,000 .00000227 ) Choice huge samples or
massive Type 2 false negative
error Model/simulate error Randomization test or
FDR modeling less conservative
bias Unimpressive yield p .00000300 if
youre lucky. Still too conservative, and
biased ( omitted true effects in error term )
175What to do? 1. Increase stringency
(intra-study) Bonferroni correct ( p
.05/22,000 .00000227 ) Choice huge samples or
massive Type 2 false negative
error Model/simulate error Randomization test or
FDR modeling less conservative
bias Unimpressive yield p .00000300 if
youre lucky. Still too conservative, and
biased ( omitted true effects in error term )
Use a better sampling design
176Population prevalence design
177Population prevalence design
Outcome-stratified design
178What to do? 1. Increase stringency
(intra-study) Bonferroni correct ( p
.05/22,000 .00000227 ) Choice huge samples or
massive Type 2 false negative
error Model/simulate error Randomization test or
FDR modeling less conservative
bias Unimpressive yield p .00000300 if
youre lucky. Still too conservative, and
biased ( omitted true effects in error term )
Use a better sampling design
179- What to do?
- 1. Increase stringency (intra-study)
- Bonferroni correct ( p .05/22,000 .00000227 )
- Choice huge samples or massive Type 2 false
negative error - Model/simulate error
- Randomization test or FDR modeling less
conservative bias - Unimpressive yield p .00000300 if youre
lucky. - Still too conservative, and biased ( omitted true
effects in error term ) - Use a better sampling design
- Replicate (inter-study or intra-study
cross-validation) - .05 x .05 x .05 .000125 x 22,000 2.75
false positives ( vs. 1,100 )
180(No Transcript)
181- What to do?
- 1. Increase stringency (intra-study)
- Bonferroni correct ( p .05/22,000 .00000227 )
- Choice huge samples or massive Type 2 false
negative error - Model/simulate error
- Randomization test or FDR modeling less
conservative bias - Unimpressive yield p .00000300 if youre
lucky. - Still too conservative, and biased ( omitted true
effects in error term ) - Use a better sampling design
- Replicate (inter-study or intra-study
crossvalidation) - .05 x .05 x .05 .000125 x 22,000 2.75
false positives ( vs. 1,100 )
182Technical aspects of study design and data
analysis
- Study designs, assay technologies, and
statistical methods - Gene discovery (e.g., genetic epidemiology)
- Candidate gene studies
- Genome-wide association studies
- - Marker selection for blind search tag SNPs
- - Massively parallel genotyping
- a. Array-based strategies
- Deep resequencing
- - Statistical models
- a. Main effect models
- Interaction models
- Managing Type I error
- - Bonferronni FDR
- - Internal cross-validation
- - External replication
183Technical aspects of study design and data
analysis
- Study designs, assay technologies, and
statistical methods - Gene discovery (e.g., genetic epidemiology)
- Candidate gene studies
- Genome-wide association studies
184Technical aspects of study design and data
analysis
- Study designs, assay technologies, and
statistical methods - Gene discovery (e.g., genetic epidemiology)
- Candidate gene studies
- Genome-wide association studies
- The bioinformatic middle road biological
hypotheses buy power
185Technical aspects of study design and data
analysis
- Study designs, assay technologies, and
statistical methods - Gene discovery (e.g., genetic epidemiology)
- Candidate gene studies
- Genome-wide association studies
- The bioinformatic middle road biological
hypotheses buy power - - Candidate set selection
- a. Regulatory polymorphism
- b. Coding polymorphism
- - Statistical considerations
- a. Power
- b. Differential enrichment
186In silico prediction of Gene x Environment
Interaction
IL6
TCT TGCGATGCTA AAG
187In silico prediction of Gene x Environment
Interaction In silico
188In silico prediction of Gene x Environment
Interaction In silico
In
vitro
IL6 promoter WT -174C
Difference p lt .0001
Transcriptional activity (fold-change)
Norepinephrine (mM) 0 10 - 0
10
189In silico prediction of Gene x Environment
Interaction In vivo
IL6 -174 GG IL6 -174 CC/GC
p .439
p .008
1901205 GRE-modifying SNPs
191Gene set enrichment analysis
192Technical aspects of study design and data
analysis
- Study designs, assay technologies, and
statistical methods - Gene discovery (e.g., genetic epidemiology)
- Candidate gene studies
- Genome-wide association studies
- The bioinformatic middle road biological
hypotheses buy power - - Candidate set selection
- a. Regulatory polymorphism
- b. Coding polymorphism
- - Statistical considerations
- a. Power
- b. Differential enrichment
193Population prevalence design
Outcome-stratified design
194Population prevalence design
Outcome-stratified design
GEscan
GEscan
195Technical aspects of study design and data
analysis
- Study designs, assay technologies, and
statistical methods - Gene discovery (e.g., genetic epidemiology)
- Candidate gene studies
- Genome-wide association studies
- The bioinformatic middle road biological
hypotheses buy power - - Candidate set selection
- a. Regulatory polymorphism
- b. Coding polymorphism
- - Statistical considerations
- a. Power
- b. Differential enrichment
196(No Transcript)
197Technical aspects of study design and data
analysis
- Study designs, assay technologies, and
statistical methods - Gene discovery (e.g., genetic epidemiology)
- Candidate gene studies
- Genome-wide association studies
- The bioinformatic middle road biological
hypotheses buy power - - Candidate set selection
- a. Regulatory polymorphism
- b. Coding polymorphism
- - Statistical considerations
- a. Power
- b. Differential enrichment
198Technical aspects of study design and data
analysis
- Study designs, assay technologies, and
statistical methods - Gene discovery (e.g., genetic epidemiology)
- Candidate gene studies
- Genome-wide association studies
- The bioinformatic middle road biological
hypotheses buy power
199Technical take-home points
- Strengths weaknesses of alternative approaches
- Candidate gene studies focus on 1 candidate
- Advantages
- - Scientifically tractable incremental
cross-validatable - - Maximal statistical power (focused hypothesis)
- Disadvantages
- - Can only discover what we already know
(i.e., biased) - Genome-wide association studies focus on all
candidates - Advantages
- - Unbiased de novo discovery
- Disadvantages
- - Minimal statistical power, particularly for
interactions - The bioinformatic middle road focus on a small
set of causally plausible candidates (unbiased
search of regulatory and coding