Title: Examples are Tangier disease in Tangier Island off the coas
1 Genetic Epidemiology M. Tevfik
DORAK http//www.dorak.info/epi/genetepi.html
2Approaches to the identification of
susceptibility genes Rebbeck TR. Cancer 1999 (www)
3Palmer LJ. Webcast (www)
4GENETIC EPIDEMIOLOGIC RESEARCH METHODS
Handbook of Statistical Genetics(John Wiley
Sons) Fig.28-1 (www)
5GENETIC EPIDEMIOLOGY Flow of research
Disease characteristics Familial
clustering Genetic or environmental Mode of
inheritance Disease susceptibility loci Disease
susceptibility markers
Descriptive epidemiology Family aggregation
studies Twin/adoption/half-sibling/migrant
studies Segregation analysis Linkage
analysis Association studies
6Autosomal recessive disorders are usually common
in populations with high level of inbreeding
(restricted gene pool). Examples are Tangier
disease in Tangier Island off the coast of
Virginia, USA many genetic disorders in
Ashkenazi Jews (Tay-Sachs disease, Gaucher
disease, Fanconi anaemia, Niemann-Pick disease)
congenital adrenal hyperplasia (CAH) due to
21-hydroxylase deficiency in Yupik Eskimos CAH
due to 11-beta hydroxylase deficiency in Moroccan
Jews and thalassaemias (beta alpha) in Cyprus
and Sardinia
Populations like Finland, Iceland and
Newfoundland exhibit an increased prevalence of
rare recessive diseases (congenital nephrotic
syndrome of the Finnish type and Newfoundland
rod-cone dystrophy)
7Study Designs in Genetic Epidemiology
nuclear families (index case and parents)
affected relative pairs (sibs, cousins, any two
members of the family) extended pedigrees
twins (monozygotic and dizygotic) unrelated
population samples
8GENETIC EPIDEMIOLOGY Flow of research
Disease characteristics Familial
clustering Genetic or environmental Mode of
inheritance Disease susceptibility loci Disease
susceptibility markers
Descriptive epidemiology Family aggregation
studies Twin/adoption/half-sibling/migrant
studies Segregation analysis Linkage
analysis Association studies
9Risk Ratio (Lambda)
Genetics in Clinical Research (www)
10Risk Ratio (Lambda)
Genetics in Clinical Research (www)
11Sibling Recurrence Risk / Sibling Risk Ratio (lS
)
Curnow Smith J Roy Stat Soc 1975138139-169
12ROCHE Genetic Education (www)
13GENETIC EPIDEMIOLOGY Flow of research
Disease characteristics Familial
clustering Genetic or environmental Mode of
inheritance Disease susceptibility loci Disease
susceptibility markers
Descriptive epidemiology Family aggregation
studies Twin/adoption/half-sibling/migrant
studies Segregation analysis Linkage
analysis Association studies
14(MacGregor, 2000)
15ROCHE Genetic Education (www)
16ROCHE Genetic Education (www)
17Adoption Studies 1. Compare the risk in
biological relatives with adopted relatives of
affected adoptees (beware of adoption bias) 2.
Compare the risk in biological relatives with
adopted relatives of unaffected adoptees
18Migrant Studies Liao CK et al. Endometrial
cancer in Asian migrants to the United States and
their descendants. Cancer Causes Control
200314357-60 (www) Flood DM et al. Colorectal
cancer incidence in Asian migrants to the United
States and their descendants. Cancer Causes
Control 200011403-11 (www) Feltbower RG et al.
Trends in the incidence of childhood diabetes in
south Asians and other children in Bradford, UK.
Diabet Med 200219162-6 (www) Children in
south Asia have a low incidence of type 1
diabetes but migrants to the UK have similar
overall rates to the indigenous population.
However, a more steeply rising incidence is seen
in the south Asian population, and our data
suggest that incidence in this group may
eventually outstrip that of the non-south Asians.
Genetic factors are unlikely to explain such a
rapid change, implying an influence of
environmental factors in disease aetiology
19GENETIC EPIDEMIOLOGY Flow of research
Disease characteristics Familial
clustering Genetic or environmental Mode of
inheritance Disease susceptibility loci Disease
susceptibility markers
Descriptive epidemiology Family aggregation
studies Twin/adoption/half-sibling/migrant
studies Segregation analysis Linkage
analysis Association studies
20(www)
21Washington University (www)
22Modes of inheritance
23GENETIC EPIDEMIOLOGY Flow of research
Disease characteristics Familial
clustering Genetic or environmental Mode of
inheritance Disease susceptibility loci Disease
susceptibility markers
Descriptive epidemiology Family aggregation
studies Twin/adoption/half-sibling/migrant
studies Segregation analysis Linkage
analysis Association studies
24(www)
25ROCHE Genetic Education (www)
26Differences between linkage and association
27Risch NJ. Nature 2000
28(No Transcript)
29GENETIC EPIDEMIOLOGY Flow of research
Disease characteristics Familial
clustering Genetic or environmental Mode of
inheritance Disease susceptibility loci Disease
susceptibility markers
Descriptive epidemiology Family aggregation
studies Twin/adoption/half-sibling/migrant
studies Segregation analysis Linkage
analysis Association studies
30Association Studies Population-based Cases
and unrelated population controls from the same
study base Family-based Child-family trios
and TDT design is the most common
31Odds Ratio 3.6 95 CI 1.3 to 10.4
ROCHE Genetic Education (www)
32Genetic Models and Case-Control Association Data
Analysis The data may also be analysed assuming
a prespecified genetic model. For example, with
the hypothesis that carrying allele B increased
risk of disease (dominant model), the AB and BB
genotypes are pooled giving a 2x3x2 table. This
is particularly relevant when allele B is rare,
with few BB observations in cases and controls.
Alternatively, under a recessive model for allele
B, cells AA and AB would be pooled. Analysing by
alleles provides an alternative perspective for
case control data. This breaks down genotypes to
compare the total number of A and B alleles in
cases and controls, regardless of the genotypes
from which these alleles are constructed. This
analysis is counter-intuitive, since alleles do
not act independently, but it provides the most
powerful method of testing under a multiplicative
genetic model, where risk of developing a
disease increases by a factor r for each B allele
carried risk r for genotype AB and r2 for
genotype BB. If a multiplicative genetic model is
appropriate, both case and control genotypes
will be in HardyWeinberg equilibrium, and this
can be tested for. A fourth possible genetic
model is additive, with an increased disease risk
of r for AB genotypes, and 2r for BB genotypes.
This model shows a clear trend of an increased
number of AB and BB genotypes, with the risk for
AB genotypes approximately half that for BB
genotypes. The additive genetic model can be
tested for using Armitages test for trend.
Lewis CM. Brief Bioinform 2002 (www)
33ROCHE Genetic Education (www)
34Linkage disequilibrium and population
demography Mapping disease genes by association
requires the identification of linkage
disequilibrium (LD) between a marker and a
disease phenotype. Several studies of African
populations have indicated that levels and
patterns of LD in these populations differ from
those in non-African populations owing to the age
of African populations, admixture with other
African and non-African populations, and
historical differences in population size and
substructure. A disease mutation (shown in
violet) that occurs on a single haplotype
background will initially be in complete LD with
flanking markers on that chromosome (see panel
a). In each generation, LD between a marker and a
disease allele decays owing to recombination
between the sites, and also because of the
effects of mutation and gene conversion at
marker loci. Young populations, and those that
have undergone recent bottlenecks (as probably
occurred during the migration of ancestral humans
out of Africa), will have haplotype blocks of
large to moderate size (panel b, shown in green).
In older and larger African populations, in
which there has been more recombination, the size
of haplotype blocks will probably be smaller
(panel c). LD can also be established by a
founder event, with the strength and extent of
the LD depending on the severity and length of
the bottleneck event. Population substructure
increases LD owing to a smaller effective
population size and to higher levels of genetic
drift in subdivided populations. So, if a pooled
sample derived from several African populations
was analysed, spurious LD would be detected, even
if the haplotypes in each subpopulation were in
LD. This could lead to erroneous conclusions
about the association between genetic markers
and disease phenotype. Small populations of
stable size are expected to show LD between
closely linked loci as a result of increased
genetic drift, and larger populations will have
fewer sites in LD. New mutations are less likely
to be in LD in growing populations owing to the
smaller effect of genetic drift, but allelic
associations that exist before population
expansion might persist for a longer period of
time in an expanding population than in a
population of constant size.
Tishkoff, Nat Reviews Genet 2002 (www)
35Mapping Disease Susceptibility Genes by
Association Studies
(www)
36Mapping Disease Susceptibility Genes by
Association Studies
Plot of minus log of P value for case-control
test for allelic association with AD, for SNPs
immediately surrounding APOE (lt100 kb)
Martin, 2000 (www)
37(No Transcript)
38Sample size requirements for different genetic
models
Palmer Cardon, Lancet 2005 (www)
39Sample size requirements as a function of allele
frequencies
Johnson GC et al. Nat Genet 2001 (www)
40Sample size requirements as a function of the
strength of association
Botstein Risch. Nat Genet 2003 (www)
41(No Transcript)
42SNP Selection for Association Studies
- Regulatory / Functional SNPs -
(www)
FastSNP (www)
Yuan, 2006 (www)
43SNP Selection for Association Studies
- Regulatory / Functional SNPs -
(www)
Yue, 2006 (www)
44SNP Selection for Association Studies
- Haplotype Tagging SNPs -
(www)
(www)
45Haplotype Association
Tabor HK et al. Nature Rev Genetics 2002 (www)
46Illustration of tagging SNPs a The diagram
shows five haplotypes. Twelve single nucleotide
polymorphisms (SNPs) are localized in order along
the chromosome. The letters on the top indicate
groups of SNPs that have perfect pairwise linkage
disequilibrium (LD) with one another, and the
numbers on the bottom indicate each of the 12
SNPs. SNP 9 is the causal variant, which in this
simple example determines drug response allele C
results in a therapeutic response, whereas
allele G results in an adverse reaction. In this
example, the selection of just one SNP from each
of the groups AE would be sufficient to fully
represent all of the haplotype diversity. Each
haplotype can be identified by just five tagging
SNPs (tSNPs), and the causal variant would be
tagged even if it were not itself typed (in
fact, multi-marker approaches to tSNP selection
would reduce the set of tags to fewer than five,
but this is ignored for simplicity). So, tSNP
profiles that are highlighted predict an adverse
reaction to the medicine. Normally, LD patterns
are not so clear-cut and statistical methods are
required to select appropriate sets of tSNPs. b
The diagram depicts the same 12 SNPs, but with
different associations among them, as might
happen in a different population group. Because
patterns of LD are different, some patients
would be misclassified if the same five tSNPs
were used and interpreted in the same way that
is, using the same SNP profiles as defined in
population A, haplotype profiles 1, 2 and 3 are
predicted to have allele C at the causal SNP 9 (a
therapeutic response), whereas haplotype
profiles 4 and 5 are predicted to have an
adverse response. However, because the pattern of
association has changed, the new haplotypes 6
and 7 are misclassified as haplotype patterns 6
and 7 in population B. Goldstein, Nat Rev Genet
2003 (www)
47Erichsen Chanock. Br J Cancer 2004 (www)
48Associations with Ancestral Haplotypes
(Schork, 1998)
49Dorak, 2002 (www)
Ayala, 1994 (www)
50Palmer LJ. Webcast (www)
51Wacholder, 2002 (www)
52Population Stratification
Marchini, 2004 (www)
53Population Stratification
Cardon Palmer, 2003 (www)
54Multiple Comparisons Spurious Associations
Diepstra, Lancet 2005 (www)
55(www)
56Family-based association study designs
Haplotype Relative Risk (HRR) Method (Falk
Rubinstein, 1987 Knapp, 1993) Affected
Family-Based Controls (AFBAC) Method (Thomson,
1995) Transmission Disequilibrium/Distortion
Test (TDT) (Spielman, 1993 1994 Ewens
Spielman, 1995) Reviews (Thomson, 1995
Gauderman, 1999)
57(No Transcript)
58- AN EXAMPLE OF TDT - TRANSMISSION DISEQUILIBRIUM
OF HLA-B62 TO THE PATIENTS WITH CHILDHOOD
AML (Dorak et al, BSHI 2002)
Out of 13 parents heterozygote for B62, 12
transmitted B62 to the affected child and 1 did
not Mc Nemars test results P 0.006 (with
continuity correction) odds ratio 12.0, 95 CI
1.8 to 513
59Multifactorial Etiology
ROCHE Genetic Education (www)
60Models of geneenvironment interactions
Hunter, 2005 (www)
61Sample size requirement for gene-environment
interaction studies
Hunter, 2005 (www)
62An example of a gene-environment interaction
In Alzheimer disease, the risk of cognitive
decline as measured by TICS test is particularly
high in APOE4 carriers who have untreated
hypertension (APOE4/HT).
Hunter, 2005 (www)
63Falconer's polygenic threshold model for
dichotomous nonmendelian characters Liability to
the condition is polygenic and normally
distributed (upper curve). People whose liability
is above a certain threshold value are affected.
Their sibs (lower curve) have a higher average
liability than the population mean and a greater
proportion of them have liability exceeding the
threshold. Therefore the condition tends to run
in families (Falconer DS, 1967).
64M.Tevfik DORAK http//www.dorak.info