Title: Diapositiva 1
1(No Transcript)
2EPISTASIS SIZE DOES MATTER
Antonio Salas Centro Nacional de Genotipado
(CeGen Santiago de Compostela) E-mail
apimlase_at_usc.es
3The genetic architecture of the disease
(1) the number of genes that impact disease
susceptibility
(2) the distribution of alleles and genotypes at
those genes
(3) the manner in which the alleles and genotype
impact disease susceptibility (Weiss, 1993)
4There are likely to be many susceptibility genes
each with combinations of rare and common alleles
and genotypes that impact disease susceptibility
primarily through nonlinear interactions with
genetic and environmental factors
5COMPLICATING FACTORS
6 7 8 9HETEROGENEITY
INTERACTIONS
Analytically, it can be difficult to distinguish
between heterogeneity and interactions. Many of
the methods that address heterogeneity might be
equally applicable to uncovering interactions.
10What are gene-gene interactions?
William Bateson (1861-1926) used the term
epistasis to describe distortions of mendelian
segregation ratios that were due to one gene
masking the effects of another.
Ronald Fisher (1890-1962) described epistasis as
deviations from additivity in a linear
statistical model.
11Genetical, biological and statistical epistasis
There is a very close relationship between
genetical and biological epistasis, with each
occurring at the level of the individual.
Differences in genetical and biological epistasis
among individuals in a population give rise to
statistical epistasis. It is entirely possible
for genetical and biological epistasis to occur
in the absence of statistical epistasis.
Does evidence of statistical epistasis
necessarily imply genetical of biological
epistasis?
12Why are gene-gene interactions likely to be
common?
The recognition that deviations from Mendelian
ratios are due to interactions between genes has
been around for nearly 100 years. This is
important because it is an idea that has
prevailed through time and is still recognized
today.
1
13The ubiquity of biomolecular interactions in gene
regulation and biochemical and metabolic systems
suggest that the relationship between DNA
sequence variations and clinical endpoints is
likely to involve gene-gene interactions.
2
14Positive results from studies of single
polymorphisms typically do not replicate across
independent samples. It seems that gene-gene
interactions play an important role here.
3
15Gene-gene interactions are commonly found when
properly investigated (Templeton 2000)
4
16Epistasis is, most probably, the principal
mechanism that explains the inter-individual
phenotypic variability of genetic diseases.
17Why are gene-gene interactions difficult to
detect?
Epistasis is difficult to detect and characterize
using traditional parametric statistical methods,
because of the sparseness of the data in high
dimensions.
.- e.g. 100 binary SNPs number of combinations
3100 5.2E47
EPISTASIS, SIZE DOES MATTER
18The result of this added dimensionality is that
exponentially larger sample sizes are needed to
have enough data to estimate the interaction
effects.
This phenomenon has been referred to as the
curse of dimensionality (Bellman, 1961) and, for
methods such as logistic regression, can lead to
parameter estimates that have very large standard
errors resulting in an increase in type I errors
(Concato et al., 1993 Peduzzi et al., 1996
Hosmer and Lemeshow, 2000).
19Most common human diseases are likely to have
complex etiologies. Methods of analysis that
allow for or exploit the phenomenon of epistasis
are clearly of growing importance in the genetic
dissection of complex disease and response to
drugs. By allowing for epistatic interactions
between potential disease loci, we may succeed in
identifying genetic variants that might otherwise
have remained undetected.
20- MULTIVARIATE ADAPTIVE REGRESSION SPLINES (MARS)
- CLASSIFICATION AND REGRESSION TREES (CART)
- COMBINATORIAL-PARTIONING METHOD (CPM)
- RESTRICTED-PARTITION METHOD (RPM)
- MULTIFACTOR DIMENSIONALITY REDUCTION (MDR)
- ARTIFICIAL NEURAL NETWORKS
21(No Transcript)
22There are quite a lot of software and
bioinformatics tools to carry out common
computational tasks in the analysis of complex
diseases and pharmacogenetics (e.g. SNP and
haplotype association tests, haplotype
reconstruction, LD, etc.).
The epistasis problem is however of a different
dimension
23FUTURE WORK IN THE CEGEN NODE OF SANTIAGO DE
COMPOSTELA
EXPLORING METHODS (SVM, CART, Bagging, Boosting,
classification trees, Random Forest, kNN, MARS,
Neural Networks, etc)
DEVELOPMENT OF NEW METHODS AND SOFTWARE
24MULTIDISCIPLINARY TEAM (CEGEN-SANTIAGO NODE) -
Medical doctors - Pharmaceutists -
Biologists - Geneticists - Physicist -
Mathematicians - Statisticians - Computer
scientists
25http//www.cesga.es/ga
26Thanks you very much for your attention
Antonio Salas Centro Nacional de Genotipado
(CeGen Santiago de Compostela) Contact e-mail
apimlase_at_usc.es