Linkage analysis: basic principles - PowerPoint PPT Presentation

About This Presentation
Title:

Linkage analysis: basic principles

Description:

Linkage analysis: basic principles Manuel Ferreira & Pak Sham Boulder Advanced Course 2005 Other Linkage statistics Dependent variable: Phenotypes Independent ... – PowerPoint PPT presentation

Number of Views:144
Avg rating:3.0/5.0
Slides: 39
Provided by: Manu56
Category:

less

Transcript and Presenter's Notes

Title: Linkage analysis: basic principles


1
Linkage analysis basic principles
Manuel Ferreira Pak Sham
Boulder Advanced Course 2005
2
Outline
1. Aim
2. The Human Genome
3. Principles of Linkage Analysis
4. Parametric Linkage Analysis
5. Nonparametric Linkage Analysis
3
1. Aim
4
For a heritable trait...
Linkage
localizes region of the genome where a locus
(loci) that regulates the trait is likely to be
harboured
Family-specific phenomenon Affected individuals
in a family share the same ancestral
predisposing DNA segment at a given trait locus
identifies a locus that regulates the trait
Association
Population-specific phenomenon Affected
individuals in a population share the same
ancestral predisposing DNA segment at a given
trait locus
5
2. Human Genome
6
DNA structure
A DNA molecule is a linear backbone of
alternating sugar residues and phosphate groups
Attached to carbon atom 1 of each sugar is a
nitrogenous base A, C, G or T
Two DNA molecules are held together in
anti-parallel fashion by hydrogen bonds between
bases Watson-Crick rules
Antiparallel double helix
A gene is a segment of DNA which is transcribed
to give a protein or RNA product
Only one strand is read during gene transcription
Nucleotide 1 phosphate group 1 sugar 1 base
7
DNA polymorphisms
RFLPs
Minisatellites
Microsatellites gt100,000 Many alleles, (CA)n,
very informative, even, easily automated
A
SNPs 10,054,521 (25 Jan 05) Most with 2 alleles
(up to 4), not very informative, even, easily
automated
B
8
DNA organization
22 1
2 (22 1)
2 (22 1)
2 (22 1)
?
?
?
A -
A -
A -
?
B -
?
?
?
?
Mitosis
B -
B -
chr1
A -
A -
A -
A -
- A
- A
?
?
?
B -
B -
B -
B -
- B
- B
A -
- A
- A
B -
- B
chr1
- B
G1 phase
S phase
M phase
Haploid gametes
Diploid zygote 1 cell
Diploid zygote gt1 cell
9
DNA recombination
22 1
22 1
A -
NR
(?)
B -
A -
- A
chr1
2 (22 1)
2 (22 1)
B -
- B
?
- A
Meiosis
R
chr1
(?)
(?)
?
?
- B
A -
A -
- A
- A
chr1
B -
B -
- B
- B
A -
R
chr1
chr1
chr1
chr1
(?)
A -
- A
B -
chr1
Diploid gamete precursor cell
B -
- B
- A
chr1
NR
- B
Haploid gamete precursors
chr1
Hap. gametes
10
DNA recombination between linked loci
22 1
A -
NR
B -
(?)
A -
- A
B -
- B
2 (22 1)
?
- A
Meiosis
NR
- B
(?)
(?)
?
?
A -
A -
- A
- A
B -
B -
- B
- B
A -
NR
B -
(?)
A -
- A
B -
- B
Diploid gamete precursor
- A
- B
NR
Haploid gamete precursors
Hap. gametes
11
Human Genome - summary
DNA is a linear sequence of nucleotides
partitioned into 23 chromosomes Two copies of
each chromosome (2x22 autosomes XY),
from paternal and maternal origins. During
meiosis in gamete precursors, recombination can
occur between maternal and paternal homologs
Recombination fraction between loci A and B
(?) Proportion of gametes produced that are
recombinant for A and B If A and B are very far
apart 50R50NR - ? 0.5 If A and B are very
close together lt50R - 0 ? lt 0.5
Recombination fraction (?) can be converted to
genetic distance (cM) Haldane
eg. ?0.17, cM20.8 Kosambi eg.
?0.17, cM17.7
12
3. Principles of Linkage Analysis
13
Linkage Analysis requires genetic markers
Q
M1
Mn
M2
0.5
.4
.3
.3
.4
0.5
?
0.5
.15
M1
Mn
M2
.35
.35
.22
.26
0.5
?
0.5
0.5
.4
.3
.3
.4
.1
M1
Mn
M2
14
Linkage Analysis Parametric vs. Nonparametric
Gene
Chromosome
Recombination
Genetic factors
Q
M
A
Mode of inheritance
Correlation
D
Phe
C
E
Environmental factors
Adapted from Weiss Terwilliger 2000
15
4. Parametric Linkage Analysis
16
Linkage with informative phase known meiosis
Gene
Chromosome
?
?
M1..6
Q1,2
Autosomal dominant, Q1 predisposing allele
M2M5Q2Q2
M1M6Q1Q?
M1
Q1
Informative Phase known
M1Q1/M2Q2
M3M4Q2Q2
M1M2Q1Q2
M2
Q2
M1Q1/M3Q2
M2Q2/M3Q2
M1Q1/M4Q2
M1Q1/M4Q2
M2Q2/M4Q2
M2Q1/M3Q2
NR M1Q1
NR M2Q2
(20.8 cM)
?MQ 1/6 0.17
R M1Q2
R M2Q1
17
Linkage with informative phase unknown meiosis
Q2Q2
Q1Q?
Informative Phase unknown
M1M2Q1Q2
M3M4Q2Q2
M1Q1/M3Q2
M2Q2/M3Q2
M1Q1/M4Q2
M1Q1/M4Q2
M2Q2/M4Q2
M2Q1/M3Q2
M1Q1/M2Q2
M1Q2/M2Q1
P
P
N
N
1-?
R M1Q1
?
3
3
NR M1Q1
NR M2Q2
1-?
R M2Q2
?
2
2
R M1Q2
?
NR M1Q2
1-?
0
0
R M2Q1
NR M2Q1
?
1-?
1
1


18
Parametric LOD score calculation
Overall LOD score for a given ? is the sum of all
family LOD scores at ?
eg. LOD3 for ?0.28
19
Parametric Linkage Analysis - summary
Q
M1
M2
Mn
.3
.4
?
0.5
0.5
.4
.3
0.5
.1
For each marker, estimate the ? that yields
highest LOD score across all families
This ? (and the LOD) will depend upon the mode of
inheritance assumed MOI determines the genotype
at the trait locus Q and thus determines
the number of meiosis which are recombinant or
nonrecombinant. Limited to Mendelian diseases.
Markers with a significant parametric LOD score
(gt3) are said to be linked to the trait locus
with recombination fraction ?
20
Practical
1. Identify informative individual(s)
Q1Q1
Q2Q?
2. Reconstruct possible phase(s)
3. Classify gametes as R or NR
M1M2Q1Q1
M3M4Q1Q2
4. Count R and NR gametes
5. Express
6. Express LOD score
M2M3Q1Q1
M1M4Q1Q2
M1M4Q1Q1
M2M4Q1Q2
M3Q1/M4Q2
M3Q2/M4Q1
P
P
N
N
1-?
R M3Q1
?
1
1
NR M3Q1
NR M4Q2
1-?
R M4Q2
?
2
2
R M3Q2
?
NR M3Q2
1-?
0
0
R M4Q1
NR M4Q1
?
1-?
1
1


21
Practical II
Talk example
Practical example
Graph each
22
Outline
1. Aim
2. The Human Genome
3. Principles of Linkage Analysis
4. Parametric Linkage Analysis
5. Nonparametric Linkage Analysis
23
5. Nonparametric Linkage Analysis
24
Approach
Parametric genotype marker locus genotype
trait locus (latter inferred from phenotype
according to a specific disease model) Parameter
of interest ? between marker and trait loci
Nonparametric genotype marker locus
phenotype If a trait locus truly regulates the
expression of a phenotype, then two relatives
with similar phenotypes should have similar
genotypes at a marker in the vicinity of the
trait locus, and vice-versa. Interest
correlation between phenotypic similarity and
marker genotypic similarity
No need to specify mode of inheritance, allele
frequencies, etc...
25
Phenotypic similarity between relatives
Squared trait differences
Squared trait sums
Trait cross-product
Trait variance-covariance matrix
Affection concordance
T2
T1
26
Genotypic similarity between relatives
IBS Alleles shared Identical By State look the
same, may have the same DNA sequence but they
are not necessarily derived from a known common
ancestor
M3
M1
M2
M3
Q3
Q1
Q2
Q4
IBD Alleles shared Identical By Descent are
a copy of the same ancestor allele
M1
M2
M3
M3
Q1
Q2
Q3
Q4
IBS
IBD
M1
M3
M1
M3
2
1
Q1
Q3
Q1
Q4
0
0
0
1
1
Inheritance vector (M)
27
Genotypic similarity between relatives -
Number of alleles IBD
Proportion of alleles IBD -
Inheritance vector (M)
M2
M3
M1
M3
0
0
0
0
1
1
Q2
Q4
Q1
Q3
M1
M3
M1
M3
0.5
0
0
0
1
1
Q1
Q3
Q1
Q4
M1
M1
M3
M3
2
1
0
0
0
0
Q1
Q1
Q3
Q3
28
Genotypic similarity between relatives -
22n
29
Statistics that incorporate both phenotypic and
genotypic similarities
Phenotypic similarity
0
0.5
1
Genotypic similarity ( )
30
Haseman-Elston regression Quantitative traits
0.5
1
0
Phenotypic dissimilarity
Genotypic similarity
b

c
31
VC ML Quantitative Categorical traits
method
0.5
1
0
H1
H0
e.g. LOD3
32
Genome-wide linkage analysis (e.g. VC)
Individual LOD scores can be expressed as P
values (Pointwise) LOD Chi-sq (n-df) P
value 2.1 9.67 0.0009
(x4.6)
33
Statistics for selected samples
Mean IBD sharing statistics
(Risch Zhang 1995, 1996)
H0 (No linkage) Mean
H1 (Linkage) Mean
T2
H0 (No linkage) Mean
H1 (Linkage) Mean
T1
34
Other Linkage statistics
Dependent variable Phenotypes Independent
variable
Extensions to Haseman Elston
(Wright 1997, Drigalenko 1998, Elston et al.
2000, Forrest 2001, Visscher Hopper 2001, Xu et
al. 2000, Sham Purcell 2001)
VC ML with mixture distribution
(Eaves et al. 1996)
Dependent variable Independent
variable Phenotypes
Pedwide-regression Analysis (reverse HE)
(Sham et al. 2002)
Reverse VC ML
(Sham et al. 2000)
Statistics for affection traits
Based on IBD scoring functions eg. Sall
(Whittemore Halpern 1994, Kong Cox 1997)
Forrest Feingold 2000 Mixed statistic
35
Nonparametric Linkage Analysis - summary
No need to specify mode of inheritance
Models phenotypic and genotypic similarity of
relatives
Expression of phenotypic similarity, calculation
of IBD
HE and VC are the most popular statistics used
for linkage of quantitative traits
Other statistics available, specially for
affection traits
Type I error? Power?
36
Type I error
True positive
Theoretical (Lander Kruglyak 1995)
k
LOD
Empirical
Type I error
37
Theoretical genome-wide thresholds
Genome-wide threshold for significant linkage LOD
score that occurs by chance alone on average once
per 20 scans LOD 3.6, Chi-sq 16.7, Pointwise
P 0.000022
Genome-wide threshold for suggestive linkage LOD
score that occurs by chance alone on average once
per scan LOD 2.2, Chi-sq 10.1, Pointwise P
0.00074
38
Empirical genome-wide thresholds
Genome-wide threshold for significant linkage LOD
score that occurs by chance alone on average once
per 20 scans
Genome-wide threshold for suggestive linkage LOD
score that occurs by chance alone on average once
per scan
Write a Comment
User Comments (0)
About PowerShow.com