Linkage analysis - PowerPoint PPT Presentation

1 / 56

About This Presentation

Title:

Linkage analysis

Description:

... Sanger sequencing only possible after selection Massively parallel sequencing possible prior to or after selection RNA sequencing exome sequencing genome ... – PowerPoint PPT presentation

Number of Views:153

Avg rating:3.0/5.0

Slides: 57

Provided by: pet5193

Category:

more less

Transcript and Presenter's Notes

Title: Linkage analysis

1
Linkage analysis
6

Jan Hellemans

2
Finding causal mutations

2 opposing strategies
sequence then select
select then sequence
Sequencing
traditional Sanger sequencing only possible after
selection
Massively parallel sequencing possible prior to
or after selection
RNA sequencing
exome sequencing
genome sequencing

3
Finding causal mutations

Selection
positional (prior to sequencing)
linkage analysis
GWAS
structural variations (e.g. microdeletions)
functional (prior to after sequencing)
candidate genes selected based on known function
or involvement in related disorders
filtering of variants based on functional
predictions
overlap (after sequencing)
looking for genes / variants that occur in
multiple independent patients
mostly a combination is used

4
exome sequencing
5
Aims

Interprete microsatellite results
Add genotypes to pedigrees
Create pedigree and genotype files
Calculate and interprete LOD-scores
Delineate linkage intervals
Basic principles of linkage analysis
Analyze other types of markers
Association studies
Learn how to work with specific pedigree programs

6
Starting linkage analysis
7
Preparations

Clearly define the phenotype
If not specific enough than you may analyze
different disorders that can map to different
genomic loci
LOD scores are additive
Find suitable families
larger is better
more patients is better
Collect genomic DNA from as much family members
as possible
Determine the type of inheritance
Calculate the power to prove linkage with the
available material (SLink not part of this
course)

8
Linkage analysis types

Directed linkage analysis
Evaluate linkage at a specific locus such as a
candidate gene
Common approach evaluate an intragenic, 5 and
3 markeroften microsattelites
Genome wide linkage analysis
Screen for linkage for markers spread across the
entire genome
Microsatellites 400 markers spaced at about
10cM
SNPs 500k SNP array
Homozygosity mapping
Screen only affected individuals in inbred
families
Select homozygous markers (typically SNP markers)
Very efficient technology
Fine mapping
Some linked markers are known, but the borders of
the linkage interval still need to be defined

9
Exercise Part 1

2 inbred families with a recessive disorder
With a homozygosity mapping based on 500k SNP
arrays 2 candidate regions could be identified

Chromosome 4
Patient 1 homozygous for
6.052Mb - 14.488Mb
21.008Mb 37.477Mb
Patient 2 homozygous for
11.186Mb 37.219Mb
Task find microsatellite markers to confirm
linkage

10
Find additional flanking markers

Find physical position of marker in NCBI gt UniSTS
NCBI map viewer http//www.ncbi.nlm.nih.gov/mapvi
ew/
Go to Homo sapiens and to the wright chromosome
Maps options show
DeCode, Généthon Marshfield (genetic maps)
Genes
Set region e.g. 2Mb up- and downstream of your
marker
Click Data as table view
Click on STS behind a marker to see its details
Select markers that
locate to only 1 genomic location
have a PCR product with an extended size
rangeone size ? not polymorphic

11
http//www.ncbi.nlm.nih.gov/projects/mapview
12
http//www.ncbi.nlm.nih.gov/projects/mapview
13
http//www.ncbi.nlm.nih.gov/projects/mapview
14
Exercise Part 1 gt possible solution

Markers in 1st candidate region
D4S3017 (21.078Mb)
D4S3044 (25.189Mb)
D4S1618 (33.857Mb)
D4S3350 (33.857Mb)
D4S2988 (36.889Mb)
Markers in 2nd candidate region
D4S1582 (10.311Mb)
D4S2906 (12.321Mb)
D4S2944 (13.141Mb)
D4S1602 (14.059Mb)
D4S2960 (15.437Mb)
? Order primers analyze them on all family
members

15
Analyzing microsatellite data
16
Microsatellites gt basics

Repeats of short sequences (e.g.
2bp)NNNNAC(AC)nACNNNN
Number of repeats is variable (instable sequence)
Number of repeats determines the allele
Number of repeats corresponds to specific length
of PCR product
allel 1 NNNNACACACACACNNNN (5AC ? 18bp)
allel 2 NNNNACACACACACACNNNN (6AC ? 20bp)
allel 3 NNNNACACACACACACACNNNN (7AC ? 22bp)
...
Determine length to know the allele (sequencer)

17
Microsatellites gt basics
18
Microsatellites gt determine size

Use internal size standard (other color)

230bp
220bp
225bp
19
Microsatellites gt heterozygotes
230bp
220bp
225bp
223bp
20
Microsatellites gt stutter peaks

Repeats are difficult to copy ? polymerase slips
Some amplicons have 1 repeat lessa few even
loose multiple repeats
Small repeats are more prone to slippage and show
more pronounced stutter peaks
Largest product is the correct one
Distance between peaks length of a repeat

21
Microsatellites gt stutter peaks
allelic peak
1st stutter peak
2nd stutter peak
22
Microsatellites gt stutter peaks

Allelic peaks are the heighest
Stutter peaks are lower

A1
A2
23
Microsatellites gt stutter peaks
A1
A2
24
Microsatellites gt A peaks

Taq polymerase tends to add an extra A at the 3
end
Variable degree of products with or without this
extra A
Do not confuse with stutter peaks (only 1bp
difference)

allelic peak
allelic peak A
1st stutter peak
1st stutter peak A
2nd stutter peak
2nd stutter peak A
25
Microsatellites gt complex plots (stutter A)
A1
A2
26
Microsatellites gt mutliplex

Combine multiple markers in a single analysis
()
Different size range
Multicolor
Commercial kits e.g. 16 markers / lane

27
Microsatellite plots examples
28
(No Transcript)
29
(No Transcript)
30
(No Transcript)
31
(No Transcript)
32
(No Transcript)
33
(No Transcript)
34
(No Transcript)
35
(No Transcript)
36
Genotyping pedigrees
37
Genotyping pedigrees

Screen one or multiple markers for some or all
family members
For every marker
Make a list of all occuring allele sizes
Due to technical variation on sizing the same
allele can have a slightly different size in
different measurements (-0.4bp _ 0.4bp). Give
all alleles within this range the same allele
number
Add the allele numbers to the pedigree at the
corresponding individual/marker combination
Find the wright phase
Advanced software like GeneMapper can generate
tables with allele numbers for every sample /
marker
Advanced pedigree programs like Progeny can store
genotype information for family members
Verify inheritance

38
Exercise Part 2

Genotype 3 markers in all available individuals
of 2 families
Pedigrees microsatellite plots
inExercisePart2-GenotypingData.pdf
Add allele numbers for the 3 markers to the
pedigree
Interprete the genotyped pedigrees linked?

39
Family 1
40
Family 2
41
Exercise Part 2 gt Conclusions

D4S1582
Mendelian error ? can not be interpreted
D4S2944
Linked
D4S3017
Not-linked unaffected individuals with the same
genotype as a patient

42
Calculate LOD scores
43
EasyLinkage

EasyLinkage UI for linkage analysis
http//genetik.charite.de/hoffmann/easyLINKAGE/ind
ex.htmlstart
Bioinformatics. 2005 Feb 121(3)405-7 PMID
15347576
Bioinformatics. 2005 Sep 121(17)3565-7 PMID
16014370
Interface for many linkage analysis programs
Input
Pedigree file (linkage format)
Genotype file(s)
Marker information (already provided for popular
markers)
Settings

44
Pedigree file

Naming requirements for EasyLinkagep_xxx.pro ?
e.g. p_SMMD.pro
Format
Tab delimited text file
1 individual per row
Columns
1 ? family ID
2 ? person ID
3 ? father ID
4 ? mother ID
5 ? sex (1male, 2female, 0unknown)
6 ? affection status (1unaffected, 2affected,
0unknown)
7 ? DNA availability (optional, relevant for
power calculations)
8 ? liability class (to be provided if multiple
liability classes are used)

45
Genotype files

Person IDs have to match exactly with those
provided in the pedigree file
Naming requirements for EasyLinkageMarkerName_xx
x.abi ? e.g. D1S1609_SMMD.abi
Format
Tab delimited text file
1 individual per row
Columns (for microsatellite based analysis)
1 ? marker (same as in file name and matching a
marker in an available marker set)
2 ? custom information (content doesnt matter,
but column must be present)
3 ? individual ID (match person ID in pedigree
file)
4 5 ? genotypes for 2 alleles (unknown0)

46
Marker information

Contains information on the chromosome and
position of every marker
Already available for a number of commercial
SNP-arrays and for the microsatellite markers
from
Genethon
Marshfield
DeCode
Custom marker sets can be created (see manual)

47
EasyLinkage settings

Choose a program
FastLink ? Parametric, single-point
SuperLink ? Parametric, single-/multipoint
SPLink ? Nonparametric, single-point
Genehunter ? Nonpara-/parametric,
single-/multipoint
Genehunter Plus ? Nonpara-/parametric,
single-/multipoint
Genehunter MOD ? Nonpara-/parametric,
single-/multipoint
Genehunter Imprinting ? Nonpara-/parametric,
single-/multipoint
GeneHunter TwoLocus ? Parametric, two-locus,
single-/multipoint
Merlin ? Nonpara-/parametric, single-/multipoint
SimWalk ? Nonparametric, single-/multipoint
Allegro ? Nonpara-/parametric, single-/multipoint
simulation, single-/multi-point
PedCheck ? Mendelian error check
FastSLink ? Simulation, single-/multi-point

48
EasyLinkage settings

Parametric lt-gt non-parametric
Single point lt-gt multipoint
Frequency of the disease allele
Penetrance vectors (wt/wt, wt/mt, mt/mt)
Standard dominant 0 1 1
Standard recessive 0 0 1
Reduced penetrance replace 1 by penetrance (e.g.
0.9)
Phenocopy replace 0 by percentage of phenocopy
(e.g. 0.1)
Example 0.01 0.9 0.991 chance to show a
similar phenotype despite a normal genotype90
chance to show the phenotype when 1 mutant allele
(dominant with incomplete penetrance)99
likelihood to present with the phenotype if both
alleles are mutant

49
Evaluate calculated LOD-scores

Maximum LOD-scores can be seen in EasyLinkage
Details about LOD-scores at different
recombination fractions can be found in text
files generated by EasyLinkage ? process in Excel
(generate graphs, ...)
Standard rules for LOD-scores
gt3 ? significant linkage
2ltLODlt3 ? suggestive linkage
-2ltLODlt2 ? uninformative
lt-2 ? significant absence of linkage

50
Interpreting LOD plots
51
Exercise Part 3

Generate one pedigree file containing all family
members of both families (use Global IDs)
Generate a genotype file for each of the tested
markers
Run SuperLink analysis with the right settings
Evaluate results

52
Exercise Part 3 gt Results
53
Strengthen the evidence