Title: Computational Genetics Lecture 1
1Computational GeneticsLecture 1
Background Readings Chapter 23 of An
introduction to Genetics, Griffiths et al. 2000,
Seventh Edition (CS/Fishbach/Other libraries).
This class has been edited from several sources.
Primarily from Terry Speeds homepage at Stanford
and the Technion course Introduction to
Genetics. Changes made by Dan Geiger.
2(No Transcript)
3(No Transcript)
4(No Transcript)
5Course Goals
- Learning about computational and mathematical
methods for genetic analysis. - We will focus on Gene hunting finding genes for
simple human diseases. - Methods covered in depth linkage analysis (using
pedigree data), association analysis (using
random samples). - Another goal is to learn more about Bayesian
networks usage for genetic linkage analysis.
6Human Genome
- Most human cells contain
- 46 chromosomes
- 2 sex chromosomes (X,Y)
- XY in males.
- XX in females.
- 22 pairs of chromosomes, named autosomes.
7Genetic Information
- Gene basic unit of genetic information. They
determine the inherited characters. - Genome the collection of genetic information.
- Chromosomes storage units of genes.
8Sexual Reproduction
Meiosis
9The Double Helix
Source Alberts et al
10Central Dogma
?????
?????
cells express different subset of the genes In
different tissues and under different conditions
11Chromosome Logical Structure
- Marker Genes, SNP, Tandem repeats.
- Locus location of markers.
- Allele one variant form of a marker.
Locus1 Possible Alleles A1,A2
Locus2 Possible Alleles B1,B2,B3
12Alleles - the ABO locus example
Genotype Phenotype
A/A, A/O A
B/B, B/O B
A/B AB
O/O O
- O is recessive to A.
- A is dominant over O.
- A and B are codominant.
- Multiple alleles A,B,O.
Trait Character Phenotype
13??????
- ??? ?????? ?????????. ???? ???? ??? ?? ????
??????? ??? - ?????????, ??????? ????? ???? ?????????
?????.
- AA ?- aa ?? ??????????? (Homozygote) ????
????????? ????????, ??????. Aa ??? ?????????
(Hetrozygote).
3. ????? ?????? (A,B,O),
14 X-linked
genotype
phenotype
- b - dominant allele. Namely, (b,b), (b,w) is
Black. - w - recessive allele. Namely, only (w,w) is
White. - This is an example of an X-linked
- trait/character.
- For males b alone is Black and w alone is white.
- There is no homolog gene on the Y chromose.
15Mendels Work
Modern genetics began with Mendels experiments
on garden peas (Although, the ramification of his
work were not realized during his life time). He
studied seven contrasting pairs of characters,
including The form of ripe seeds
round, wrinkled The color of the seed
albumen yellow, green The length of
the stem long, short
Mendel Gregor. 1866. Experiments on Plant
Hybridization. Transactions of the Brünn Natural
History Society.
16Mendels first law Characters are controlled by
pairs of genes which separate during the
formation of the reproductive cells (meiosis)
A a
a
A
17P AA X aa
F1 Aa
F1 X F1 Aa X Aa
test cross Aa X aa
Gametes A a A AA
Aa a Aa aa
Gametes A a a Aa
aa
Phenotype
1A 1 a
F2 1 AA 2 Aa 1 aa
Phenotype
A a
18??????
1. ????? ?? 1F ?? ???? ???? 2F ???? ??? ???????
?????? ??????? ????????? ???? ?????? ???????
??????? ??? 31.
2. ????? ???? ????? ????? 1F ?? ????? ???
??????? ???????. ???? ??? ??????? ??????
??????? ????????? ???? ?????? ??????? ???????
??? 11
19Mendel's First low.
Results of crosses in which parents differed for
one character
F2 ratio F2 F1 Parental Phenotype
2.961 5474 round 1850 wrinkled Round 1. Round X wrinkled seeds
3.011 6022 yellow 2001 green yellow 2. Yellow X green seeds
3.151 705 purple 224 white purple 3. Purple X white petals
2.951 882 inflated 299 pinched inflated 4. Inflated X pinched pods
2.821 428 green 152 yellow green 5. Green X yellow pods
3.141 651 axial 207 terminal axial 6. Axial X terminal flowers
2.841 787 lon 277 short long 7. Long X short stems
Conclusion, First low The two members of a gene
pair segregate from each other into the gametes.
20????? ?????? ?? ?????? ??????? (??????? ?? ???
?????).
21Polydactyly A dominant mutation
22Brachydactyly A dominant mutation
23Mendels second law When two or more pairs of
genes segregate simultaneously, they do so
independently.
A a B b
A B
A b
a B
a b
PAB PA ? PB PAbPA ? Pb PaBPa ? PB
PabPa ? Pb
24(No Transcript)
25Mendel's second low.
A d
ihybrid cross for color and shape of pea seeds
P wrinkled and yellow X round and green
rrYY
RRyy
F1 round yellow
Rr Yy
X Rr Yy
F2 round yellow
315
round green 108
wrinkled yellow 101
wrinkled green 32
556
a. Check segregation pattern for each allele in
F2
416 yellow 140 green (2.971)
423 round 133 wrinkled (3.181)
Conclusion both trai
ts behave as single genes
, each carrying
two different alleles.
26Question Is there independent assortment of
alleles of the different genes?
Probability to get yellow is 3/4 probability to
get round is 3/4
v
v
Probability to get yellow is 3/4 probability to
get wrinkled is 1/4
v
Probability to get green is 3/4 probability to
get round is 3/4
probability to get green round is
X 3/4, namely 3/16
1/4
v
Probability to get green is 1/4 probability to
get wrinkled is 1/4
.
probability to get
green
wrinkled
is
1
/4 X 1/4, namely
1
/16
27A standard presentation in terms of counts
expected expected observed
yellow round 9 312.75
315 yellow wrinkled 3
104.25 101 green round 3
104.25 108 green wrinkled
1 34.75 32 Total
16 556
556
Conclusion, second law Different gene pairs
assort independently in gamete formation
28Exceptions to Mendels Second Law
Morgans fruit fly data (1909) 2,839 flies Eye
color A red a purple Wing length B normal b
vestigial
AABB x aabb
AaBb x aabb
AaBb Aabb
aaBb aabb Expected 710
710 710 710 Observed 1,339
151 154 1,195
The pair AB stick together more than expected
from Mendels law.
29Morgans explanation
F1
F2
30Parental types AaBb, aabb Recombinants Aabb,
aaBb
The proportion of recombinants between the two
genes (or characters) is called the
recombination fraction between these two genes.
It is usually denoted by r or ?. For
Morgans traits r (151
154)/2839 0.107 If r lt 1/2 two
genes are said to be linked. If r
1/2 independent segregation
(Mendels second law).
31Recombination Phenomenon(Happens during Meiosis)
Male or female
The recombination fraction Between two loci on
the same chromosome Is the probability that they
end up in regions Of different colors
??? ??? ?????, ?? ???
32?????????? ??????? ?????? ???????
???????? ??? ?????? ????????? ??????.
33Example ABO, AK1 on Chromosome 9
Phase inferred
Hardy-Weinberg law of population genetics permits
calculation of genotype frequencies from allele
frequencies P(a) frequency of a in the
population P(ab) 2P(a)P(b) Hardy-Weinberg
equilibrium corresponds to a random union of two
gamets, called zygote.
34Example ABO, AK1 on Chromosome 9
Phase inferred
Recombination fraction is 12/100 in males and
20/100 in females. One centi-morgan means one
recombination every 100 meiosis. One centi-morgan
corresponds to approx 1M nucleotides (with large
variance) depending on location and sex.
35Conventions
36Maximum Likelihood Principle
What is the probability of data for this
pedigree, assuming a recessive mutation ?
What is the probability of data for this
pedigree, assuming a dominant mutation ?
Maximum likelihood principle Choose the model
that maximizes the probability of the data.
37Linkage Equilibrium
- Linkage Equilibrium haplotype frequency is the
product of the underlying alleles frequencies
independence. - Exceptions occur for tightly linked loci.
38One locus founder probabilities
Founders are individuals whose parents are not
in the pedigree. They may of may not be typed
(namely, their genotype measured). Either way, we
need to assign probabilities to their actual or
possible genotypes. This is usually done by
assuming Hardy-Weinberg equilibrium (H-W). If the
frequency of D is .01, then H-W says
pr(Dd )
2x.01x.99 Genotypes of founder couples are
(usually) treated as independent.
pr(pop Dd , mom dd )
(2x.01x.99)x(.99)2
D d
1
2
1
D d
dd
39 One locus transmission probabilities
Children get their genes from their parents
genes, independently, according to Mendels laws
also independently for different children.
D d
D d
2
1
d d
3
pr(kid 3 dd pop 1 Dd mom 2 Dd ) 1/2
x 1/2
40One locus transmission probabilities - II
D d
D d
1
2
4
3
5
D d
d d
D D
pr(3 dd 4 Dd 5 DD 1 Dd 2 Dd )
(1/2 x 1/2)x(2 x 1/2 x 1/2) x (1/2 x 1/2). The
factor 2 comes from summing over the two mutually
exclusive and equiprobable ways 4 can get a D
and a d.
41One locus penetrance probabilities
Pedigree analyses usually suppose that, given the
genotype at all loci, and in some cases age and
sex, the chance of having a particular phenotype
depends only on genotype at one locus, and is
independent of all other factors genotypes at
other loci, environment, genotypes and phenotypes
of relatives, etc. Complete penetrance
pr(affected DD
) 1 Incomplete penetrance)
pr(affected DD ) .8
DD
DD
42One locus penetrance - II
Age and sex-dependent penetrance (liability
classes)
pr(
affected DD , male, 45 y.o. ) .6
D D (45)
43?????? ?????
????? ??????? ????????? ?? ??????? ??????? ??
???? ?????
A healthy daughter transmits The mutation to her
daughter. .
44One locus putting it all together
Assume penetrances pr(affected dd ) .1,
pr(affected Dd ) .3 pr(affected DD ) .8,
and that allele D has frequency .01. The
probability of data for this pedigree assuming
penetrances of ?10.1 and ?20.3 is the product
(2 x .01 x .99 x .7) x (2 x .01 x .99 x .3) x
(1/2 x 1/2 x .9) x (2 x 1/2 x 1/2 x .7) x (1/2 x
1/2 x .8)
This is a function of the penetrances. By the
maximum likelihood principle, the values for ?1
and ?1 that maximize this probability are the ML
estimates.
45Fully penetrant Recessive Disease
2
1
5
3
4
Let q be the probability of the disease allele.
The probability of data for this pedigree
assuming full penetrance is the product
L (1-q) x q x (1-q) x q (3/4)(3/4)(1/4)
Exercise write the likelihood for a fully
penetrant dominant disease.
46Linkage Recombination
- Tutorial 2
- by Maayan Fishelson
47Crossing Over
- Sometimes in meiosis, homologous chromosomes
exchange parts in a process called crossing-over. - New combinations are obtained, called the
crossover products.
48Recombination During Meiosis
Recombinant gametes
49Linkage
- 2 genes on separate chromosomes assort
independently at meiosis. - 2 genes far apart on the same chromosome can also
assort independently at meiosis. - 2 genes close together on the same chromosome
pair do not assort independently at meiosis. - A recombination frequency ltlt 50 between 2 genes
shows that they are linked.
50Two Loci Inheritance
Recombinant
51Linkage Maps
- Let U and V be 2 genes on the same chromosome.
- In every meiosis, chromatids cross over at random
along the chromosome. - If the chromatids cross over between U V, then
a recombinant is produced. -
52Recombination Fraction
- The recombination fraction ? between two loci
- is the percentage of times a recombination
- occurs between the two loci.
- ? is a monotone, nonlinear function of the
- physical distance separating between the loci
- on the chromosome.
53Centimorgan (cM)
- 1 cM (or 1 genetic map unit, m.u.) is the
distance between genes for which the
recombination frequency is 1.
54Interference
- Crossovers in adjacent chromosome regions are
usually not independent. This interaction is
called interference. - A crossover in one region usually decreases the
probability of a crossover in an adjacent region.
55Building Genetic Maps
- At first only genes with variant alleles
producing detectably different phenotypes were
used as markers for mapping. - Problem the chromosomal intervals between the
genes were too large ? the resolution of the maps
wasnt high enough. - Solution use of molecular markers (a site of
heterozygosity for some type of silent DNA
variation not associated with any measurable
phenotypic DNA variation).
56Linkage Mapping by Recombination in Humans.
- Problems
- Its impossible to make controlled crosses in
humans. - Human progenies are rather small.
- The human genome is immense. The distances
between genes are large on average.
57Lod Score for Linkage Testing by Pedigrees
- The results of many identical matings are
combined to get - a more reliable estimate of the recombination
fraction. - Calculate the probabilities of obtaining a set of
results in a family on the basis of (a)
independent assortment and (b) a specific degree
of linkage. - Calculate the Lod score log(b/a).
A Lod score of 3 is considered convincing
support for a specific recombination fraction.