Title: Quantitative genetics and breeding theory
1Quantitative genetics and breeding theory
Mini-course by Dag Lindgren Dag.Lindgren_at_genfys.sl
u.se Umeå 03-03-10-12, Raleigh, USA
Acknowledgements to Darius Danusevicius for
assistance in the lay out
2Message from a senior and old professor of Forest
Genetics
- At least one week attention on the concepts
behind TBT is needed each five years - For all who call themselves forest tree breeders
- For all who get the doctors title in forest
genetics in the future - For most professional forest geneticists
3General website http//www.genfys.slu.se/staff/da
gl/ In particular Tree Breeding Tools
(TBT) http//www.genfys.slu.se/staff/dagl/Breed_Ho
me_Page/ The start of this course is almost
identically given at http//www.genfys.slu.se/staf
f/dagl/Breed_Home_Page/Tutorials/Quant_Gen/Kurs01A
_for_site.htm This mini-course is much my
personal view of the use of quantitative genetics
applied to forest tree breeding. Other schools
have other emphases. Some concepts are
established. Other I, or collaborators, have
coined. Most, but not all, stuff presented is
published somewhere,
4Common assumptions
- One character but may be composite!
- Diploid zygotes and haploid gametes
Meiosis
Haploid gametes
Diploid zygote
Mitosis
Diploid progeny
5Semantics
- Many misunderstandings and conflicts are semantic
(a matter of definitions) - Important to speak the same language and use
the same symbols at least within group. The
second best is to understand that people speak
different languages.
No its a plant
This is a tree
6The art of breeding is combining a lot of things
in a good way!
7To do that effectively, we must have quantitative
concepts and measures
To optimize, a quantitative measure must be
defined and maximized!
8Some concepts useful for quantitative
genetics Identical by descent (IBD) means that
genes at the same locus are copies of the same
original gene in some ancestor.
The chance that both homologous genes in the same
zygote are identical by descent is called
inbreeding (F) (or coefficient of inbreeding).
9Coancestry (?, f) between pair of individuals is
the probability that genes, taken at random from
each of the concerned individuals, are identical
by descent (coefficient of coancestry). A
quantification of relatedness. Â We will widen
that concept!
Self-coancestry An individual's coancestry with
itself is 0.5(1F). This can be realised e.g. by
considering that coancestry in the previous
generation becomes inbreeding in next, and then
consider selfing.
If two individuals mate, their coancestry becomes
the inbreeding of their offspring.
Note that inbreeding and coancestry are relative
to a situation with no inbreeding or relatedness.
10Founder population is the starting point of
calculations. If all inbreeding and coancestry of
the founder population is known, inbreeding and
coancestry can be calculated from a pedigree. It
is usually practical and convenient to set
inbreeding and coancestry to zero in the "wild
forest" (or source population) and see the
founders (plus trees) as a sample from the wild
forest. Inbreeding and coancestry are relative to
some real or imaginary "base" or "reference" or
"source" population. Most conveniently this is
the founder population or the wild
forest. Self-coancestry An individual's
coancestry with itself is 0.5(1F). This can be
realised e.g. by considering that coancestry in
the previous generation becomes inbreeding in
next, and then consider selfing.
11Gene pool means all genes in a population. It is
convenient to consider genes at one locus. The
gene pool is independent on how (or if) a
population is organised in zygotes.
Gene pool A population with N zygotes has 2N
genes in the gene pool Â
Â
....2N
e.g 2 zygotes with 4 genes in the picture above
12Each gene has the frequency 1/2N
Arrows sampling with replacement (or infinite
copies of each gene) Â Probability to sample the
same gene twice is 1/2N Â Probability that
different genes will be sampled is (1-0.5/N).
Â
13Genes can be IBD (identical by descent). The
probability is the coancestry (f).
14The gene pool is often structured in individuals
15The probability that the two different genes in
the same zygote are IBD is the coefficient of
inbreeding (F).
F inbreeding
16Different mechanisms genes sampled from a
population may be IBD
- 1. The same gene sampled twice (drift)
2. The genes are homologous genes from the same
individual (inbreeding),
3. The genes originate from different individuals
(relatedness).
17Pair-wise coancestries arranged in a coancestry
matrix
Ind 1 2 3
1 0.5 0.25 0
2 0.25 0.5 0
3 0 0 1
We denote a certain value by f2,10.25
Symmetric, thus f2,1 f1,2
The values along the diagonal (self-coancestries)
appear only once.
Coefficient of relationship are often arranged in
such a matrix, (numerator matrix), in absence of
inbreeding these values are double as large.
18Examples of coancestry
Â
Coancestries are probabilities, thus 0 ?f ?1.
Â
19Group coancestry
20Group coancestry Let's put all homologous genes
in a big pool and select two (at random with
replacement). The probability that two are IBD we
define as group coancestry. (?, this term was
introduced by Cockerham 1967).
Â
To get overall probability average over all
individual probabilities, f. Group coancestry
equals the average of all N2 coancestry values
among all combinations of the N individuals in a
population (or the average of all 4N2
combinations of individual genes). We could as
well define group coancestry as this average, the
advantage of the probabilistic definition appears
in more complex situations.
21Ind 1 2 3
1 0.5 0.25 0
2 0.25 0.5 0
3 0 0 1
Sum of the 9 values in matrix 2.5 Average
group coancestry 2.5/9 0.278 Note that
self-coancestries appear once, while other
coancestries appear twice (reciprocals).
22If all individuals in a population are related in
the same pattern, it is enough to calculate the N
coancestries for a single individual. Â Self-coanc
estry is the group coancestry for a population
with a single member. Â All members in a full sib
family have equal coancestries to all other
individuals. Thus it is enough to construct the
coancestry matrix for full sib families (and make
some thinking). Â Group coancestry depends on
relatedness, not how uniting gametes are
arranged. A brother is equally related to his
brother as to his sister, in spite of that his
gametes are able to unite only with those of his
sister.
23Group coancestry for families
Family size n, no inbreeding
Half sibs
Full sibs
Self sibs
24Group Coancestry may be expressed
25Cross-coancestry and Self-coancestry
The term cross-coancestry is used here for the
average of all coancestry-values among different
individuals excepting self-coancestry. Using
Coancestry for average cross-coancestry
invites to misunderstandings.
Group-coancestry can be separated in two types
Self-coancestry and cross-coancestry.
26Ind 1 2 3
1 0.5 0.25 0
2 0.25 0.5 0
3 0 0 1
Average cross-coancestry for this matrix is
20.25/60.083
27Cross-coancestry, Inbreeding and Group Coancestry
relations
- A population can be described by
- Inbreeding (or average self-coancestry)
- Group-coancestry
- Average Cross-coancestry
- If two are known, the third can be derived
28Using the following relationships, group
coancestry and average cross-coancestry can be
derived
where ? group coancestry N individuals
f average cross-coancestry F average
inbreeding.
29Linking generations  Group coancestry changes at
generation shifts can be calculated
retrospectively from a known pedigree linking to
the founders. Â Future group coancestry can be
calculated with knowledge or assumptions about
future pedigrees. For other cases predictions may
be made, but this is often far from
trivial. Â Note that there may be doubt if
assumptions are realistic (neutral selection,
many genes with infinitesimal action etc.)
30The link between generations is the gametes.
parents
offspring
The gene pool of the offspring is identical to
the gene pool of the successful gametes of the
parents.
31Consider a pair of genes, which may equivalently
be regarded as in offspring zygotes or in
parental successful gametes! Â A pair of genes in
offspring may be IBD as they are copies of the
same gene in the parent population. This may
happen if a parent has more than one offspring.
32A pair of genes may originate from homogenous
genes of the same parental zygote in the parental
generation, if that was inbred, the considered
genes may be IBD.
F
2Nparents parents
2Noffspring offspring
Different gametes from a parent get coancestry
(1Fparent)/2
Sibs sharing that parent (half sib) get
coancestry (1Fparent)/8.
33If the considered gene pair originates from
different parents, the coancestry will be fparent.
2Nparents parents
2Noffspring offspring
34IBD may occur by the following mechanisms
- 1. The same gene in the current generation is
sampled twice, - 2. The genes are copies of the same gene in the
parental generation, - 3. The genes origin from homologous genes in the
same inbred parent, - 4. The genes come from different, but related,
parents.
35Gene diversity!
36Group coancestry and gene diversity
- Group coancestry is the probability that two
genes are IBD
- Diversity means that things are different
- Gene Diversity means that genes are different.
- Evidently 1 - group coancestry is the
probability that the genes are non-identical,
thus diverse.
37GD 1 - group coancestry is the probability
that the genes are non-identical, thus diverse.
GD is Gene Diversity!
Group coancestry is a measure of gene diversity
lost! That seems to be something worth knowing!
38This way of thinking sees all genes in the source
(reference) populations as unique (tagged).
GD is similar to expected average heterozygosity
(the chance that two genes are different).
Group coancestry based measures are (like
inbreeding) relative to some reference
population. For forest tree breeding the wild
forest usually constitutes a good reference. The
gene diversity of the wild forest is 1, and the
group coancestry is the share of the initial gene
diversity lost.
Monitor group coancestry in tree improvement
operations! That says how much gene diversity has
been lost since the initiation of the breeding
program!
39Deriving coancestry and group coancestry
An algorithm for calculation of coancestry and
group coancestry (example from Lindgren et al
1997).
40Tabulate pedigree for the population, points (.)
for founders. Parents always defined before used
as parents. Task Calculate group coancestry of
reds!
1,2,3,4,9 and one parent to 13 can be considered
founders.
41Calculation of the coancestry matrix. Pedigree
for population in the example. Fill the matrix
(thus the coancestry of all pair of the 13
individuals) using the pedigree information. This
can be done step by step. Â Â Â Â Â Â Â Fill rows from
left to right        Start with the diagonal
element        Proceed leftwards to the rows
end
42Â Â Â Â Â Â Â As the matrix is symmetric, column
values can be filled from the row        Start
with next diagonal
43Â
The matrix below has been filled to element
(6,6). Individual 6 has parents 2 and 3, it is
demonstrated how diagonal element (6,6) is
filled.
44Â
The matrix below has been filled to element
(6,7). Individual 8 has parents 3 and 4, it is
demonstrated how off-diagonal element (6,8) is
filled.
45Â The full coancestry matrix. Group coancestry is
wanted for 10-13
The red population get the red coancestry values,
the group coancestry for the population 10-13 is
the average of the red values (
2.875/160.1797).
Â
46Status number
- Status number is half the inverse of group
coancestry
47- Or, equivalently
- Status number is half the inverse of the
probability that two genes drawn at random are
IBD.
48Status Number
An attractive property of the status number is
that it is the same as the census number for a
population of unrelated, non-inbred trees.
Status number is an intuitively appealing way of
presenting group coancestry, as it connects to
the familiar concept of number (population size).
Status number is an effective number. It relates
a real population to an ideal population. The
ideal population consists of unrelated,
non-inbred trees with the same probability of
IBD.
49Gene diversity as a function of status number
Note that 1/2N is familiar in genetics!
50The status number says that the probability to
draw two genes IBD is the same as if it were so
many unrelated non-inbred individuals
contributing to the gene pool. Therefore we can
call it an effective number. The ratio of the
status number and the census number is useful,
thus NrNs/N. I call this the relative status
number.
51An example of the predicted drop of status number
over time in a breeding program  POPSIM
simulation BP100 four controlled matings made
for each member of the breeding population, the
family size was 40, the next generation was
recruited from the previous by phenotypic
selection, the initial heritability was 0.2.
(Lindgren et al 1997).
 Â
52The drop of Gene Diversity The same data looks
less drastic when considering gene
diversity! This is to exemplify what may happen
to Gene Diversity during breeding (from Lindgren
et al 1997). Data from a simulated breeding
program. POPSIM simulation Breeding
Population100 four controlled matings made for
each member of the breeding population, the
family size was 40, the next generation was
recruited from the previous by phenotypic
selection (selecting the best 100 among the
offspring considering only the phenotype), the
initial heritability was 0.2.
53(No Transcript)
54- Some properties of status number
- Â NS can never be higher than the census number
(N) - q NS can never be lower than 0.5 (NS of a
gamete) - q NS considers relatedness and inbreeding
- q NS may be derived for any hypothetical
population (with known relatedness patterns to a
known source population). It is irrelevant if
"population members" belong to the same
generation or the same subpopulation - q NS cannot exceed the minimum N in any of the
preceding generations, if all ancestors are
confined to a range of discrete generations - q NS does not care about the gender of the
population members - q  NS after a generation shift depends only on
the number of offspring for each parent - q NS is independent on the mating patterns of
the parents it is derived from - q  NS describes a gene pool, not how it is
organised - q  NS usually declines at generation shifts, but
it can rise if the initial genomes may get a more
equal representation after a generation shift
than before. - Â
- Mating patterns matters for development of NS in
later generations, and they are constraining for
possible values of NS, thus they are a relevant
matter, even if not formally. - Â
- NS is closely associated to inbreeding, but the
associations become cleared with the concept
group coancestry, they are better developed in
connection to that concept.
55Status number and group coancestry measure gene
dispersion! Cockerham (1969) concluded that the
variance of the gene frequency (thats the mean
of the occurrence of a gene) is
this can equivalently be expressed
This is the binomial expression of the variance
for the gene frequency in a population with Ns
non-inbred non-related members!
56Status number is the size of unrelated non inbred
trees sampled from the reference population,
which have the same drift as the accumulated
drift of the population under study (compared to
the reference population).
57 Effective number  An effective number (size) is
an effort to characterise a complicated system by
the number of individuals in a simpler and more
ideal system, which have the same characteristic
value or behaviour from some important
respect. Â Effective population size in the
inbreeding or variance sense  To understand how
these concepts are usually used in genetics one
has to understand that they compare population
dynamics to that of an "ideal population".
(Caballero, 1994 p 658 from Fisher 1930) An
idealized population consists of infinite,
randomly mated base populations subdivided into
infinitely many subpopulations, each with a
constant number, N, of breeding individuals per
generation. In each subpopulation, parents
produce an infinite number of male and female
gametes into a large pool from which only 2N are
sampled and united to produce the N zygotes of
the following generation...Both the sampling of
the gametes and their union (including
self-fertilization) are random, so that all
parents have an equal chance of producing
offspring.... Generations do not overlap. (there
are some less important omissions)
58Based on this The effective size of a population
is defined as the size of an idealized population
which would give rise to the variance of change
in gene frequency or the rate of inbreeding
observed in the actual population under
consideration  Thus, effective population size
says how a studied population develops over
generations compared to the development of the
ideal population. Note that the number is not
associated to any particular generation.
59 Usually two variants of effective population
sizes are recognised, the inbreeding sense and
the variance sense (there are more). The
classical effective population size is the size
of an ideal population, which accumulates
inbreeding or widen variance at the same rate as
the ideal population. The status number does not
do that. The status number measures a state, the
classical effective population size measure a
rate.
60The status number and the "traditional" effective
population size are sometimes similar when
studied after respectively over the first
generation turn over, in particular when large
progenies are considered. An analogy Distance
and a speed may appear the same, if studied over
a unit time from a common starting point. E.g.
many results concerning diversity from Lindgren
and Wei (e.g. Wei 1995) and others can be
considered as status number results even if
different variants of effective numbers or
diversity has been used. But then families are
limited there is no equivalence.
61Currently I believe effective population size in
the inbreeding sense is a concept we have better
to forget about in forest tree improvement, it is
much better just to try to predict the inbreeding
than stray around in never needed - but complex
- calculations of an abstract and often
misleading entity. I have easier to see the need
and accept some intuitively odd characteristics
for the effective population size in the variance
sense, there may be a need for such calculations,
and status number may be viewed as a complement.
62Status number, group coancestry and variance
effective number These concepts may (in an
over-simplified world) be linked
Where NS status number, NV variance effective
number and t generations  Can also be expressed
63The initial founders matters, so the formulas are
more relevant for the development over
generations that the absolute values.
64Different effective numbers for the same
object  It may be of interest to see how
different effective numbers compare for the same
object, this has been done by Kjaer and
Wellendorf (1998) for a Norway spruce seed
orchard and its crop
Â
Note that the effective population size expresses
changes between the 100 parental clones and their
progeny, while the status number expresses the
relationship between the orchard seeds and the
base population with unrelated non inbred trees
(the wild forest) the seed orchard clones were
drawn from.
Â
65Status number may be interpreted
Status number is the number of clones drawn from
the wild forest which has the same group
coancestry and gene diversity as the seeds
harvested in the orchard. Â Inbreeding
interpretation Status number is the number of
clones drawn from the wild forest which following
random mating would produce as much inbreeding as
expected in the seed crop of the forest created
with the seed orchard crop. Â Drift
interpretation Status number is the number of
clones drawn from the wild forest, which has the
same expected drift in gene frequencies as the
seeds harvested in the orchard. Note that the
variance effective population size is a measure
of the drift between the seed orchard and its
seed,
66Inbreeding follows group coancestry
67Group coancestry and Wright's F-statistics
What is called FIS is the difference between
inbreeding and cross-coancestry. If
Hardy-Weinberg balance they are equal (the same
chance of IBD if the genes are in the same as in
different individuals). I have developed the
relations with Q as follows
68Forest tree breeding and status number
The status number concept is more useful to
forest tree breeders than other breeders or
geneticists. Forest tree breeders
- are still very close to the founders
- thus close to the "wild forest, a natural
reference point for evaluating impact of breeding
- deal with few generations
- change strategy between generations
- structure population in sublines
- "own" and control the breeding population