One way cells differentiate is methylation. Addition of CH - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

One way cells differentiate is methylation. Addition of CH

Description:

One way cells differentiate is methylation. Addition of CH3 ... aFF = 1; aFL = 0. eF(1) = eF(3) = .2; eF(2) = .3; eF(4) = 0; eF(5) = eF(6) = .1. Pseudocounts ... – PowerPoint PPT presentation

Number of Views:139
Avg rating:3.0/5.0
Slides: 23
Provided by: root
Category:

less

Transcript and Presenter's Notes

Title: One way cells differentiate is methylation. Addition of CH


1
A modeling Example
  • CpG islands in DNA sequences

2
Methylation Silencing
  • One way cells differentiate is methylation
  • Addition of CH3 in C-nucleotides
  • Silences genes in region
  • CG (denoted CpG) often mutates to TG, when
    methylated
  • In each cell, one copy of X is silenced,
    methylation plays role
  • Methylation is inherited during cell division

3
Example CpG Islands
CpG nucleotides in the genome are frequently
methylated (Write CpG not to confuse with CG
base pair) C ? methyl-C ? T Methylation
often suppressed around genes, promoters ?
CpG islands
4
Example CpG Islands
  • In CpG islands,
  • CG is more frequent
  • Other pairs (AA, AG, AT) have different
    frequencies
  • Question Detect CpG islands computationally

5
A model of CpG Islands (1) Architecture
A
C
G
T
CpG Island
A-
C-
G-
T-
Not CpG Island
6
A model of CpG Islands (2) Transitions
  • How do we estimate parameters of the model?
  • Emission probabilities 1/0
  • Transition probabilities within CpG islands
  • Established from known CpG islands
  • (Training Set)
  • Transition probabilities within other regions
  • Established from known non-CpG islands
  • (Training Set)
  • Note these transitions out of each state add up
    to oneno room for transitions between () and
    (-) states

1
1
1
1
1
1
1
1
7
Log LikehoodsTelling CpG Island from Non-CpG
Island
Another way to see effects of transitions Log
likelihoods L(u, v) log P(uv ) / P(uv
-) Given a region x x1xN A quick--dirty
way to decide whether entire x is CpG P(x is
CpG) gt P(x is not CpG) ? ?i L(xi, xi1) gt 0
8
A model of CpG Islands (2) Transitions
  • What about transitions between () and (-)
    states?
  • They affect
  • Avg. length of CpG island
  • Avg. separation between two CpG islands

1-p
Length distribution of region X PlX 1
1-p PlX 2 p(1-p) PlX k
pk-1(1-p) ElX 1/(1-p) Geometric
distribution, with mean 1/(1-p)
X
Y
p
q
1-q
9
Applications of the model
  • Given a DNA region x,
  • The Viterbi algorithm predicts locations of CpG
    islands
  • Given a nucleotide xi, (say xi A)
  • The Viterbi parse tells whether xi is in a CpG
    island in the most likely general scenario
  • The Forward/Backward algorithms can calculate
  • P(xi is in CpG island) P(?i A x)
  • Posterior Decoding can assign locally optimal
    predictions of CpG islands
  • ?i argmaxk P(?i k x)
  • Advantage ?Each nucleotide is more likely to
    be called correctly
  • Disadvantage ?The overall parse will be
    choppyCpG islands too short

Advantage/Disadvantage?
10
What if a new genome comes?
  • We just sequenced the porcupine genome
  • We know CpG islands play the same role in this
    genome
  • However, we have no known CpG islands for
    porcupines
  • We suspect the frequency and characteristics of
    CpG islands are quite different in porcupines
  • How do we adjust the parameters in our model?
  • LEARNING

11
Learning
  • Re-estimate the parameters of the model based on
    training data

12
Two learning scenarios
  • Estimation when the right answer is known
  • Examples
  • GIVEN a genomic region x x1x1,000,000 where
    we have good (experimental) annotations of the
    CpG islands
  • GIVEN the casino player allows us to observe
    him one evening, as he changes dice and
    produces 10,000 rolls
  • Estimation when the right answer is unknown
  • Examples
  • GIVEN the porcupine genome we dont know how
    frequent are the CpG islands there, neither do
    we know their composition
  • GIVEN 10,000 rolls of the casino player, but
    we dont see when he changes dice
  • QUESTION Update the parameters ? of the model to
    maximize P(x?)

13
1. When the right answer is known
  • Given x x1xN
  • for which the true ? ?1?N is known,
  • Define
  • Akl times k?l transition occurs in ?
  • Ek(b) times state k in ? emits b in x
  • We can show that the maximum likelihood
    parameters ? (maximize P(x?)) are
  • Akl Ek(b)
  • akl ek(b)
  • ?i Aki ?c Ek(c)

14
1. When the right answer is known
  • Intuition When we know the underlying states,
  • Best estimate is the normalized
    frequency of
  • transitions emissions that occur in
    the training data
  • Drawback
  • Given little data, there may be overfitting
  • P(x?) is maximized, but ? is unreasonable
  • 0 probabilities BAD
  • Example
  • Given 10 casino rolls, we observe
  • x 2, 1, 5, 6, 1, 2, 3, 6, 2, 3
  • ? F, F, F, F, F, F, F, F, F, F
  • Then
  • aFF 1 aFL 0
  • eF(1) eF(3) .2
  • eF(2) .3 eF(4) 0 eF(5) eF(6) .1

15
Pseudocounts
  • Solution for small training sets
  • Add pseudocounts
  • Akl times k?l transition occurs in ? rkl
  • Ek(b) times state k in ? emits b in x
    rk(b)
  • rkl, rk(b) are pseudocounts representing our
    prior belief
  • Larger pseudocounts ? Strong priof belief
  • Small pseudocounts (? lt 1) just to avoid 0
    probabilities

16
Pseudocounts
  • Example dishonest casino
  • We will observe player for one day, 600 rolls
  • Reasonable pseudocounts
  • r0F r0L rF0 rL0 1
  • rFL rLF rFF rLL 1
  • rF(1) rF(2) rF(6) 20 (strong belief
    fair is fair)
  • rL(1) rL(2) rL(6) 5 (wait and see for
    loaded)
  • Above s are arbitrary assigning priors is an
    art

17
2. When the right answer is unknown
  • We dont know the true Akl, Ek(b)
  • Idea
  • We estimate our best guess on what Akl, Ek(b)
    are
  • Or, we start with random / uniform values
  • We update the parameters of the model, based on
    our guess
  • We repeat

18
2. When the right answer is unknown
  • Starting with our best guess of a model M,
    parameters ?
  • Given x x1xN
  • for which the true ? ?1?N is unknown,
  • We can get to a provably more likely parameter
    set ?
  • i.e., ? that increases the probability P(x ?)
  • Principle EXPECTATION MAXIMIZATION
  • Estimate Akl, Ek(b) in the training data
  • Update ? according to Akl, Ek(b)
  • Repeat 1 2, until convergence

19
Estimating new parameters
  • To estimate Akl (assume ?CURRENT, in all
    formulas below)
  • At each position i of sequence x, find
    probability transition k?l is used
  • P(?i k, ?i1 l x)
  • 1/P(x) ? P(?i k, ?i1 l, x1xN)
    Q/P(x)
  • where Q P(x1xi, ?i k, ?i1 l, xi1xN)
  • P(?i1 l, xi1xN ?i k) P(x1xi, ?i
    k)
  • P(?i1 l, xi1xi2xN ?i k) fk(i)
  • P(xi2xN ?i1 l) P(xi1 ?i1 l)
    P(?i1 l ?i k) fk(i)
  • bl(i1) el(xi1) akl fk(i)
  • fk(i) akl el(xi1) bl(i1)
  • So P(?i k, ?i1 l x, ?)
  • P(x ?CURRENT)

20
Estimating new parameters
  • So, Akl is the E times transition k?l, given
    current ?
  • fk(i) akl
    el(xi1) bl(i1)
  • Akl ?i P(?i k, ?i1 l x, ?) ?i
  • P(x ?)
  • Similarly,
  • Ek(b) 1/P(x ?)? i xi b fk(i)
    bk(i)

bl(i1)
fk(i)
akl
k
l
xi2xN
x1xi-1
el(xi1)
xi1
xi
21
The Baum-Welch Algorithm
  • Initialization
  • Pick the best-guess for model parameters
  • (or arbitrary)
  • Iteration
  • Forward
  • Backward
  • Calculate Akl, Ek(b), given ?CURRENT
  • Calculate new model parameters ?NEW akl,
    ek(b)
  • Calculate new log-likelihood P(x ?NEW)
  • GUARANTEED TO BE HIGHER BY EXPECTATION-MAXIMIZATIO
    N
  • Until P(x ?) does not change much

22
The Baum-Welch Algorithm
  • Time Complexity
  • iterations ? O(K2N)
  • Guaranteed to increase the log likelihood P(x
    ?)
  • Not guaranteed to find globally best parameters
  • Converges to local optimum, depending on initial
    conditions
  • Too many parameters / too large
    model Overtraining

23
Alternative Viterbi Training
  • Initialization Same
  • Iteration
  • Perform Viterbi, to find ?
  • Calculate Akl, Ek(b) according to ?
    pseudocounts
  • Calculate the new parameters akl, ek(b)
  • Until convergence
  • Notes
  • Not guaranteed to increase P(x ?)
  • Guaranteed to increase P(x ?, ?)
  • In general, worse performance than Baum-Welch
Write a Comment
User Comments (0)
About PowerShow.com