Evolution - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Evolution

Description:

Doug Raiford Lesson 4 – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 31
Provided by: doug3161
Category:

less

Transcript and Presenter's Notes

Title: Evolution


1
Evolution
  • Doug Raiford
  • Lesson 4

2
Darwinian Evolution
  • Heritable traits
  • Variation in population (parental combinations
    and mutations)
  • Visible, phenotypical differences lead to
    different survival rates

Parental combinations and mutational changes
Natural Selection
3
Meiosis
  • Parental combinations

Each of us has two copies of each
chromosome (diploid)
  • One allele from each parent
  • Allele one of a series of different forms of a
    gene
  • Each chromatid has a copy of each gene

Sperm and egg only one copy
4
DNA and evolution
  • Over time, genes accumulate mutations
  • Environmental factors
  • Radiation
  • Oxidation
  • Mistakes in replication or repair
  • Evolution change in allele frequency over time

5
Classification
  • In the past phenotypical differences were used to
    classify
  • Now that have sequenced genomes can look at
    similarity of DNA
  • Chimps and humans 99 identical
  • Diverged about 6 million years ago

6
Homologs, Paralogs, and Orthologs
  • To compare, species must look at similar genes
  • Homologous genes
  • Orthologous genes
  • Separated by speciation
  • Paralogous genes
  • Similar due to gene duplication event

7
Why do homologs drift apart?Types of mutations
  • Point mutations
  • Insertions, deletions
  • Duplications, inversions, translocations
  • Remember paralogs?
  • Causes
  • Radiation (cosmic, UV, X-ray)
  • Replication (mitosis) or crossover (meiosis)

8
If we want to compare two homologous genes
  • Point mutations (substitutions),
    easyACGTCTGATACGCCGTATAGTCTATCTACGTCTGATTCGCCC
    TATCGTCTATCT

9
Insertions and deletions
  • Indels are difficult, must align
    sequencesACGTCTGATACGCCGTATAGTCTATCTCTGATTCGCAT
    CGTCTATCTACGTCTGATACGCCGTATAGTCTATCT----CTGATTC
    GC---ATCGTCTATCT

10
Deletions
  • Codon deletionACG ATA GCG TAT GTA TAG CCG
  • Effect depends on the protein, position, etc.
  • Almost always deleterious
  • Sometimes lethal
  • Frame shift mutation (muscular dystrophy and
    sickle-cell) ACG ATA GCG TAT GTA TAG CCG ACG
    ATA GCG ATG TAT AGC CG?
  • Almost always lethal

11
Insertion or deletion?
  • Comparing two genes it is generally impossible to
    tell if an indel is an insertion in one gene, or
    a deletion in another, unless ancestry is
    knownACGTCTGATACGCCGTATCGTCTATCTACGTCTGAT---CC
    GTATCGTCTATCT

12
Does it change the protein?
  • Synonymous
  • Serine
  • UCU to UCT
  • No change in protein
  • Non-synonymous
  • Serine to tyrosine
  • UCU to UAU

13
Nature experiments
  • Gene duplication event
  • One can continue to perform function
  • Other can accumulate mutations experiment
  • Paralogs

14
Why align sequences?
  • Already said
  • Can then measure differences between genes to
    determine evolutionary distance
  • see where indels and substitutions are
  • Why else?
  • What if wanted to do a database search?
  • Databases great at perfect matches
  • But to find homologous genes need fuzzy matches

Database of sequences
Sequence query
15
But, how to align?
  • Exhaustively, could try all possible alignments

From---ACGTACT---- ToACGT-------ACT
And everything in between
16
Exhaustive placement of spaces
  • Could setup a loop and place gaps in all possible
    locations

Or, could solverecursively
---ACGT ACT---- --A-CGT ACT---- --AC-GT ACT----
. . . ACGT--- ----ACT
  • Tricky
  • Have to avoid all gap-gap situations
  • Must find a way to look at ALL possibles

17
Recursion
  • A function that calls itself
  • Can often be an elegant solution to difficult
    problems
  • Elegant non-obvious solution that is much more
    simple in design than the problem would suggest

Example factorial Definition of f(n) return
nf(n-1)
18
A more practical example
  • Factorial recurrence relation
  • factorial(n) n factorial(n-1)
  • Define f(n) return n f(n-1)
  • Example f(3)
  • 3 f(2)
  • 2 f(1)
  • 1
  • 3 2 1
  • How did the program know to stop at f(1)?

19
Base case
  • To know when to stop, must have a base case
  • Base case for factorial is when n equals 1

Example my answer factorial(10) print
"answer\n" sub factorial my passedArg
shift check base case if(passedArg 1)
return 1 else if base case not
satisfied, recurse return passedArg
factorial(passedArg - 1)
20
What does this have to do with aligning?
  • Recursive solutions can usually be found by
    breaking a problem into sub problems
  • Insert no gap, recurse on rest
  • Insert a gap in string 1, recurse on rest
  • Insert a gap in string 2, recurse on rest

Add score from matching to score from
First character of string 1 First character of string 2 rest of string 1 rest of string 2
Gap First character of string 2 string 1 rest of string 2
First character of string 1 Gap rest of string 1 string 2
Three sub-problems
21
Example
t g c g _ tg c g t g _ cg
  • atg
  • acg

a tg a cg _ atg a cg a tg _ acg
22
Scoring
  • When scoring alignments there must be a gap
    penalty, a mismatch penalty, and a bonus for a
    match
  • For any two strings the best alignment score with
    be the maximum of three possibilities
  • Recurrence relations

Match or mismatch of first chars allign(rest of
string1, rest of string2) Gap penalty
allign(string1, rest of string2) Gap penalty
allign(string1 starting at pos 2, string2)
max
23
What is the base case?
  • If down to empty string for either
  • Return gap penalty the length of the non-empty
    string (return 0 if both empty)

Base Case
24
Pseudo code
  • Definition of allign(string1, string2)
  • If base case satisfied return base score
  • Otherwise
  • Return the max of
  • Gap penalty allign(string1, rest of string2)
  • Match or mismatch of first chars allign(rest of
    string1, rest of string2)
  • Gap penalty allign(string1 starting at pos 2,
    string2)

25
Example
  • These two strings
  • atagcgcc
  • ataggcc
  • Align like
  • atagcgcc
  • atag_gcc

Have now taken a problem in biology and mapped it
to a common problem-solving technique in computer
science Recursion
26
Life is good, but
  • The previous example (an 8 character string
    aligned with a 7 character string) took 103,342
    invocations of allign
  • Why?

27
Is exponential bad?
  • Aligning two strings of size 500
  • More invocations of align than there are
    subatomic particles in the universe
  • If took one nanosecond per invocation
  • Universe is 14 Billion years old
  • It would take 8.2 10208 times the age of the
    universe to calculate the alignment score

Exponential bad Corollary Tree exponential
28
Complexity analysis
  • Fixed best
  • Linear next best
  • Polynomial (n2) not bad
  • Exponential (3n) very bad
  • Big O notation
  • O(1), O(n), O(n3), O(3n)

Big ONotation
29
Next lesson
  • Speeding things up
  • Dynamic programming solution

Dynamic Programming
30
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com