Geometric Crossover for Biological Sequences - PowerPoint PPT Presentation

About This Presentation
Title:

Geometric Crossover for Biological Sequences

Description:

... flexible, can be stretched or folded to align better to each others ... Allows to specify preference for loops, folds, bigger gaps. Subsequence transp./reversal ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 25
Provided by: albertom7
Category:

less

Transcript and Presenter's Notes

Title: Geometric Crossover for Biological Sequences


1
EuroGP 2006
Geometric Crossover for Biological Sequences
Alberto Moraglio, Riccardo Poli Rolv
Seehuus
2
Contents
  • Geometric Crossover
  • Geometric Crossover for Sequences
  • Is Biological Recombination Geometric?

3
I. Geometric Crossover
4
Geometric Crossover
  • Representation-independent generalization of
    traditional crossover
  • Informally all offspring are between parents
  • Search space all offspring are on shortest paths
    connecting parents

5
Geometric Crossover Distance
  • Search Space is a Metric Space d(A,B) length of
    shortest paths between A and B
  • Metric space all offspring C are in the segment
    between parents
  • C in A,Bd ?? d(A,C)d(C,B)d(A,B)

6
Example1 Traditional Crossover
  • Traditional Crossover is Geometric Crossover
    under Hamming Distance

Parent1 011101 Parent2 010111 Child
011111
HD(P1,C)HD(C,P2)HD(P1,P2) 1 1
2
7
Example2 Blending Crossover
  • Blending Crossover for real vectors is geometric
    under Euclidean Distance

ED(P1,C)ED(C,P2)ED(P1,P2)
8
Many Recombinations are Geometric
  • Traditional Crossover for multary strings
  • Box and Discrete recombinations for real vectors
  • PMX, Cycle and Order Crossovers for permutations
  • Homologous Crossover for GP trees
  • Ask me for more examples over a coffee!

9
Being geometric crossover is important because.
  • We know how the search space is going to be
    searched by geometric crossover for any
    representation convex search
  • We know a rule-of-thumb on what type of
    landscapes geometric crossover will perform well
    smooth landscape
  • This is just a beginning of general theory, in
    the future we will know more!

10
II. Geometric Crossover for Sequences
11
Sequences Edit Distance
  • Sequence variable-length string of character
    from an alphabet A
  • Edit distance minimum number of edit operations
    insertion, deletion, substitution to
    transform one sequence into the other
  • A a,c,t,g, seq1 agcacaca, seq2 acacacta
  • Seq1agcacaca ? acacacta ? acacactaSeq2
  • ED(Seq1,Seq2)2 (g deleted, t inserted)

12
Sequence Alignment (on contents)
  • Alignment put spaces (-) in both sequences such
    as they become of the same length
  • Seq1 agcacac-a
  • Seq2 a-cacacta
  • Alignment Score number of mismatches 2
  • Optimal alignment minimal score alignment (Best
    Inexact Alignment on Contents)
  • The score of the optimal alignment of two
    sequences equals their edit distance
    ED(Seq1,Seq2)Score(A)2

13
Homologous Crossover
  • Align optimally two parent sequences
  • Generate randomly a crossover mask as long as the
    alignment
  • Recombine as traditional crossover
  • Remove dashes from offspring

Mask 111111000 Seq1 agcacac-a Seq2
a-cacacta SeqC a-cacac-a SeqC acacaca
14
Theorem Geometricity of HC
  • Homologous Crossover is geometric crossover under
    edit distance
  • Seq1agcacaca ? SeqCacacaca ?acacactaSeq2
  • ED(Seq1,SeqC)ED(SeqC,Seq2)ED(Seq1,Seq2)
  • 1 1
    2

15
More theory on HC in the paper
  • Extension to weighted edit distances Extension to
    block ins/del edit distances
  • Peculiarity of metric segments in edit distance
    spaces
  • Bounds on offspring size due to parents size

16
III. Is Biological Recombination Geometric?
17
Recombination at a molecular level
  • DNA strands align on the contents, no
    positionally
  • DNA are flexible, can be stretched or folded to
    align better to each others
  • DNA strands do not need to be aligned at the
    extremities
  • Some pair matching are preferred to others
  • DNA strands can form loops
  • Crossover points happen to be where DNA strands
    align better
  • Not all details worked out yet!

18
Homologous Crossover as a Model of Biological
Recombination
Many possible variants of edit distance that fit
many real requirements of biological
recombination
19
Minimum Free Energy Edit Distance
  • DNA strands align optimally according to edit
    distance because
  • (i) The alignment of two DNA strands
    (macromolecules) obeys chemistry it is the state
    at minimum free energy
  • (ii) The weights of the edit moves can be
    interpreted as repulsion forces at a single basis
    level
  • (iii) The best alignment on edit distance is the
    best trade-off for which the global effect of
    repulsion forces is minimized the minimum free
    energy alignment

20
Is Biological Recombination Geometric? Yes?!
21
So what?
22
Bridging Natural and Artificial Evolution
  • Bridging Natural and Artificial Evolution
  • into a common theoretical framework
  • Change in perspective this allows to study real
    biological evolution as a computational process
  • In the paper we use geometric arguments to claim
    that biological evolution does efficient
    adaptation!

23
Summary
  • Geometric crossover
  • Geometric crossover offspring between parents
  • Many recombinations are geometric
  • Some general theory for geometric crossover
  • Homologous crossover
  • Homologous crossover for sequences alignment on
    contents before recombination
  • Homologous crossover is geometric under edit
    distance
  • Biological Recombination
  • Homologous crossover models biological
    recombination at DNA level, so it is geometric
  • Geometric theory applies to biological
    recombination, bridging biological artificial
    evolution

24
Questions?
Write a Comment
User Comments (0)
About PowerShow.com