Coevolving Solutions to the Shortest Common Superstring Problem - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Coevolving Solutions to the Shortest Common Superstring Problem

Description:

Construct new species on the fly (as suggested by Potter and DeJong [2000] ... Result: Assembly of good building blocks to construct better solutions (as in a puzzle) ... – PowerPoint PPT presentation

Number of Views:116
Avg rating:3.0/5.0
Slides: 28
Provided by: cs189
Category:

less

Transcript and Presenter's Notes

Title: Coevolving Solutions to the Shortest Common Superstring Problem


1
Coevolving Solutions to the Shortest Common
Superstring Problem
  • Assaf Zaritsky Moshe Sipper
  • Ben-Gurion University, Israel
  • www.cs.bgu.ac.il/assafza

2
Outline
  • The Shortest Common Superstring problem.
  • DNA sequencing and the input domain.
  • Simple genetic algorithm (GA).
  • Cooperative coevolutionary algorithm.
  • Experimental results.

3
The Shortest Common Superstring Problem (SCS)
  • Let S s1,,sn be a set of strings (blocks)
    over some alphabet S. A superstring of S is a
    string x such that each si in S is a substring of
    x.
  • Problem Find shortest (common) superstring.
  • NP-Complete.
  • MAX-SNP hard.

4
SCS Example
  • S ate, half, lethal, alpha, alfalfa
  • A trivial superstring is atehalflethalalphaalfalf
    a of length 25 (a simple concatenation of all
    blocks).
  • A shortest common superstring is
    lethalphalfalfate of length 17.
  • Note that a compressed permutation of the
    blocks is actually a superstring.

5
Approximation Algorithms
  • Several linear approximations for SCS have been
    proposed, most of which rely on greedy
    approaches.
  • GREEDY
  • The most widely heuristic used in DNA
    sequencing.
  • Conjecture Blum 1994, Sweedyk 1999 Superstring
    produced by GREEDY is of length at most two times
    the optimal.
  • We are not aware of any previous evolutionary
    approach to the SCS problem.

6
DNA Sequencing
The most common usage of the SCS problem.
7
The Input Domain
The input strings used in the experiments were
inspired by DNA sequencing
8
Input Generation Setup Parameters
NB increasing number of blocks results in
exponential growth of the problems complexity.
9
Simple GA for the SCS Problem
  • Given a set of strings as input, generate initial
    population of random candidate solutions.
  • The fitness of each individual depends on its
    length and accuracy.
  • The GA uses selection, recombination, and
    mutation to create the next generation, each
    individual of which is then evaluated.
  • Theses steps are repeated a predefined number of
    times or until the solution is deemed
    satisfactory.

10
Simple GA for the SCS Problem (contd)
  • Blocks of the input set are atomic components.
  • Representation.
  • Permutation Representation Good or Bad?
  • Evaluation.
  • Genetic operators selection, recombination,
    mutation.

11
Coevolution
  • Simultaneous evolution of two or more species
    with coupled fitness.
  • Coevolving species either compete or cooperate.

12
Cooperative Coevolution
13
Cooperative Coevolution (contd)
  • Cooperative Coevolution involves a number of
    independently evolving species.
  • Interaction between species occurs via fitness
    function only.
  • The fitness of an individual depends on its
    ability to collaborate with individuals from
    other species.

14
Cooperative Coevolution (contd)
Source Potter DeJong (1997)
15
Cooperative Coevolutionary Algorithm for the SCS
problem
  • Two species evolve simultaneously.
  • First species contains prefixes of candidate
    solutions to the SCS problem at hand.
  • Second species contains candidate suffixes.
  • Fitness of an individual in each species depends
    on how good it interacts with representatives
    from other species to construct a global solution.

16
Cooperative Coevolutionary Algorithm for the SCS
problem (evaluation process)
Merge
17
Cooperative Coevolutionary Algorithm for the SCS
problem (evaluation process)
Evaluate
18
Experiments
Compare GREEDY, Simple GA, Cooperative
Coevolution
19
Experimental Setup
Each type of GA was executed twice on each
problem instance the better run of the two was
used for statistical purposes.
20
Results Experiment I (50 blocks)
21
Results Experiment II (80 blocks)
22
Results Summary
size of shortest common superstring
Algorithm
Problem size
Greedy
Genetic
Cooperative
50 blocks
80 blocks
23
Conclusions
  • Evolutionary algorithms can be applied to the SCS
    problem.
  • Cooperative Coevolution outperforms simple GA and
    GREEDY.
  • The collaboration between the two populations
    results in a good decomposition of the problem
    into two smaller sub-problems.

24
Future Work
  • Tackle larger problem instances.
  • Construct new species on the fly (as suggested by
    Potter and DeJong 2000).
  • Improved method the puzzle algorithm.

25
(No Transcript)
26
Current Work - The Puzzle Algorithm
27
Puzzle Algorithm The Idea
  • Improve Recombination Operator.
  • Preserve good building blocks discovered by GA
    using selection of recombination loci that do not
    destroy good building blocks.
  • Result Assembly of good building blocks to
    construct better solutions (as in a puzzle).
Write a Comment
User Comments (0)
About PowerShow.com