Biological Computing - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Biological Computing

Description:

1. Biological Computing DNA solution. Presented by ... Ligation reaction (annealing) Each vertex encoded by random 20bp ... 1019 op/J (in ligation step) ... – PowerPoint PPT presentation

Number of Views:69
Avg rating:3.0/5.0
Slides: 36
Provided by: josipd
Learn more at: http://www.cs.gsu.edu
Category:

less

Transcript and Presenter's Notes

Title: Biological Computing


1
Biological Computing DNA solution
Presented by Wooyoung Kim 4/8/09 CSc 8530
Parallel Algorithms, Spring 2009 Dr. Sushil K.
Prasad
2
Outline
  • NP and NP-complete
  • Biological computation
  • Hamiltonian path problem (HPP)
  • Satisfaction problem
  • Generalized SAT
  • Discussion

3
NP and NP-complete
  • NP vs. NP-complete
  • NP problems Non-deterministic Polynomial Time
    complexity.
  • NP-complete all NP problems can be reduced to
    it, and if it has an efficient solution, then so
    do all NP problems.
  • No general efficient solution exists for any
    NP-complete problem.

4
Biological computation Adv.
  • Speed of any computer is determined by
  • How many parallel processes it has.
  • How many steps each can perform per unit time.
  • Biological computations could potentially have
    vastly more parallelism.
  • Ex 3 g water contains approx. 1022 molecules.
  • The second factor favors conventional computers,
    since biological machine is limited to small
    fraction of a biological experiment.
  • However, the advantage in parallelism is so huge,
    the difference in the execution time is not a
    problem.

5
Biological computation Disadv.
  • Even with parallelism, brute force approach is
    not always feasible, too inefficient.
  • The biological computer can solve any HPP of 70
    or less edges.
  • Practically, there is not a great need, though.

6
Hamiltonian Path Problem
  • L.M. Adleman. "Molecular Computation of Solutions
    To Combinatorial Problem," Science, vol. 266,
    1994, pp 1021-1024.
  • Using DNA, solve Hamiltonian Path Problem
    efficiently.

7
Hamiltonian Path Problem
0 ? 1 ? 2 ? 3 ? 4 ? 5 ? 6
8
Algorithm for HPP
  1. Generating random paths through the graph.
  2. Keep only those paths that begin with vin and end
    with vout.
  3. If the graph has n vertices, then keep only those
    paths that enter exactly n vertices.
  4. Keep only those paths that enter all of the
    vertices of the graph at least once.
  5. If any paths remain, say Yes otherwise say
    No.

9
Implementing Step 1
  • Generating random paths through the graph.
  • Ligation reaction (annealing)
  • Each vertex encoded by random 20bp sequences (Oi)
  • Approximately 3x1013 copies of the associated
    oligonucleotides (a short nucleic acid polymer)
    were added.

Vertex 2 (O2)
Vertex 3 (O3)
TATCGGATCG GTATATCCGA
GCTATTCGAG CTTAAAGCTA
GTATATCCGA GCTATTCGAG
Edge 2-gt3
10
Implementing Step 2
  • Keep only those paths that begin with vin (O0)and
    end with vout(O6).
  • The product of step 1 were amplified by PCR
    (polymerase chain reaction) using O0(starting
    point) and O6(ending point)
  • Thus keep only those molecules encode paths which
    begin with vin and end with vout.

O0
O6
11
Implementing Step 3
  • If the graph has n vertices, then keep only those
    paths that enter exactly n vertices.
  • The product of Step2 was run on an agarose gel.
  • The 140bp band (corresponding to double strand
    (ds) DNA encoding paths entering exactly seen
    vertices) was excised and soaked in ddH2O to
    extract DNA.

12
Implementing Step 4
  • Keep only those paths that enter all of the
    vertices of the graph at least once.
  • The product of step 3 was affinity-purified with
    a biotin-avidin magnetic bead system, by
  • First generating single stranded (ss) DNA from
    the dsDNA of step3
  • Then incubating the ssDNA with the O1 conjugated
    to magnetic beads.
  • Only those ssDNA containing O1 annealed to the
    bound O1, and were were retained.
  • It is repeated with O2 until O5

13
Implementing Step 5
  • If any paths remain, say Yes otherwise say
    No.
  • The product of step 4 was amplified by PCR and
    run on a gel.

14
Drawbacks
  • 7 days of lab work.
  • Step 4 (magnetic bead separation) is most
    labor-intensive work.
  • Possibility of errors
  • Pseudo-paths
  • Inexact reactions
  • Hairpin loops

15
Advantages
  • The number of different oligonucleotides required
    should grow linearly with the number of edges.
  • O(n)
  • The fastest supercomputer vs. DNA computer
  • 106 op/sec vs. 1014 op/sec
  • 109 op/J vs. 1019 op/J (in ligation step)
  • 1bit per 1012 nm3 vs. 1 bit per 1 nm3 (video
    tape vs. molecules)

16
Satisfaction problem
  • SAT consists of a Boolean formula of ,
    , where each Cl is a
    clause of the form .
    Vi is a variable or its negation. Ex.
  • Problem find values of the variables so that
    the formula is 1.
  • If we have n variables, then there are 2n choices
    to search.

17
Satisfaction problem
  • Graph formulation

unprimed ?1
primed ?0
  • Suppose we have n variables in the formula,
    where ai represents the variables.
  • This graph is constructed so that all paths from
    a1 to an1 encode an n-bit binary number.
  • At each stage, a path has exactly two choices
    unprimed?1, primed?0
  • Ex. A path a1x a2ya3 ? 01 , that is, x is 0 and
    y is 1.

18
Satisfaction problem
  • Example
  • Number of variables n2 (x and y)
  • Number of clauses m 2
  • Construct a graph with (n1) 2n nodes for
    each clause and connect them as the following

19
Satisfaction problem
  • Graph paths and SAT problem
  • If we have a path from a1 to an1 , that means
    each variable is represented by 0 or 1 and the
    formula satisfies.
  • If there is no path from start to end, then the
    formula does not have any solution (not
    satisfies).
  • Using the properties of DNA annealing
    (Watson-Crick complement binding), we can
    construct a graph representing the variables, and
    using test tubes, we can either obtain paths
    (satisfies) or no paths at all (not satisfies).

20
Satisfaction problem
  • Assign random pattern of DNA strings to each
    vertex. (ex. length 8)
  • Then decide the pattern of DNA strings of each
    edge.


TATCCCGA
GGCTCGTT
GCAACCTA
CCTTATAG
GGCTAATG
CCCACCGA
ATTCGGAA
TTACGGGT
GGATTCCA
CCCAGGGT
TAATCCTA
CCTTCGAT
TCGAAATG
CCCAATTA
GCTAAGCT
21
Satisfaction problem
  • In an initial test tube t0, put many copies of
    the DNA strings corresponding the vertices and
    the edges. (many copies of each vertex and each
    edge)
  • Put a sequence of complement of the first half
    of a1 and complement of the last half of a3 To
    show the start and end strings.


TATCCCGA
GGCTCGTT
TAAG
AGGT
ATTCGGAA
TTACGGGT
GGATTCCA
CCCAATTA
GCTAAGCT
22
Satisfaction problem
  1. Let t0 be an initial test tube containing all the
    DNA strings of vertices and edges.
  2. Since the first clause is
    (that is, the first variable x is 1), operate
    E(t0,1,1) setting the first variable x to 1. Then
    extract only those corresponding patterns (10,11)
    and put it to t0-1
  3. Put the remainder (pattern 00, 01), to t0-1 and
    operate E(t0-1,2,1) setting the second variable
    y to 1. Then extract only those corresponding
    patterns from t0-1 and put them to t0-2
  4. Pour t0-1 and t0-2 together to form t1 test
    tubes.
  5. Note that now the patterns of t1 is 01,10,11 and
    that is the solution of the first clause.

23
Satisfaction problem
  1. Repeat the same process for the second clause
    starting from t1.
  2. Since the second clause is
    operate E(t1, 1, 0) to extract it to the t1-1
    test tube.
  3. Put the remainder to t1-1 and make t1-2 by
    operating E(t1-1, 2,0).
  4. Pour t1-1 and t1-2 into t2 test tube.
  5. Check to see if there is any DNA in the last
    tube.
  6. The satisfying assignments are exactly those in
    this final test tube.

24
Satisfaction problem

Test tube OP Values
t0 initial 00, 01, 10, 11
t0-1 E(t0,1,1) 10, 11
t0-1 Reminder of t0-1 00, 01
t0-2 E(t0-1,2,1) 01
t1 Put t0-1 and t0-2 together 01, 10, 11
t1-1 E(t1, 1, 0) 01
t1-1 Reminder of t1-1 10,11
t1-2 E(t1-1,2,0) 10
t2 Put t1-1 and t1-2 together 01, 10
25
Satisfaction problem
  • For general formula with n variables and m
    clauses, we only need O(m) number of test tubes.
    (For each clause there are constant number of
    test tubes are additionally constructed)
  • The last tube are checked to see if there is any
    patterns (paths) left from the start vertex to
    the end vertex.


26
Generalized SAT
  • Generalize this to consider problems that
    correspond to any Boolean formula.
  • Formulas are defined by the recursive definition
  • Any variable x is a formula
  • If F is a formula, then so is F
  • If F and G are formulas, then so are
    and


27
Generalized SAT
  • Size of the formula S the number of operations
    used to build the formula.
  • SAT problem given a formula, find an assignment
    of Boolean values of variables so that the
    formula is true. ? NP-complete.
  • Claim A O(S) number of DNA experiments can solve
    this SAT problem.


28
Generalized SAT step1
  • Construct a contact network for a formula.
  • A contact network is a directed graph with source
    s and sink t
  • Each edge is x or
  • Given any assignment, an edge is connected if it
    is 1.


For example, the above graph is 1 only if w1 or
xyz1
29
Generalized SAT step2
  • Solve the SAT problem of a contact network by
    deciding
  • Whether or not there is an assignment of values
    to the variables such that there is a directed
    connected path from s to t.
  • If two edges have the same label, they should be
    consistent.
  • How many of DNA experiments? O(S)


30
Generalized SAT claims
  • Note that the result follows from the two
    claims
  • Given any formula of size S, there is a contact
    network of size linear in S , s.t. if the
    formula satisfies then the network satisfies.
  • Given any contact network of size S, the SAT
    problem for the network can be solved in O(S) DNA
    experiments.

31
Generalized SAT claim 1
Existence of contact network for given formula
simple formula Any formula can be placed into a
normal form with DeMorgans laws.

32
Generalized SAT claim 1
Existence of contact network for given formula
general formula
G is a network for E, H is a network for F.
  • The networks for
  • (B) The networks for

33
Generalized SAT claim2
  • Solve the SAT problem for any contact network
    using O(S) number of DNA experiments
  • Associate a test tube Pv with each node v in the
    contact network.
  • The test tube Pt associated with the sink t is
    the answer
  • Suppose that v?u is an edge with the label x and
    that Pv is already constructed. Then construct Pu
    by doing the extraction E(Pv, x,1)
  • If several edges leave a vertex v then use an
    amplify step to get multiple copies in test tube
    Pv
  • If several enter a vertex v, then pour the
    resulting test tubes together to form Pv.


34
Discussion
  • Can we actually build DNA computers?
  • All the methods described here assumes that all
    the operations are perfect without error.
  • However, the operations are not perfect.
  • In the future, the DNA-based computers are hoped
    to be a practical means of solving hard problems.


35
Reference
  • R.J. Lipton. DNA solution of hard computational
    problems, Science, vol. 268, 1995, pp.542-545.
  • L.M. Adleman. "Molecular Computation of Solutions
    To Combinatorial Problem," Science, vol. 266,
    1994, pp 1021-1024.
  • R.J. Lipton. Speeding Up Computations via
    Molecular Biology, unpublished manuscript,
    available at www.cs.princeton.edu/rjl/
Write a Comment
User Comments (0)
About PowerShow.com