DNA Computing

About This Presentation

Transcript and Presenter's Notes

Title: DNA Computing

1
DNA Computing

Charles Ormsby III
CSE 497
4/15/2004

2
Outline

DNA Computing Characteristics
Different Approaches
Liptons Paper
DNA Solution of Hard Computational Problems
Practical Purposes
Future Work/Funding
References

DNA Computing Characteristics
(Advantages Disadvantages)

4
DNA Computation Characteristics

Parallel Processing
Processes all possible solutions simultaneously!
Well kind of, but it is not instantaneous
AND, it is a Physical Process!
Therefore, the molecular steps required to
process the solution set can take weeks
But, we are finding ways improve time efficiency!
More To Come

5
DNA Computation Characteristics

Read/Write Rate of DNA
DNA replication rate 500 base pairs per second
- 10 times faster than human cells
- Very low error rates
But only 1000 bits/sec? Compare to the data
throughput of an average hard drive? SLOW!!!
Can anyone think of an advantage that DNA-based
computers might have over the way todays PCs
interact with memory?

http//www.arstechnica.com/reviews/2q00/dna/dna-2.
html
6
DNA Computation Characteristics

YES, copies of the replication enzymes can work
on DNA in parallel
Bonus - Replication enzymes can start on the
second replicated strand of DNA even before
they're finished copying the first one. So
already the data rate jumps to 2000 bits/sec
Electric computers are incapable of such a feat!

http//www.arstechnica.com/reviews/2q00/dna/dna-2.
html
7
DNA Computation Characteristics

Read/Write Rate of DNA (contd)
Look what happens after each replicating
iteration
number of DNA strands increases exponentially
2n after n iterations
Data rate increases by 1000 bits/sec per strand
After 10 iterations, replication rate 1Mbit/sec
And, after 30 iterations it increases to 1000
Gbits/sec
This is well beyond the sustained data rates of
the fastest hard drives!!!

http//www.arstechnica.com/reviews/2q00/dna/dna-2.
html
8
DNA Computation Characteristics

Data density A, T, C, G
Bases spaced every 0.35 nanometers
1-dimension 18 Mbits per inch
2-dimension Over one million Gbits per square
inch (assuming one base per square nanometer)
Typical high performance hard drive
data density 7 Gbits per square inch
A factor of over 100,000 smaller!!

http//www.arstechnica.com/reviews/2q00/dna/dna-2.
html
9
DNA Computation Characteristics

Double stranded nature
- Every DNA sequence has a natural complement
If S ATTACGTCG
S TAATGCAGC, its complement
DNAs complementary nature makes it a unique data
structure for computation and can be exploited in
many ways, such as Error Correction

10
DNA Computation Characteristics

DNA Error Rates
Biological error rate 1/109 copied bases
Hard drive read error rate 1/1013
Error Correction Errors occur due to many
factors, for examples
Incorrect insertions/deletions
Damage from thermal energy and UV energy from the
sun
However, if the error occurs in one of the
strands of double stranded DNA, repair enzymes
can restore the proper DNA sequence by using the
complement strand as a reference.
RAID 1 array

http//www.arstechnica.com/reviews/2q00/dna/dna-1.
html
11
DNA Computation Characteristics

The Statistics of Randomness
Pertaining to Adlemans method
All HDPPs paths are equally likely to be formed
during the random production of sequences
In other words, over a large well distributed
solution set, all solutions (or at least a great
majority) should be present
This is key because in order for the DNA
computer to arrive at the correct solution, the
solution must first exist in the solution set
Statistics If only 99 of the solutions exist
in the solution set than the method will have a
successrate of only?

Different Approaches
Free Floating vs. DNA Chips

13
Free Floating

Approach 1 Bits of DNA float freely in a test
tube
(pioneered by Leonard M. Adleman)

14
Free Floating

Advantages
- Strong general problem solving application
- Increased freedom in experimentation
i.e. Immediate scalability by amplification
(could the freedom also be also considered a
disadvantage?)
- Can encode unique problems
- Scales very well
Can you think of any other advantages?

HAHA, neither could I
15
DNA-based Chips

Approach 2 A gold-plated square of glass (one
inch square) anchors as many as a trillion
individual strands of DNA to the glass.
Microarrays

http//www.dhgp.de/ethics/ethics02.html
16
DNA-based Chips

Advantages
- Easier to handle, specific orientation
- Keeps out impurities
- Serves as a building block to scale upwards
- Programmable interfaces (in the future)
- Very useful for storing information about
Bio-agents
Business Quiz
Why is this approach more appealing to
corporations and institutions who fund research?

17
DNA-based Chips

Can be manufactured!!!

Liptons Paper
DNA Solution of Hard Computational Problems
Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545

19
Richard Liptons DNA Solution of Hard
Computational Problems

Two factors limit any computers performance
Parallel processing capabilities
3 grams of water ? 1022 molecules
Computations per unit time
100 million instructions per second
Human Time vs. Computation Time

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
20
Richard Liptons DNA Solution of Hard
Computational Problems

State-of-the-Art Supercomputer
100 million instructions per second
Biological computers are limited to only a
fraction of an experiment per second
Doesnt the complexity of the experiment
determine the difference?
However, DNA computers counter the instruction
time disparity with parallelism

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
21
Richard Liptons DNA Solution of Hard
Computational Problems

Traveling Salesman Revisited
Conventional computer can solve tour with 70
cities, but would fail with 100 or more cities
Even with 1023 parallel processors, Brute force
is too inefficient
However, are DNA computers only advantageous for
problems with very large solutions sets?
No, Adelmans work can be extended to produce
solutions to all problems that are obtainable and
unobtainable by traditional CPUs in much less time

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
22
Richard Liptons DNA Solution of Hard
Computational Problems

NP-complete ? The Satisfaction Problem
(SAT)
SAT is a simple search problem, and was one of
the first NP-complete problems
Consider
F (x V y) ? (Gx V Gy)
Current Best Method test all 2n solutions for
n variables

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
23
Richard Liptons DNA Solution of Hard
Computational Problems

Truth Table
Current Best Method test all 2n solutions for
n variables

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
24
Richard Liptons DNA Solution of Hard
Computational Problems

Initial Assumptions/Conditions
This model is simple and idealized
Ignores many known complex effects, but is an
excellent first order approximation
Strands of DNA are just sequences
a1,, ak of the set A,C,G,T
Double stranded DNA are a pair of sequences
For i 1,,k given a1,, ak and b1,, bk both
sequences of the set A,C,G,T a1 must
complement b1, meaning A??T or C??G
Only consider strands with a length of 20
nucleotides

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
25
Richard Liptons DNA Solution of Hard
Computational Problems

Five Simple operations the can be performed on
test tubes that contain DNA strands
Possible to synthesize a large number of copies
of any single strand
Annealing produces a double strand from a single
strand and its complementary strand
Given a test tube of DNA, one can extract a
strand that contains some simple pattern of
length l
Using a Polymearse Chain Reaction (PCR), one can
detect whether there are DNA strands at all in
the test tube
All of the DNA in the test tube may be amplified
by replicating the strands in the test tube

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
26
The Theory

One fixed test tube
The set in the test tube corresponds to the
following graph Gn

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
27

All paths the travel from a1 to an 1 encode an
n-bit binary string
At each stage, a path has exactly two choices
Unprimed node encodes a 1
Primed node encodes a 0
Therefore, the example path a1xa2ya3 encodes 01

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
28
The Solution Set Discovery

Encode graphs vertices in DNA
Encode edges in DNA
3) Encode starting and ending points in DNA

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
29
Step 1 - vertices in DNA

The Graph is encoded in a test tube of DNA
Each vertex of the graph is assigned a random
pattern of length l from A,C,G,T
Each encoding is referred to as the name of the
vertex and is comprised of two parts
1st half ? pi
2nd half ? qi
Therefore, each vertex can be referenced by piqi

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
30
Step 2 - edges in DNA

Then, fill a test tube with the following
For each vertex, add many copies of a 5 ? 3
DNA sequence of the form piqi
For each edge i ? j, put many copies of a 3 ?
5 sequence that is of the form (GqjGpi)
If
Vertex i ATCGGCTACTCCTGACTTGA
pi ATCGGCTACT
qi CCTGACTTGA
Vertex j AGGTTCAGTCAGGCCTATTC
pi AGGTTCAGTC
qj AGGCCTATTC
Therefore, for edge I ? j a sequence like the
following would be added
Gqj GGACTGAACT Gpi TCCAAGTCAG

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
31
Step 3 end points in DNA

Then, add the following DNA strands
Add a 3 ? 5 sequence of length l /2 that is
complementary to the first half of the initial
vertex
Similarly, add 3 ? 5 sequence of length l /2
that is complementary to the last half of the
final vertex
In other words, add Gp1 Gqn)
If initial vertex was
ACTTGCCATCTCCGATACTT
And the final vertex was
TCGCCTAATCTACGATCTTA
then add
TGAACGGTAG ATGCTAGAAT

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
32
Goal of Initial Solution Set

KEY That every legal path in Gn corresponds to
a correctly matched sequence of vertices and
edges
Any path through the graph must contain a
sequence that alternates between vertex, edge,
vertex, edge,...
Try this visual
Consider the edge v ? u, any path that passes
through v and then passes through u must fit
together like bricks

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
33

So, the top 5 ? 3 represents a series of
vertices
Whereas, the bottom 3 ? 5 represents an edge
Furthermore
Vertex v is encoded as puqv
Edge uv is encoded as G qv G pu

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
34

Why is this ordering significant?

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
35

the end of the vertex and the beginning of the
edge can anneal because they are complementary!
Similarly, the end of the edge and the beginning
of the next vertex can anneal too!
High Probability of No inadvertant paths
Sequences are chosen at random
2) The sequence lengths are large
After the annealing, all of the possible paths
through the graph will be encoded into n-bit
long DNA sequences

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
36
Similarity Between Sequences

At any given vertex in a path, the choice is
simply left or right, therefore, all paths are
similar
What does this mean?
All paths are equally likely to be formed
during the random production of sequences
In other words, over a large well distributed
solution set, all solutions (or at least a great
majority) should be present
This is key because in order for the computer
to arrive at the correct solution, the solution
must first exist in the solution set
Statistics!

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
37
Extraction Operations

Notation
E(t,i,a), denotes all sequences in test tube t
where i a
Perform one extract operation such that
checks for the sequence that corresponds to the
name of xl if a 1,
and if a 0, it check for xl

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
38
Extraction Operations

Construct a series of test tubes
Values Present
t0 contains all sets 00,01,10,11
t1 E(t0, 1, 1) 10,11
t1 remainder of t1 00,01
t2 E(t1, 2, 1) 01
Pour t1 and t2 together to form t3
t3 t1 t2 01,10,11

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
39
Extraction Operations

2) Construct a series of test tubes
Values Present
t4 E(t3, 1, 0) 01
t4 remainder of t4 00,10,11
t5 E(t4, 2, 0) 10
Pour t4 and t5 together to form t6
t6 t4 t5 01,10

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
40
Extraction Operations

3) Check to see if there are DNA strands
available in t6
Those left in t6 are the satisfying assignment!

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
41
Understanding How it Works

Test tube t3 consists of all the sequences that
satisfy the first clause 01,10,11
and, similarly t6 consists of all those that
satisfy the second clause and are contained in t3

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
42
More General Case

Any SAT problem on
n variables, and
m clauses,
can be solved with at most m extract steps
(with one detect step at end)
Liptons Acknowldegments
Operations are assumed perfect and without error

Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
43

Practical Purposes

44
Purposes

Counter Bioterrorism/Monitor Genetic Progression
Institute for Countermeasures against
Agricultural Bioterrorism (ICAB)
Plan
1) Obtain DNA sequences from crops, animals,
bio-agents, etc.
2) Deploy DNA-chip technology to identify and
characterize
3) Build geo-referenced information system
4) Predict and track the spread of bio-agents
after introduction
5) Create powerful DNA-based tools for
monitoring and enhanced diagnosis
DNA microarrays DNA-based chips
- Can store 1,000 to 100,000 different
diagnostic DNA sequences
Next generation will contain one million tags!

http//icab.tamu.edu/
45
Purposes

Predictive Gene Testing

http//www.dhgp.de/ethics/ethics02.html
46

Poker Playing

DNA Computing 7th International Workshop on DNA
Based Computers, Dna7, Tampa, Florida, June
10-13, 2001 Revised Papers
47

Weighted-Recursive Algorithms

DNA Computing 7th International Workshop on DNA
Based Computers, Dna7, Tampa, Florida, June
10-13, 2001 Revised Papers
48
Pessimism

1) Too fragile and prone to error
2) The field is dominated by hard-core
enthusiasts who, will be forced to "slog through
and do the heavy research" before there is a
major breakthrough

Optimism
However, keep in mind the first commercially
available electronic computer was not well
received, and IBM in 1951 had to reinvent what
they spent millions of dollars and years working
on to fit customers needs (such as payroll)
http//www.jsonline.com/alive/news/0607dna.stm
49
The Future of DNA Computing

Commercial application by 2010
Alternative to traditional computing by 2020
Vision Today we have not one but several
companies making "DNA chips," where DNA strands
are attached to a silicon substrate in large
arrays (for example Affymetrix's genechip).
Production technology of MEMS is advancing
rapidly, allowing for novel integrated small
scale DNA processing devices. The Human Genome
Project is producing rapid innovations in
sequencing technology. The future of DNA
manipulation is speed, automation, and
miniaturization

http//www.jsonline.com/alive/news/0607dna.stm
50
Research Funding

Funding
National Science Foundation
Pentagon's Defense Advanced Research Projects
Agency - Much of the military's interest arises
from the increasing sophistication of encryption
techniques that other countries can use to encode
their data. As a result, Washington needs
ever-more-powerful computers for code breaking

Internet References
http//chronicle.com/data/articles.dir/art-44.dir/
issue-4.dir/14a02301.htm
http//www.jsonline.com/alive/news/0607dna.stm
http//www.arstechnica.com/reviews/2q00/dna/dna-1.
html
Book/Papers References
Lipton, Richard J., DNA Hard Solution of
Computational Problems. Science, New Series, Vol.
268, No. 5210 (April 28, 1995), 542-545
DNA Computing 8th International Workshop on DNA
Based Computers, Dna8, Sapporo, Japan, June
10-13, 2002 Revised Papers (Lecture Notes in
Computer Science, 2568)
DNA Computing 7th International Workshop on DNA
Based Computers, Dna7, Tampa, Florida, June
10-13, 2001 Revised Papers
Future References
http//www.nas.nasa.gov/
http//www.nas.nasa.gov/Research/Reports/reportsar
chive.html

Write a Comment

User Comments (0)

About PowerShow.com

DNA Computing PowerPoint PPT Presentation