UC Riverside Talk - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

UC Riverside Talk

Description:

DNA probe arrays and unwanted illumination. Synchronous array design (2-D placement) ... Higher reaction speed. Higher parallelism ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 27
Provided by: ionma
Category:

less

Transcript and Presenter's Notes

Title: UC Riverside Talk


1
Engineering a Scalable
Placement Heuristic for DNA Arrays
A.B. Kahng (UCSD) P. Pevzner (UCSD) S. Reda
(UCSD) A. Zelikovsky (GSU)
2
Outline
  • DNA probe arrays and unwanted illumination
  • Synchronous array design (2-D placement)
  • Asynchronous array design (3-D placement)
  • Experimental results
  • Extensions
  • Conclusions

3
DNA Arrays
  • DNA arrays
  • Short DNA probes bound to a glass substrate
  • Detect matching single-strand DNA molecules
  • Growing number of applications
  • Diagnosis of genetically based conditions
  • Point of care diagnosis (low-cost, real-time)
  • Targeted treatment
  • E.g., antibiotic sensitivity
  • Drug discovery
  • Sequencing, genotyping, gene expression
    monitoring
  • Agricultural research, environmental impact,
    bio-warfare agents,

4
Scaling Trends and Challenges
  • Smaller is better for DNA arrays
  • Reduced reagent consumption
  • Higher reaction speed
  • Higher parallelism
  • but brings challenging system complexity new
    dominant physical effects
  • ½ million probes / array ? 100 million probes /
    next generation array
  • Unwanted illumination caused by light diffraction
  • ? Emerging DNA array design automation field
  • Need scalable design tools, mature methodologies
  • Great potential for transfer of techniques and
    methodologies from VLSI design automation

5
Array Manufacturing Process
  • Very Large-Scale Immobilized Polymer Synthesis
  • Treat substrate with chemically
    protected linker molecules,
    creating rectangular array
  • Site size approx. 10x10 microns
  • Selectively expose array sites to light
  • Light deprotects exposed molecules,
    activating further synthesis
  • Flush chip surface with solution of
    protected A,C,G,T
  • Binding occurs at previously
    deprotected sites
  • Repeat steps 23 until desired
    probes are synthesized

6
Unwanted Illumination Effect
  • Unwanted illumination ? erroneous probes
  • Effect gets worse with technology scaling

7
Example Probe Synthesis
8
Example Probe Synthesis
9
Example Probe Synthesis
10
Measure of Unwanted Illumination
Unwanted illumination ? border length
11
Synchronous Synthesis
  • Periodic deposition sequence, e.g., (ACTG)k

? border conflicts b/w adjacent probes 2 x
Hamming distance
12
2D Placement Problem
Edge cost 2 x Hamming distance
13
Previous Approaches
  • Hubbell 90s
  • Find TSP w.r.t. Hamming dist
  • Thread TSP to grid row by row
  • TSP-based methods do not scale to gt 106 probes
  • ? Transfer scalable techniques from VLSI
    placement!

14
2D Placement Sliding-Window Matching
  • Slide window over entire chip
  • Repeat until improvement drops below certain
    threshold

15
Effect of Window Size
16
2D Placement Epitaxial Growth
  • Simulates crystal growth
  • O(N3/2) row-order implementation, where N
    probes

17
Asynchronous Synthesis
  • Probes grow at different speeds
  • border conflicts b/w adjacent probes depends on
    their embedding into the nucleotide deposition
    sequence


? 3D placement problem
18
Single-Probe Embedding
  • Dynamic programming algorithm similar to LCS

19
Post-Placement Embedding Optimization
  • 2D placement fixed, allow only probe embeddings
    to change
  • Greedy optimally re-embed probe with largest
    gain
  • Chessboard Algorithm alternate re-embedding of
    red and green probes

20

Embedding Optimization Results
  • Chessboard is 5-6 better than greedy
  • Within 21 of lower-bound

21

Comparison of Placement Algorithms
Chip size 100x100 to 500x500
  • SWM 600x faster (5 min. vs. 30 hours) with up to
    4 border conflict decrease
  • 20 Row-Epitaxial 6-10 better than
    TSPThreading, gt10x faster for 500x500

22
Practical Extensions
  • Distant-dependent border conflict weights
  • Take into account conflicts between 2-,3-hop
    neighbors rather than only immediate neighbors
  • Position-dependent border conflict weights
  • In alignment DP for two sequences take into
    account importance of conflicts in the middle of
    probes alignment cost has weights on conflicts
    which depend on conflict position
  • Perfect match/mismatch probes
  • Pairs of probes that differ only in middle
    position
  • Should be placed and aligned together

23
Alignment DP for 2-SNPs
Optimal Embedding of AC,TT
24
Summary
  • Contributions
  • Epitaxial placement ? reduces by extra 10 over
    the previously best known method
  • Asynchronous placement problem formulation
  • Postplacement improvement by extra 15.5-21.8
  • Lower bounds
  • Scalable Placements (1000x1000 in 20min)
  • Ongoing work
  • Comparison on industrial benchmarks
  • Experiments with algorithms for extended
    formulations (SNPs, distance-dependent weights,
    etc.)

25
Summary
  • Results demonstrate effectiveness of VLSI
    placement techniques to DNA probe placement
  • Currently exploring other VLSI placement
    techniques, e.g., recursive 4-way partition based
    on linear-time clustering methods
  • Algorithms validated on industry data
  • Extended to handle practical constraints such as
    control probes, match/mismatch probe pairs
  • 5 border length improvement over industry
    placements
  • ? Improved design results in fewer erroneous
    probes, smaller array area, and/or more probes
    per array

26
Simplified DNA Array Flow
Probe Selection
Probe Placement
Probe Alignment (Mask Design)
Mask Manufacturing

Array Manufacturing
Soft/Computational Domain
Hybridization Experiment

Analysis of Hybridization Intensities
Hard/Biochemistry Domain
Gene sequences, position of SNPs, etc.
Write a Comment
User Comments (0)
About PowerShow.com