Title: Graph Formations of PartialOrder MultipleSequence Alignments Using Nano and MicroScale Reconfigurabl
1Graph Formations of Partial-Order
Multiple-Sequence Alignments Using Nano- and
Micro-Scale Reconfigurable Meshes
- Mary M. Eshaghian-Wilner, Ling Lau, Shiva Navab,
and David Shen - Department of Electrical EngineeringUniversity
of California, Los Angeles - maryew_at_ee.ucla.edu, jonlau_at_ucla.edu,
shiva_n_at_ee.ucla.edu, dshen727_at_ucla.edu
The authors are listed alphabetically by last
name.
2Overview
- Background
- Bioinformatics sequence processing
- Multiple Sequence Alignment
- Partial-order graph formation algorithm
- Spin-wave reconfigurable mesh
- Electrical VLSI reconfigurable mesh
3Motivation and Background
- Problems in computation biology involves sequence
alignment. - Sequences in Biological Data
- DNA contains 4 base pairs A, T, C, G.
- An average human gene contains as many as 30,000
base pairs. - In protein, there are different types of amino
acid sequences.
4Sequence Alignment
- Graph Formation Problem Description
- Form graphical representation using sorted
sequences - Example of how to obtain sorted sequences
- Alignment algorithms
- CLUSTLW, T-COFFEE, MAFFT, POA Partial Order
Alignment
5Partial Order Alignment
- POA aligns unsorted sequences using dynamic
programming (Chris Lee 2002). - We denote the graph formation of the sorted
sequences as - Partial Order Multiple Sequence Alignment Graph
(PO-MSAG)
Chris Lee. MSA using Partial Order Graph, 2002
6Sequence Alignment Example
There are four sequences with four similar
subsequences and three different ones.
The similar subsequences within the four
sequences are now extracted while the different
ones are kept separate.
Now expand this problem to a larger scale.
7Partial Order Graph Formation On an Architecture
- Goal
- To form the PO-MSAG on an architecture after the
given input sequence data has been aligned. - Architecture Reconfigurable Mesh
- Micro-scale Standard Electrical VLSI
Reconfigurable Mesh - Nano-scale Spin-wave Reconfigurable Mesh
(Eshaghian et al., 2006)
8Standard VLSI Reconfigurable Mesh
Figures obtained from Slides prepared by Heiko
Schröder, 1998
9Spin-wave Reconfigurable Mesh
10Spin-wave Signal Propagation
- Each node can transmit/receive at a unique
frequency - Takes O(1) for signal propagation through entire
bus - Define this constant time as t0 L / V (L
Length of bus, V velocity of signal) - Example of bus
Nodes
Transmitter/Receiver
Freq. A
Freq. B
Freq. C
Freq. D
Bus
Switch
11Directing the Spin-wave Signal Propagation
- Sent wave signals travel in two opposite
directions - Closed switch forces both waves in the same
direction
Transmitter/Receiver
Switch
Bus
12Directing the Spin-wave Signal Propagation
- Switch closed while sending to force direction
- Switch opened so that all signals can travel
through bus
Transmitter/Receiver
Switch
Bus
13PO-MSAG Formation
- Overall idea
- To record the connectivity between the nodes
- To eliminate all repeated nodes using parallel
processing - To form graph based on the connections
- Problems considered
- Dataset with constant number of variables
- Dataset with as many as O(N) variables
14PO-MSAG for Constant Variation Using Spin-wave
Overview
- Initial Mapping
- Neighbor Recording
- Node Elimination
- Memory Update
15PO-MSAG for Constant Variation Using Spin-wave
Example
- Example sequence DNA (restricted to 5 variables
A, T, C, G, ) - Number of sequences N Length of each sequence
L - Architecture Spin-wave Reconfigurable Mesh
- Size of Mesh NL
16PO-MSAG for Constant Variation Using Spin-wave
Initial Mapping
- Each node on the Reconfigurable Mesh has 7 memory
slots the number of different variables 2. - O(1) time to place each nodes own data to its
second memory slot.
A
A
C
G
C
G
2
2
2
2
17PO-MSAG for Constant Variation Using Spin-wave
Neighbor Recording
- Each node records its own left and right
neighbors. - Store these variables into the nodes first and
third memory slots, respectively. - O(1) time complexity
A
A
A
A
A
X
C
G
C
G
C
G
X
A
1
1
3
3
C
G
1
1
3
3
18PO-MSAG for Constant Variation Using Spin-wave
Node Elimination 1
- Goal
- To select one node of each type to represent the
repeated nodes of that type in every column - Challenge
- Within a column, the rows in which each type of
node lies is unknown, thus choosing one node
becomes difficult. - Solution
- To select the first (uppermost) node of each type
that appears in a column
19PO-MSAG for Constant Variation Using Spin-wave
Node Elimination 2
- Close switch above forcing all sent signals
downward - All nodes send signal downward and open switch
above - Disable nodes that receive a signal
- One node of certain type remains to represent
disabled nodes of that type - All other A, T, C, G, and nodes perform this
simultaneously by using their own frequency
channels
Freq A
Freq
Only one A remains
Example of the third column
A
A
A
A
Only one remains
A
A
A
A
A
20PO-MSAG for Constant Variation Using Spin-wave
Memory Update 1
- All disabled, repeated nodes still hold
connectivity information via their right
neighbors - Want to avoid losing connectivity information
- Have disabled nodes send their right neighbors to
their respective representative node - Representative node store the received
information
21PO-MSAG for Constant Variation Using Spin-wave
Memory Update 2
- Close switch below forcing signals upward.
- Disabled node checks right neighbor for A if so,
send A signal upward while opening switch below - If representative node of each type receives an A
signal, place an A in the next available right
neighbor memory if it is not already there - This still runs in O(1) time.
Example using the second column
Freq
Freq T
Freq G
Receives an A signal
There is already an A in node s right neighbor
memory slot 3 (from the neighbor recording step)
so no addition A will be added.
G
T
A
A
A
3
22PO-MSAG for Constant Variation Using Spin-wave
Memory Update 3
- All nodes perform previous procedure sequentially
for all other right neighbors (checking memory
slot 3 for T, C, G, and sequentially). - Memory update for one right neighbor done in
constant time, so doing it for four more is still
constant time O(1) 4 O(1) O(1). - PO-MSAG formation with constant variation using
Spin-wave Reconfigurable Mesh is complete (graph
retrieval/drawing is beyond our scope, but can be
done with third-party graphing program).
Example of the second and third columns
A
A
A
T
G
C
T
G
T
A
A
A
T
A
C
A
T
A
T
G
T
23PO-MSAG for Constant Variation Using Spin-wave
Overall Performance
24PO-MSAG for Constant Variation Using Electrical
VLSI
- Overall highly similar procedure
- Differences
- Node elimination must be done sequentially since
VLSI does not have frequency channels - Memory update stage must also be done
sequentially - Same overall performance of O(1) since constant
variation done sequentially is still constant
25PO-MSAG for Constant Variation Using Electrical
VLSI Overall Performance
26PO-MSAG for O(N) Variation Overview
- Assumptions
- Extend Mesh size to 2N2L
- Node has O(1) access to row index
- Highlighted empty columns called graph columns
- Algorithm
- Same initial mapping, neighbor recording, and
node elimination - Difference in Graph Formation
- Count disabled nodes
- Place active nodes in graph column
- Place right neighbors in graph column
- Disable repeated right neighbors
2L
A
B
C
C
D
F
G
B
C
D
G
F
C
G
E
G
K
K
K
B
E
K
G
C
1 2 3 4
1
C
B
A
C
K
G
B
B
A
1 2 3
1 2
1
D
F
T
B
D
D
F
G
T
1 2
E
H
S
B
C
D
E
H
S
1
2N
27Spin-wave Signal Superposition
- Close all switches
- All nodes transmit a signal with amplitude 1 in
own frequency and open all switches - Signals in the same frequency superpose as they
meet - When the first signal reaches the end, all
signals will have superposed - Superposition used to count number of disabled
nodes - Example of superposition in one frequency
Transmitter/Receiver
Switch
Bus
28PO-MSAG for O(N) Variation Using Spin-wave
Counting Disabled Nodes
- Close all switches
- All disabled nodes transmit amplitude 1 signal in
own frequency and then open switches - Active nodes receive for t0 amount of time (time
taken for first signal to reach end of bus) - Example of the first column
Transmitter/Receiver
Switch
Top
Bottom
Bus
G
A
A
A
A
Receives 3
Receives 0
29PO-MSAG for O(N) Variation Using Spin-wave
Indexing Active Nodes
Example using only the first two columns
- Open all switches
- All active nodes communicate over common
frequency fActive - First active node closes switch behind and
broadcasts its number of repeated nodes 2 down
the channel - When successive active nodes receive that signal,
they superpose a 2 signal. - Magnitude received by active node is its index in
the graph column to the right - Example of first column
- Time Complexity O(1)
Receives 0
Receives 0
Repeated Nodes 3
Repeated Nodes 4
Receives 5
Repeated Nodes 0
Freq. Active Amp. 3 2 5
Freq. Active Amp. 5 (0 2) 7
Transmitter/Receiver
Switch
Top
Bottom
G
A
A
A
A
Receives 5Sends 2
Receives 0
Sends 5
(Not on Freq. Active Channel)
(Not on Freq. Active Channel)
(Not on Freq. Active Channel)
Bus
30PO-MSAG for O(N) Variation Using Spin-wave
Placing Active Nodes
- Using magnitudes previously received as indices,
each active node copies itself to its right graph
column - Now these active node copies are noted as label 1
and are Bold - Active nodes now retrieve their own right
neighbors and copy them directly below - These right neighbor copies are noted as label 2
and are Italicized
Example using only the first two columns
A
B
B
C
G
B
Remember that there are actually 2N rows, not
only just the seven shown.
31PO-MSAG for O(N) Variation Using Spin-wave
Indexing Disabled Nodes
- Each active node of each type sends its graph
column index 2 downward in its own frequency - Each disabled node of that type receives the
signal and superposes 1 signal - Signal received by disabled node is its graph
column index - Time Complexity O(1)
Freq. A Amp. 0 2 2
Freq. A Amp. 2 1 3
Freq. A Amp. 3 1 4
Freq. A Amp. 4 1 5
Transmitter/Receiver
Switch
Top
Bottom
G
A
A
A
A
Receives 2
Receives 3
Receives 4
32PO-MSAG for O(N) Variation Using Spin-wave
Placing Disabled Nodes Right Neighbors
- Magnitude previously received by each disabled
node is its graph column index - Each disabled node retrieves its right neighbor
and copies him to its graph column based on its
graph column index - These right neighbor copies are noted as label 2
and are Italicized - Eliminate repeated right neighbors under a single
active node type. This can be done in O(1) using
our Node Elimination procedure.
Example using only the first two columns
K
Received 2
B
A
Received 2
Received 3
B
D
Received 3
Received 4
B
C
Received 4
Received 5
Note that this B is not eliminated because it
belongs to the active node G, not A.
33PO-MSAG for O(N) Variation Using Spin-wave
Summary
- Graph formed in highlighted graph columns.
- Bold nodes in graph column represents nodes in
actual graph. - Italicized nodes in graph column are the right
side connections of nodes in actual graph
34PO-MSAG for O(N) Variation Using Spin-wave Graph
Representation
After the actual retrieval and drawing of the
graph
35PO-MSAG for O(N) Variation Using Spin-wave
Overall Performance
36PO-MSAG for O(N) Variation Using Electrical VLSI
Overview
- Algorithm similar that for constant variation
- Major differences from spin-wave
- Nodes no longer have their own frequency-based
communication channels - Node elimination and memory update done
sequentially via the same method used on in the
case with constant variation - Time complexity degrades to O(N)
37PO-MSAG for O(N) Variation Using Electrical VLSI
Overall Performance
38Summary
39Conclusion
- Techniques for Partial-Order Multiple-Sequence
Alignment Graph formation using spin-wave and
VLSI reconfigurable meshes - Future Work
- Extend the algorithms to large-scale graph
databases - Extend problem to incorporate biological
pathways, sequence splicing, or any other areas
that demand efficient computing tools for
sequence alignment.
40References
- Bromberg, Martin, Partial-Order Alignment of RNA
Structures, Undergraduate thesis, Brown
University, RI, 2005. - Benjamin, Raphael, Degui Zhi, Haixu Tang, and
Pavel Pevzner, A Novel Method for Multiple
Alignment of Sequences with Repeated and Shuffled
Elements, Genome Research 14 2336-2346, 2004. - Eshaghian-Wilner, Mary M., Integrated
Architectural Solutions for Protein
Sequence-Structure Alignment, Proceedings of the
Sixth World Multi-Conference on Systemics,
Cybernetics, and Informatics, SCI2002, Florida,
July 2002. - Eshaghian-Wilner, Mary M., Alex Khitun, Shiva
Navab, and Kang Wang, A Nano-Scale
Reconfigurable Mesh with Spin Waves, ACM
International Conference on Computing Frontiers.
Ischia, Italy, 2006. - Eshaghian-Wilner, Mary M., "Mapping Arbitrary
Heterogeneous Task Graphs onto Arbitrary
Heterogeneous System Graphs," International
Journal on Foundation of Computer Science, Volume
12, Number 5, pages 599-628, 2001. - Eshaghian-Wilner, Mary M., Russ Miller, "The
Systolic Reconfigurable Mesh," Journal of
Parallel Processing Letters, Volume 14, Numbers 3
and 4, 337-350, 2004. - Grasso, C., C. Lee, Combining Partial Order
Alignment and Progressive Multiple Sequence
Alignment Increases Alignment Speed and
Scalability to Very Large Alignment Problems,
Bioinformatics (Oxford, England), 20 (10)
1546-5, 2004. - Grasso, C., M. Quist, M. Ke, C. Lee, POAVIZ a
Partial Order Multiple Sequence Alignment
Visualizer, Bioinformatics (Oxford, England) 19
(11) 1446-8, 2003. - Grasso, C., B. Modrek, Y. Xing, C. Lee,
Genome-Wide Detection of Alternative Splicing in
Expressed Sequences Using Partial Order Multiple
Sequence Alignment Graphs, Pacific Symposium on
Biocomputing. Pacific Symposium on Biocomputing.
World Scientific, 29-41, 2004. - Lee, C., Generating Consensus Sequences from
Partial Order Multiple Sequence Alignment
Graphs, Bioinformatics (Oxford, England) 19 (8)
999-1008, 2003. - Lee, C., C. Grasso, M.F. Sharlow, Multiple
Sequence Alignment Using Partial Order Graphs,
Bioinformatics (Oxford, England) 18 (3) 452-64,
2002. - Miller, Russ, V. K. Prasanna-Kumar, Dionisios l.
Reisis, and Quentin F. Stout, Parallel
Computations on Reconfigurable Meshes, IEEE
Transactions on Computers 42 (6) 678 692, 1993. - Zhang, Xu, and Tamer Kahveci, A New Approach for
Alignment of Multiple Proteins, Pacific
Symposium on Biocomputing, Maui. 11339-350,
2006. - Yuzhen Ye, Adam Godzik, Multiple Flexible
Structure Alignment Using Partial Order Graphs,
Bioinformatics (Oxford, England) 21(10).
2362-2369, 2005.
41- Thank you for attending this presentation!!!