Using Sparse Matrix Reordering Algorithms for Cluster Identification - PowerPoint PPT Presentation

1 / 10
About This Presentation
Title:

Using Sparse Matrix Reordering Algorithms for Cluster Identification

Description:

... Sparse Matrix Reordering Algorithms for Cluster Identification. Chris ... Sparse matrix reordering algorithms reorder the elements in ... algorithms worked ... – PowerPoint PPT presentation

Number of Views:211
Avg rating:3.0/5.0
Slides: 11
Provided by: chrism54
Category:

less

Transcript and Presenter's Notes

Title: Using Sparse Matrix Reordering Algorithms for Cluster Identification


1
Using Sparse Matrix Reordering Algorithms for
Cluster Identification
  • Chris Mueller
  • Dec 9, 2004

2
Visualizing a Graph as a Matrix
Each row and column in the matrix corresponds to
a node in the graph. The nodes are ordered the
same in the rows and columns, so node 10 is
represented by row10 and col10. Each edge
between two nodes (a,b) is rendered as a dot at
(i,j) where i is the row for a and j is the
column for b. The solid diagonal shows the
identity relationship for each node.
3
Visually Identifying Clusters
Reordering the nodes (rows/cols) can reduce the
noise in the display and highlight clusters.
Dense areas in the matrix reveal potential
clusters. Some dense areas may be in the same
row or column as others, suggesting a
relationship.
4
(Some) Previous Work
The basic idea of visualizing relational data as
a reordered matrix has been around since the
early days of computer science. Some examples
are
The Reorderable Matrix (Bertin, 1981)
GAP Generalized Association Plots (Chen, 2002)
Block Clustering (Hartigan, 1972)
Bertin (1981), Graphics and Graphic Information
Processing. From http//www.math.yorku.ca/SCS/Gall
ery/bright-ideas.html
www.stat.sinica.edu.tw/SLR/PDF/
???-Cluster_Lecture_040206-new.pdf
5
Sparse Matrices
Matrices are the basic data structure for most
numerical computations
Sparse matrices are matrices that do not need
explicit values for each element
Note that zeros may be important and cannot
always be excluded from that matrix.
Sparse matrices can be stored in memory in data
structures that are more compact that 2D arrays
Sparse matrix reordering algorithms reorder the
elements in the matrix to achieve better use of
memory or computational resources
Swapping column 1 and 2 reduced the bandwidth to
3, decreased the amount of storage required by 2
elements, and removed 2 empty elements.
The bandwidth is the number of diagonals required
to store the matrix. In this example, the
bandwidth is 4.
The banded representation stores only the
diagonals that have values
6
Sparse Matrix Reordering Algorithms
Bandwidth Minimization Reverse Cuthill-McKee and
Kings Algorithm
RCM(matrix) Represent the matrix as a graph
Choose a suitable starting node For each node
reachable from the current node Output the
node Find all unvisited neighbors Order
them based on increasing degree Visit them in
that order
Kings algorithm is similar but it orders based
on edges out of the current cluster rather than
total edges.
Note that these algorithms are stochastic in the
choice of starting nodes and ordering for nodes
with the same degree.
7
Reordering the COG Database
Basic Protocol
  • Filter edges based on FASTA score
  • cmp2 is original data, cmp90, cmp200 are filtered
  • Shuffle the data
  • For each sorted and shuffled graph
  • Identify the connected components
  • Apply RCM and Kings algorithm to each component
  • Apply MMD to the entire graph

8
Results by the Numbers
(but the pictures show sooo much more)
9
Visualization Key
Green lines show the extent of a COG family.
Red dots are edges
Black dots show the elements in the family.
Blue dots are the COG families for the node in
column j.
Both axes have the nodes in the same order
10
Discussion
  • All algorithms worked as expected
  • However, the matrix ordering goals were too
    simple to yield good cluster clusters.
  • Possible Future Work
  • Extended algorithms that allow more information
    to be used
  • Exploit features of ordering strategies to do a
    second pass that generates better clusters?
  • Hypergraph reordering
  • (demo of reordering by hand)
Write a Comment
User Comments (0)
About PowerShow.com