Title: Overlapping Matrix Pattern Visualization: a Hypergraph Approach
1Overlapping Matrix Pattern Visualization a
Hypergraph Approach
- Ruoming Jin
- Kent State University
- Joint with Yang Xiang, David Fuhry, and Feodor F.
Dragan (KSU)
2The Problem
- Given a set of discovered submatrices, how can we
reorder the rows and columns of the data matrix
to best display these submatrices and their
relationship?
3Motivation Overlapping Bicluster Visualization
- Gene expression profiles (row genes, columns
conditions, matrix entry expression level) - Biclustering homogeneous submatrices (genes ?
conditions) - Biclustering visualization problem GMM06, KG07
4Motivation Transactional Data Visualization
- Shopping-basket data (rows transaction, columns
item, binary matrix) - Transactional data summarization using a set of
dense submatrices CK07, WK06, XJFD08
Summarization Cost88521
5Roadmap
- Problem Definition
- Visualization cost
- Hardness of the visualization problem
- Hypergraph ordering problem
- Minimum linear arrangement (MLA)
- Algorithm
- Leveraging MLA and local convergence
- Experimental Results
6Submatrix Visualization Cost
- Given a display of the matrix (a fixed row-order
and column-order), how can we measure the
goodness of visualization of a submatrix?
t1,t2,t7,t8Xi1,i2,i8,i9
t1,t2,t7,t8Xi1,i2,i8,i9
i1
i2
i8
i3
i4
i5
i6
i7
i9
t1
t8
t2
t7
t3
t6
t4
t5
Why the second one is intuitively better than the
second one?
7Submatrix Visualization Cost
t1,t2,t7,t8Xi1,i2,i8,i9
t1,t2,t7,t8Xi1,i2,i8,i9
i1
i2
i8
i3
i4
i5
i6
i7
i9
t1
t8
t2
t7
t3
t6
t4
t5
- Area 8x8, 6x6,
4x4, 4x4 - Perimeter 88, 66, 44,
44 - Given a row order and a column order, the
visualization cost of a submatrix is the sum of - difference between its first and last row w.r.t.
the row order - difference between its first and last column
w.r.t. the column order
8Matrix Visualization Cost
- Given a row order and a column order, and a set
of submatrices, the matrix visualization cost is
the sum of these submatrices visualization cost.
- Matrix Optimal Visualization Problem
- Find the optimal row order and column order such
that the matrix visualization cost is minimal.
9Roadmap
- Problem Definition
- Visualization cost
- Hardness of the visualization problem
- Hypergraph ordering problem
- Minimal linear arrangement (MLA)
- Algorithm
- Leveraging MLA and Local convergence
- Experimental Results
10Hypergraph Ordering
- Hypergraph HG(V,X),
- V is the set of vertices
- Xx1,x2,, is the set of hyperedges, where each
hyperedge is the set of vertices - Hyperedge cost and Hypergraph cost
- Hypergraph Ordering Problem
Hyperedge 0,2,3,4 cost 4
0
1
2
3
4
5
6
Hypergraph cost16
Hyperedge 1,3,5 cost 4
11The Link between Matrix Visualization and
Hypergraph Ordering
- Relationship between matrix visualization cost
and hypergraph cost - Finding minimum visualization (or hypergraph)
cost is NP-hard
12Hypergraph Ordering Problem is the Generalization
of MLA
- Graph cost w.r.t. a vertex order
- MLA (Minimal Linear Arrangement) Find an optimal
vertex ordering to minimize graph cost
0
1
2
3
4
5
6
Graph cost2221143216
0
1
2
3
4
5
6
Graph cost2423421118
13Roadmap
- Problem Definition
- Visualization cost
- Hardness of the visualization problem
- Hypergraph ordering problem
- Minimal linear arrangement
- Algorithm
- Leveraging MLA and Local convergence
- Experimental Results
14Basic Idea for Hypergraph Ordering
- Many existing work on solving MLA problem
(heuristic or bounded-approximation) - Instead of working from scratch for the
hypergraph ordering problem, can we somehow
leverage the MLA algorithms? - The answer is YES!
15Basic Procedure
- Given the hypergraph HG(V,X), and starts with
a random vertex order ? - Step 1 Transforming the hypergraph HG into a
graph G(V,E) based on the vertex order ? - cost(HG, ?)cost(G, ?)
- Step 2 Run MLA algorithm for graph G to produce
a new optimal vertex order ? - cost(G, ?) ?cost(G, ?)
- Step 3 If the new order improve the hypergraph
cost, cost(HG, ?) gt cost(HG, ?), then use ? as
the new order (? ?), and repeat Step 1 and 2. - cost(G, ?) ? cost(HG, ?)
Cost(HG, ? )cost(G, ? ) ?cost(G, ?) ?cost(HG,
?)
16(Step1) Transformation Hyperedge-gtPath
0
1
2
3
4
5
6
0
1
2
3
4
5
6
0
1
2
3
4
5
6
Hyperedge costpath cost!
17Step 1-gtStep 2
0
1
2
3
4
5
6
0
1
2
3
4
5
6
Step 1 (Hypergraph-gtGraph) cost(G,
?)2221143216cost(HG, ?)
0
1
2
3
4
5
6
Step 2 (MLA) cost(G, ?)1221212313ltcost(
G, ?)
18Step 1-gtStep 2-gtStep 3
0
1
2
3
4
5
6
0
1
2
3
4
5
6
Step 1 (Hypergraph-gtGraph) cost(G, ?)cost(HG,
?)16
Step 2 (MinLA) cost(G, ?)13ltcost(G, ?)
0
1
2
3
4
5
6
0
2
3
5
6
1
4
With the new ordering, hyperedge cost?path cost!
19Step 1-gtStep 2-gtStep 3
0
1
2
3
4
5
6
0
1
2
3
4
5
6
0
1
2
3
4
5
6
Step 1 (Hypergraph-gtGraph) cost(G, ?)cost(HG,
?)16
Step 2 (MinLA) cost(G, ?)13ltcost(G, ?)
0
1
2
3
4
5
6
Step 3 cost(HG, ?)10ltcost(G, ?)13
Cost(HG, ? )cost(G, ? )gtcost(G, ?)gtcost(HG, ?)
20Run Iteratively and Local Convergence
21Other conversions of hyperedge
- Converting hyperedge to cycle
- Converting hyperedge to mulicycles
22Roadmap
- Problem Definition
- Visualization cost
- Hardness of the visualization problem
- Hypergraph ordering
- Algorithm
- Minimum linear arrangement (MLA)
- Leveraging MLA and local convergence
- Experimental Results
23Visualization effects
24Visualization effects (continued)
25Visualization effects (continued)
26Cost and running time
27Conclusion
- We found an interesting link from matrix
visualization problem to a well-know graph
theoretical problem the minimal linear
arrangement (MLA) problem. - Theoretically, we introduce a generalization of
the MLA problem for the hypergraphs, and develop
a novel local convergence algorithm - Our method can be incorporated into an
interactive visualization environment to allow
users to focus on different parts of the data and
patterns.
28Thanks!!