Title: PROXIMUS an error bounded compression framework
1PROXIMUS an error bounded compression framework
2Motivation
- Analysis of discrete data sets, generally leads
to NP-complete/hard problems, especially when
physically interpretable results in discrete
spaces are desired. - the focus here is on effective heuristics for
reducing the problem size of discrete data. - Two possible approaches
- Probabilistic subsampling
- Data reduction.
3Proposed Approach
- Binary (0, 1) nonorthogonal matrix transformation
to extract dominant patterns - In the proposed approach, elements of singular
vectors of a binary valued matrix are constrained
to binary entries with an associated singular
value of 1.
4PROXIMUS
- PROXIMUS is a nonorthogonal matrix transform
based on recursive partitioning of a data set
depending on the distance of a relation from the
dominant pattern. - The dominant pattern is computed as a binary
approximation vector of the matrix of relations. - Each discovered pattern has a physical
interpretation at all levels in the hierarchy of
the recursive process.
5- PROXIMUS provides a framework that captures the
properties of discrete data sets more accurately
and takes advantage of their binary nature to
improve both the quality and efficiency of the
analysis.
6Error-bounded approximation of binary matrics
7Binary rank-one approximation
- Binary rank-one approximation for an m x n matrix
A is one of finding two vectors x and y that
maximize the number of zeros in the matrix - y is the pattern vector approximation for the
objective function. - x is the presence vector indicating the rows of
A approximated by y.
8Example
pattern vector
presence vector
9Complexity of optimal binary rank-one
approximation
- The authors claim that this problem is NP-hard.
- Reasons
- It is closely related to maximum cliques in
graphs. - Issue the reference given for NP-hardness is not
directly for this problem.
10Drawbacks of SVD for binary matrices
- Resulting decomposition contains non-integral
vector values. - Due to orthogonal constraint of SVD, features
contained in some of the previously discovered
patterns are extracted.
11Example of SVD for an overlapping item set
1st singular vector
2nd
3rd
12Modification for SDD
- SDD (semi-discrete decomposition) also has
similar problem. - Modification
- Binary elements instead of -1, 0, or 1.
- Recursive partition using binary rank-one
approximation. This method provides a
hierarchical representation of dominant patterns.
13Heuristic for BR1A
- Objective function can be written as
- Which is equivalent to maximize
- If y is fixed, then sAy is fixed. Optimal x is
- Note this approach is very similar to SDD.
14Recall the Greedy Algorithm for SDD
15Heuristic for BR1A
- The main steps of Heuristic for BR1A is identical
to SDD - Alternatively fix x or y, followed by finding the
best possible vector for the other until no
further improvement can be made. - But how does to setup the initial fixed vector
for BR1A?
16Initialization of y
- Partition
- Arbitrarily select a column c
- Identify the rows with nonzero at column c
- Use the centroid as the fixed x for y
- Greedy graph growing
- Randomly get a row r
- Grow the rows that share nonzero from r up to
half of the rows - Use the centroid as the fixed x for y
- Randomly selected row as the fixed x for y
17Recursive decomposition
- Idea use the presence vector to partition the
given matrix A into two sub-matrices A0 and A1.
18Example
19Recursive decomposition
- Apply binary rank-one decomposition to the two
sub-matrices recursively until a stopping
criterion is met. - Criteria
- All rows of Ai are present in presence vector.
- The Hamming radius (defined in next slide) of a
sub-matrix and pattern vector is in the
prescribed bound.
20Hamming Radius
21Example hierarchical structure
22Result from a sample binary matrix
(a)
(b)
(c)
(d)
(e)
(f)
original matrix
Which one has the best result?
23Result from a sample binary matrix
(a)
(b)
(c)
(d)
(e)
(f)
original matrix
quantizing SVD Rank-29
quantizing most dominant singular vector
SVD rank-6
PROXIMUS
k-means rank-6
24Running time scalability