PROXIMUS an error bounded compression framework

About This Presentation

Title:

PROXIMUS an error bounded compression framework

Description:

Two possible approaches: Probabilistic subsampling. Data reduction. Proposed Approach ... Example. presence. vector. pattern. vector. Complexity of optimal ... – PowerPoint PPT presentation

Number of Views:43

Avg rating:3.0/5.0

Slides: 25

Provided by: BAO72

Category:

more less

Transcript and Presenter's Notes

Title: PROXIMUS an error bounded compression framework

1
PROXIMUS an error bounded compression framework
2
Motivation

Analysis of discrete data sets, generally leads
to NP-complete/hard problems, especially when
physically interpretable results in discrete
spaces are desired.
the focus here is on effective heuristics for
reducing the problem size of discrete data.
Two possible approaches
Probabilistic subsampling
Data reduction.

3
Proposed Approach

Binary (0, 1) nonorthogonal matrix transformation
to extract dominant patterns
In the proposed approach, elements of singular
vectors of a binary valued matrix are constrained
to binary entries with an associated singular
value of 1.

4
PROXIMUS

PROXIMUS is a nonorthogonal matrix transform
based on recursive partitioning of a data set
depending on the distance of a relation from the
dominant pattern.
The dominant pattern is computed as a binary
approximation vector of the matrix of relations.
Each discovered pattern has a physical
interpretation at all levels in the hierarchy of
the recursive process.

PROXIMUS provides a framework that captures the
properties of discrete data sets more accurately
and takes advantage of their binary nature to
improve both the quality and efficiency of the
analysis.

6
Error-bounded approximation of binary matrics
7
Binary rank-one approximation

Binary rank-one approximation for an m x n matrix
A is one of finding two vectors x and y that
maximize the number of zeros in the matrix
y is the pattern vector approximation for the
objective function.
x is the presence vector indicating the rows of
A approximated by y.

8
Example
pattern vector
presence vector
9
Complexity of optimal binary rank-one
approximation

The authors claim that this problem is NP-hard.
Reasons
It is closely related to maximum cliques in
graphs.
Issue the reference given for NP-hardness is not
directly for this problem.

10
Drawbacks of SVD for binary matrices

Resulting decomposition contains non-integral
vector values.
Due to orthogonal constraint of SVD, features
contained in some of the previously discovered
patterns are extracted.

11
Example of SVD for an overlapping item set
1st singular vector
2nd
3rd
12
Modification for SDD

SDD (semi-discrete decomposition) also has
similar problem.
Modification
Binary elements instead of -1, 0, or 1.
Recursive partition using binary rank-one
approximation. This method provides a
hierarchical representation of dominant patterns.

13
Heuristic for BR1A

Objective function can be written as
Which is equivalent to maximize
If y is fixed, then sAy is fixed. Optimal x is
Note this approach is very similar to SDD.

14
Recall the Greedy Algorithm for SDD
15
Heuristic for BR1A

The main steps of Heuristic for BR1A is identical
to SDD
Alternatively fix x or y, followed by finding the
best possible vector for the other until no
further improvement can be made.
But how does to setup the initial fixed vector
for BR1A?

16
Initialization of y

Partition
Arbitrarily select a column c
Identify the rows with nonzero at column c
Use the centroid as the fixed x for y
Greedy graph growing
Randomly get a row r
Grow the rows that share nonzero from r up to
half of the rows
Use the centroid as the fixed x for y
Randomly selected row as the fixed x for y

17
Recursive decomposition

Idea use the presence vector to partition the
given matrix A into two sub-matrices A0 and A1.

18
Example
19
Recursive decomposition

Apply binary rank-one decomposition to the two
sub-matrices recursively until a stopping
criterion is met.
Criteria
All rows of Ai are present in presence vector.
The Hamming radius (defined in next slide) of a
sub-matrix and pattern vector is in the
prescribed bound.

20
Hamming Radius
21
Example hierarchical structure
22
Result from a sample binary matrix
(a)
(b)
(c)
(d)
(e)
(f)
original matrix
Which one has the best result?
23
Result from a sample binary matrix
(a)
(b)
(c)
(d)
(e)
(f)
original matrix
quantizing SVD Rank-29
quantizing most dominant singular vector
SVD rank-6
PROXIMUS
k-means rank-6
24
Running time scalability

Write a Comment

User Comments (0)