PROXIMUS an error bounded compression framework - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

PROXIMUS an error bounded compression framework

Description:

Two possible approaches: Probabilistic subsampling. Data reduction. Proposed Approach ... Example. presence. vector. pattern. vector. Complexity of optimal ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 25
Provided by: BAO72
Category:

less

Transcript and Presenter's Notes

Title: PROXIMUS an error bounded compression framework


1
PROXIMUS an error bounded compression framework
2
Motivation
  • Analysis of discrete data sets, generally leads
    to NP-complete/hard problems, especially when
    physically interpretable results in discrete
    spaces are desired.
  • the focus here is on effective heuristics for
    reducing the problem size of discrete data.
  • Two possible approaches
  • Probabilistic subsampling
  • Data reduction.

3
Proposed Approach
  • Binary (0, 1) nonorthogonal matrix transformation
    to extract dominant patterns
  • In the proposed approach, elements of singular
    vectors of a binary valued matrix are constrained
    to binary entries with an associated singular
    value of 1.

4
PROXIMUS
  • PROXIMUS is a nonorthogonal matrix transform
    based on recursive partitioning of a data set
    depending on the distance of a relation from the
    dominant pattern.
  • The dominant pattern is computed as a binary
    approximation vector of the matrix of relations.
  • Each discovered pattern has a physical
    interpretation at all levels in the hierarchy of
    the recursive process.

5
  • PROXIMUS provides a framework that captures the
    properties of discrete data sets more accurately
    and takes advantage of their binary nature to
    improve both the quality and efficiency of the
    analysis.

6
Error-bounded approximation of binary matrics
7
Binary rank-one approximation
  • Binary rank-one approximation for an m x n matrix
    A is one of finding two vectors x and y that
    maximize the number of zeros in the matrix
  • y is the pattern vector approximation for the
    objective function.
  • x is the presence vector indicating the rows of
    A approximated by y.

8
Example
pattern vector
presence vector
9
Complexity of optimal binary rank-one
approximation
  • The authors claim that this problem is NP-hard.
  • Reasons
  • It is closely related to maximum cliques in
    graphs.
  • Issue the reference given for NP-hardness is not
    directly for this problem.

10
Drawbacks of SVD for binary matrices
  • Resulting decomposition contains non-integral
    vector values.
  • Due to orthogonal constraint of SVD, features
    contained in some of the previously discovered
    patterns are extracted.

11
Example of SVD for an overlapping item set
1st singular vector
2nd
3rd
12
Modification for SDD
  • SDD (semi-discrete decomposition) also has
    similar problem.
  • Modification
  • Binary elements instead of -1, 0, or 1.
  • Recursive partition using binary rank-one
    approximation. This method provides a
    hierarchical representation of dominant patterns.

13
Heuristic for BR1A
  • Objective function can be written as
  • Which is equivalent to maximize
  • If y is fixed, then sAy is fixed. Optimal x is
  • Note this approach is very similar to SDD.

14
Recall the Greedy Algorithm for SDD
15
Heuristic for BR1A
  • The main steps of Heuristic for BR1A is identical
    to SDD
  • Alternatively fix x or y, followed by finding the
    best possible vector for the other until no
    further improvement can be made.
  • But how does to setup the initial fixed vector
    for BR1A?

16
Initialization of y
  • Partition
  • Arbitrarily select a column c
  • Identify the rows with nonzero at column c
  • Use the centroid as the fixed x for y
  • Greedy graph growing
  • Randomly get a row r
  • Grow the rows that share nonzero from r up to
    half of the rows
  • Use the centroid as the fixed x for y
  • Randomly selected row as the fixed x for y

17
Recursive decomposition
  • Idea use the presence vector to partition the
    given matrix A into two sub-matrices A0 and A1.

18
Example
19
Recursive decomposition
  • Apply binary rank-one decomposition to the two
    sub-matrices recursively until a stopping
    criterion is met.
  • Criteria
  • All rows of Ai are present in presence vector.
  • The Hamming radius (defined in next slide) of a
    sub-matrix and pattern vector is in the
    prescribed bound.

20
Hamming Radius
21
Example hierarchical structure
22
Result from a sample binary matrix
(a)
(b)
(c)
(d)
(e)
(f)
original matrix
Which one has the best result?
23
Result from a sample binary matrix
(a)
(b)
(c)
(d)
(e)
(f)
original matrix
quantizing SVD Rank-29
quantizing most dominant singular vector
SVD rank-6
PROXIMUS
k-means rank-6
24
Running time scalability
Write a Comment
User Comments (0)
About PowerShow.com