Fast SDP Relaxations of Graph Cut Clustering, - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Fast SDP Relaxations of Graph Cut Clustering,

Description:

Affinity matrix A: ... Affinity matrix: Experiments. 2. Clustering and transduction on text. Data set: 195 articles ... Affinity matrix: 20-nearest neighbor: A ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 20
Provided by: people3
Category:

less

Transcript and Presenter's Notes

Title: Fast SDP Relaxations of Graph Cut Clustering,


1
Fast SDP Relaxations of Graph Cut Clustering,
Transduction, and Other Combinatorial
Problems (JMLR 2006)
Tijl De Bie and Nello Cristianini
Presented by Lihan He March 16, 2007
2
Outline
  • Statement of the problem
  • Spectral relaxation and eigenvector
  • SDP relaxation and Lagrange dual
  • Generalization between spectral and SDP
  • Transduction and side information
  • Experiments
  • Conclusions

3
Statement of the problem
Data set S
Affinity matrix A
Objective graph cut clustering -- divide the
data points into two set, P and N, such that

No label clustering With some labels
transduction
4
Statement of the problem
Normalized graph cut problem (NCut)
Cost function
How well the clusters are balanced
Cut cost
where
5
Statement of the problem
Normalized graph cut problem (NCut)
Unknown label vector
Let
Write
Rewrite the NCut problem as a combinatorial
optimization problem
(1)
NP-complete problem, the exponent is very high.
6
Spectral Relaxation
Let
the problem becomes
Relax the constraints by adding
and dropping the combinatorial constraints on
, we obtain the spectral clustering relaxation
(2)
7
Spectral Relaxation eigenvector
Solution the eigenvector corresponding to the
second smallest generalized eigenvalue.
Solve the constrained optimization by Lagrange
dual
The second constraint is automatically satisfied
8
SDP Relaxation
Let
the problem becomes
Note that
Relax the constraints by adding the above
constraints and dropping
and
Let
and
we obtain the SDP relaxation
(3)
9
SDP Relaxation Lagrange dual
Lagrangian
We obtain the dual problem (strong dual is hold)
(4)
n1 variables
10
Generalization between spectral and SDP
A cascade of relaxations tighter than spectral
and looser than SDP
where
n constraints
m constraints,
Looser than SDP
m1 variables
design how to relax the constraints
Design the structure of W
11
Generalization between spectral and SDP
  • rank(W)n original SDP relaxation.
  • rank(W)1 m1, Wd spectral relaxation.
  • A relaxation is tighter than another if the
    column space of the matrix W used in the first
    one contains the full column space of W of the
    second.
  • If choose d within the column space of W, then
    all relaxations in the cascade are tighter than
    the spectral relaxation.
  • One approach of designing W proposed by the
    author
  • Sort the entries of the label vector (2nd
    eigenvector) from spectral relaxation
  • Construct partition m subsets are roughly
    equally large
  • Reorder the data points by this sorted order
  • W


1
2
m
n/m
12
Transduction
Given some labels, written as label vector yt --
transductive problem
Reparameterize
Label constraints are imposed
  • Rows (columns) corresponding to oppositely
    labeled training points then automatically are
    each others opposite
  • Rows (columns) corresponding to same-labeled
    training points are equal to each other.

13
Transduction
Transductive NCut relaxation
ntest2 variables
14
General constraints
  • An equivalence constraint between two sets of
    data points specifies that they belong to the
    same class
  • An inequivalence constraint specifies two set of
    data points to belong to opposite classes.
  • No detailed label information provided.

15
Experiments
Affinity matrix
1. Toy problems
16
Experiments
2. Clustering and transduction on text
4 languages
Data set 195 articles
several topics
1
Affinity matrix 20-nearest neighbor A(i,j)
0.5
0
Distance of two articles cosine distance on the
bag of words representation
Define dictionary
17
Experiments
2. Clustering and transduction on text cost
By language
By topic
Spectral (randomized rounding)
Cost
Cost
SDP (randomized rounding)
SDP (lower bound)
Spectral (lower bound)
Fraction of labeled data points
Fraction of labeled data points
Cost randomized rounding opt lower bound
18
Experiments
2. Clustering and transduction on text accuracy
By language
By topic
Accuracy
Accuracy
SDP (randomized rounding)
Spectral (randomized rounding)
Fraction of labeled data points
Fraction of labeled data points
19
Conclusions
  • Proposed a new cascade of SDP relaxations of the
    NP-complete normalized graph cut optimization
    problem
  • One extreme spectral relaxation
  • The other extreme newly proposed SDP relaxation
  • For unsupervised and semi-supervised learning,
    and more general constraints
  • Balance the computational cost and the accuracy.
Write a Comment
User Comments (0)
About PowerShow.com