Pattern Analysis using Convex Optimization: Part 2 of Chapter 7 Discussion

About This Presentation

Title:

Pattern Analysis using Convex Optimization: Part 2 of Chapter 7 Discussion

Description:

Epsilon-insensitive loss instead of hinge. Ridge Regression: Squared-error loss ... Epsilon-insensitive. Defines band around function for 0-loss. A KTEC Center ... – PowerPoint PPT presentation

Number of Views:175

Avg rating:3.0/5.0

Slides: 44

Provided by: herb94

Category:

more less

Transcript and Presenter's Notes

Title: Pattern Analysis using Convex Optimization: Part 2 of Chapter 7 Discussion

1
Pattern Analysis using Convex Optimization Part
2 of Chapter 7 Discussion

Presenter Brian Quanz

2
About todays discussion

Last time discussed convex opt.
Today Will apply what we learned to 4 pattern
analysis problems given in book
(1) Smallest enclosing hypersphere (one-class
SVM)
(2) SVM classification
(3) Support vector regression (SVR)
(4) On-line classification and regression

3
About todays discussion

This time for the most part
Describe problems
Derive solutions ourselves on the board!
Apply convex opt. knowledge to solve
Mostly board work today

4
Recall KKT Conditions

What we will use
Key to remember ch. 7
Complementary slackness -gt sparse dual rep.
Convexity -gt efficient global solution

5
Novelty Detection Hypersphere

Train data learn support
Capture with hypersphere
Outside novel or abnormal or anomaly
Smaller sphere more fine-tuned novelty detection

6
1st Smallest Enclosing Hypersphere

Given
Find center, c, of smallest hypersphere
containing S

7
S.E.H. Optimization Problem

O.P.
Lets solve using Lagrangian and KKT and discuss

8
Cheat
9
S.E.H. Solution

H(x) 1 if xgt0, 0 o.w.

Dualprimal _at_
10
Theorem on bound of false positive
11
Hypersphere that only contains some data soft
hypersphere

Balance missing some points and reducing radius
Robustness single point could throw off
Introduce slack variables (repeated approach)
0 within sphere, squared distance outside

12
Hypersphere optimization problem

Now with trade off between radius and training
point error
Lets derive solution again

13
Cheat
14
Soft hypersphere solution
15
Linear Kernel Example
16
Similar theorem
17
Remarks

If data lies in subspace of feature space
Hypersphere overestimates support in
perpendicular dir.
Can use kernel PCA (next week discussion)
If normalized data (k(x,x)1)
Corresponds to separating hyperplane, from origin

18
Maximal Margin Classifier

Data and linear classifier
Hinge loss, gamma margin
Linear separable if

19
Margin Example
20
Typical formulation

Typical formulation fixes gamma (functional
margbin) to 1 and allows w to vary since scaling
doesnt affect decision, margin proportional to
1/norm(w) to vary.
Here we fix w norm, and vary functional margin
gamma

21
Hard Margin SVM