Anomaly detection through Bayesian Support Vector Machines - PowerPoint PPT Presentation

1 / 9
About This Presentation
Title:

Anomaly detection through Bayesian Support Vector Machines

Description:

AMSC663 Project Proposal. 1. Anomaly detection through Bayesian Support Vector Machines ... The decision function D(x) requires the dot product of the feature map F ... – PowerPoint PPT presentation

Number of Views:89
Avg rating:3.0/5.0
Slides: 10
Provided by: vsot
Category:

less

Transcript and Presenter's Notes

Title: Anomaly detection through Bayesian Support Vector Machines


1
Anomaly detection through Bayesian Support Vector
Machines
  • Vasilis A. Sotiris
  • Michael Pecht

2
Detection Algorithm
Model Decision boundary
Model space R kxm
Kltn
Training data
Karhunen- Loeve expansion
Decision
Residual Decision boundary
Input space R nxm
Residual space R lxm
lltn
Positive class
Negative class
3
BasicsLinear Classification Separable
x2
Abnormal Class
  • For seperable data the SVM finds a function D(x)
    that best separates the two classes (Maximum
    distance M)
  • Function D(x) can be used as a classifier
  • Through the support vectors we can
  • compress the input space
  • Detect anomalies
  • By minimizing the norm of w we find the line or
    linear surface that best separates the two
    classes
  • The decision function is the linear combination
    of the weight vector w

M
w
Optimal Separating Line D(x)
x1
Normal Class

Training Support Vectors

Lagrange multipliers
4
Basics Linear Classification - Inseparable
  • Maximize the margin M and minimize the sum of
    slack errors xi
  • Function D(x) can be again used as a classifier
    (incorporating a degree of error)

x2
Abnormal Class
x1
x1
M
x2
x2
x1
Normal Class
Training Support Vectors
New observation vector
5
Nonlinear classification
x2
  • For inseparable data the SVM finds a nonlinear
    function D(x) that best separates the two classes
    by
  • Use of a kernel map k(.)
  • KF(xi)F(x)
  • An example of a feature map F(x)x2 v2x 1T
  • The decision function D(x) requires the dot
    product of the feature map F
  • uses the same mathematical framework as the
    linear classifier
  • The class y of the data is determined by the sign
    of D(x)

1
-1
6
Nonlinear classification for detection
Solution
Mapping F
Feature Space
Input Space
Input Space
  • Given a training data set that contains the
    normal and artificial abnormal data points (Blue
    crosses and red circles respectively
  • Solve linear optimization problem to find w and b
    in the feature space
  • Form a nonlinear decision function by mapping
    back to the input space using the same kernel
    mapping
  • The result is that we can obtain a decision
    boundary on the given training set and use it to
    classify new observations

7
Need for a soft decision boundary
Could be a false alarm
  • Class predictions are not probabilistic
  • SVM output is a hard binary decision
  • We desire to estimate the conditional
    distribution p(yx) ? capture uncertainty
  • User defined model parameters like C can lead to
    low generalization
  • Bayesian methods can determine model parameters

Soft decision boundary
Hard decision boundary
Likelihood function
D(x)
8
Validation
  • Training data simulate (n x m) matrix of
    observations
  • Test data use training data and inject a fault
  • Construct D(x) with BSVM against training data
  • Validation
  • detect given faults
  • reduce false alarms (compare to SVM)

9
BACKUP - Bayesian Classifier Design
  • Loss functions for a soft decision and hard
    decision boundary

3
yx-1
yx1
2
1
0
-1
0
1
-2
2
D(x)
Write a Comment
User Comments (0)
About PowerShow.com