The Perceptron Algorithm (dual form) - PowerPoint PPT Presentation

1 / 9
About This Presentation
Title:

The Perceptron Algorithm (dual form)

Description:

Title: PowerPoint Presentation Last modified by: test_John Created Date: 1/1/1601 12:00:00 AM Document presentation format: Other titles – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 10
Provided by: edut1550
Category:

less

Transcript and Presenter's Notes

Title: The Perceptron Algorithm (dual form)


1
The Perceptron Algorithm (dual form)
2
  • The perceptron algorithm works by
  • Adding missclassified positive training examples
    or
  • Subtracting misclassified negative ones to an
    initial arbitrary weight vector
  • We assumed that the initial weight vector is the
    zero vector, and so te final hyothesis will be a
  • Linear combination of the training points

The number of times misclassification of Xi has
Caused the weight to be updated.
The part of Perceptron algorithm (Primal form)
3
The number of times misclassification of xi has
Caused the weight to be updated.
  • Points that have caused fewer mistakes will have
    smaller ai ,
  • whereas difficult points will have large values
  • This quantity is sometimes referred to as the
    embedding strength of the pattern xi

4
  • Once a sample S has been fixed, one can think of
    the vector aas alternative representation of the
    hypothesis in different of dual coordinates.
  • This expansion is not unique different acan
    correspond to the same hypothesis w
  • One can also regard ai as an indication of the
    information content of the example xi

5
  • The decision function can be rewritten in dual
    coordinates as follows

?? ??? x ??? w ???, ??????? training example ???
6
Update w
Update ai (?? x ????? a)
The Perceptron algorithm (primal form)
The Perceptron algorithm (dual form)
7
  • The dual forms property
  • The points that have larger ai can be used to
    rank the data according to their information
    content
  • Remark 2.10 Since the number of updates equals
    the number of mistakes and each update causes 1
    to be added to exactly one of its components, the
    1-norm of the vector a satisifies

?? the 1-norm of a as the complexity of the
target concept in the dual representation
8
  • Remark 2.11
  • The training data only enter the algorithm
    through the entries of the matrix G, known as the
    Gram matrix

9
  • Dual Representation of Linear Machines
  • The data only appear through entries in the Gram
    matrix and never through their individual
    attributes
  • In the decision function, it is only the inner
    products of the data with the new test point that
    are needed
Write a Comment
User Comments (0)
About PowerShow.com