Overview of multilayer neural networks - PowerPoint PPT Presentation

1 / 8
About This Presentation
Title:

Overview of multilayer neural networks

Description:

'There is nothing particularly magical about multilayer neural networks; they ... but in a space where the inputs have been mapped nonlinearly', Duda, Hart, Stork ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 9
Provided by: Jon8152
Category:

less

Transcript and Presenter's Notes

Title: Overview of multilayer neural networks


1
Overview of multilayer neural networks
  • Chapter 6 in Duda et. al.

There is nothing particularly magical about
multilayer neural networks they implement linear
discriminants, but in a space where the inputs
have been mapped nonlinearly, Duda, Hart, Stork
2
Multilayer neural networks
  • In general a NN implements a non-linear mapping
  • For classification
  • Input is the d-dimensional feature vector x
  • Output is the c discriminant functions
  • We strive to obtain
  • Example

3-d feature vectors, two-category case, neural
network with 5 hidden units
3
Terminology of neural networks
Weights, synapses
Bias weights
Target vector
Non-linearity,activation function
A hidden unit
Net activation
Input layer
Output layer
Hidden layer
4
Structure of a neural network
  • We will study fully-connected, three layer
    networks with a fixed non-linearity
  • We train the NN by optimizing the weights
    according to some criterion
  • Generalizations
  • Different non-linearities in each node.
  • Other network topologies not fully connected,
    feedback paths

Sloppy notation! Weight indices are used to
distinguish between layers
5
Sigmoid non-linearities
  • Sigmoids are non-decreasing, scalar functions
    that satisfy
  • Examples
  • For training it is beneficial (if not crucial)
    that the sigmoid is differentiable

Hard limiter or step function
6
Expressive power of neural networks
  • Neural networks can implement any
    multidimensional mapping
  • Kolmogorov (1957) finite number of hidden units
    but unknown and arbitrarily complex scalar
    non-linearities
  • Hornik (Neural networks, vol 4, 1990) and many
    others fixed scalar non-linearities (continuous,
    bounded, non-constant) but arbitrarily many
    hidden units
  • This situation is closer to practice where we
    typically use differentiable sigmoids, and vary
    the number of hidden units until satisfactory
    performance.
  • In practice engineering skills are more important
  • Application specific knowledge that guide the
    choice of network topology
  • Number of hidden layers
  • Number of units in each hidden layer
  • Feedback networks
  • Pruning techniques

7
Backpropagation training of neural networks
  • Supervised learning
  • For each feature vector there is an
    associated target vector
  • A gradient descent algorithm that modifies the
    weights iteratively so that the MSE is
    minimized
  • Often a stochastic gradient descent algorithm is
    used
  • For each input vector we consider the error
  • Calculate the stochastic gradient with respect to
    all weightsand update by
  • How choose the target vectors?
  • We do not know the posterior probabilities!
  • Somehow the target vector should indicate the
    category
  • For the batch version we have

Training data fromall categories
8
What more for session 6?
  • Read sections 6.1-6.6
  • Derivation of the backpropagation algorithm
  • Convergence of gradient algorithms
  • Interpretations of neural networks
  • Mapping of feature vectors to a space where they
    can be linearly separated
  • MSE approximation of the Bayes discriminant
    functions
  • Gives one idea how to specify the target vector
Write a Comment
User Comments (0)
About PowerShow.com