Arizona State University DMML - PowerPoint PPT Presentation

About This Presentation
Title:

Arizona State University DMML

Description:

... transform using Logistic sigmoid function ,we obtain a ... The right plot shows the result of transforming this sample using a logistic sigmoid function. ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 16
Provided by: Shankar4
Category:

less

Transcript and Presenter's Notes

Title: Arizona State University DMML


1
Kernel Methods Gaussian Processes
  • Presented by Shankar Bhargav

2
Gaussian Processes
  • Extending role of kernels to probabilistic
    discriminative models leads to framework of
    Gaussian processes
  • Linear regression model
  • Evaluate posterior distribution over W
  • Gaussian Processes Define probability
    distribution over functions directly

3
Linear regression
  • x - input vector
  • w M Dimensional weight vector
  • Prior distribution of w given by the Gaussian
    form
  • Prior distribution over w induces a probability
    distribution over function y(x)

4
Linear regression
  • Y is a linear combination of Gaussian distributed
    variables given by elements of W,
  • where is the design matrix with elements
  • We need only mean and covariance to find the
    joint distribution of Y
  • where K is the Gram matrix with elements

5
Gaussian Processes
  • Defn. Probability distributions over functions
    y(x) such that the set of values of y(x)
    evaluated at an arbitrary set of points jointly
    have a gaussian distribution
  • Mean is assumed zero
  • Covariance of y(x) evaluated at any two values of
    x is given by the kernel function

6
Gaussian Processes for regression
  • To apply Gaussian process models for regression
    we need to take account of noise on observed
    target values
  • Consider noise processes with gaussian
    distribution

  • with
  • To find marginal distribution over t we need to
    integrate over Y
  • where covariance matrix C
  • has elements

7
Gaussian Processes for regression
  • Joint distribution over
    is given by
  • Conditional distribution of
    is a Gaussian distribution with mean and
    covariance given by
  • where and is NN covariance matrix

8
Learning the hyperparameters
  • Rather than fixing the covariance function we can
    use a parametric family of functions and then
    infer the parameter values from the data
  • Evaluation of likelihood function where
    denotes the hyperparameters of Gaussian
    process model
  • Simplest approach is to make a point estimate of
    by maximizing the log likelihood function

9
Gaussian Process for classification
  • We can adapt gaussian processes to classification
    problems by transforming the output using an
    appropriate nonlinear activation function
  • Define Gaussian process over a function a(x), and
    transform using Logistic sigmoid function
    ,we obtain a non-Gaussian stochastic process over
    functions

10
The left plot shows a sample from the Gaussian
process prior over functions a(x). The right plot
shows the result of transforming this sample
using a logistic sigmoid function.
Probability distribution function over target
variable is given by Bernoulli distribution on
one dimensional input space
11
Gaussian Process for classification
  • To determine the predictive distribution
  • we introduce a Gaussian process prior over
    vector , the Gaussian prior takes
    the form
  • The predictive distribution is given by
  • where

12
Gaussian Process for classification
  • The integral is analytically intractable so may
    be approximated using sampling methods.
  • Alternatively techniques based on analytical
    approximation can be used
  • Variational Inference
  • Expectation propagation
  • Laplace approximation

13
Illustration of Gaussian process for
classification Optimal decision boundary
Green Decision boundary from Gaussian Process
classifier - Black
14
Connection to Neural Networks
  • For a broad class of prior distributions over w,
    the distribution of functions generated by a
    neural network will tend to a Gaussian process as
    M -gt Infinity
  • In this Gaussian process limit the ouput
    variables of the neural network become
    independent.

15
Thank you
Write a Comment
User Comments (0)
About PowerShow.com