Nonparametric Density Estimation - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Nonparametric Density Estimation

Description:

Christopher M. Bishop, Pattern Recognition and Machine Learning, ... Discrete rv binomial and multinomial distributions. Continuous rv Gaussian distributions ... – PowerPoint PPT presentation

Number of Views:668
Avg rating:3.0/5.0
Slides: 16
Provided by: istTe
Category:

less

Transcript and Presenter's Notes

Title: Nonparametric Density Estimation


1
Nonparametric Density Estimation
  • Riu Baring
  • CIS 8526 Machine Learning
  • Temple University
  • Fall 2007

Christopher M. Bishop, Pattern Recognition and
Machine Learning, Chapter 2.5 Some slides from
http//courses.cs.tamu.edu/rgutier/cpsc689_f07/
2
Overview
  • Density Estimation
  • Given a finite set x1,,xN
  • Task to model the probability distribution p(x)
  • Parametric Distribution
  • Governed by adaptive parameters
  • Mean and variance Gaussian Distribution
  • Need procedure to determine suitable values for
    the parameters
  • Discrete rv binomial and multinomial
    distributions
  • Continuous rv Gaussian distributions

3
Nonparametric Method
  • Attempt to estimate the density directly from the
    data without making any parametric assumptions
    about the underlying distribution
  • .

Nonparametric Density Estimation
4
Histogram
  • Divide the sample space into a number of bins and
    approximate the density at the center of each bin
    by the fraction of points in the training data
    that fall into the corresponding bin
  • .

5
Histogram
  • Parameter bin width
  • .

6
Histogram - Drawbacks
  • The discontinuities of the estimate are not due
    to the underlying density, they are only an
    artifact of the chosen bin locations
  • These discontinuities make it very difficult (to
    the naïve analyst) to grasp the structure of the
    data
  • A much more serious problem is the curse of
    dimensionality, since the number of bins grows
    exponentially with the number of dimensions
  • In high dimensions we would require a very large
    number of examples or else most of the bins would
    be empty

7
Nonparametric DE
8
Nonparametric DE
9
Nonparametric DE
10
Kernel Density Estimator
11
Kernel Density Estimator
12
k-nearest-neighbors
  • To estimate p(x)
  • Consider small sphere centered on the point x
  • Allow the radius of the sphere to grow until it
    contains k data points

13
k-nearest-neighbors
  • Data set comprising Nk points in class Ck, so
    that
  • Suppose the sphere has volume, V, and contains kk
    points from class Ck
  • Density Estimate Unconditional
    density Class Prior
  • Posterior probability of class membership
  • .

14
k-nearest-neighbors
  • To classify new point x
  • Identify K nearest neighbors from training data
  • Assign to the class having the largest number of
    representatives
  • Parameter, K
  • .

15
My thoughts
  • KDE and KNN require the entire training data set
    to be stored
  • Leads to expensive computation
  • Tweak parameters
  • KDE bandwidth, h
  • KNN K
Write a Comment
User Comments (0)
About PowerShow.com