A Bayesian Approach for Bandwidth Selection in Kernel Density Estimation PowerPoint PPT Presentation

presentation player overlay
1 / 20
About This Presentation
Transcript and Presenter's Notes

Title: A Bayesian Approach for Bandwidth Selection in Kernel Density Estimation


1
A Bayesian Approach for Bandwidth Selection in
Kernel Density Estimation
  • C.N.Kuruwita
  • Department of Mathematical Sciences
  • Clemson University

2
What is Density Estimation ?
  • The process of obtaining an estimate for an
    unknown probability density function.
  • There are two main approaches.
  • Parametric.
  • Nonparametric.

3
Parametric Approach.
  • Assumes the data come from a known
  • parametric family with density f0(. ?).
  • Then estimate the unknown parameter ?
  • using a suitable parameter estimation
  • method.
  • e.g Maximum Likelihood.

4
Problems in Parametric Approach
  • Restricting the estimator to a certain
    parametric family
  • makes important features in the data
    undetected.

Parametric estimate with a lognormal density
Nonparametric estimate
Reference Kernel Smoothing , Wand Jones (1995)
5
Nonparametric Approach
  • Let the data speak for themselves.
  • Problems
  • Properties of these estimators are hard to
  • determine.
  • Computer Intensive.

6
Common Nonparametric Density Estimation Methods.
  • Kernel density estimators.
  • Nearest neighbor method.
  • Maximum penalized likelihood estimators.
  • Orthogonal series estimators.

7
Kernel Density Estimator.
  • Definition
  • K(.) is a symmetric pdf. (usually)
  • h is called the bandwidth or smoothing
    parameter.

8
The Problem
  • Spill over effect.

9
  • Data driven bandwidth selection.

10
The Approach
  • Use an asymmetric kernel with a positive support
    to avoid the spill over effect.
  • Assign a prior density for the smoothing
    parameter h.
  • Derive the density estimator on a Bayesian
    framework.

11
The Lognormal Kernel Density Estimator
  • Definition
  • K(.) is a lognormal pdf, with scale parameter
    h.
  • sj is the jump size of the Kaplan-Meier estimator
    at each observation .
  • h is assigned an inverted gamma prior with
    shape parameter ? and scale parameter ?.

12
Resulting Bayesian Bandwidth.
  • The Bayesian estimator of the smoothing
    parameter h under squared error loss is given
    as
  • where

13
Asymptotic Properties
  • The lognormal KDE converge to the actual
  • pdf as .
  • i.e a.s ,
  • The Bayesian local bandwidths converge to
  • zero as .
  • i.e a.s ,

14
Simulation Study
  • Assess the effect of the lognormal kernel
  • Compare with Inverse Gaussian kernel.
  • Assess the effectiveness of the Bayesian
    bandwidths.
  • Compare with Cross Validated bandwidth.
  • Simulated Failure Rate Data from Weibull(?,1)
  • Decreasing Failure Rate ( DFR )
  • Constant Failure Rate ( CFR )
  • Increasing Failure Rate ( IFR )
  • Censoring levels. 10, 20, 50

15
Assessment Criteria
  • Pointwise Estimated MSE Ratio
  • Estimated Mean Integrated Squared Error
  • where N is the number of simulations.

16
Estimated Mean Integrated Squared Errors.
IG- Bayes
LN-CV
LN-Bayes
17
Application to Real Data
The Problem Estimation of the probability
density of debond strength of carbon
fibers. Data Due to the complexity of the
experiment there were only 12
observations and 3 of which are
censored. Reference Harwell M. (1995)
Microbond Tests for ribbon fibers . M.S.thesis,
Department of Chemical Engineering , Clemson
University.
18
Density Estimates
19
Conclusion and Future Work
  • The lognormal KDE with the Bayesian bandwidths
    shows lot of potential as density estimator.
  • Need to explore the boundary effect at the right
    of the support. i.e. when the support is finite
    as 0,? , with

20
Thank You
Write a Comment
User Comments (0)
About PowerShow.com