The KolmogorovSmirnov Test - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

The KolmogorovSmirnov Test

Description:

... on approximations for the distribution. 3. Empirical distribution function ... Given two cumulative probability functions FX and FY, the test statistics are ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 12
Provided by: VasileiosH9
Category:

less

Transcript and Presenter's Notes

Title: The KolmogorovSmirnov Test


1
The Kolmogorov-Smirnov Test
  • Vasileios Hatzivassiloglou
  • University of Texas at Dallas

2
Kolmogorov-Smirnov test
  • A fully non-parametric test for comparing two
    distributions
  • Does not depend on approximations for the
    distribution

3
Empirical distribution function
  • For a random variable X and a sample x1, x2,
    ..., xn the empirical distribution function of X
    is defined as
  • where I(condition) is the indicator function,
    i.e., 1 if the condition is true and 0 otherwise

4
Example data
  • FX is an estimate of
  • the cumulative probability function of X
  • Consider the following example data
  • 1.26, 0.34, 0.70, 1.75, 50.57, 1.55, 0.08, 0.42,
    0.50, 3.20, 0.15, 0.49, 0.95, 0.24, 1.37, 0.17,
    6.98, 0.10, 0.94, 0.38
  • n 20
  • Is this data normal?

5
Examining the data
  • Sorted data
  • 0.08, 0.10, 0.15, 0.17, 0.24, 0.34, 0.38, 0.42,
    0.49, 0.50, 0.70, 0.94, 0.95, 1.26, 1.37, 1.55,
    1.75, 3.20, 6.98, 50.57
  • Mean 3.61, Standard deviation 11.2

6
Examining the data for normality
  • For normal data,
  • 15 should be below one s.d. from the mean
    (3.61-11.2 -7.59)
  • none of the samples are even negative
  • about 2 should be above two standard deviations
    from the mean (3.61211.226.01)
  • here we have one in 20 samples way beyond that
    value (50.57)

7
Example empirical distribution
8
Log transformation
9
The Kolmogorov-Smirnov test
  • Given two cumulative probability functions FX and
    FY, the test statistics are
  • Usually the value DmaxD, D- is used (although
    its distribution is harder to study than either
    D or D-)

10
Comparing distributions
11
Advantages of Kolmogorov-Smirnov
  • It is non-parametric and hence robust
  • It does not rely on the means location only
    (like the t-test)
  • It works for non-normal data (the t-test can fail
    if the data is too far from normal)
  • It is not sensitive to scaling
  • It is more powerful than ?2
  • However, it is less sensitive than t if the data
    is indeed normal
Write a Comment
User Comments (0)
About PowerShow.com