Comparison of data distributions: the power of GoodnessofFit Tests - PowerPoint PPT Presentation

About This Presentation
Title:

Comparison of data distributions: the power of GoodnessofFit Tests

Description:

Goodman test (Kolmogorov-Smirnov test in chi-squared approximation) Kolmogorov-Smirnov test ... Goodman (15.9 0.2) ms. Generalised Girone (16.3 0.2) ms (0.44 ... – PowerPoint PPT presentation

Number of Views:67
Avg rating:3.0/5.0
Slides: 20
Provided by: mariagr
Category:

less

Transcript and Presenter's Notes

Title: Comparison of data distributions: the power of GoodnessofFit Tests


1
Comparison of data distributions the power of
Goodness-of-Fit Tests
  • B. Mascialino1, A. Pfeiffer2, M.G. Pia1, A.
    Ribon2, P. Viarengo3
  • 1INFN Genova, Italy
  • 2CERN, Geneva, Switzerland
  • 3IST National Institute for Cancer Research,
    Genova, Italy

IEEE NSS 2006 San Diego, October 29-November 5,
2006
2
Goodness of Fit testing
Goodness-of-fit testing is the mathematical
foundation for the comparison of data
distributions
  • Regression testing
  • Throughout the software life-cycle
  • Online DAQ
  • Monitoring detector behaviour w.r.t. a reference
  • Simulation validation
  • Comparison with experimental data
  • Reconstruction
  • Comparison of reconstructed vs. expected
    distributions
  • Physics analysis
  • Comparisons of experimental distributions
  • Comparison with theoretical distributions

Use cases in experimental physics
3
(No Transcript)
4
GoF algorithms in the Statistical Toolkit
TWO-SAMPLE PROBLEM
  • Binned distributions
  • Anderson-Darling test
  • Chi-squared test
  • Fisz-Cramer-von Mises test
  • Tiku test (Cramer-von Mises test in chi-squared
    approximation)
  • Unbinned distributions
  • Anderson-Darling test
  • Anderson-Darling approximated test
  • Cramer-von Mises test
  • Generalised Girone test
  • Goodman test (Kolmogorov-Smirnov test in
    chi-squared approximation)
  • Kolmogorov-Smirnov test
  • Kuiper test
  • Tiku test (Cramer-von Mises test in chi-squared
    approximation)
  • Weighted Kolmogorov-Smirnov test
  • Weighted Cramer-von Mises test

5
Performance of the GoF tests
6
Power of GoF tests
  • Do we really need such a wide collection of GoF
    tests? Why?
  • Which is the most appropriate test to compare two
    distributions?
  • How good is a test at recognizing real
    equivalent distributions and rejecting fake ones?
  • No comprehensive study of the relative power of
    GoF tests exists in literature
  • novel research in statistics (not only in physics
    data analysis!)
  • Systematic study of all existing GoF tests in
    progress
  • made possible by the extensive collection of
    tests in the Statistical Toolkit

7
Method for the evaluation of power
The power of a test is the probability of
rejecting the null hypothesis correctly
Parent distribution 1
Parent distribution 2
Pseudo-experiment a random drawing of two
samples from two parent distributions
GoF test
Sample 1 n
Sample 2 n
N10000 Monte Carlo replicas
Confidence Level 0.05
8
Analysis cases
  • Data samples drawn from different parent
    distributions
  • Data samples drawn from the same parent
    distribution
  • Applying a scale factor
  • Applying a shift
  • Use cases in experimental physics
  • Signal over background
  • Hot channel, dead channel
  • etc.

Power analysis on a set of reference mathematical
distributions
Power analysis on some typical physics
applications
Is there any recipe to identify the best test to
use?
9
Parent reference distributions
10
TAILWEIGHT
SKEWNESS
11
Compare different distributions Parent1 ? Parent2
Unbinned distributions
12
The power increases as a function of the sample
size
No clear winner
13
The power varies as a function of the parent
distributions characteristics
General recipe
plt0.0001
14
Quantitative evaluation of GoF tests power
We propose a quantitative method to evaluate the
power of various GoF tests.
15
Binned distributions
Compare different distributions Parent1 ? Parent2
16
Preliminary results
CvM test More powerful Faster (CPU time)
17
Physics use case
18
?0.25 µ2.0
?0.25 µ0.5
K
AD
KS
CvM
Empirical power ()
Empirical power ()
W
WKSAD
Samples size
Samples size
?0.75 µ3.5
AD
Empirical power ()
CvM
WKSAD
Samples size
19
Conclusions
  • No clear winner for all the considered
    distributions in general
  • the performance of a test depends on its
    intrinsic features as well as on the features of
    the distributions to be compared
  • Practical recommendations
  • first classify the type of the distributions in
    terms of skewness and tailweight
  • choose the most appropriate test given the type
    of distributions evaluating the best test by
    means of the quantitative model proposed
  • Systematic study of the power in progress
  • for both binned and unbinned distributions
  • Topic still subject to research activity in the
    domain of statistics
Write a Comment
User Comments (0)
About PowerShow.com