Title: Comparison of data distributions: the power of GoodnessofFit Tests
1Comparison of data distributions the power of
Goodness-of-Fit Tests
- B. Mascialino1, A. Pfeiffer2, M.G. Pia1, A.
Ribon2, P. Viarengo3 - 1INFN Genova, Italy
- 2CERN, Geneva, Switzerland
- 3IST National Institute for Cancer Research,
Genova, Italy
IEEE NSS 2006 San Diego, October 29-November 5,
2006
2Goodness of Fit testing
Goodness-of-fit testing is the mathematical
foundation for the comparison of data
distributions
- Regression testing
- Throughout the software life-cycle
- Online DAQ
- Monitoring detector behaviour w.r.t. a reference
- Simulation validation
- Comparison with experimental data
- Reconstruction
- Comparison of reconstructed vs. expected
distributions - Physics analysis
- Comparisons of experimental distributions
- Comparison with theoretical distributions
Use cases in experimental physics
3(No Transcript)
4GoF algorithms in the Statistical Toolkit
TWO-SAMPLE PROBLEM
- Binned distributions
- Anderson-Darling test
- Chi-squared test
- Fisz-Cramer-von Mises test
- Tiku test (Cramer-von Mises test in chi-squared
approximation)
- Unbinned distributions
- Anderson-Darling test
- Anderson-Darling approximated test
- Cramer-von Mises test
- Generalised Girone test
- Goodman test (Kolmogorov-Smirnov test in
chi-squared approximation) - Kolmogorov-Smirnov test
- Kuiper test
- Tiku test (Cramer-von Mises test in chi-squared
approximation) - Weighted Kolmogorov-Smirnov test
- Weighted Cramer-von Mises test
5Performance of the GoF tests
6Power of GoF tests
- Do we really need such a wide collection of GoF
tests? Why? - Which is the most appropriate test to compare two
distributions? - How good is a test at recognizing real
equivalent distributions and rejecting fake ones?
- No comprehensive study of the relative power of
GoF tests exists in literature - novel research in statistics (not only in physics
data analysis!) - Systematic study of all existing GoF tests in
progress - made possible by the extensive collection of
tests in the Statistical Toolkit
7Method for the evaluation of power
The power of a test is the probability of
rejecting the null hypothesis correctly
Parent distribution 1
Parent distribution 2
Pseudo-experiment a random drawing of two
samples from two parent distributions
GoF test
Sample 1 n
Sample 2 n
N10000 Monte Carlo replicas
Confidence Level 0.05
8Analysis cases
- Data samples drawn from different parent
distributions - Data samples drawn from the same parent
distribution - Applying a scale factor
- Applying a shift
- Use cases in experimental physics
- Signal over background
- Hot channel, dead channel
- etc.
Power analysis on a set of reference mathematical
distributions
Power analysis on some typical physics
applications
Is there any recipe to identify the best test to
use?
9Parent reference distributions
10TAILWEIGHT
SKEWNESS
11Compare different distributions Parent1 ? Parent2
Unbinned distributions
12The power increases as a function of the sample
size
No clear winner
13The power varies as a function of the parent
distributions characteristics
General recipe
plt0.0001
14Quantitative evaluation of GoF tests power
We propose a quantitative method to evaluate the
power of various GoF tests.
15Binned distributions
Compare different distributions Parent1 ? Parent2
16Preliminary results
CvM test More powerful Faster (CPU time)
17Physics use case
18?0.25 µ2.0
?0.25 µ0.5
K
AD
KS
CvM
Empirical power ()
Empirical power ()
W
WKSAD
Samples size
Samples size
?0.75 µ3.5
AD
Empirical power ()
CvM
WKSAD
Samples size
19Conclusions
- No clear winner for all the considered
distributions in general - the performance of a test depends on its
intrinsic features as well as on the features of
the distributions to be compared - Practical recommendations
- first classify the type of the distributions in
terms of skewness and tailweight - choose the most appropriate test given the type
of distributions evaluating the best test by
means of the quantitative model proposed - Systematic study of the power in progress
- for both binned and unbinned distributions
- Topic still subject to research activity in the
domain of statistics