Title: Omeostatico nuova codifica
1A Toolkit for statistical comparison of data
distributions
Monte Carlo 2005 - Chattanooga, April 2005
2Data analysis
- Provide tools for the statistical comparison of
distributions - equivalent reference distributions
- experimental measurements
- data from reference sources
- functions deriving from theoretical calculations
or fits
Detector monitoring Simulation
validation Reconstruction vs. expectation Regressi
on testing Physics analysis
3GoF statistical toolkit
A project to develop a statistical comparison
system
4Software process guidelines
- United Software Development Process, specifically
tailored to the project - practical guidance and tools from the RUP
- both rigorous and lightweight
- mapping onto ISO 15504
- Guidance from ISO 15504
- Incremental and iterative
- life cycle model
5Architectural guidelines
- The project adopts a solid architectural approach
- to offer the functionality and the quality needed
by the users - to be maintainable over a large time scale
- to be extensible, to accommodate future
evolutions of the requirements - Component-based approach
- to facilitate re-use and integration in different
frameworks - AIDA
- adopt a (HEP) standard
- no dependence on any specific analysis tool
6(No Transcript)
7The tests are specialised on the kind of
distribution (binned/unbinned)
8G.A.P Cirrone, S. Donadio, S. Guatelli, A.
Mantero, B. Mascialino, S. Parlati, M.G. Pia, A.
Pfeiffer, A. Ribon, P. Viarengo A
Goodness-of-Fit Statistical Toolkit IEEE-
Transactions on Nuclear Science (2004), 51 (5)
2056-2063.
Release StatisticsTesting-V1-01-00 downloadable
from the web http//www.ge.infn.it/geant4/analysi
s/HEPstatistics/
9- Applies to binned distributions
- It can be useful also in case of unbinned
distributions, but the data must be grouped into
classes - Cannot be applied if the counting of the
theoretical frequencies in each class is lt 5 - When this is not the case, one could try to unify
contiguous classes until the minimum theoretical
frequency is reached - Otherwise one could use Yates correction
10unbinned distributions
Dmn
SUPREMUM STATISTICS
11binned/unbinned distributions
12- Simple user layer
- Only deal with AIDA objects and choice of
comparison algorithm
13Unit tests
Rigorous software process adopted
Test process
Integration tests
System tests
- Testing focuses primarily on the evaluation or
assessment of quality of the software product,
guaranteeing its correctness and robustness. - finding and documenting defects in software
quality - validating software product functions as
designed - validating that the requirements have been
implemented appropriately - Test result summaries are included as part of the
documentation of the Toolkit release and are
available on the web.
14 supremum statistics
- Weighted KS tests
- Weighted CVM tests
- CVM approximation to a ?2 (Tiku test)
- Exact Anderson-Darling test
- Watson test
- Watson approximation to a ?2 (Tiku test)
- With these tests the GoF Statistical Toolkit will
be the most complete toolkit for the two-sample
problem in physics as well as in the statistics - domain.
unbinned distributions
quadratic statistics
binned/unbinned distributions
unbinned distributions
15The power of a test is the probability of
rejecting the null hypothesis correctly
In terms of power
- ?2 loses information in a test for unbinned
distribution by grouping the data into cells - Kac, Kiefer and Wolfowitz (1955) showed that
Kolmogorov-Smirnov test requires n4/5
observations compared to n observations for ?2
to attain the same power - Cramer-von Mises and Anderson-Darling statistics
are expected to be superior to Kolmogorov-Smirnov
s, since they make a comparison of the two
distributions all along the range of x, rather
than looking for a marked difference at one point
16(No Transcript)
17K. Amako, S. Guatelli, V. Ivanchenko, M. Maire,
B. Mascialino, K. Murakami, P. Nieminen, L.
Pandola, S. Parlati, A. Pfeiffer, M. G. Pia, M.
Piergentili, T. Sasaki, L. Urban Precision
validation of Geant4 electromagnetic physics
p-value stability study
Geant4 LowE Penelope Geant4 Standard Geant4 LowE
EEDL NIST - XCOM
Geant4 LowE Penelope Geant4 Standard Geant4 LowE
EEDL
The three Geant4 models are equivalent
Z
18Radioprotection applications in manned space
missions
inflatable habitat
- Comparison of inflatable and conventional rigid
habitat concepts Effect of different shielding
materials Effect of shielding thickness E.m.
and hadronic interactions
thanks to Susanna Guatelli
KS TEST
S. Guatelli, B. Mascialino, P. Nieminen, M. G.
Pia Radioprotection for interplanetary manned
missions
S. Guatelli, B. Mascialino, P. Nieminen, M. G.
Pia Radioprotection for interplanetary manned
missions
19c2 not appropriate (lt 5 entries in some bins,
physical information would be lost if rebinned)
Anderson-Darling Ac (95) 0.752
A. Mantero, B. Mascialino, P. Nieminen, M. G.
Pia, A. Owens, M. Bavdaz, A. Peacock A library
for simulated X-ray emission from planetary
surfaces
thanks to Alfonso Mantero
20 dose
dose
Kolmogorov-Smirnov test
Distance (mm)
Distance (mm)
thanks to Michela Piergentili
Range D p-value
-56 ? -35 mm 0.26 0.89
-34 ? -22 mm 0.43 0.42
-21 ?21 mm 0.38 0.08
22 ? 32 mm 0.26 0.98
33 ? 36 mm 0.57 0.13
Range D p-value
-84 ? -60 mm 0.385 0.23
-59 ? -48 mm 0.27 0.90
-47 ? 47 mm 0.43 0.19
48 ? 59 mm 0.30 0.82
60 ? 84 mm 0.40 0.10
F. Foppiano, B. Mascialino, M. G. Pia, M.
Piergentili Geant4 simulation of an accelerator
head for intensity modulated radiotherapy
21Conclusions
- This is a new up-to-date easy to handle and
powerful tool for statistical comparison in
particle physics. - It the first tool supplying such a variety of
sophisticated and powerful statistical tests in
HEP. - Released and downloadable from the web.
- AIDA interfaces allow its integration in any
other data analysis tool.
Applications in HEP, astrophysics, medical
physics,