Title: Alberto Ribon
1Tutorial of the Statistical Toolkit
Geant4 Workshop Vancouver, September 2003
http//www.ge.infn.it/geant4/analysis/TandA
2Download/setup the Statistical Toolkit
http//www.ge.infn.it/geant4/analysis/HEPstatistic
s Download StatisticsTesting-V1-00-00.tgz
tarball You also need to install the
following libraries 1) GSL (GNU Scientific
Library) 2) AIDA 3.0 3) Anaphe 5.0.5
Edit the script buildAll.py to set the proper
GSL path GSL_DIR/afs/cern.ch/sw/lhcxx/specific/re
dhat73/gcc3.2/
PublicDomainPackages/2.0.0/ Run the script
./buildAll.py
3Statistiscal Tests available
Currently the available statistical tests are
1) Chi2 test (for binned distributions) 2)
Kolmogorov-Smirnov test (for unbinned
distributions).
3) Cramer-von Mises test (for both binned and
unbinned distributions) 4) Anderson-Darling
test (for both binned and unbinned
distributions)
4The AIDA classes you need to know
AIDAIHistogram1D for 1-dimensional
binned distribution AIDAICloud1D
for 1-dimensional unbinned distribution
AIDAIDataPointSet vector of IDataPoint
IDataPoint is a vector of IMeasurement
IMeasurement
(value, errorPlus, errorMinus).
5The StatisticsTesting classes you need to know
namespace StatisticsTesting template lt class
Algorithm gt // For instance
Chi2ComparisonAlgorithm ,
//
KolmogorovSmirnovComparisonAlgorithm
. class StatisticsComparator public
ComparisonResult compare (const
AIDAIDataPointSet dps1, const
AIDAIDataPointSet dps2)
ComparisonResult compare (const
AIDAIHistogram1D histo1, const
AIDAIHistogram1D histo2)
ComparisonResult compare (const AIDAICloud1D
cloud1, const AIDAICloud1D cloud2)
... class ComparisonResult public
double distance( ) double quality ( )
double ndf ( ) ...
6Chi2 test between histograms
Include AIDA/AIDA.h include
StatisticsTesting/StatisticsComparator.h includ
e Chi2ComparisonAlgorithm.h include
ComparisonResult.h using namespace
StatisticsTesting stdauto_ptrltAIDAIAnalysisF
actorygt af( AIDA_createAnalysisFactory()
) stdauto_ptrltAIDAITreeFactorygt tf( af -gt
createTreeFactory() ) stdauto_ptrltAIDAITreegt
tree( tf -gt create() ) stdauto_ptrltAIDAIHisto
gramFactorygt hf( af-gtcreateHistogramFactory(
tree ) ) AIDAIHistogram1D hA (
hf-gtcreateHistogram1D( "A", 100, 0.0, 50.0)
) AIDAIHistogram1D hB ( hf-gtcreateHistogram
1D( "B", 100, 0.0, 50.0) ) hA.fill( 15.7
) ... hB.fill( 23.4 ) ... StatisticsComparatorlt
Chi2ComparisonAlgorithm gt comparator ComparisonR
esult result comparator.compare( hA, hB
) stdcout ltlt distance ltlt
result.distance() ltlt ndf ltlt result.ndf() ltlt
p-value ltlt result.quality()
7Kolmorogov-Smirnov test between clouds
Include AIDA/AIDA.h include
StatisticsTesting/StatisticsComparator.h includ
e KolmogorovSmirnovComparisonAlgorithm.h" includ
e ComparisonResult.h using namespace
StatisticsTesting stdauto_ptrltAIDAIAnalysisF
actorygt af( AIDA_createAnalysisFactory()
) stdauto_ptrltAIDAITreeFactorygt tf( af -gt
createTreeFactory() ) stdauto_ptrltAIDAITreegt
tree( tf -gt create() ) stdauto_ptrltAIDAIHisto
gramFactorygt hf( af-gtcreateHistogramFactory(
tree ) ) AIDAICloud1D cloudA (
hf-gtcreateCloud1D( "A" ) ) AIDAICloud1D
cloudB ( hf-gtcreateCloud1D( "B" )
) cloudA.fill( 15.7 ) ... cloudB.fill( 23.4
) ... StatisticsComparatorlt KolmogorovSmirnovCom
parisonAlgorithm gt comparator ComparisonResult
result comparator.compare( cloudA, cloudB
) stdcout ltlt K-S distance ltlt
result.distance() ltlt p-value ltlt
result.quality()
8Example of an XML data file
lt?xml version1.0 encodingISO-8859-1?gt lt!DOCT
YPE aida SYSTEM http//aida.freehep.org/schemas/3
.0/aida.dtd gt ltaida version3.0gt ltimplementation
packageAnaphe version5.0.0/gt ltdataPointSet
dimension2 nameattenuation coefficient
path/
titleattenuation coefficient in
Gegt ltannotationgt ltitem keyTitle
valueattenuation coefficient in Ge/gt ltitem
keyName valueattenuation coefficient/gt ltitem
keySize value24/gt lt/annotationgt ltdataPointgt
ltmeasurement value1.000e-03/gt ltmeasurement
errorMinus9.465e01 errorPlus9.465e01
value1.893e03 /gt lt/dataPointgt ltmeasurement
value1.500e-03/gt ltmeasurement
errorMinus1.3555e02 errorPlus2.7375e02
value5.475e03/gt lt/dataPointgt ltdataPointgt ltme
asurement value1.500e01/gt ltmeasurement
errorMinus1.670e-03 errorPlus1.670e-03
value3.340e-02/gt lt/dataPointgt lt/dataPointSetgt lt
/aidagt
9How to write a XML data file
aFact AIDA_createAnalysisFactory() treeFact
aFact -gt createTreeFactory() theTree
treeFact -gt create(test50.xml, xml, false,
true, uncompress ) dataPointFactory aFact -gt
createDataPointSetFactory( theTree
) particleTransmissionDataPoint
dataPointFactory -gt create(Transmission test,
2) particleTransmissionDataPoint -gt
addPoint() AIDAIDataPoint point
particleTransmissionDataPoint -gt point(
PointNumber ) AIDAIMeasurement coordinateX
point -gt coordinate( 0 ) coordinateX -gt
setValue( primaryParticleEnergy
) AIDAIMeasurement coordinateY point -gt
coordinate( 1 ) coordinateY -gt setValue(
TransFraction ) coordinateY -gt setErrorPlus(
TransError ) coordinateY -gt setErrorMinus(
TransError ) theTree -gt commit() theTree -gt
close()
10How to compare XML dataPointSets
Include AIDA/AIDA.h include
StatisticsTesting/StatisticsComparator.h includ
e Chi2ComparisonAlgorithm.h include
ComparisonResult.h using namespace
StatisticsTesting stdauto_ptrltAIDAIAnalysisF
actorygt af( AIDA_createAnalysisFactory()
) stdauto_ptrltAIDAITreeFactorygt tf( af -gt
createTreeFactory() ) stdauto_ptrltAIDAITreegt
tree( tf -gt create() ) stdauto_ptrltAIDAIDataP
ointSetFactorygt dpsf( af -gt createDataPointSetFact
ory( tree ) ) stdauto_ptrltAIDAITreegt
treeXML1( tf -gt create(gamma_lowE_Ge.xml,
xml, true, false) stdauto_ptrltAIDAITreegt
treeXML2( tf -gt create(NIST_attenuationGamma_Ge.x
ml, true, false) AIDAIDataPointSet dps1
( dynamic_castltAIDAIDataPointSetgt( treeXML1
-gt find(Gamma
attenuation coefficient
test ) ) ) AIDAIDataPointSet dps2 (
dynamic_castltAIDAIDataPointSetgt( treeXML2 -gt
find( Gamma
attenuation coefficient test
) ) ) stdcout ltlt title ltlt dps1.title() ltlt
size ltlt dps1.size() ltlt dimension ltlt
dps1.dimension() StatisticsComparatorlt
Chi2ComparisonAlgorithm gt comparator ComparisonRe
sult result comparator.compare( dps1, dps2
) stdcout ltlt distance ltlt
result.distance() ltlt ndf ltlt result.ndf() ltlt
p-value ltlt result.quality()
11Conclusions
- The Statistical Toolkit provides
- already some important functionalities
- It is already used in test50 for Geant4
- physics tests and regression
- it is simple to install and to use
- we are working on the documentation
- other statistical tests are under development
- various long-term extensions are foreseen