Statistical Testing Project - PowerPoint PPT Presentation

About This Presentation

Title:

Statistical Testing Project

Description:

Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team LCG-Application Meeting CERN, 27 November 2002 http://www.ge.infn.it/geant4/analysis/TandA – PowerPoint PPT presentation

Number of Views:99

Avg rating:3.0/5.0

Slides: 37

Provided by: MariaGr5

Category:

more less

Transcript and Presenter's Notes

Title: Statistical Testing Project

1
Statistical Testing Project

Maria Grazia Pia, INFN Genova
on behalf of the Statistical Testing Team

LCG-Application Meeting CERN, 27 November 2002
http//www.ge.infn.it/geant4/analysis/TandA
2
History and background
3
What is?
A project to develop a statistical analysis
system, to be used in Geant4 testing
physics validation regression testing system
testing
Main application areas in Geant4

Provide tools for the statistical comparison of
distributions
equivalent reference distributions (for
instance, regression testing)
experimental measurements
data from reference sources
functions deriving from theoretical calculations
or from fits

4
History

Statistical testing agreed in the Geant4
Collaboration as a major objective for 2002
Initial ideas presented at Geant4 TSB meeting,
November 2001
Open brainstorming session at a Geant4-WG
workshop, 31 May 2002
Inception phase, summer 2002
Informal discussions with STT, Geant4
collaborators and interested potential developers
Initial collection of user requirements in Geant4
First version of software process deliverables
Vision, URD, Risk List
Presentation at Geant4 Workshop parallel
sessions, October 2002
http//www.ge.infn.it/geant4/talks/G4workshop/CERN
/pia/tanda-2002.ppt

Launch of the project
5
The team
interested collaborators are welcome!

Development team
Pablo Cirrone, INFN Southern National Lab
Stefania Donadio, Univ. and INFN Genova
Susanna Guatelli, CERN/IT/API Technical Student
and INFN Genova
Alberto Lemut, Univ. and INFN Genova
Barbara Mascialino, Univ. and INFN Genova
Sandra Parlati, INFN Gran Sasso National Lab
Andreas Pfeiffer, CERN/IT/API
Maria Grazia Pia, INFN Genova
Geant4 system integration team
Gabriele Cosmo, CERN/IT/API - Geant4 Release
Manager
Sergei Sadilov, CERN/IT/API - Geant4 System
Testing Coordinator
Statistical consultancy
Paolo Viarengo, Univ. Genova, Statistician

requirements, suggestions, b-testing by many
other Geant4 Collaborators (M. Maire, A. Ribon,
L. Urban et al.)
6
The vision
7
Vision the basics

Have a vision for the project
An internal tool for Geant4 physics STT?
Also for Geant4 physics validation in the
experiments?
Other parties than Geant4 interested?

Clearly define scope, objectives

Who are the stakeholders?
Who are the users?
Who are the developers?

Clearly define roles

Rigorous software process

Software quality
Flexible, extensible, maintainable system

Build on a solid architecture

8
Scope of the project

The project will provide tools for statistical
testing of Geant4
physics comparisons and regression testing
multiple comparison algorithms
Generality (for application also in other areas)
should be pursued
facilitated by a component-based architecture
The statistical tools should be used in Geant4
(and in other frameworks)
tool to be used in testing frameworks
not a testing framework itself
Re-use existing tools whenever possible
no attempt to re-invent the wheel
but critical, scientific evaluation of candidate
tools

9
Architectural guidelines

The project adopts a solid architectural approach
to offer the functionality and the quality needed
by the users
to be maintainable over a large time scale
to be extensible, to accommodate future
evolutions of the requirements
Component-based approach
Geant4-specific components general components
to facilitate re-use and integration in diverse
frameworks
AIDA
adopt a (HEP) standard
no dependence on any specific analysis tool
Python
The approach adopted is compatible with the
recommendations of the LCG Architecture
Blueprint RTAG

10
The reason why we are here

Core statistics comparison component user layer
can be generalised
to wider scope than Geant4 only
This is the reason why we present the project to
LCG
to establish a scientific discussion on a topic
of common interest
to see if there are any interested users
to see if there are any interested collaborators
We would all benefit of a collaborative approach
to a common problem
share expertise, ideas, tools, resources

11
Software process guidelines

Significant experience in the team
in Geant4 and in other projects
Guidance from ISO 15504
standard!
USDP, specifically tailored to the project
practical guidance and tools from the RUP
both rigorous and lightweight
mapping onto ISO 15504
Open to use tools provided by the LCG Software
Process Infrastructure project

12
Who are the stakeholders?
13
Who are the users?

Other potential users
users of the Geant4 Toolkit, wishing to compare
the results of their applications to reference
data or to their own experimental results
other projects with requirements for statistical
comparisons of distributions
(e.g. the LHC Computing Grid project)

14
Some use cases

Regression testing
Throughout the software life-cycle
Online DAQ
Monitoring detector behaviour w.r.t. a reference
Simulation validation
Comparison with experimental data
Reconstruction
Comparison of reconstructed vs. expected
distributions
Physics analysis
Comparisons of experimental distributions (ATLAS
vs. CMS Higgs?)
Comparison with theoretical distributions (data
vs. Standard Model)

15
What do the users want?

User requirements from Geant4 (physics, system
testing) elicited, analysed, specified and
reviewed with the users
User Requirements Document
http//www.ge.infn.it/geant4/analysis/TandA/URD_Ta
ndA.html
Use case model in progress
Specific user requirements related to the core
statistical component
Detail in progress (URD in preparation)
Input from LCG?
Requirement traceability
Analysis/design, implementation, test,
documentation, results

16
Are there any constraints?

Geant4 constraint requirements
Based on AIDA
No concrete dependencies on specific AIDA
implementations should appear in the code of the
system tests
Available on Geant4 supported platforms
The system should not require additional licenses
w.r.t. what required for Geant4 development
Other non-functional requirements?

17
The core statistical component
18
HBOOK, PAW Co.
HBOOK manual, 1994
Based on considerations such as those given
above, as well as considerable computational
experience, it is generally believed that tests
like the Kolmogorov or Smirnov-Cramer-Von-Mises
(which is similar but more complicated to
calculate) are probably the most powerful for the
kinds of phenomena generally of interest to
high-energy physicists. The value of PROB
returned by HDIFF is calculated such that it will
be uniformly distributed between zero and one for
compatible histograms, provided the data are not
binned. The value of PROB should not be
expected to have exactly the correct distribution
for binned data.
but
CDF Collaboration, Inclusive jet cross section
in p pbar collisions at sqrt(s) 1.8 TeV, Phys.
Rev. Lett. 77 (1996) 438
19
Goodness-of-fit tests

Pearsons c2 test
Kolmogorov test
Kolmogorov Smirnov test
Lilliefors test
Cramer-von Mises test
Anderson-Darling test
Kuiper test

It is a difficult domain Implementing algorithms
is easy But comparing real-life distributions is
not easy Incremental and iterative software
process Collaboration with statistics
experts Patience, humility, time
System open to extension and evolution Suggestions
welcome!
20
Pearsons c2

Applies to discrete distributions
It can be useful also in case of continuous
distributions, but the data must be grouped into
classes
Cannot be applied if the counting of the
theoretical frequencies in each class is lt 5
When this is not the case, one could try to unify
contiguous classes until the minimum theoretical
frequency is reached

21
Kolmogorov test

The easiest among non-parametric tests
Verify the adaptation of a sample coming from a
random continuous variable
Based on the computation of the maximum distance
between an empirical repartition function and the
theoretical repartition one
Test statistics
D sup
FO(x) - FT(x)

22
Kolmogorov-Smirnov test

Problem of the two samples
mathematically similar to Kolmogorovs
Instead of comparing an empirical distribution
with a theoretical one, try to find the maximum
difference between the distributions of the two
samples Fn and Gm
Dmn sup
Fn(x) - Gm(x)
Can be applied only to continuous random
variables
Conover (1971) and Gibbons and Chakraborti (1992)
tried to extend it to cases of discrete random
variables

23
Lilliefors test

Similar to Kolmogorov test
Based on the null hypothesis that the random
continuous variable is normally distributed
N(m,s2), with m and s2 unknown
Performed comparing the empirical repartition
function F(z1,z2,...,zn) with the one of the
standardized normal distribution F(z)
D sup
FO(z) - F(z)

24
Cramer-von Mises test

Based on the test statistics
w2 integral
(FO(x) - FT(x))2 dF(x)
Can be performed both on continuous and discrete
variables
Satisfactory for symmetric and right-skewed
distributions

25
Anderson-Darling test

Performed on the test statistics
A2 integral FO(x) FT(x)2 / FT(x)
(1-FT(X)) dFT(x)
Can be performed both on continuous and discrete
variables
Seems to be suitable to any data-set (Aksenov and
Savageau - 2002) with any skewness (symmetric
distributions, left or right skewed)
Seems to be sensitive to fat tail of
distributions

26
Kuiper test

Based on a quantity that remains invariant for
any shift or re-parameterization
Does not work well on tails
D max (FO(x)-FT(x)) max (FT(x)-FO(x))

27
Work in progress
28
OOAD

Preliminary design of the statistical component
in progress
Core statistics comparison package
User layer
Policy-based class design
http//www.ge.infn.it/geant4/rose/statistics/
Validation of the design through use cases
Some open issues identified, to be addressed in
next design iteration

29
more algorithms
work in progress
30
work in progress
31
Use case compare two continuous distributions
work in progress
32
Work in progress

Implementation and test of preliminary design
What can be re-used?
Algorithms in GSL, NAG libraries (to be
evaluated)
Studies in progress
Transformation between continuous-discrete
distributions
Strategies to use Kolmogorov-Smirnov with
discrete distributions (E. Dagum original
ideas)
How to deal with experimental errors (not only
statistical!)
Multi-dimensional distributions
Bayesian approach
In the to-do list
Conversion from AIDA objects to distributions
Pythonisation
Revision of the initial documents (Vision, URD,
Risks)
Based on the recent evolutions in the project
Input from todays meeting?

33
Work in progress Geant4-specific

Development of general physics tests in the E.M.
domain, for comparison of reference distributions
Compilation of existing tests
Evaluation, documentation of tests
Elicitation of requirements for tests among the
Geant4 physics groups
Collection of reference data/distributions
Prototype for automated comparison w.r.t.
reference databases
NIST, Sandia etc., directly downloaded from the
web
Prototype as a risk mitigation strategy
Integration in the Geant4 system testing
framework
Integration in Geant4 physics testing frameworks

34
Where?

Geant4-specific stuff
In Geant4
May be included in public distribution, if of
interest to users
Core statistical component
Developed in an independent CVS repository
Code, documentation, software process
deliverables
Web site
http//www.ge.infn.it/geant4/analysis/TandA/index.
html
Contact persons
Andreas.Pfeiffer_at_cern.ch, Maria.Grazia.Pia_at_cern.c
h

35
Time scale

Aggressive time scale driven by Geant4 needs
incremental and iterative software process
OOAD implementation already started
Prototype at CHEP
Advanced functional system summer 2003
Open to the needs/suggestions of LCG
compatible with the available resources and
Geant4 needs

36
Conclusions

Geant4 requires a statistical testing system for
physics validation and regression testing
to provide a high quality product to its user
communities
Core statistical component (of potential general
interest)
Geant4-specific components
Project compatible with LCG architecture
blueprint
component-based approach, AIDA, Python
Rigorous software process
to contribute to the quality of the product
Aggressive time scale dictated by Geant4 needs
Open to scientific collaboration

Beginning

Write a Comment

User Comments (0)