Statistical Testing Project - PowerPoint PPT Presentation

About This Presentation
Title:

Statistical Testing Project

Description:

Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team LCG-Application Meeting CERN, 27 November 2002 http://www.ge.infn.it/geant4/analysis/TandA – PowerPoint PPT presentation

Number of Views:99
Avg rating:3.0/5.0
Slides: 37
Provided by: MariaGr5
Category:

less

Transcript and Presenter's Notes

Title: Statistical Testing Project


1
Statistical Testing Project
  • Maria Grazia Pia, INFN Genova
  • on behalf of the Statistical Testing Team

LCG-Application Meeting CERN, 27 November 2002
http//www.ge.infn.it/geant4/analysis/TandA
2
History and background
3
What is?
A project to develop a statistical analysis
system, to be used in Geant4 testing
physics validation regression testing system
testing
Main application areas in Geant4
  • Provide tools for the statistical comparison of
    distributions
  • equivalent reference distributions (for
    instance, regression testing)
  • experimental measurements
  • data from reference sources
  • functions deriving from theoretical calculations
    or from fits

4
History
  • Statistical testing agreed in the Geant4
    Collaboration as a major objective for 2002
  • Initial ideas presented at Geant4 TSB meeting,
    November 2001
  • Open brainstorming session at a Geant4-WG
    workshop, 31 May 2002
  • Inception phase, summer 2002
  • Informal discussions with STT, Geant4
    collaborators and interested potential developers
  • Initial collection of user requirements in Geant4
  • First version of software process deliverables
    Vision, URD, Risk List
  • Presentation at Geant4 Workshop parallel
    sessions, October 2002
  • http//www.ge.infn.it/geant4/talks/G4workshop/CERN
    /pia/tanda-2002.ppt

Launch of the project
5
The team
interested collaborators are welcome!
  • Development team
  • Pablo Cirrone, INFN Southern National Lab
  • Stefania Donadio, Univ. and INFN Genova
  • Susanna Guatelli, CERN/IT/API Technical Student
    and INFN Genova
  • Alberto Lemut, Univ. and INFN Genova
  • Barbara Mascialino, Univ. and INFN Genova
  • Sandra Parlati, INFN Gran Sasso National Lab
  • Andreas Pfeiffer, CERN/IT/API
  • Maria Grazia Pia, INFN Genova
  • Geant4 system integration team
  • Gabriele Cosmo, CERN/IT/API - Geant4 Release
    Manager
  • Sergei Sadilov, CERN/IT/API - Geant4 System
    Testing Coordinator
  • Statistical consultancy
  • Paolo Viarengo, Univ. Genova, Statistician

requirements, suggestions, b-testing by many
other Geant4 Collaborators (M. Maire, A. Ribon,
L. Urban et al.)
6
The vision
7
Vision the basics
  • Have a vision for the project
  • An internal tool for Geant4 physics STT?
  • Also for Geant4 physics validation in the
    experiments?
  • Other parties than Geant4 interested?

Clearly define scope, objectives
  • Who are the stakeholders?
  • Who are the users?
  • Who are the developers?

Clearly define roles
  • Rigorous software process

Software quality
Flexible, extensible, maintainable system
  • Build on a solid architecture

8
Scope of the project
  • The project will provide tools for statistical
    testing of Geant4
  • physics comparisons and regression testing
  • multiple comparison algorithms
  • Generality (for application also in other areas)
    should be pursued
  • facilitated by a component-based architecture
  • The statistical tools should be used in Geant4
    (and in other frameworks)
  • tool to be used in testing frameworks
  • not a testing framework itself
  • Re-use existing tools whenever possible
  • no attempt to re-invent the wheel
  • but critical, scientific evaluation of candidate
    tools

9
Architectural guidelines
  • The project adopts a solid architectural approach
  • to offer the functionality and the quality needed
    by the users
  • to be maintainable over a large time scale
  • to be extensible, to accommodate future
    evolutions of the requirements
  • Component-based approach
  • Geant4-specific components general components
  • to facilitate re-use and integration in diverse
    frameworks
  • AIDA
  • adopt a (HEP) standard
  • no dependence on any specific analysis tool
  • Python
  • The approach adopted is compatible with the
    recommendations of the LCG Architecture
    Blueprint RTAG

10
The reason why we are here
  • Core statistics comparison component user layer
    can be generalised
  • to wider scope than Geant4 only
  • This is the reason why we present the project to
    LCG
  • to establish a scientific discussion on a topic
    of common interest
  • to see if there are any interested users
  • to see if there are any interested collaborators
  • We would all benefit of a collaborative approach
    to a common problem
  • share expertise, ideas, tools, resources

11
Software process guidelines
  • Significant experience in the team
  • in Geant4 and in other projects
  • Guidance from ISO 15504
  • standard!
  • USDP, specifically tailored to the project
  • practical guidance and tools from the RUP
  • both rigorous and lightweight
  • mapping onto ISO 15504
  • Open to use tools provided by the LCG Software
    Process Infrastructure project

12
Who are the stakeholders?
13
Who are the users?
  • Other potential users
  • users of the Geant4 Toolkit, wishing to compare
    the results of their applications to reference
    data or to their own experimental results
  • other projects with requirements for statistical
    comparisons of distributions
  • (e.g. the LHC Computing Grid project)

14
Some use cases
  • Regression testing
  • Throughout the software life-cycle
  • Online DAQ
  • Monitoring detector behaviour w.r.t. a reference
  • Simulation validation
  • Comparison with experimental data
  • Reconstruction
  • Comparison of reconstructed vs. expected
    distributions
  • Physics analysis
  • Comparisons of experimental distributions (ATLAS
    vs. CMS Higgs?)
  • Comparison with theoretical distributions (data
    vs. Standard Model)

15
What do the users want?
  • User requirements from Geant4 (physics, system
    testing) elicited, analysed, specified and
    reviewed with the users
  • User Requirements Document
  • http//www.ge.infn.it/geant4/analysis/TandA/URD_Ta
    ndA.html
  • Use case model in progress
  • Specific user requirements related to the core
    statistical component
  • Detail in progress (URD in preparation)
  • Input from LCG?
  • Requirement traceability
  • Analysis/design, implementation, test,
    documentation, results

16
Are there any constraints?
  • Geant4 constraint requirements
  • Based on AIDA
  • No concrete dependencies on specific AIDA
    implementations should appear in the code of the
    system tests
  • Available on Geant4 supported platforms
  • The system should not require additional licenses
    w.r.t. what required for Geant4 development
  • Other non-functional requirements?

17
The core statistical component
18
HBOOK, PAW Co.
HBOOK manual, 1994
Based on considerations such as those given
above, as well as considerable computational
experience, it is generally believed that tests
like the Kolmogorov or Smirnov-Cramer-Von-Mises
(which is similar but more complicated to
calculate) are probably the most powerful for the
kinds of phenomena generally of interest to
high-energy physicists. The value of PROB
returned by HDIFF is calculated such that it will
be uniformly distributed between zero and one for
compatible histograms, provided the data are not
binned. The value of PROB should not be
expected to have exactly the correct distribution
for binned data.
but
CDF Collaboration, Inclusive jet cross section
in p pbar collisions at sqrt(s) 1.8 TeV, Phys.
Rev. Lett. 77 (1996) 438
19
Goodness-of-fit tests
  • Pearsons c2 test
  • Kolmogorov test
  • Kolmogorov Smirnov test
  • Lilliefors test
  • Cramer-von Mises test
  • Anderson-Darling test
  • Kuiper test

It is a difficult domain Implementing algorithms
is easy But comparing real-life distributions is
not easy Incremental and iterative software
process Collaboration with statistics
experts Patience, humility, time
System open to extension and evolution Suggestions
welcome!
20
Pearsons c2
  • Applies to discrete distributions
  • It can be useful also in case of continuous
    distributions, but the data must be grouped into
    classes
  • Cannot be applied if the counting of the
    theoretical frequencies in each class is lt 5
  • When this is not the case, one could try to unify
    contiguous classes until the minimum theoretical
    frequency is reached

21
Kolmogorov test
  • The easiest among non-parametric tests
  • Verify the adaptation of a sample coming from a
    random continuous variable
  • Based on the computation of the maximum distance
    between an empirical repartition function and the
    theoretical repartition one
  • Test statistics
  • D sup
    FO(x) - FT(x)

22
Kolmogorov-Smirnov test
  • Problem of the two samples
  • mathematically similar to Kolmogorovs
  • Instead of comparing an empirical distribution
    with a theoretical one, try to find the maximum
    difference between the distributions of the two
    samples Fn and Gm
  • Dmn sup
    Fn(x) - Gm(x)
  • Can be applied only to continuous random
    variables
  • Conover (1971) and Gibbons and Chakraborti (1992)
    tried to extend it to cases of discrete random
    variables

23
Lilliefors test
  • Similar to Kolmogorov test
  • Based on the null hypothesis that the random
    continuous variable is normally distributed
    N(m,s2), with m and s2 unknown
  • Performed comparing the empirical repartition
    function F(z1,z2,...,zn) with the one of the
    standardized normal distribution F(z)
  • D sup
    FO(z) - F(z)

24
Cramer-von Mises test
  • Based on the test statistics
  • w2 integral
    (FO(x) - FT(x))2 dF(x)
  • Can be performed both on continuous and discrete
    variables
  • Satisfactory for symmetric and right-skewed
    distributions

25
Anderson-Darling test
  • Performed on the test statistics
  • A2 integral FO(x) FT(x)2 / FT(x)
    (1-FT(X)) dFT(x)
  • Can be performed both on continuous and discrete
    variables
  • Seems to be suitable to any data-set (Aksenov and
    Savageau - 2002) with any skewness (symmetric
    distributions, left or right skewed)
  • Seems to be sensitive to fat tail of
    distributions

26
Kuiper test
  • Based on a quantity that remains invariant for
    any shift or re-parameterization
  • Does not work well on tails
  • D max (FO(x)-FT(x)) max (FT(x)-FO(x))

27
Work in progress
28
OOAD
  • Preliminary design of the statistical component
    in progress
  • Core statistics comparison package
  • User layer
  • Policy-based class design
  • http//www.ge.infn.it/geant4/rose/statistics/
  • Validation of the design through use cases
  • Some open issues identified, to be addressed in
    next design iteration

29
more algorithms
work in progress
30
work in progress
31
Use case compare two continuous distributions
work in progress
32
Work in progress
  • Implementation and test of preliminary design
  • What can be re-used?
  • Algorithms in GSL, NAG libraries (to be
    evaluated)
  • Studies in progress
  • Transformation between continuous-discrete
    distributions
  • Strategies to use Kolmogorov-Smirnov with
    discrete distributions (E. Dagum original
    ideas)
  • How to deal with experimental errors (not only
    statistical!)
  • Multi-dimensional distributions
  • Bayesian approach
  • In the to-do list
  • Conversion from AIDA objects to distributions
  • Pythonisation
  • Revision of the initial documents (Vision, URD,
    Risks)
  • Based on the recent evolutions in the project
  • Input from todays meeting?

33
Work in progress Geant4-specific
  • Development of general physics tests in the E.M.
    domain, for comparison of reference distributions
  • Compilation of existing tests
  • Evaluation, documentation of tests
  • Elicitation of requirements for tests among the
    Geant4 physics groups
  • Collection of reference data/distributions
  • Prototype for automated comparison w.r.t.
    reference databases
  • NIST, Sandia etc., directly downloaded from the
    web
  • Prototype as a risk mitigation strategy
  • Integration in the Geant4 system testing
    framework
  • Integration in Geant4 physics testing frameworks

34
Where?
  • Geant4-specific stuff
  • In Geant4
  • May be included in public distribution, if of
    interest to users
  • Core statistical component
  • Developed in an independent CVS repository
  • Code, documentation, software process
    deliverables
  • Web site
  • http//www.ge.infn.it/geant4/analysis/TandA/index.
    html
  • Contact persons
  • Andreas.Pfeiffer_at_cern.ch, Maria.Grazia.Pia_at_cern.c
    h

35
Time scale
  • Aggressive time scale driven by Geant4 needs
  • incremental and iterative software process
  • OOAD implementation already started
  • Prototype at CHEP
  • Advanced functional system summer 2003
  • Open to the needs/suggestions of LCG
  • compatible with the available resources and
    Geant4 needs

36
Conclusions
  • Geant4 requires a statistical testing system for
    physics validation and regression testing
  • to provide a high quality product to its user
    communities
  • Core statistical component (of potential general
    interest)
  • Geant4-specific components
  • Project compatible with LCG architecture
    blueprint
  • component-based approach, AIDA, Python
  • Rigorous software process
  • to contribute to the quality of the product
  • Aggressive time scale dictated by Geant4 needs
  • Open to scientific collaboration

Beginning
Write a Comment
User Comments (0)
About PowerShow.com