Analysis Software Strategy - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Analysis Software Strategy

Description:

Title: API Group Presentation Author: Juergen Knobloch Last modified by: Francesco Forti Created Date: 11/16/2000 11:03:47 AM Document presentation format – PowerPoint PPT presentation

Number of Views:127
Avg rating:3.0/5.0
Slides: 40
Provided by: Juerg48
Category:

less

Transcript and Presenter's Notes

Title: Analysis Software Strategy


1
Analysis Software Strategy
AIDA ANAPHE LIZARD
  • Jürgen Knobloch
  • HTASC, DESY 9 October 2001

2
Anaphe Lizard - AIDA
  • Anaphe - Libraries for Hep Computing
  • Full replacement of CERNLIB
  • Open source, mostly license-free
  • Lizard is based on Anaphe components and the
    Python scripting language (through SWIG)
  • PAW functionality
  • young but has very solid base in mature Anaphe
  • real plug-in structure
  • AIDA project (Abstract Interfaces for Data
    Analysis)
  • Interface definitions classes for C and Java

3
ANAPHE Components
Lizard
Commander
Python SWIG Objectivity/DB NAG-C Qt
Histograms NTuples Fitting Plotting VectorOfPoints
Functions Analyzer
HTL Tags (HepODBMS Gemini/HepFitting Qplotter Vect
orOfPoints
Abstract types
Implemetations (HEP-specific)
AIDA (Abstract Interfaces for Data Analysis)
non-HEP components
CLHEP Class Libraries for HEP
User Interface - using Abstract Types
4
ANAPHE - History
  • LHC Libraries for Hep Computing
  • 1995 project initiated with the aim to develop a
    modular replacement of CERNLIB (lib and tools)
  • using standards (STL, OpenGL, ...) and commercial
    components (e.g., NAG_C, OpenInventor,
    IrisExplorer)
  • First iteration on physics data analysis tool
  • data driven approach (based on IRIS Explorer)
  • GUI based, not command line driven
  • Request to create new physics analysis tool
  • September 99new requirements defined together
    with experiments
  • identified categories/components and Abstract
    Types

5
AIDA - History
  • started in Sept. 1999 (HepVis 99, Orsay)
  • several (mini-) workshops since then
  • main ones Paris 2000 and Boston 2001
  • release 1.0 summer 2000
  • concentrated on developers view
  • Histogram package only
  • IAxis, IHistogram, IHistogram1D, IHistogram2D,
    IHistogramFactory
  • release 2.0 May 2001 (Boston release)
  • about 20 Interface classes
  • aiming at discussion and gathering feedback

6
History and Status - Lizard
  • Started after CHEP-2000
  • Full version out since June 2001
  • PAW like analysis functionality plus
  • on-demand loading of compiled code using shared
    libraries
  • gives full access to experiments analysis code
    and data
  • based on Abstract Interfaces
  • flexible and extensible
  • License-free version available

7
USDP-like approach
  • Start with OO analysis
  • collection of user requirements
  • OO design phase
  • define categories and classes
  • find patterns
  • Create prototype
  • considered throw away
  • get feedback from users
  • Iterate, iterate, iterate ...

8
User requirements
  • Ease of use (like PAW)
  • Foresee customization/integration
  • e.g., use persistency/messaging/... from
    experiment
  • Framework used will not be exposed/imposed
  • needs to be compatible with experiments
    framework
  • Plan for extensions
  • code for now, design for the future
  • Maximize flexibility/interoperability

9
Architecture Overview
  • Maximize flexibility and re-use
  • Abstract Interfaces allow each component to
    develop independently
  • not bound to a specific implementation of any
    component (plugin style)
  • re-use of existing packages to implement
    components reduces start-up time significantly
  • Identify and use patterns - avoid anti-patterns
  • learn from other peoples experiences/failures

10
Ignominy Tool to Quantify Modularity
  • NCCD (spaghetti index)
  • ? 1.0 good toolkit
  • lt 1.0 independent packages
  • gt 1.0 strongly-coupled

Includes Fortran
ATLAS
ROOT
ORCA (CMS)
GEANT4
COBRA (CMS)
Anaphe / Lizard
IGUANA
See 8-024
11
CLHEP
  • HEP foundation class library
  • Random number generators
  • Physics vectors
  • 3- and 4- vectors
  • Geometry
  • Linear algebra
  • System of units
  • more packages recently added
  • will continue to evolve
  • wwwinfo.cern.ch/asd/lhc/clhep/

12
2D Graphics libraries
  • Qt
  • multi-platform C GUI toolkit
  • C class library, not wrapper around C libs
  • superset of Motif and MFC
  • available on Unix and MS Windows
  • no change for developer
  • commercial but with public domain version
  • www.troll.no
  • Qplotter
  • add-on functionality for HEP
  • HIGZ/HPLOT

13
Basic 3D Graphic Libraries
  • OpenGL (basic graphics)
  • De-facto industry standard for basic 3D graphics
  • Used in CAD/CAE, games, VR, medical imaging
  • OpenInventor (scene mgmt.)
  • OO 3D toolkit for graphics
  • Cubes, polygons, text, materials
  • Cameras, lights, picking
  • 3D viewers/editors,animation
  • Based on OpenGL/MesaGL

14
Mathematical Libraries
  • NAG (Numerical Algorithms Group) C Library
  • Covers a broad range of functionality
  • Linear algebra
  • differential equations
  • quadrature, etc.
  • Special functions of CERNLIB added to Mark-6
    release
  • mostly for theory and accelerator
  • Quality assurance
  • extensive testing done by NAG
  • www.nag.com

15
Histograms the HTL package
  • Histograms are the basic tool for physics
    analysis
  • Statistical information of density distributions
  • Histogram Template Library (HTL)
  • design based on C templates
  • Modular separation between sampling and
    display
  • Extensible open for user defined binning
    systems
  • Flexible support transient/persistent at the
    same time
  • Open large use of abstract interfaces
  • recent addition 3D histograms

16
Fitting and Minimization
  • Fitting and Minimization Library (FML)
  • common OO interface
  • NAG-C, MINUIT
  • based on Abstract Interfaces
  • IVector, IModelFunction,
  • Gemini
  • common minimization interface
  • very thin layer

17
Tags, Ntuples and Events
  • NtupleTag Library
  • Ntuple navigation and analysis
  • common OO interface for different storage
  • ODBMS
  • HBook (CERNLIB)
  • Exploiting Tag concept
  • enhanced Ntuples
  • associated with an underlying persistent store
  • optional association to the Event may be used to
    navigate to any other part of the Event
  • even from an interactive visualization program
  • main use speedup data selection for analysis
  • Tag data is typically better clustered than the
    original data

18
Interactive Data Analysis
  • Aim OO replacement for PAW
  • analysis of ntuple-like data (Tags,
    Ntuples, )
  • visualisation of data (Histograms, scatter-plot,
    Vectors)
  • fitting of histograms (and other data)
  • access to experiment specific data/code
  • Maximize flexibility and re-use
  • plug-in structure
  • careful design with limited source and binary
    dependencies
  • Foresee customization/integration
  • allow use from within experiments s/w framework!

19
Lizard Interfaces
20
Anaphe components
21
Scripting (Architectural issue)
  • Typical use of scripting is quite different from
    programming (reconstruction, analysis, ...)
  • history go back to where I was before
  • repetition/looping - with modifiable parameters
  • SWIG to (semi-) automatically create connection
    to chosen scripting language
  • allows flexibility to choose amongst several
    scripting languages
  • Python, Perl, Tcl, Guile, Ruby, (Java)
  • Python - OO scripting, no strange !-variables
  • other scripting languages possible (through SWIG)
  • Can be enhanced and/or replaced by a GUI
  • scripting window within GUI application

22
Sample session
23
Future Enhancements
  • Access to other implementations of components
  • HBOOK histograms and ntuples (RWN) mostly done
  • OpenScientist, ROOT histograms?
  • Adding other scripting languages
  • Perl , Tcl, cint ?
  • Communication with Java tools/packages
  • via AIDA
  • JAS
  • WIRED

24
Distributed Computing
  • Motivation
  • move code to data
  • parallel analysis
  • Techniques
  • services via AI
  • late binding
  • plug-in architecture
  • End-user (Lizard)
  • look-and-feel of local analysis
  • RD started and first prototype available soon
  • CORBA


25
The AIDA project
  • AIDA project (Abstract Interfaces for Data
    Analysis)
  • Presently active mainly developers from existing
    packages
  • Tony Johnson (JAS)
  • Andreas Pfeiffer (Lizard/Anaphe)
  • Guy Barrand (OpenScientist )
  • Mark Dönszelmann (Wired)
  • Developers from LHCb/Gaudi

26
AIDA
  • Design Interfaces for Data Analysis (in HEP)
  • The goals of the AIDA project are to define
    abstract interfaces for common physics analysis
    tools, such as histograms. The adoption of these
    interfaces should make it easier for developers
    and users to select to use different tools
    without having to learn new interfaces or change
    their code. In addition it should be possible to
    exchange data (objects) between AIDA compliant
    applications. (http//aida.freehep.org)
  • Open for contributions of any kind
  • questions, suggestions, code, implementations

27
Motivation
  • Minimize coupling between components
  • Provide flexibility to interchange
    implementations of these interfaces
  • Allows and try to re-use existing packages
  • even across language boundaries
  • e.g., C analysis using Java Histograms
  • Allow for faster turn-around time

28
Components Abstract Interfaces
  • User Code uses only Interface classes
  • IHistogram1D hist histoFactory-gt
    create1D(track quality, 100, 0., 10.)
  • Actual implementations are selected at run-time
  • loading of shared libraries
  • No change at all to user code but keep freedom to
    choose implementation

29
Architectural issueAbstract Interfaces
  • Abstract Interfaces
  • Only pure virtual methods, inheritance only from
    other Abstract Interfaces
  • Components use other components only through
    their Abstract Interface
  • Defines a kind of a protocol for a component
  • Allow each component to develop independently
  • reduces maintenance effort significantly
  • Maximize flexibility and re-use of packages
  • Re-use of existing packages to implement
    components reduces start-up time significantly

30
Architectural issue Components (I)
  • Identify components by functionality
  • Define protocol using Abstract Interfaces
  • Emphasize separation of different aspects for
    each component
  • Example Histogram
  • statistical entity (density distribution of a
    physics quantity)
  • view of a collection of data points (which can
    be a density distribution but also a detector
    efficiency curve)
  • command to manipulate/store/plot/fit/...

31
Architectural issue Components (II)
  • Users view is different from implementors
    (developers) view
  • separate Abstract Interfaces for both aspects
  • command-layer vs. implementation-layer(s)
  • UserInterface as a separate component
  • by definition couples to most of the other
    components
  • Facade pattern
  • promotes weak coupling between the other
    components
  • interfaces to scripting and/or GUI

32
Initial Categories and dependencies
33
Across the languages
  • JAida C access to Java libs
  • using C proxies implementing the C Abstract
    Interfaces to the Java interfaces

34
Infrastructure (I)
  • Location of repository (Java and C)
  • Anonymous CVS access
  • pserveranoncvs_at_cvs.freehep.org/cvs/aida
    (passwd aida)
  • module aida
  • Release area for C code
  • /afs/cern.ch/sw/contrib/AIDA/ltversiongt/AIDA/
  • include ltAIDA/IHistogram.hgt
  • version x.y.p
  • tarballs (C) available in /afs/cern.ch/sw/contri
    b/AIDA/tar/AIDA-ltversiongt.tar.gz

35
Infrastructure (III)
  • Mailing lists (archived)
  • ltlistNamegt_at_cern.ch
  • project-aida-dev (open)
  • project-aida (open)
  • project-aida-announce (posting moderated,
    subscription open)
  • Web page (http//aida.freehep.org/)
  • Updated automatically from repository
  • On web page links to implementations
  • from whoever provides one (and informs us)

36
Release schedule / plan
  • Release for discussion and feedback
  • even if not (yet) complete
  • V 2.0 mid May 2001 (Boston release)
  • C and Java version of the Interfaces
  • V 2.1 Aug. 2001 (Genova release)
  • updates from discussions at Geant-4 workshop
  • fixed some problems with C version
  • Aiming for 2-3 month release frequency

37
More information
  • cern.ch/Anaphe
  • cern.ch/Anaphe/Lizard
  • aida.freehep.org/
  • cern.ch/DB
  • wwwinfo.cern.ch/asd/lhc/clhep/

38
Use-cases of AIDA
  • Java reference implementation in FreeHEP
    repository
  • JAS, OpenScientist and Lizard/Anaphe plan for
    implementations of version 2.x by Dec. 2001
  • Used by Gaudi/Athena (LHCb, Atlas, Harp)
  • Gaudi people involved in design
  • Adopted and used in Geant-4 examples/testing
  • new category created in Geant-4 for analysis
  • No need to go for least common denominator
  • use reasonable superset and concentrate on
    design

39
AIDA - Summary
  • Design of Abstract Interfaces for Data Analysis
  • Maximize flexibility and re-use
  • Allow for faster turn-around time
  • Allows for and try to re-use existing packages
  • No need to go for least common denominator
  • use reasonable superset
  • concentrate on proper design
Write a Comment
User Comments (0)
About PowerShow.com