Distributed Simulation with Geant4 - PowerPoint PPT Presentation

About This Presentation
Title:

Distributed Simulation with Geant4

Description:

speed up the analysis cycle. generate more events debug simulation faster ... both in terms of time and amout of code/expertise one must invest ... – PowerPoint PPT presentation

Number of Views:97
Avg rating:3.0/5.0
Slides: 25
Provided by: geI3
Category:

less

Transcript and Presenter's Notes

Title: Distributed Simulation with Geant4


1
Distributed Simulation with Geant4 Preliminary
results of the LowE / DIANE joint projectJakub
T. Moœcicki, CERN/IT credits also to Alfonso
Mantero, INFN Genova
2
History
  • Parallelization of Geant4 simulation is a joint
    project between Geant4 DIANE Anaphe
  • DIANE is an RD project in IT/API to study
    distributed analysis and simulation and create a
    prototype
  • initiated early 2001 with very limited resources
  • Anaphe is an analysis project supported by IT
  • provides the analysis framework for HEP
  • The pilot programme includes G4 simulation which
    produces AIDA/Anaphe histograms
  • Collaboration started late spring 2002

3
Sequential Geant4 Simulation
  • the goal of simulation
  • optimize the detectors used for x-ray
    fluorescence emission from Mercury's crust in the
    context of Hermes, Bepi Colombo ESA mission.
  • requires high statistics è many events
  • 20 Mio events 3 hours
  • up to 100 Mio events might be useful
  • estimated time 16 hours

4
Parallel Geant4 Simulation
  • increase performance
  • shift from batch to semi-interactive simulation
  • speed up the analysis cycle
  • generate more events debug simulation faster
  • from sequential to parallel simulation
  • preserve reproducability of the results
  • minimize deployment overhead
  • when moving from sequential to parallel
    simulation
  • both in terms of time and amout of code/expertise
    one must invest

5
Performance Increase
6
Benchmarking environment
  • parallel cluster configuration
  • lxplus 70 redhat 61 nodes
  • 7 Intel STL2 (2 x PIII 1GHz, 512MB)
  • 31 ASUS P2B-D ( 2 x PIII 600MHz, 512MB)
  • 15 Celsius 620 (2 x PIII, 550MHz, 512MB)
  • the rest Kayak 450 Mhz (2 x PIII, 450Mhz,
    128MB)
  • reference sequential machine
  • pcgeant2 (2x Xeon 1700Mhz, 1GB)

7
Benchmarking Caveat
  • non-exclusive access to interactive machines
  • 'load-noise' background, unpredictible load peaks
  • different CPU and RAM on nodes
  • AFS used to fetch physics config data
  • try to remove the noise
  • repeat simulations many times to get the correct
    mean
  • work at night and off-peak hours (what about US
    people using CERN computing facilities ?)
  • etc...
  • conclusion
  • results should be taken with caution and are
    approximate

8
Structure of the simulation
  • initialization phase (constant)
  • load 10-15 Mb of physics tables, config data
    etc.
  • reference sequential machine 4 minutes (user
    time)
  • cluster nodes 5-6 minutes
  • beamOn f( event number )
  • small job 1-5 Mio events
  • medium job 20-40 Mio events
  • big job gt 50 Mio events

9
Scalability test (job time)
10
Normalized efficency
11
Benchmarking (comments)
  • results are approximate
  • scaling factors for different CPU speeds
  • but seem with agreement with expectations
  • move from batch to semi interactive simulation
    feasible
  • small jobs do not gain so much large constant
    initialization time

12
Problems solutions
  • time of job execution slowest machine...
  • ...or most loaded one at the moment
  • often had to wait a long time for last worker to
    finish
  • possible solution
  • use larger number of smaller workers
  • fast machines run workers sequentially many
    times, but...
  • constant initialization time rather important
  • initialize once, beamOn many times... to be
    checked
  • if this problem is solved we may move towards
    more interactive simulation

13
From sequential to parallel simulation
14
Reproducability
  • initial seed of the random engine
  • make sure that every parallel simulation starts
    with a seed uniquely determined by the job's
    initial seed
  • number of times engine is used depends on the
    initial seed
  • make sure that correlations between the workers'
    seeds are avoided
  • our solution
  • use two uncorrelated random engines
  • one to generate a table of initial seeds (one
    seed for each worker)
  • another for the simulation inside the worker

15
Reproducability
  • parameters which need to be fixed to reproduce
    the simulation
  • total number of events
  • initial seed
  • ... but also
  • number of workers
  • number of events per worker

16
Minimizing deployment overhead
17
Ease of use
  • user-friendliness
  • G4 simulation developer should not need to fight
    with irrelevant technical problems when moving
    from sequential to parallel G4 simulation
  • as non-intrusive as possible
  • minimize necessary code changes in original
    simulation
  • good separation of the subsystems
  • G4 simulation does not need to know that it runs
    in parallel...
  • the distributed framework (DIANE) does not need
    to care about what actually is being simulated
    (see Slide 20)

18
What is DIANE?
  • RD project in IT/API
  • semi-interactive parallel analysis for LHC
  • middleware technology evaluation choice
  • CORBA, MPI, Condor, LSF...
  • also see how to integrate API products with GRID
  • prototyping (focus on ntuple analysis)
  • time scale and resources
  • Jan 2001 start (lt 1 FTE)
  • June 2002 running prototype exists
  • sample Ntuple analysis with Anaphe
  • event-level parallel Geant4 simulation

19
What is DIANE?
  • framework for parallel cluster computation
  • application-oriented
  • master-worker model common in HEP applications
  • application-independent
  • apps dynamically loaded in a plugin style
  • callbacks to applications via abstract interfaces
  • component-based
  • subsystems and services packaged into component
    libraries
  • core architecture uses CORBA and CCM (CORBA
    Component Model )
  • integration layer between applications and the
    GRID
  • environment and deployment tools

20
Master/Worker model
  • applications share the same computation model
  • so also share a big part of the framework code
  • but have different non-functional requirements
  • CPU vs IO intensive
  • semi-interactive vs batch etc....

21
What DIANE is not
  • DIANE is not
  • a replacement for a GRID and its services
  • a hardwired analysis toolkit

22
DIANE and GRID
  • DIANE as a GRID computing element
  • ...via a gateway that understands Grid/JDL
  • ... Grid/JDL must be able to descibe parallel
    jobs/tasks
  • DIANE as a user of (low level) Grid services
  • ...authentication, security, load balancing...
  • and profit from existing 3rd party
    implementations
  • python environment is a rapid prototyping
    platform
  • and may provide a convinient connection between
    DIANE and Globus Toolkit via pyGlobus API

23
Architecture Overview
  • layering abstract middleware interfaces and
    components
  • plugin-style application loading

24
Conclusions
  • prototype deployment of G4-DIANE
  • significant performance improvement possible
  • scalability tests
  • 140 Mio Events
  • 70 nodes in the cluster
  • 1 hour total parallel execution
  • putting together DIANE and G4 is fairly easy
  • done in several days...
  • DIANE may bridge G4 to the GRID world
  • without necessarily waiting for fully-fledged
    GRID infrastructure to become available
Write a Comment
User Comments (0)
About PowerShow.com