Title: Ganga Status Update Will Reece
1Ganga Status UpdateWill Reece
2Outline
- User Statistics
- User Experiences
- New Features in 4.3.0
- Upcoming Features
- Reference Manual
- Testing Tools
- Summary
3User Statistics
25 Users
- 557 Unique Users Since Jan 1, 110 per Week
- 113 LHCb Users, 25 Unique per Week
http//gangamon.cern.ch8888/
4User Experiences
- Feedback from Active LHCb Users
- Helps prioritize features
- Tells us what Needs Improvement
- and what is already good!
- Mailing Lists Good Source
- Will Look at Some Case Studies
5Robert Lambert
- Used Gauss to Generate 70m Events
- Studying final state asymmetries ? custom decay
- Needed 10-3 precision across 10 Pt bins
- Compared Custom Decay with DC06
- Used Ganga and DIRAC ? 4000 Jobs
- 2 Years of CPU Time!
- Very Happy with DIRAC Success rate
- Ganga Front-end Really Easy!
- Likes SplitByFiles (but Replica Issues)
- Wants Merge of Subjobs
6Eduardo Rodrigues
- Toy MC Used for g Sensitivity Studies
- Bs?Dsp, Bs?DsK channels
- Needed large data set ? Used Ganga and LCG
- Uses ROOT and RooFit ? Root App
- Ran 3000 toy experiments
- Each experiment takes 2-3 hours ? 1 year CPU!
- Had some problems with LCG ? Planning to use
Dirac - Using PyROOT for e.g. Simplified Studies
- Root App and LCG Backend with standard python
modules - Has had good experience both with LSF and Grid
7Mitesh Patel
- Uses Ganga to Study Small Backgrounds
- B ? (D0/D0)(?Kp,KK,pp)K (LHCB-2006-066)
- Looking at suppressed (10-7) decays to measure g
- Bd ? Kmm as New Physics Probe (LHCB-2007-038)
- Uses full sample b?m, b?m and b?c?m to ntuple
- Likes Splitters but Would Like More Warnings
- Has Submitted 1000s of Jobs
- Benefited from Developer Support
- More Examples Would be Nice
8New in 4.3.0
- GNU GPL License
- Sun Grid Engine Support
- Core Updates
- Oracle backend for remote repository
- Subjob access to job repository optimized
- DIRAC Support for Root Application
- PyROOT
- Run python jobs using the ROOT libraries
- Gaudi Updates ROOT Map files
- Many Bugfixes ? Improved Stability!
- Testing framework
http//ganga.web.cern.ch/ganga/release/4.3.0/
9Ganga Goes GPL
- 4.3.0 is First GPL Release
- Aim is to protect project
- Applies to Future Releases
- Ganga Used Commercially
- Clear license needed
http//www.gnu.org/licenses/gpl.html
10SGE Backend Now Supported
- Sun Grid Engine Support Added
- Common batch system
- Can Use Following Applications
- Executable
- Root
- Any Gaudi
11DIRAC Submission for ROOT
- Submit Jobs Using ROOT to DIRAC
- Uses new functionality in DIRAC v2r13
- DIRAC Recommended for Remote ROOT Jobs
- Improved reliability
- Superior job debugging info
- Excellent job monitoring
DIRAC is LHCb Standard for Distributed Analysis
12PyROOT Support
- ROOT Provides Python Bindings
- Python is quick and easy to write ? Productive!
- Ganga Now Supports Use in Root App
- Need Correct Python Version for ROOT
- Determined Automatically
- LHCb Configuration uses LCG versions
- /afs/cern.ch/sw/lcg/external/
- Can be controlled in .gangarc file
13(No Transcript)
14PyROOT Support
- Root Documentation Updated
- help(Root) in Ganga
15Gaudi Updates ROOT Map
- ROOT Map used to Auto-load Libraries
- Found via CMT
- Now Preparing for 4.3.x
- Expect new LHCb Functionality in 4.3.2
16Upcoming Features
Features planned for 4.3.x or 4.4
- Framework for Job Merging
- Merge text and ROOT files
- Job Slices
- LFC Aware Splitter for Gaudi
- Caching for Datasets
- Summary Printing of Objects
- Improved Credential Management
https//twiki.cern.ch/twiki/bin/view/ArdaGrid/Gang
aIndexGangaFourFour
17Merging of Jobs and Subjobs
Ganga 4.3.x
- Jobs may have Many Subjobs
- Hand Merge?
- Time Consuming and Error Prone ? Automate
- Merge Subjobs
- Combines subjob output
- Can Run on Master Job Completion
- or from Command Line
- Merging Text and ROOT Files Supported
- What else is needed?
- Can Merge Lists of Jobs
18Automatic Merge
- Attach Merge Object to Job
- Merge run on completion
19Command Line Merge
- Create List of Jobs to Merge
- Will recursively merge subjobs
- Run Merge on Command Line
- Support Job Slices in Ganga 4.4
20Types of Merge
- TextMerger Concatenate Text
- Unordered, but adds headers
- RootMerger Combines ROOT Files
- Uses hadd ? Adds histograms and trees
- MultipleMerger Chain Merge Objects
- SmartMerger Merge by Extension
- Associations in .gangarc file
21Job Slices
Ganga 4.4
- Change Semantics of jobs Object
- Support slices ? jobs-1, jobs05
- Index by Job ID ? use __call__ e.g. jobs(45)
- Allow Job Operations on Slices
- copy, fail, kill, peek, remove, resubmit, submit
- Job Subjobs also a Job Slice
- Can Create Job Slice with select
- select(time'yesterday')
- select(status'failed')
https//twiki.cern.ch/twiki/bin/view/ArdaGrid/Gang
aJobIndexingSlices
22LFC Aware Splitter for Gaudi
- Gaudi Provides SplitByFiles
- Splits job into subjobs with subset of data files
- Data Files not Available in all Sites
- Some subjobs are unrunnable
- DIRAC v2r14 Allows Query of LFC
- Sort files by location ? optimal splitting
- New DiracSplitter
- Splits files by file locations. Must use LFNs
- Protects against mistyped file names ? Error
Ganga 4.4
23Performance of LFC Replica Query
- Last SW Week
- DIRAC v2r13 LFC Query Slow
- 0.5s per file ? 5min for 600 files
- DIRAC v2r14 Bulk Query
- Much Improved Performance
- Factor 10 times faster
- 30s for 600 files
- Thanks to DIRAC Team!
DIRAC v2r13 Single Query
DIRAC v2r14 Multiple Query
24Performance of LFC Replica Query
- Further Speed Up Needed?
- Multithreaded query worse
- Limited by LFC
- Queue system used?
- Use Replica Caching
- Cache stored per file
- Cache date stored
- Users Query with Dataset
- updateReplicaCache()
- DiracSplitter Still Slow
- Will print time estimate at start
1397 Unique Files Queried
Error bars show s of 5 measurements
25Printing Summary of Objects
- Printing Verbose
- E.g. Job object with many subjobs
- Summary as Default
- Lists show length
- Objects define own summary
- Get Full Print
- full_print(j)
- Same on object attributes
Ganga 4.4
26(No Transcript)
27Improved Credential Management
Ganga 4.4
- Ganga Manages Credentials That Expire
- AFS Token, Grid Proxy
- Expiring Tokens Affect Ganga Session
- Ganga May Not Clean-Up Services on Exit
- Introducing InternalService Objects
- Ensures correct clean-up
- Services not used when expired
- Alert Users Before Credentials Expire
- Ganga Shuts Down Gracefully
28Upcoming Feature Remote Workspaces
- Roaming Ganga Profile
- Store Workspace Remotely
- Access input and output files anywhere
- Work across multiple machines
- Local Cache Created on Demand
- Currently at Prototyping Stage
- Exciting new functionality!
- Release Schedule is Uncertain
Ganga 4.x
29The Ganga Reference Manual
- Aim is to Show Ganga Help Online
- Same information as help in Ganga
- Documentation Generated from Source
- Have Prototype Online
- Missing documentation to be filled in ? on-going!
- Manual will be Generated with Release
- Feedback on Documentation Appreciated
- Let us know if anything is not clear
http//ganga.web.cern.ch/ganga/user/GPI/
30(No Transcript)
31Testing Tools
- Use Test Framework
- Based on unittest
- Reports with Release
- Helps Find Bugs!
- Now Collect Coverage
- Use Figleaf Library
- Should improve testing
- Identifies untested code
32(No Transcript)
33The LHCb Distributed Analysis Mailing List
- Replaces Current List for LHCb Users
- project-ganga_at_cern.ch
- lhcb-distributed-analysis_at_cern.ch
- Can sign up at http//simba2.cern.ch
- Encourages User Community
- Less support burden for developers!
https//mmm.cern.ch/public/archive-list/l/lhcb-dis
tributed-analysis/
34Summary
- User Statistics 557 Unique Users in 07
- Ganga is de facto Grid front end tool for LHCb
- Ganga has New Features in 4.3.0
- Dirac Handler for Root, PyROOT Support, etc.
- Interested Features Upcoming
- Merge framework, DiracSplitter
- Reference Manual Coming Soon
http//ganga.web.cern.ch/ganga/