Presentazione di PowerPoint - PowerPoint PPT Presentation

1 / 8
About This Presentation
Title:

Presentazione di PowerPoint

Description:

From my experience the running of ORCA analysis codes needs the attachment of ... http://webcms.ba.infn.it/cms-software/orca ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 9
Provided by: A381
Category:

less

Transcript and Presenter's Notes

Title: Presentazione di PowerPoint


1
Nicola De Filippis
Dipartimento Interateneo di Fisica
dellUniversità e del Politecnico di Bari and INFN
  • Outline
  • goals of the procedure
  • preparation of the POOL catalogues
  • attaching of the runs to the META data and the
    validation
  • conclusions

2
  • to provide an easy and fast access to data
    locally from the people performing analysis on
    physics channels
  • to test the handling of MCinfo, Hit, Digis and
    Pileup informations in a Tier-1/2 site, also via
    Grid tools
  • to understand the relationship between the META
    data and the events files in COBRA and POOL
    (still not so clear!)
  • to gain experience with data management and
    transfer

3
Datasets (PCP04) and POOL xml catalogues with
Hits, Digi and pileup fully attached to vergin
META data at Bari
Signal events Background
eg03_zz_2e2mu hg03_zbb_2e2mu_compHEP eg03_tt_2e2mu
(copying) hg03_zbb_cc_2e2mu_compHEP hg03_zbb_lc_2
e2mu_compHEP (the last two still in production)
eg03_hzz_2e2mu_160 eg03_hzz_2e2mu_170 eg03_hzz
_2e2mu_180 eg03_hzz_2e2mu_190
eg03_hzz_2e2mu_200 eg03_hzz_2e2mu_250
eg03_hzz_2e2mu_300 eg03_hzz_2e2mu_450
eg03_hzz_2e2mu_500 eg03_hzz_2e2mu_600
hg03_hzz_2e2mu_115a hg03_hzz_2e2mu_120a
hg03_hzz_2e2mu_130a hg03_hzz_2e2mu_140a
hg03_hzz_2e2mu_150a
  1. The samples not available locally were
    transferred using castorgrid or SRB
  2. A set of scripts was created in order to prepare
    local catalogues and attach runs
  3. The analysis job ran over a cluster (70 CPUs)
    with a hybrid configuration in order to run
    CMS production and analysis locally and in grid
    environment

2.
D. Giordano is one of the responsible of H ?
ZZ?2e2m analysis
4
  • (a) Preparation of local POOL xml catalogues in
    few steps
  • Downloading vergin (without runs) META data
    from CERN http//cmsdoc.cern.ch/cms/production/ww
    w/cgi/data/META and preparation of the
    related POOL xml catalogue
  • Preparation of the POOL xml catalogue of HITs
    and DIGIs runs by extracting the POOL compressed
    string of runs from RefDB (pileup data and
    catalogue assumed already in local)
  • Publishing the POOL catalog of META data, hits,
    digis and pileup in just one complete POOL file
    catalogue
  • Changing the physical filename of the files in
    the catalogue according to the local path of
    files or rfio path
  • Being sure that the META data are accessed
    locally and not via rfio

The POOL catalogue is READY to be used
5
  • (b) Attach of runs to vergin META data in few
    steps
  • Extracting the CARFResume runid of Digis string
    from RefDB or from summary files if available
  • Attaching the runs, fixing the final collection
    and checking the META data attached with dsDump
  • Validation by running ExSimHitStatistics and
    ExDigiStatistics ORCA executables to check the
    access to hits and digis locally.

The Data sample is READY to be analysed
From my experience the running of ORCA analysis
codes needs the attachment of DIGI runs alone
and the access to also META data and EVD of hits
(not necessary to be attached).
6
  1. all the procedures are based on the parser of the
    RefDB web pages and depend strongly on the
    structure of RefDB tables those can change due
    to multiple hit or digis fields used for tests or
    empty field in tables.
  2. it happens that after the decompression of POOL
    strings some characters (like 41435a) exist in
    the pool fragment related to a run in this case
    you have to remove them using the sed command
    already included in the scripts.
  3. in addition to the expected META data related to
    right owners, other ones are sometimes necessary
    to be downloaded and published in the POOL xml
    catalogue (mostly Configuration files). This
    problem was related to a sometimes wrong
    procedure of initialization of Digi META data at
    CERN it cannot be avoided. 5-10 of datasets
    should be affected by this problem.
  4. problems related to an old version of POOL . The
    catalogue has to migrated into a new one with the
    command XMLmigrate_POOL1toPOOL1.4
  5. FCrenamePFN is very slow with large xml catalogs
    in replacing dummy path into local path of files!


7
  1. Sometimes the CARFResume runid string in smry
    files is different from the one in RefDB because
    of multiple submitted jobs (in RefDB the tables
    are only updated the first time you sent the smry
    file) .so it is better to extract them from
    RefDB in order to access to validated
    informations
  2. Sometimes the runid as extracted from RefDB
    tables is not correct so the script has to be
    tuned to work properly (the field number has only
    to be changed).

8
  1. Publishing data for analysis is possible in a
    Tier 1-2 now!
  2. The global procedure can be optimized and
    automatized (is under discussion in DAPROM)
  3. Im in contact with KA (A. Schmidt) for creating
    scripts for analysis job submisson in Grid

All the scripts and the documentation are
available in a the file kit_for_analysis.tar.gz
at link http//webcms.ba.infn.it/cms-software/or
ca For information mail to Nicola.Defilippis_at_ba.
infn.it
Write a Comment
User Comments (0)
About PowerShow.com