Enabling Grid Computer for HEP - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Enabling Grid Computer for HEP

Description:

Guinea Pig: James. Goal: integration and support. Don't care with computers, grid, popcorn machine: if available, they use them ... – PowerPoint PPT presentation

Number of Views:229
Avg rating:3.0/5.0
Slides: 22
Provided by: Jam991
Category:
Tags: hep | care | computer | enabling | grid | guinea | pig

less

Transcript and Presenter's Notes

Title: Enabling Grid Computer for HEP


1
Enabling Grid Computer for HEP
  • Babar Team at
  • University of Manchester
  • Resources www.hep.man.ac.uk/u/jamwer

2
Human resource strategy
Jobs with 5 events instead Millions.
3
Resources Strategy
4
Grid Test Bed
5
(No Transcript)
6
Software 850 packages. Tau Datasets range
between 60 files 1GB and 150 files 1GB Total
4,000 GB 10,000 files
7
Analysis Submission to Grid
(Prototype)
  • Single command ./easygrid dataset_name
  • Perform Handlers management and submission
  • Software based in State-machine
  • Verify skimdata available
  • If not available perform BbkDatasetTCL to
    generate skimData. Each file will be a job.
  • Verify if there are handlers pending
  • If not, script generation (gera.c) with
    edg-job-submit and ClassAdds, and script
    execution. Nest for submission policy and
    optimisation.
  • If yes, verify job status. When the all jobs
    ended, recover results in user folder.

8
Generation and submission
  • jamwer_at_bfb babar ./easygrid SP-1005-Tau11-R14
  • Invalid configuration filename
    /opt/edg/etc/vomses
  • Your identity /CUK/OeScience/OUManchester/LHE
    P/CNjames werner
  • Enter GRID pass phrase for this identity
  • Creating temporary proxy .........................
    ................................ Done
  • Creating proxy ...................................
    ................. Done
  • Searching pre selected skimdata.
  • Searching previous handlers.
  • Handlers not found. Submiting to GRID . Wait end
    of process...

9
Job Status
  • jamwer_at_bfb babar ./easygrid SP-1005-Tau11-R14
  • Invalid configuration filename
    /opt/edg/etc/vomses
  • Your identity /CUK/OeScience/OUManchester/LHE
    P/CNjames werner
  • Enter GRID pass phrase for this identity
  • Creating temporary proxy .........................
    ... Done
  • Creating proxy ...............................
    Done
  • Searching pre selected skimdata.
  • Searching previous handlers. Checking if jobs
    finished.
  • Handle -gt https//lcgrb01.gridpp.rl.ac.uk9000
    /foRHhWyeDBnbqA9JkDADLg
  • Current Status Scheduled
  • https//lcgrb01.gridpp.rl.ac.uk9000/foRHhWy
    eDBnbqA9JkDADLg still pendent.
  • Handle -gt https//lxn1188.cern.ch9000/8DdK3xr
    uxtevNpei3zZbaA
  • Current Status Scheduled
  • https//lxn1188.cern.ch9000/8DdK3xruxtevNpe
    i3zZbaA still pendent.
  • 4 jobs did not finished ! Try again later.

10
Job Status and recovery
  • jamwer_at_bfb babar ./easygrid SP-1005-Tau11-R14
  • Invalid configuration filename
    /opt/edg/etc/vomses
  • Your identity /CUK/OeScience/OUManchester/LHE
    P/CNjames werner
  • Enter GRID pass phrase for this identity
  • Creating temporary proxy .........................
    ................. Done
  • Creating proxy ...................................
    ........................ Done
  • Searching pre selected skimdata. Searching
    previous handlers.
  • Checking if jobs finished.
  • Handle -gt https//lcgrb01.gridpp.rl.ac.uk9000
    /foRHhWyeDBnbqA9JkDADLg
  • Current Status Done
  • Exit code 0
  • Handle -gt https//lxn1188.cern.ch9000/8DdK3xr
    uxtevNpei3zZbaA
  • Current Status Done
  • Exit code 0
  • 0 jobs did not finished ! Try again later.
  • All jobs done. Recovering results in your folder.
    Results in the following folders
    /home/jamwer/grid_sub/babar/jamwer_foRHhWyeDBnbqA9
    JkDADLg /home/jamwer/grid_sub/babar/jamwer_8DdK3xr
    uxtevNpei3zZbaA

11
Monte Carlo Submission to Grid
(Prototype)
  • Single Command ./mcgrid JobName num_copies
  • Perform Handlers management and submission.
  • Software based in State-Machine
  • Verify if there are handlers pending
  • If not, script generation (geramc.c) with
    edg-job-submit and ClassAdds for each copy, and
    script execution. Nest for submission policy and
    optimisation.
  • If yes, verify job status. When the all jobs
    ended, recover results in user folder.

12
MC Submission
  • jamwer_at_bfb mcgrid1 ./mcgrid MCteste 3
  • Invalid configuration filename
    /opt/edg/etc/vomses
  • Your identity /CUK/OeScience/OUManchester/LHE
    P/CNjames werner
  • Enter GRID pass phrase for this identity
  • Creating temporary proxy .........................
    ........ Done
  • Creating proxy ...................................
    .................... Done
  • Searching previous handlers. Handlers not found.
  • Submiting to GRID . Wait end of process...

13
Job Status
  • jamwer_at_bfb mcgrid1 ./mcgrid MCteste 3
  • Invalid configuration filename
    /opt/edg/etc/vomses
  • Your identity /CUK/OeScience/OUManchester/LHE
    P/CNjames werner
  • Enter GRID pass phrase for this identity
  • Creating temporary proxy .........................
    ............... Done
  • Creating proxy ...................................
    .... Done
  • Searching previous handlers. Checking if jobs
    finished.
  • Handle -gt https//lxn1188.cern.ch9000/9WzceoI
    MEQoTK24a-UvOmw
  • Current Status Scheduled
  • https//lxn1188.cern.ch9000/9WzceoIMEQoTK24
    a-UvOmw still pendent.
  • Handle -gt https//lcgrb01.gridpp.rl.ac.uk9000
    /c4iCB8vioozaGteI9hybIg
  • Current Status Ready
  • https//lcgrb01.gridpp.rl.ac.uk9000/c4iCB8v
    ioozaGteI9hybIg still pendent.
  • Handle -gt https//lcgrb01.gridpp.rl.ac.uk9000
    /L5BD1OE--eckTm5RXkp2nA
  • Current Status Ready
  • https//lcgrb01.gridpp.rl.ac.uk9000/L5BD1OE
    --eckTm5RXkp2nA still pendent.
  • 3 jobs did not finished ! Try again later.

14
Job status and recovery
  • jamwer_at_bfb mcgrid1 ./mcgrid MCteste 3
  • Invalid configuration filename
    /opt/edg/etc/vomses
  • Your identity /CUK/OeScience/OUManchester/LHE
    P/CNjames werner
  • Enter GRID pass phrase for this identity
  • Creating temporary proxy .........................
    ......................... Done
  • Creating proxy ...................................
    ................. Done
  • Searching previous handlers. Checking if jobs
    finished.
  • Handle -gt https//lxn1188.cern.ch9000/9WzceoI
    MEQoTK24a-UvOmw
  • Current Status Done
  • Exit code 0
  • Handle -gt https//lcgrb01.gridpp.rl.ac.uk9000
    /c4iCB8vioozaGteI9hybIg
  • Current Status Done
  • Exit code 0
  • 0 jobs did not finished ! Try again later.
  • All jobs done. Recovering results in your folder.
    Results in the following folders
    /home/jamwer/grid_sub/mcgrid1/jamwer_9WzceoIMEQoTK
    24a-UvOmw /home/jamwer/grid_sub/mcgrid1/jamwer_c4i
    CB8vioozaGteI9hybIg /home/jamwer/grid_sub/mcgrid1/
    jamwer_L5BD1OE--eckTm5RXkp2nA

15
Testing Submission Script
  • Load Range Worker load x Files
  • 16 x 60 files 960 jobs pendent
  • 16 x 150 files 2400 jobs pendent
  • Test with Submission script

sslv3 alert handshake failure Please wait
job enter the Done status. This never
happens! Resource Broker not reliable or robust.
Sometimes failure 3 days a week or takes hours to
submit/dispatch to CE (empty!).
16
Pending Infrastructure gt Course of action
  • Babar Software Know How is not available at
    Manchester gt Web Page Network skills.
  • Quality Assurance gt We are OK! from benchmark (E
    x P)
  • Real Application to perform complete cycle,
    acquire know how, and grid prof-of-concept is
    missing gt Partnership with physicists
  • CERN does NOT recognise Babar Community gt Lets
    reduce their priority!
  • RB at Manchester gt 60MB binaries and policies
    freedom.
  • SE/RC at Manchester gt policies and submission
    jobs freedom.
  • Mass storage (10TB) for Babar purposes gt CAP!
  • UI in the AFS gt wide access to Manchester farms.
  • Apprenticeship at RAL and later at SLAC
    production and experiment gt improve where others
    fail
  • Configuration for optimal job performance/submissi
    on at Tear 2 (1 Ce x 50 WN? Performance dCache
    with Babar Software? Why 10TB if Liverpool bought
    80TB? Electricity bill? gt analyse procedures to
    improve QoS and better Site Configuration
  • Update (software and data) and operational
    policies gt operational standards to achieve high
    QoS

17
Aimed Hardware Architecture
(Redundant RB with alternate access)
18
Aimed Software Architecture
19
Production Job Submission Package
  • Operational policies/integration with RB
    (application level).
  • Recovery of aborted status.
  • Resources optimisation.
  • Integration with RC (application level) for
    replicas policies development.
  • Interactive data visualisation (Useful?)
  • Integration with GridSite (Data visualisation,
    analysis, performance monitor, and submission)
  • Professional version.

20
Integrate LCG2 and Job Submission with
Babar/CM2 at University of Manchester for Tau
Physics modelling, analysis and MC generation.
Summary
  • We aim to be soon
  • The largest site in UK.
  • Leader in grid computing and HEP

21
Conclusion
  • Babar CM2 is running at Manchester!
  • LCG2 Grid is running with real world experiment!
  • Babar submission prototype to Grid is running !
  • LCG is not LHC software only! It is Babars.
  • We are doing today what will take years to you to
    achieve. Lets work together!
Write a Comment
User Comments (0)
About PowerShow.com