Expanding the PHENIX Reconstruction Universe - PowerPoint PPT Presentation

1 / 8
About This Presentation
Title:

Expanding the PHENIX Reconstruction Universe

Description:

Why we must augment PHENIX reconstruction sites. Description of the ACCRE facility ... 'Reconstruction time is unsolved and unmanageable at this point. ... – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 9
Provided by: charles175
Category:

less

Transcript and Presenter's Notes

Title: Expanding the PHENIX Reconstruction Universe


1
Expanding the PHENIXReconstruction Universe
  • C.F. Maguire, P. Sheldon, A. Tackett
  • Vanderbilt University

2
Outline
  • Why we must augment PHENIX reconstruction sites
  • Description of the ACCRE facility
  • What ACCRE can propose to PHENIX
  • Missing information or infrastructure?
  • How should we proceed?

3
Need to Expand PHENIX Reconstruction Universe
  • Run4 Experience (data from Table 1 of Run6 BUP)
  • 270 TBytes of AuAu 200 GeV data taken
    correspondingto 241 ?b-1 with data taking
    completed by June 2004(10 TBytes of 62.4 GeV
    AuAu and 35 TBytes of pp 200 GeV data also
    taken)
  • Last of Run4 data reconstruction and analysis
    completed only slightly before (May-June) QM2005
    - a long wait by all
  • Run6 Planning
  • Hope to obtain a factor of 4 increase in AuAu
    200 GeV data size(last run for AuAu with
    minimum radiation length in central arm)
  • How do we plan to reconstruct this 1 PByte data
    set? Can we have significant amounts of data in
    time for QM06 (Nov. 2006?), QM07?
  • Deliberately provocative statement to us all from
    the spokesperson Reconstruction time is
    unsolved and unmanageable at this point.
  • One solution expand the universe of PHENIX
    reconstruction facilitiesbuilding on what we
    learn from similar efforts in Run5

4
Off-Site Reconstruction in Run5(as quoted in
Run6 BUP)
  • Level 2 triggered data reconstructed at ORNL
  • Impressive showing of J/Psi CuCu results at
    QM05
  • Excellent near-real time feedback on quality of
    J/Psi dataduring the run itself
  • ORNL wants to expand on its capability for future
    Runs
  • Run5 pp polarized data to CC-J
  • Well publicized 60-day continuous transfer of
    data from counting house buffer boxes to Riken
    computer center in Japan
  • Highlighted at last months JPS/DNP meeting in
    MauiAlso a main article in CERN courier
    newsletter this summer
  • 270 TBytes of data were transferred corresponding
    to asustained rate of 60 MBytes/second (special
    network topology)
  • Data stored in HPSS at CC-J to be reconstructed
    later foranalysis presentations during October
    2005 PANIC meeting.

5
What is ACCRE at Vanderbilt?
  • ACCRE
  • Advanced Computing Center for Research and
    Education
  • Collaborative 8.5M computing resource funded by
    Vanderbilt
  • Presently consists of over 1500 processors and 50
    TB of disk(VU group has its own dedicated 4.5 TB
    for PHENIX simulations)
  • Much work by Medical Center and Engineering
    school researchersas well as by Physics
    Department groups
  • ACCRE eager to get into physics experiment
    reconstructionfirst PHENIX and then CMS
  • Previous PHENIX Use of ACCRE
  • First used extensively for supporting QM02
    simulations
  • Order of magnitude increased work during QM05
    simulations
  • QM05 simulation effort hardly came close to
    tapping ACCREs full potential use for PHENIX
  • Discovered that the major roadblock to expanding
    use was the need to gain an order of magnitude
    increase in sustained, reliable I/O rate back to
    BNL

6
What ACCRE Can Propose(subject to actual
benchmarking on ACCRE CPUs)
  • Assume the PHENIX Run6 BUP scenario
  • Begin with 13 weeks of AuAu at 200 GeV, goal of
    1 nb-1
  • Data will be a mix of triggered and min bias
  • Assume that 1 PByte will eventually be generated,
    corresponds to127 MBytes/second (!) in a 13 week
    period (can DAQ really do this?)
  • ACCRE proposes to process 15 of these data (150
    TBytes)
  • Corresponds to 19 MBytes/second sustained
    transfer to ACCREThis is 1/3 the rate achieved
    to CC-J from BNL from counting house
  • Data would be reconstructed in near-real time at
    ACCREsince no large archival system is available
    at Vanderbilt
  • 10K Min Bias Run4 events reconstructed in 7 CPU
    hours (Carla Vale e-mail)
  • Run4 270 TBytes 1.3 billion events -gt 720
    million events net to ACCRE
  • Steady state requires 230 CPUs running
    continuously for the 13 weeks in order to
    reconstruct these 720 million events ( 500K
    CPU-hrs total for 150 TBytes)
  • Realistic duty (safety) factor 0.7 means 330 CPUs
    should be available
  • Reconstructed output must be returned immediately
    to BNL
  • Assume reconstructed output data size 25 of
    input data size (?)
  • This would require 5 MBytes/second sustained on
    return trip to BNL

7
Missing Information and Infrastructure
  • What will RHIC be running in Run6 and when?
  • Does it make more sense to reconstruct theLevel2
    triggered events instead of the MB events?
  • This is what ORNL did for Run5 with many fewer
    CPUs
  • What are the event reconstruction times on ACCRE
    CPUs?
  • Missing infrastructure?
  • We must transfer the data while it is still on
    the buffer boxes
  • Can the special network topology created for the
    Run5 pp datatransfer to CC-J be expanded to
    accommodate transfers to ACCRE?Can the buffer
    boxes handle the additional I/O load?
  • QM05 simulations used BBFTP tool to RCF but this
    was too slowWe want start with gridFTP on ACCRE
    (must still be demonstrated)
  • How much additional disk space do we need at
    ACCRE? At25 MBytes/second then 30 TBytes
    corresponds to two weeks buffer.
  • What about newer alternatives to gridFTP, e.g.
    IBP depots?

8
How Should We Proceed?
  • Coordination needed within PHENIX and with RCF
  • The Run5 remote sites will want to continue their
    efforts in Run6
  • Coordination needed between the sites to share
    available BWThere was obvious BBFTP competition
    between CCJ and VU in the summer
  • What new infrastructure is needed at BNL to
    support this effort?Will transfer of
    reconstructed output into HPSS become an issue?
  • A proposal will be made to DOE to support this
    effort
  • This work should not become a net cost to ACCRE
  • DOE is getting the benefit of 15 faster
    turnaround in the analysis
  • The 330 CPUs are available for sure in Run6, but
    how to we ensure that another VU group doesnt
    budget for them in the future?
Write a Comment
User Comments (0)
About PowerShow.com