Status GridKa - PowerPoint PPT Presentation

About This Presentation
Title:

Status GridKa

Description:

Scheduled batch analysis using GRID (Event Summary Data and Analysis ... Also PDC data need to be transferred for prototyping and testing of analysis code. ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 15
Provided by: kili2
Category:
Tags: gridka | status | testing

less

Transcript and Presenter's Notes

Title: Status GridKa


1
Status GridKaALICE T2in Germany
  • Kilian Schwarz
  • GSI Darmstadt

2
ALICE T2
  • Present status
  • Plans and timelines
  • Issues and problems

3
Status GridKa
  • Pledged 600 KSI2k, delivered 133, 11 of ALICE
    jobs (last month)

FZK
CERN
4
GridKa main issue
  • Resources provided according to megatable
  • The share among Tier1s comes automatically when
    considering the Tier2s connecting to this Tier1
  • GridKa pledges 2008 tape 1.5 PB, disk 1 PB
  • Current megatable tape 2.2 PB !!!
  • ? Much more than pledged, more than all other
    experiments together, most of the additional
    demand due to the Russian T2 (0.8 PB)

The point is the money is fixed. In principle
switch between tape/disk/cpu should be possible
not on short notice, though. Eventually for 2009
things still can be changed.
5
GridKa one more issue
  • disk cache in front of the mass storage how to
    compute this value ?
  • Suggestion
  • strongly depending on the ALICE computing model
    and therefore the formula to compute it should be
    the same for all T1 centres.
  • The various parameters in the formula should be
    defined by individual sites, according to actual
    MSS implementation (dCache, DPM, xrootd, )

6
ALICE T2 present status
CERN
GridKa
150 Mbps
Grid
30 TB 120 ALICEGSISExrootd
vobox
LCG RB/CE
GSI Batchfarm (39 nodes/252 cores for ALICE)
GSIAF(14 nodes)
Directly attached disk storage (55
TB) ALICEGSISE_tactical xrootd
PROOF/Batch
GSI
7
Present Status
  • ALICEGSISExrootd
  • gt 30 TB disk on fileserver (8 FS a 4 TB each)
  • 120 TB disk on fileserver
  • 20 fileserver 3U 15500 GB disks RAID 5
  • 6 TB user space per server
  • Batch Farm/GSIAF and ALICEGSISE_tacticalxroo
    td
  • nodes dedicated to ALICE
  • 15 D-Grid funded boxes each
  • 22core 2.67 GHz Xeon, 8 GB RAM
  • 2.1 TB local disk space on 3 disks system disk
  • Additionally 24 new boxes each
  • 24core 2.67 GHz Xeon, 16 GB RAM
  • 2.0 TB local disk space on 4 disks including
    system

8
ALICE T2 short term plans
  • Extend GSIAF to all 39 nodes
  • Study coexistence of interactive and batch
    processes on the same machines. Develop
    possibility to increase/decrease the number of
    batch jobs on the fly to give advantage to
    analysis.
  • Add newly bought fileservers (about 120 TB disk
    space) to ALICELCGSExrootd

9
ALICE T2 medium term plans
  • Add 25 additional nodes to GSI Batchfarm/GSIAF to
    be financed via 3rd party project (D-Grid)
  • Upgrade GSI network connection to 1 Gbs either as
    dedicated line to GridKa (direct T2 connection to
    T0 problematic) or as general internet connection

10
ALICE T2 ramp up plans
 http//lcg.web.cern.ch/LCG/C-RRB/MoU/WLCGMoU.pdf
11
Plans for the Alice Tier 23 at GSI
  • Remarks
  • 2/3 of that capacity is for the tier 2 (ALICE
    central, fixed via WLCG MoU)
  • 1/3 for the tier 3 (local usage, may be used via
    Grid)
  • according to the Alice computing model no tape
    for tier2
  • tape for tier3 independent of MoU
  • hi run in October -gt upgrade operational 3Q each
    year

12
ALICE T2/T3
Language definition according to GSI
interpretation ALICE T2 central use ALICE T3
local use. Resources may be used via Grid. But no
pledged resources.
  • remarks related to ALICE T2/3
  • At T2 centres are the Physicists who know what
    they are doing
  • Analysis can be prototyped in a fast way with the
    experts close by
  • GSI requires flexibility for optimising the ratio
    of calibration/analysis simulation at tier2/3

13
ALICE T2 use cases (see computing model)
  • Three kinds of data analysis
  • Fast pilot analysis of the data just collected
    to tune the first reconstruction at CERN Analysis
    Facility (CAF)
  • Scheduled batch analysis using GRID (Event
    Summary Data and Analysis Object Data)
  • End-user interactive analysis using PROOF and
    GRID (AOD and ESD)

CERN Does first pass reconstruction Stores one
copy of RAW, calibration data and first-pass
ESDs T1 Does reconstructions and scheduled
batch analysis Stores second collective copy of
RAW, one copy of all data to be kept, disk
replicas of ESDs and AODs T2 Does simulation
and end-user interactive analysis Stores disk
replicas of AODs and ESDs
14
Data reduction in ALICE
RAW 14MB/ev
RAW 1.1MB/ev
  • In principle individual file transfer works
    fine, now. Plan next transfers with Pablos new
    collections based commands. Webpage where
    transfer requests can be entered and transfer
    status can be followed.

15
data transfers CERN GSI
  • motivation calibration modell and algorithms
    need to be tested before October
  • test the functionality of current T0/T1 ? T2
    transfer methods.
  • At GSI the CPU and storage resources are
    available, but how do we bring the data here ?

16
data transfer CERN GSI
  • The system is not ready yet for generic use.
    Therefore expert control by a mirror
    master_at_CERN is necessary.
  • In principle individual file transfer works
    fine, now. Plan next transfers with Pablos new
    collections based commands. Webpage where
    transfer requests can be entered and transfer
    status can be followed up.
  • So far about 700 ROOT files have been
    successfully transfered. This corresponds to
    about 1 TB of data.
  • 30 of the newest request still pending.
  • Maximum speed achieved so far 15 MB/s (almost
    complete bandwidth of GSI), but only during a
    relatively short time
  • Since August 8 no relevant transfers anymore.
    Reasons
  • August 8, pending xrootd update at Castor SE
  • August 14, GSI SE failure due to network problems
  • August 20, instability of central AliEn services.
    Production comes first
  • -- Up to recently AliEn update
  • GSI plans to analyse the transferred data ASAP
    and to continue with more transfers. Also PDC
    data need to be transferred for prototyping and
    testing of analysis code.

17
data transfer CERN GSI
18
ALICE T2 problems and issues
  • Where do we get our KSI2k values from for
    monitoring of CPU usage. Currently
    http//www.spec.org/cpu/results (but e.g. HEPiX
    intel CPUs not complete performance available
    for typical HEP applications since optimised for
    Intel compilers etc
  • How to do comparision between values published in
    ALICE and WLCG ?
Write a Comment
User Comments (0)
About PowerShow.com