GRIDbased data handling for CDFUK - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

GRIDbased data handling for CDFUK

Description:

Data locator. Data copier. Globus toolkit. Job Submission. 21st Nov ... of more people. Oxford has created a new technician post which will free up some effort ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 17
Provided by: ianm153
Category:

less

Transcript and Presenter's Notes

Title: GRIDbased data handling for CDFUK


1
GRID-based data handling for CDF-UK
  • Ian McArthur
  • Physics Department, Oxford University
  • Ian.McArthur_at_physics.ox.ac.uk

2
The CDF Experiment
  • Tevatron proton-antiproton
  • collider generating
  • 7 Million collisions/sec
  • CDF data rates
  • 250KB/event 2TB/sec
  • 3-stage trigger reduces to
  • 100 events/sec 25MB/sec
  • which goes to tape.
  • Runs 1/3 of time to generate
  • 700GB/day
  • 250TB/year 4 years
  • Data expected March 2001

3
Processing/analysis scheme
  • Reconstruction carried out at Fermilab (FNAL).
    Reduces data volume to about 50KB/event. Writes
    out Physics Analysis Data (PAD) containing
    objects like hits, tracks etc.
  • PAD data has about 10 streams sorted by Physics
    interest.
  • Each stream grows by typically 6TB/year,
    (15GB/day).
  • UK physicists will be interested in a few of
    these streams.
  • Data analysis runs on PAD data and calculates
    physics parameters writing out a brief summary
    per event called an NTUPLE (10KB/event)
  • The final stage of analysis selects and summaries
    these NTUPLES (outputs histograms).

4
Processing Requirements in the UK
  • Code Development and testing
  • Runs many times
  • Need to run on databases of approx 30GB
  • Aim for near interactive performance, 5 mins
    100MB/sec
  • Need to locate datasets containing most
    up-to-date or best data
  • Analysis
  • Run only a few times
  • Typically run on large dataset, approx 6TB
  • Aim for overnight turnaround, 16 hrs 100 MB/sec
  • Aim to keep at least one stream up-to-date via
    network transfers, (15GB/day 1.4Mbps)
  • Analysis summary
  • Runs many times
  • Datasets of a few 100 GB.
  • Most interactive phase, fastest possible
    turnaround required.
  • Monte Carlo simulated events
  • Can be very effectively distributed

5
UK Hardware Status
  • CDF collaborators in the UK applied for JIF grant
    for IT equipment in 1998. Awarded 1.67M in
    summer 2000.
  • First half of grant will buy
  • Multiprocessor systems plus 1TB of disk for 4
    Universities
  • 2 multiprocessors plus 2.5 TB of disk for RAL
  • A 32 CPU farm for RAL (Monte Carlo generation)
  • 5 TB of disk and 8 high end workstations for
    FNAL
  • Emphasis on high IO throughput super-workstations
    for end-stage analysis and code
    development/testing. Systems will also be used
    for running analysis jobs.
  • A dedicated network link from UK to FNAL

6
CDF-UK JIF Equipment
7
CDF-UK GRID Project
  • JIF proposal only covered hardware but in the
    meantime GRID has arrived !
  • Use existing tools to make the existing FNAL
    system more distributed.
  • Concentrate on solving real-user issues. Develop
    an architecture for locating data, data
    replication and job submission .
  • Make routine tasks automated and intuitive.
  • Gain experience in GRID and the Globus toolkit.

8
Some Requirements
  • We must make best use of the existing packages
    and minimise modifications to existing software.
    Have chosen to emulate the FNAL system so that
    analysis code can run without modification. Locks
    us into an existing architecture.
  • Aim to provide a scheme to allow efficient use of
    the users time.
  • Make best use of the systems available by placing
    jobs and data for optimal turnaround time.
  • Produce a simple but useful system ASAP.
  • Build a system on which more intelligent layers
    can be built later.

9
Design principles
  • All sites are equal
  • All sites hold meta-data describing only local
    data. This stores relationships between files and
    datasets, run conditions and luminosity.
  • Use LDAP to publish meta-data kept in
  • Oracle - at FNAL
  • msql also supported by software so use for local
    metadata ( mySQL may be possible later)
  • Use pre and post processing at endpoints of data
    transfer
  • take better account of resources e.g. use of
    near-line storage, disk space management
  • Execute consistency and sanity checks.
  • Use existing Disk Inventory Manager from FNAL.
  • Use Globus tools throughout single
    authentication.

10
Functionality at a site
  • Publication of the local metadata
  • Publication of  information about other system
    resources (CPU, Disk, Batch queues etc).
  • Transmission of data via network.
  • This may involve staging of data from tape to
    disk before transmission.
  • Receive data from the network or from tapes.
  • Copy or construct metadata
  • A mechanism to allow jobs from participating
    sites to be run.
  • Some sites may have reduced functionality

11
Data Location/Copy
12
Applications
User Interface
Dataset maintainer
...
Data locator
Data copier
Job Submission
Globus toolkit
13
Scope
  • Plan to install at
  • 4 UK universities (Glasgow, Liverpool, Oxford,
    UCL)
  • RAL
  • FNAL (although this would be reduced
    functionality, data and metadata exporter)
  • More non-UK sites could be included
  • Intend to have basic utilities in place at time
    of equipment installation (May 2001)

14
Work so far
  • Week long meeting with author of existing data
    handling software (DIM).
  • Project plan under development .
  • UK Universities have agreed to participate.
    Expect continuing support from RAL.
  • Globus installed at a number of sites. Remote
    execution of shell commands checked.
  • LDAP/database connectivity, (S.Fisher, RAL)
  • LDAP to Oracle via Python script
  • Java to LDAP (via JNDI) JNDI (Java Naming and
    Directory Interface) gives very elegant interface
    to LDAP

15
Longer Term plans and needs
  • Need to understand how to track status of
    distributed tasks.
  • Understand if this strategy is scalable to 100s
    of systems. Are there ways to organise LDAP so
    that its not too big an overhead ?
  • Could Monte Carlo generation take advantage of
    the same scheme of job placement and data
    cataloguing? This is generally even more
    distributed (high cpu/IO ratio)
  • 1st phase of equipment purchase will probably be
    SMP machines, easy management, single system
    image. Must be usable from day 1. With the right
    tools, 2nd phase could be more distributed
  • Need to work on selective expansion of C
    objects to minimise CPU overheads.
  • Expect user Interface to be implemented as Java
    application to give platform independence.
  • Make intelligent UI to automate or suggest
    strategies for replicating data/submitting jobs
    etc look to generic solutions
  • Extend UI to provide a more complete working
    environment. Integrate histogram presenters etc..
    generic solutions may be possible.

16
Effort
  • Universities finding small fractions of people
    adding up to around 0.5 FTE each. Need larger
    chunks of more people.
  • Oxford has created a new technician post which
    will free up some effort for GRID.
  • Italian Computer Science student to join for 6
    month project from Easter 2001.
  • Share experience by involving same staff in both
    CDF and other GRID projects ideally full time
    GRID staff.
Write a Comment
User Comments (0)
About PowerShow.com