DISc NIKHEF: Driving Grid Development with Real Applications Kors Bos NIKHEF PowerPoint PPT Presentation

presentation player overlay
1 / 14
About This Presentation
Transcript and Presenter's Notes

Title: DISc NIKHEF: Driving Grid Development with Real Applications Kors Bos NIKHEF


1
DISc _at_ NIKHEFDriving Grid Development with Real
Applications Kors BosNIKHEF
15 October 2004
http//www.nikhef.nl/grid
EGEE is a project funded by the European Union
under contract IST-2003-508833
2
Contents
  • The HEP Computing Problem
  • Grid research and dissemination activities at
    NIKHEF
  • Community forming
  • Large-scale infrastructure operation
  • Data Intensive Sciences on the Grid

3
The HEP Computing Problem
  • Collect, distribute, and archive data
  • Transform these data to useful quantities
  • Mine the transformed data looking for interesting
    stuff

4
Collect, distribute, and archive data
5
Scales Data Archival
One of the four LHC detectors
40 MHz (40 TB/sec)
online system multi-level trigger filter out
background reduce data volume
level 1 - special hardware
75 KHz (75 GB/sec)
level 2 - embedded processors
5 KHz (5 GB/sec)
level 3 - PCs
100 Hz (100 MB/sec)
data recording offline analysis
6
Collect, distribute, and archive data
10 PB
Few Gb/s
Few to tens of GB/s depending on how you do it
7
Transform the data to useful quantities
8
Transform the Data to Useful Qs
  • Place event info on 3D map
  • Trace trajectories through hits
  • Assign type to each track
  • Find particles you want
  • Needle in a haystack!
  • This is relatively easy case

9
Already Doing Data Transformation on Global Scale!
10
Scales Data Transformation
  • 90 seconds per event to reconstruct and analyze
  • 100 incoming events per second
  • Even to keep up, need either
  • A computer that is nine thousand times faster, or
  • nine thousand computers working together
  • Its worse than just keeping up each event will
    need to be analyzed several times
  • Each event is an independent entity pleasantly
    parallel

11
Results Data Challenge 04
  • Up to 3000 simultaneous jobs per experiment,
    globally distributed
  • Equivalent of 2.2 million hours (250 yrs) of CPU
    time (2.0 GHz) in period of one month
  • Total data volume produced gt 25 TB

For LHCb NIKHEF 6 of worldwide total
12
Community Management
  • Effective Grid infrastructures need
  • Low threshold for forming user communities
  • Effective mechanisms for arranging resource
    sharing and federation
  • Effective tracking of resource usage at site
    level
  • NIKHEF research focuses on enabling sites to
    support
  • dynamic user communities
  • securely
  • with minimal effort

13
Large-ScaleInfrastructure Operation
  • Grids ultimately need to scale to
  • 10.000 100.000 processors
  • Petabytes to exabytes of data
  • Hundreds of user communities
  • Thousands of users
  • When NIKHEF started in 2001, scales were
  • 100s of processors
  • Gigabytes of data
  • Six user communities
  • Tens of users
  • We learn at each increase in scale
  • Our users at NIKHEF are the ideal test team they
    really use the stuff, and want to use it at the
    largest scales possible
  • We find out how to solve scaling problems on our
    local facility
  • We transmit design requirements to our
    software-engineer partners
  • We transmit experience and fixes to our
    production-facility partner SARA
  • Crucial that NIKHEF has its own relatively large
    facility!

14
Data Intensive Science Research
  • The HEP use cases are reasonably generic
  • If they work for HEP, they work for others too
  • Astronomy (radio telescopics VLBI)
  • Bioinformatics (FMRI)
  • Earth Observation (Ozone Profile Processing)
  • Biodiversity (tracking bird migration patterns)
  • Didnt detail data mining use case, we do
    metadata research here, useful for many other
    data miners
  • Link to these groups via
  • Hosting part-time presence at NIKHEF
  • Leadership of Data Intensive Sciences track in
    the VL-E (Virtual Lab for e-Science) project
Write a Comment
User Comments (0)
About PowerShow.com