Global Lambdas and Grids for Particle Physics in the LHC Era - PowerPoint PPT Presentation

About This Presentation
Title:

Global Lambdas and Grids for Particle Physics in the LHC Era

Description:

... System Issues: PCI-X Bus, Linux Kernel, NIC Drivers, CPU. Internet 2 Land Speed ... Linux kernel for TCP-based protocols, ... 7 full kernel-build cycles ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 39
Provided by: harvey
Category:

less

Transcript and Presenter's Notes

Title: Global Lambdas and Grids for Particle Physics in the LHC Era


1
  • Global Lambdas and Grids for Particle Physics in
    the LHC Era

Harvey B. Newman California Institute of
TechnologySC2005Seattle, November 14-18 2005
2
Beyond the SM Great Questions of Particle
Physics and Cosmology
You Are Here.
  1. Where does the pattern of particle families and
    masses come from ?
  2. Where are the Higgs particles what is the
    mysterious Higgs field ?
  3. Why do neutrinos and quarks oscillate ?
  4. Is Nature Supersymmetric ?
  5. Why is any matter left in the universe ?
  6. Why is gravity so weak?
  7. Are there extra space-time dimensions?

We do not know what makes up 95 of the
universe.
3
Large Hadron Collider CERN, Geneva 2007 Start
  • pp ?s 14 TeV L1034 cm-2 s-1
  • 27 km Tunnel in Switzerland France

CMS
TOTEM
pp, general purpose HI
5000 Physicists 250 Institutes 60
Countries
Atlas
ALICE HI
LHCb B-physics
Higgs, SUSY, Extra Dimensions, CP Violation, QG
Plasma, the Unexpected
Challenges Analyze petabytes of complex data
cooperativelyHarness global computing, data
network resources
4
LHC Data Grid Hierarchy
Emerging Vision A Richly Structured, Global
Dynamic System
5
Long Term Trends in Network Traffic Volumes
300-1000X/10Yrs
ESnet Accepted Traffic 1990 2005Exponential
Growth 82/Year for the Last 15 Years 400X
Per Decade
R. Cottrell
W. Johnston
10 Gbit/s
600
500
400
TERABYTES Per Month
300
Progressin Steps
200
100
  • SLAC Traffic Growth in Steps 10X/4 Years.
  • Projected 2 Terabits/s by 2014
  • Summer 05 2x10 Gbps links one for
    production, one for RD

6
Internet 2 Land Speed Record (LSR)
Internet2 LSRsBlue HEP
7.2G X 20.7 kkm
  • IPv4 Multi-stream record with FAST TCP 6.86 Gbps
    X 27kkm Nov 2004
  • IPv6 record 5.11 Gbps between Geneva and
    Starlight Jan. 2005
  • Disk-to-disk Marks 536 Mbytes/sec (Windows)
    500 Mbytes/sec (Linux)
  • End System Issues PCI-X Bus, Linux Kernel, NIC
    Drivers, CPU

Throuhgput (Petabit-m/sec)
Nov. 2004 Record Network
NB Manufacturers Roadmaps for 2006 One Server
Pair to One 10G Link
7
HENP Bandwidth Roadmap for Major Links (in Gbps)
Continuing Trend 1000 Times Bandwidth Growth
Per DecadeHEP Co-Developer as well as
Application Driver of Global Nets
8
LHCNet , ESnet Plan 2006-200920-80Gbps US-CERN,
ESnet MANs, IRNC
LHCNet US-CERN Wavelength Triangle 10/05 10G
CHI 10G NY 2007 20G 20G 2009 40G
40G
AsiaPac
SEA
Europe
Europe
ESnet 2nd Core 30-50G
Aus.
BNL
Japan
Japan
SNV
CHI
NYC
IRNC Links
DEN
GEANT2 SURFNet IN2P3
DC
Metro Rings
FNAL
Aus.
ESnet IP Core 10 Gbps
ALB
SDG
ATL
CERN
ELP
ESnet hubs
LHCNet Data Network (2 to 8 x 10 Gbps US-CERN)
New ESnet hubs
Metropolitan Area Rings
10Gb/s 10Gb/s 30Gb/s2 x 10Gb/s
Major DOE Office of Science Sites
High-speed cross connects with Internet2/Abilene
Production IP ESnet core, 10 Gbps enterprise IP
traffic
Science Data Network core, 40-60 Gbps circuit
transport
Lab supplied
Major international
ESNet MANs to FNAL BNL Dark fiber (60Gbps) to
FNAL
LHCNet Data Network
NSF/IRNC circuit GVA-AMS connection via Surfnet
or Geant2
9
Global Lambdas for Particle PhysicsCaltech/CACR
and FNAL/SLAC Booths
  • Preview global-scale data analysis of the LHC Era
    (2007-2020), using next-generation networks and
    intelligent grid systems
  • Using state of the art WAN infrastructure and
    Grid-based Web service frameworks, based on the
    LHC Tiered Data Grid Architecture
  • Using a realistic mixture of streams organized
    transfer of multi-TB event datasets, plus
    numerous smaller flows of physics data that
    absorb the remaining capacity.
  • The analysis software suites are based on the
    Grid-enabled Analysis Environment (GAE) developed
    at Caltech and U. Florida, as well as Xrootd from
    SLAC, and dcache from FNAL
  • Monitored by Caltechs MonALISA global monitoring
    and control system

10
Global Lambdas for Particle PhysicsCaltech/CACR
and FNAL/SLAC Booths
  • We used Twenty Two 10 Gbps waves to carry
    bidirectional traffic between Fermilab, Caltech,
    SLAC, BNL, CERN and other partner Grid Service
    sites including Michigan, Florida, Manchester,
    Rio de Janeiro (UERJ) and Sao Paulo (UNESP) in
    Brazil, Korea (KNU), and Japan (KEK)
  • Results
  • 151 Gbps peak, 100 Gbps of throughput sustained
    for hours475 Terabytes of physics data
    transported in lt 24 hours
  • 131 Gbps measured by SCInet bwc team on 17 of
    our waves
  • Using real physics applications and production as
    well as test systems for data access, transport
    and analysis bbcp, xrootd, dcache, and gridftp
    and grid analysis tool suites
  • Linux kernel for TCP-based protocols, including
    Caltechs FAST
  • Far surpassing our previous SC2004 BWC Record
    of 101 Gbps
  • 15 at the Caltech/CACR and 7 at the
    FNAL/SLAC Booth

11
Monitoring NLR, Abilene/HOPI, LHCNet,
USNet,TeraGrid, PWave, SCInet, Gloriad, JGN2,
WHREN, other Intl RE Nets, and 14000 Grid
Nodes Simultaneously
I. Legrand
12
(No Transcript)
13
(No Transcript)
14
Switch and Server Interconnections at the
Caltech Booth (428)
  • 15 10G Waves
  • 72 nodes with 280 Cores
  • 64 10G Switch Ports 2 Fully Populated Cisco
    6509Es
  • 45 Neterion 10 GbE NICs
  • 200 SATA Disks
  • 40 Gbps (20 HBAs) to StorCloud
  • Thursday Sunday Setup

http//monalisa-ul.caltech.edu8080/stats?pagenod
einfo_sys
15
(No Transcript)
16
(No Transcript)
17
(No Transcript)
18
Fermilab
  • Our BWC data sources are the Production Storage
    Systems and File Servers used by
  • CDF
  • US CMS Tier 1
  • Sloan Digital Sky Survey
  • Each of these produces, stores and moves Multi-TB
    to PB-scale data Tens of TB per day
  • 600 gridftp servers (of 1000s) directly
    involved

19
Fermilab
  • Mass Storage Facilities
  • Over 3.3 PB stored.
  • Ingest 200 TB/month.
  • 20 to 300 TB/day read.
  • Disk pools dCache backed by tape through
    Enstore.
  • Multiple data transfer protocols
  • WAN gsiftp, http
  • LAN dcap (presents POSIX I/O interface)

20
(No Transcript)
21
Xrootd Server Performance
A. Hanushevsky
  • Scientific Results
  • Ad hoc Analysis of Multi-TByte Archives
  • Immediate exploration
  • Spurs novel discovery approaches
  • Linear Scaling
  • Hardware Performance
  • Deterministic Sizing
  • High Capacity
  • Thousands of clients
  • Hundreds of Parallel Streams
  • Very Low Latency
  • 12us Transfer Cost
  • Device NIC Limited

Excellent Across WANs
22
Xrootd Clustering
  • Unbounded Clustering
  • Self organizing
  • Total Fault Tolerance
  • Automatic real-time reorganization
  • Result
  • Minimum Admin Overhead
  • Better Client CPU Utilization
  • More results in less time at less cost

Xrootd Clustering
A
Who has file X?
Data Servers
Client
open file X
B
D
go to C
Who has file X?
I have
open file X
I have
C
E
go to F
I have
Redirector (Head Node)
Supervisor (sub-redirector)
open file X
F
Cluster
Client sees all servers as xrootd data servers
23
Remote Sites Caltech, UFL, Brazil..
  • Authenticated users automatically discover, and
    initiate multiple transfers of physics datasets
    (Root files) through secure Clarens based GAE
    services.
  • Transfer is monitored through MonALISA
  • Once data arrives at the target sites (remote)
    analysis can start by authenticated users, using
    the Root analysis framework.
  • Using the Clarens Root viewer or COJAC event
    viewer data from remote can be presented
    transparently to the user.

ROOT Analysis
ROOT Analysis
GAE Services
GAE Services
GAE Services
ROOT Analysis
24
SC05 Abilene and HOPI Waves
25
GLORIAD 10 Gbps Optical Ring Around the Globe
by March 2007
  • GLORIAD Circuits Today
  • 10 Gbps Hong Kong-Daejon-Seattle
  • 10 Gbps Seattle-Chicago-NYC (CANARIE
    contribution to GLORIAD)
  • 622 Mbps Moscow-AMS-NYC
  • 2.5 Gbps Moscow-AMS
  • 155 Mbps Beijing-Khabarovsk-Moscow
  • 2.5 Gbps Beijing-Hong Kong
  • 1 GbE NYC-Chicago (CANARIE)

China, Russia, Korea, Japan, US, Netherlands
Partnership
US NSF IRNC Program
26
ESLEA/UKLight SC05 Network Diagram
6 X 1 GE
OC-192
27
KNU (Korea) Main Goals
  • Uses 10Gbps GLORIAD link from Korea to US, which
    is called BIG-GLORIAD, also part of UltraLight
  • Try to saturate this BIG-GLORIAD link with
    servers and cluster storages connected with
    10Gbps
  • Korea is planning to be a Tier-1 site for LHC
    experiments

U.S.
Korea
BIG-GLORIAD
28
KEK (Japan) at SC0510GE Switches on the
KEK-JGN2-StarLight Path
  • JGN2 10G Network Research Testbed
  • Operational since 4/04
  • 10Gbps L2 between Tsukuba and Tokyo Otemachi
  • 10Gbps IP to Starlight since August 2004
  • 10Gbps L2 to Starlight since September 2005
  • OtemachiChicago OC192 link replaced by 10GE
    WANPHY in September 2005

29
Brazil HEPGrid Rio de Janeiro (UERJ) and Sao
Paulo (UNESP)
30
Global Lambdas for Particle PhysicsA Worldwide
Network Grid Experiment
  • We have Previewed the IT Challenges of Next
    Generation Science at the High Energy
    Frontier (for the LHC and other major programs)
  • Petabyte-scale datasets
  • Tens of national and transoceanic links at 10
    Gbps (and up)
  • 100 Gbps aggregate data transport sustained for
    hours We reached a Petabyte/day transport
    rate for real physics data
  • We set the scale and learned to gauge the
    difficulty of the global networks and
    transport systems required for the LHC mission
  • But we set up, shook down and successfully ran
    the system in lt1 week
  • We have substantive take-aways from this marathon
    exercise
  • An optimized Linux (2.6.12 FAST NFSv4)
    kernel for data transport after 7 full
    kernel-build cycles in 4 days
  • A newly optimized application-level copy
    program, bbcp, that matches the performance of
    iperf under some conditions
  • Extension of Xrootd, an optimized low-latency
    file access application for clusters, across the
    wide area
  • Understanding of the limits of 10 Gbps-capable
    systems under stress

31
Global Lambdas for Particle PhysicsA Worldwide
Network Grid Experiment
  • We are grateful to our many network partners
    SCInet, LHCNet, Starlight, NLR, Internet2s
    Abilene and HOPI, ESnet, UltraScience Net, MiLR,
    FLR, CENIC, Pacific Wave, UKLight, TeraGrid,
    Gloriad, AMPATH, RNP, ANSP, CANARIE and JGN2.
  • And to our partner projects US CMS, US ATLAS,
    D0, CDF, BaBar, US LHCNet, UltraLight,
    LambdaStation, Terapaths, PPDG, GriPhyN/iVDGL,
    LHCNet, StorCloud, SLAC IEPM, ICFA/SCIC and Open
    Science Grid
  • Our Supporting Agencies DOE and NSF
  • And for the generosity of our vendor supporters,
    especially Cisco Systems, Neterion, HP, IBM, and
    many others, who have made this possible
  • And the Hudson Bay Fan Company

32
(No Transcript)
33
Extra Slides Follow
34
Global Lambdas for Particle Physics
Analysis SC05 Bandwidth Challenge Entry
Caltech, CERN, Fermilab, Florida, Manchester,
Michigan, SLAC, Vanderbilt,Brazil, Korea, Japan,
et al CERN's Large Hadron Collider experiments
Data/Compute/Network Intensive Discovering the
Higgs, SuperSymmetry, or Extra Space-Dimensions -
with a Global Grid Worldwide Collaborations of
Physicists Working Together while Developing
Next-generation Global Network and Grid Systems
35
  • Authentication
  • Access control on Web Services.
  • Remote file access (and access control on files).
  • Discovery of Web Services and Software.
  • Shell service. Shell like access to remote
    machines (managed by access control lists).
  • Proxy certificate functionality
  • Virtual Organization management and role
    management.

Analysis Sandbox
Catalog
Storage
Network
  • User's point of access to a Grid system.
  • Provides environment where user can
  • Access Grid resources and services.
  • Execute and monitor Grid applications.
  • Collaborate with other users.
  • One stop shop for Grid needs
  • Portals can lower the barrier for users to access
    Web Services and using Grid enabled applications

Start (remote) analysis
select dataset
36
(No Transcript)
37
(No Transcript)
38
(No Transcript)
39
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com