Investigating the Network Performance of Remote Real-Time Computing Farms For ATLAS Trigger DAQ. - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Investigating the Network Performance of Remote Real-Time Computing Farms For ATLAS Trigger DAQ.

Description:

Catalin Meirosu Politehnica University of Bucuresti & CERN ... Gadomski, A. Negri, A. Kazarov, M. Dobson, M. Caprini, P. Conde, C. Haeberli, M. Wiesmann, ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 29
Provided by: dl98
Category:

less

Transcript and Presenter's Notes

Title: Investigating the Network Performance of Remote Real-Time Computing Farms For ATLAS Trigger DAQ.


1
Investigating the Network Performance of Remote
Real-Time Computing Farms For ATLAS Trigger DAQ.
Richard Hughes-Jones University of Manchester
In Collaboration withBryan Caron University
of Alberta Krzysztof Korcyl IFJ PAN Krakow
Catalin Meirosu Politehnica University of
Bucuresti CERNJakob Langgard Nielsen Niels
Bohr Institute
2
  • Introduction
  • Poster On the potential use of Remote Computing
    Farms in the ATLAS TDAQ System

3
Atlas Computing Model
PByte/sec
Trigger Event Builder
PC (2004) 1 kSpecInt2k
10 GByte/sec
Event Filter7.5MSI2k
320 MByte/sec
Tier 0
5 PByte/yearno simulation
CERN Center PBytes of Disk Tape Robot
Castor
75MB/s/T1 for ATLAS

Tier 1
UK Regional Centre (RAL)
US Regional Centre
French Regional Centre
Dutch Regional Centre
MSS
2 PByte/year/T1
Tier 2
Tier2 Centre 200kSI2k
Tier2 Centre 200kSI2k
Tier2 Centre 200kSI2k
622Mb/s 1 Gbit/s links
200 TByte/year/T2
  • High Bandwidth Network
  • Many Processors
  • Experts at Remote sites

Lancaster 0.25TIPS
Sheffield
Manchester
Liverpool
Physics data cache
100 - 1000 MB/s links
Desktop
4
Remote Computing Concepts
Remote Event Processing Farms
Copenhagen Edmonton Krakow Manchester
ATLAS Detectors Level 1 Trigger
Event Builders
lightpaths
GÉANT
Level 2 Trigger
Local Event Processing Farms
CERN B513
Mass storage
Experimental Area
5
ATLAS Remote Farms Network Connectivity
6
ATLAS Application Protocol
  • Event Request
  • EFD requests an event from SFI
  • SFI replies with the event 2Mbytes
  • Processing of event
  • Return of computation
  • EF asks SFO for buffer space
  • SFO sends OK
  • EF transfers results of the computation
  • Tcpmon - instrumented tcp request-response
    program emulates the Event Filter EFD to SFI
    communication.

7
  • Networks and End Hosts

8
End Hosts NICs CERN-nat-Manc.
Throughput Packet Loss Re-Order
  • Use UDP packets to characterise Host, NIC
    Network
  • SuperMicro P4DP8 motherboard
  • Dual Xenon 2.2GHz CPU
  • 400 MHz System bus
  • 64 bit 66 MHz PCI / 133 MHz PCI-X bus

Request-Response Latency
  • The network can sustain 1Gbps of UDP traffic
  • The average server can loose smaller packets
  • Packet loss caused by lack of power in the PC
    receiving the traffic
  • Out of order packets due to WAN routers
  • Lightpaths look like extended LANShave no
    re-ordering

9
  • Using Web100 TCP Stack Instrumentation
  • to analyse application protocol - tcpmon

10
tcpmon TCP Activity Manc-CERN Req-Resp
  • Round trip time 20 ms
  • 64 byte Request green1 Mbyte Response blue
  • TCP in slow start
  • 1st event takes 19 rtt or 380 ms

11
tcpmon TCP activity Manc-cern Req-RespTCP stack
tuned
  • Round trip time 20 ms
  • 64 byte Request green1 Mbyte Response blue
  • TCP starts in slow start
  • 1st event takes 19 rtt or 380 ms
  • TCP Congestion windowgrows nicely
  • Response takes 2 rtt after 1.5s
  • Rate 10/s (with 50ms wait)
  • Transfer achievable throughputgrows to 800
    Mbit/s

12
tcpmon TCP activity Alberta-CERN Req-RespTCP
stack tuned
  • Round trip time 150 ms
  • 64 byte Request green1 Mbyte Response blue
  • TCP starts in slow start
  • 1st event takes 11 rtt or 1.67 s
  • TCP Congestion windowin slow start to 1.8s
    then congestion avoidance
  • Response in 2 rtt after 2.5s
  • Rate 2.2/s (with 50ms wait)
  • Transfer achievable throughputgrows slowly from
    250 to 800 Mbit/s

13
SC2004 Disk-Disk bbftp
  • bbftp file transfer program uses TCP/IP
  • UKLight Path- London-Chicago-London PCs-
    Supermicro 3Ware RAID0
  • MTU 1500 bytes Socket size 22 Mbytes rtt 177ms
    SACK off
  • Move a 2 Gbyte file
  • Web100 plots
  • Standard TCP
  • Average 825 Mbit/s
  • (bbcp 670 Mbit/s)
  • Scalable TCP
  • Average 875 Mbit/s
  • (bbcp 701 Mbit/s4.5s of overhead)
  • Disk-TCP-Disk at 1Gbit/sis here!

14
Time Series of Request-Response Latency
  • Manchester CERN
  • Round trip time 20 ms
  • 1 Mbyte of data returned
  • Stable for 18s at 42.5ms
  • Then alternate points 29 42.5 ms
  • Alberta CERN
  • Round trip time 150 ms
  • 1 Mbyte of data returned
  • Stable for 150s at 300ms
  • Falls to 160ms with 80 µs variation

15
  • Using the Trigger DAQ Application

16
Time Series of T/DAQ event rate
  • Manchester CERN
  • Round trip time 20 ms
  • 1 Mbyte of data returned
  • 3 nodes 1 GEthernet two 100Mbit
  • 2 nodes two 100Mbit nodes
  • 1node one 100Mbit node
  • Event Rate
  • Use tcpmon transfer time of 42.5ms
  • Add the time to return the data 95ms
  • Expected rate 10.5/s
  • Observe 6/s for the gigabit node
  • Reason TCP buffers could not be set large enough
    in T/DAQ application

17
  • Tcpdump of the Trigger DAQ Application

18
tcpdump of the T/DAQ dataflow at SFI (1)
Cern-Manchester 1.0 Mbyte event Remote EFD
requests event from SFI
Incoming event request Followed by ACK
SFI sends event Limited by TCP receive
buffer Time 115 ms (4 ev/s)
When TCP ACKs arrive more data is sent.
N 1448 byte packets
19
Tcpdump of TCP Slowstart at SFI (2)
Cern-Manchester 1.0 Mbyte event Remote EFD
requests event from SFI
First event request
SFI sends event Limited by TCP Slowstart Time 320
ms
N 1448 byte packets
When ACKs arrive more data sent.
20
tcpdump of the T/DAQ dataflow for SFI SFO
  • Cern-Manchester another test run
  • 1.0 Mbyte event
  • Remote EFD requests events from SFI
  • Remote EFD sending computation back to SFO
  • Links closed by Application

Link setup TCP slowstart
21
Some First Conclusions
  • The TCP protocol dynamics strongly influence the
    behaviour of the Application.
  • Care is required with the Application design eg
    use of timeouts.
  • With the correct TCP buffer sizes
  • It is not throughput but the round-trip nature of
    the application protocol that determines
    performance.
  • Requesting the 1-2Mbytes of data takes 1 or 2
    round trips
  • TCP Slowstart (the opening of Cwnd) considerably
    lengthens time for the first block of data.
  • Implementation improvements (Cwnd reduction)
    kill performance!
  • When the TCP buffer sizes are too small (default)
  • The amount of data sent is limited on each rtt
  • Data is send and arrives in bursts
  • It takes many round trips to send 1 or 2 Mbytes
  • The End Hosts themselves
  • CPU power is required for the TCP/IP stack as
    well and the application
  • Packets can be lost in the IP stack due to lack
    of processing power

22
Summary
  • We are investigating the technical feasibility of
    remote real-time computing for ATLAS.
  • We have exercised multiple 1 Gbit/s connections
    between CERN and Universities located in Canada,
    Denmark, Poland and the UK
  • Network providers are very helpful and interested
    in our experiments
  • Developed a set of tests for characterization of
    the network connections
  • Network behavior generally good e.g. little
    packet loss observed
  • Backbones tend to over-provisioned
  • However access links and campus LANs need care.
  • Properly configured end nodes essential for
    getting good results with real applications.
  • Collaboration between the experts from the
    Application and Network teams is progressing well
    and is required to achieve performance.
  • Although the application is ATLAS-specific, the
    information presented on the network interactions
    is applicable to other areas including
  • Remote iSCSI
  • Remote database accesses
  • Real-time Grid Computing eg Real-Time
    Interactive Medical Image processing

23
Thanks to all who helped, including
  • National Research NetworksCanarie, Dante,
    DARENET, Netera, PSNC and UKERNA
  • ATLAS remote farms
  • J. Beck Hansen, R. Moore, R. Soluk, G.
    Fairey, T. Bold, A. Waananen, S. Wheeler, C. Bee
  • ATLAS online and dataflow software
  • S. Kolos, S. Gadomski, A. Negri, A. Kazarov,
    M. Dobson, M. Caprini, P. Conde, C. Haeberli, M.
    Wiesmann, E. Pasqualucci, A. Radu

24
More Information Some URLs
  • Real-Time Remote Farm site http//csr.phys.ualbert
    a.ca/real-time
  • UKLight web site http//www.uklight.ac.uk
  • DataTAG project web site http//www.datatag.org/
  • UDPmon / TCPmon kit writeup http//www.hep.man
    .ac.uk/rich/ (Software Tools)
  • Motherboard and NIC Tests
  • http//www.hep.man.ac.uk/rich/net/nic/GigEth_te
    sts_Boston.ppt http//datatag.web.cern.ch/datata
    g/pfldnet2003/
  • Performance of 1 and 10 Gigabit Ethernet Cards
    with Server Quality Motherboards FGCS Special
    issue 2004
  • http// www.hep.man.ac.uk/rich/ (Publications)
  • TCP tuning information may be found
    athttp//www.ncne.nlanr.net/documentation/faq/pe
    rformance.html http//www.psc.edu/networking/p
    erf_tune.html
  • TCP stack comparisonsEvaluation of Advanced
    TCP Stacks on Fast Long-Distance Production
    Networks Journal of Grid Computing 2004http//
    www.hep.man.ac.uk/rich/ (Publications)
  • PFLDnet http//www.ens-lyon.fr/LIP/RESO/pfldnet200
    5/
  • Dante PERT http//www.geant2.net/server/show/nav.0
    0d00h002

25
  • Any Questions?

26
  • Backup Slides

27
End Hosts NICs CERN-Manc.
Throughput Packet Loss Re-Order
  • Use UDP packets to characterise Host NIC
  • SuperMicro P4DP8 motherboard
  • Dual Xenon 2.2GHz CPU
  • 400 MHz System bus
  • 66 MHz 64 bit PCI bus

Request-response Latency
28
TCP (Reno) Details
  • Time for TCP to recover its throughput from 1
    lost packet given by
  • for rtt of 200 ms

2 min
UK 6 ms Europe 20 ms USA 150 ms
Write a Comment
User Comments (0)
About PowerShow.com