Investigating the Network Performance of Remote Real-Time Computing Farms For ATLAS Trigger DAQ.

About This Presentation

Title:

Investigating the Network Performance of Remote Real-Time Computing Farms For ATLAS Trigger DAQ.

Description:

Catalin Meirosu Politehnica University of Bucuresti & CERN ... Gadomski, A. Negri, A. Kazarov, M. Dobson, M. Caprini, P. Conde, C. Haeberli, M. Wiesmann, ... – PowerPoint PPT presentation

Number of Views:23

Avg rating:3.0/5.0

Slides: 29

Provided by: dl98

Category:

more less

Transcript and Presenter's Notes

Title: Investigating the Network Performance of Remote Real-Time Computing Farms For ATLAS Trigger DAQ.

1
Investigating the Network Performance of Remote
Real-Time Computing Farms For ATLAS Trigger DAQ.
Richard Hughes-Jones University of Manchester
In Collaboration withBryan Caron University
of Alberta Krzysztof Korcyl IFJ PAN Krakow
Catalin Meirosu Politehnica University of
Bucuresti CERNJakob Langgard Nielsen Niels
Bohr Institute
2

Introduction
Poster On the potential use of Remote Computing
Farms in the ATLAS TDAQ System

3
Atlas Computing Model
PByte/sec
Trigger Event Builder
PC (2004) 1 kSpecInt2k
10 GByte/sec
Event Filter7.5MSI2k
320 MByte/sec
Tier 0
5 PByte/yearno simulation
CERN Center PBytes of Disk Tape Robot
Castor
75MB/s/T1 for ATLAS

Tier 1
UK Regional Centre (RAL)
US Regional Centre
French Regional Centre
Dutch Regional Centre
MSS
2 PByte/year/T1
Tier 2
Tier2 Centre 200kSI2k
Tier2 Centre 200kSI2k
Tier2 Centre 200kSI2k
622Mb/s 1 Gbit/s links
200 TByte/year/T2

High Bandwidth Network
Many Processors
Experts at Remote sites

Lancaster 0.25TIPS
Sheffield
Manchester
Liverpool
Physics data cache
100 - 1000 MB/s links
Desktop
4
Remote Computing Concepts
Remote Event Processing Farms
Copenhagen Edmonton Krakow Manchester
ATLAS Detectors Level 1 Trigger
Event Builders
lightpaths
GÉANT
Level 2 Trigger
Local Event Processing Farms
CERN B513
Mass storage
Experimental Area
5
ATLAS Remote Farms Network Connectivity
6
ATLAS Application Protocol

Event Request
EFD requests an event from SFI
SFI replies with the event 2Mbytes
Processing of event
Return of computation
EF asks SFO for buffer space
SFO sends OK
EF transfers results of the computation
Tcpmon - instrumented tcp request-response
program emulates the Event Filter EFD to SFI
communication.

Networks and End Hosts

8
End Hosts NICs CERN-nat-Manc.
Throughput Packet Loss Re-Order

Use UDP packets to characterise Host, NIC
Network
SuperMicro P4DP8 motherboard
Dual Xenon 2.2GHz CPU
400 MHz System bus
64 bit 66 MHz PCI / 133 MHz PCI-X bus

Request-Response Latency

The network can sustain 1Gbps of UDP traffic
The average server can loose smaller packets
Packet loss caused by lack of power in the PC
receiving the traffic
Out of order packets due to WAN routers
Lightpaths look like extended LANShave no
re-ordering

Using Web100 TCP Stack Instrumentation
to analyse application protocol - tcpmon

10
tcpmon TCP Activity Manc-CERN Req-Resp

Round trip time 20 ms
64 byte Request green1 Mbyte Response blue
TCP in slow start
1st event takes 19 rtt or 380 ms

11
tcpmon TCP activity Manc-cern Req-RespTCP stack
tuned

Round trip time 20 ms
64 byte Request green1 Mbyte Response blue
TCP starts in slow start
1st event takes 19 rtt or 380 ms

TCP Congestion windowgrows nicely
Response takes 2 rtt after 1.5s
Rate 10/s (with 50ms wait)

Transfer achievable throughputgrows to 800
Mbit/s

12
tcpmon TCP activity Alberta-CERN Req-RespTCP
stack tuned

Round trip time 150 ms
64 byte Request green1 Mbyte Response blue
TCP starts in slow start
1st event takes 11 rtt or 1.67 s

TCP Congestion windowin slow start to 1.8s
then congestion avoidance
Response in 2 rtt after 2.5s
Rate 2.2/s (with 50ms wait)

Transfer achievable throughputgrows slowly from
250 to 800 Mbit/s

13
SC2004 Disk-Disk bbftp

bbftp file transfer program uses TCP/IP
UKLight Path- London-Chicago-London PCs-
Supermicro 3Ware RAID0
MTU 1500 bytes Socket size 22 Mbytes rtt 177ms
SACK off
Move a 2 Gbyte file
Web100 plots
Standard TCP
Average 825 Mbit/s
(bbcp 670 Mbit/s)
Scalable TCP
Average 875 Mbit/s
(bbcp 701 Mbit/s4.5s of overhead)
Disk-TCP-Disk at 1Gbit/sis here!

14
Time Series of Request-Response Latency

Manchester CERN
Round trip time 20 ms
1 Mbyte of data returned
Stable for 18s at 42.5ms
Then alternate points 29 42.5 ms

Alberta CERN
Round trip time 150 ms
1 Mbyte of data returned
Stable for 150s at 300ms
Falls to 160ms with 80 µs variation

Using the Trigger DAQ Application

16
Time Series of T/DAQ event rate

Manchester CERN
Round trip time 20 ms
1 Mbyte of data returned
3 nodes 1 GEthernet two 100Mbit
2 nodes two 100Mbit nodes
1node one 100Mbit node
Event Rate
Use tcpmon transfer time of 42.5ms
Add the time to return the data 95ms
Expected rate 10.5/s
Observe 6/s for the gigabit node
Reason TCP buffers could not be set large enough
in T/DAQ application

Tcpdump of the Trigger DAQ Application

18
tcpdump of the T/DAQ dataflow at SFI (1)
Cern-Manchester 1.0 Mbyte event Remote EFD
requests event from SFI
Incoming event request Followed by ACK
SFI sends event Limited by TCP receive
buffer Time 115 ms (4 ev/s)
When TCP ACKs arrive more data is sent.
N 1448 byte packets
19
Tcpdump of TCP Slowstart at SFI (2)
Cern-Manchester 1.0 Mbyte event Remote EFD
requests event from SFI
First event request
SFI sends event Limited by TCP Slowstart Time 320
ms
N 1448 byte packets
When ACKs arrive more data sent.
20
tcpdump of the T/DAQ dataflow for SFI SFO

Cern-Manchester another test run
1.0 Mbyte event
Remote EFD requests events from SFI

Remote EFD sending computation back to SFO
Links closed by Application

Link setup TCP slowstart
21
Some First Conclusions

The TCP protocol dynamics strongly influence the
behaviour of the Application.
Care is required with the Application design eg
use of timeouts.
With the correct TCP buffer sizes
It is not throughput but the round-trip nature of
the application protocol that determines
performance.
Requesting the 1-2Mbytes of data takes 1 or 2
round trips
TCP Slowstart (the opening of Cwnd) considerably
lengthens time for the first block of data.
Implementation improvements (Cwnd reduction)
kill performance!
When the TCP buffer sizes are too small (default)
The amount of data sent is limited on each rtt
Data is send and arrives in bursts
It takes many round trips to send 1 or 2 Mbytes
The End Hosts themselves
CPU power is required for the TCP/IP stack as
well and the application
Packets can be lost in the IP stack due to lack
of processing power

22
Summary

We are investigating the technical feasibility of
remote real-time computing for ATLAS.
We have exercised multiple 1 Gbit/s connections
between CERN and Universities located in Canada,
Denmark, Poland and the UK
Network providers are very helpful and interested
in our experiments
Developed a set of tests for characterization of
the network connections
Network behavior generally good e.g. little
packet loss observed
Backbones tend to over-provisioned
However access links and campus LANs need care.
Properly configured end nodes essential for
getting good results with real applications.
Collaboration between the experts from the
Application and Network teams is progressing well
and is required to achieve performance.
Although the application is ATLAS-specific, the
information presented on the network interactions
is applicable to other areas including
Remote iSCSI
Remote database accesses
Real-time Grid Computing eg Real-Time
Interactive Medical Image processing

23
Thanks to all who helped, including

National Research NetworksCanarie, Dante,
DARENET, Netera, PSNC and UKERNA
ATLAS remote farms
J. Beck Hansen, R. Moore, R. Soluk, G.
Fairey, T. Bold, A. Waananen, S. Wheeler, C. Bee
ATLAS online and dataflow software
S. Kolos, S. Gadomski, A. Negri, A. Kazarov,
M. Dobson, M. Caprini, P. Conde, C. Haeberli, M.
Wiesmann, E. Pasqualucci, A. Radu

24
More Information Some URLs

Real-Time Remote Farm site http//csr.phys.ualbert
a.ca/real-time
UKLight web site http//www.uklight.ac.uk
DataTAG project web site http//www.datatag.org/
UDPmon / TCPmon kit writeup http//www.hep.man
.ac.uk/rich/ (Software Tools)
Motherboard and NIC Tests
http//www.hep.man.ac.uk/rich/net/nic/GigEth_te
sts_Boston.ppt http//datatag.web.cern.ch/datata
g/pfldnet2003/
Performance of 1 and 10 Gigabit Ethernet Cards
with Server Quality Motherboards FGCS Special
issue 2004
http// www.hep.man.ac.uk/rich/ (Publications)
TCP tuning information may be found
athttp//www.ncne.nlanr.net/documentation/faq/pe
rformance.html http//www.psc.edu/networking/p
erf_tune.html
TCP stack comparisonsEvaluation of Advanced
TCP Stacks on Fast Long-Distance Production
Networks Journal of Grid Computing 2004http//
www.hep.man.ac.uk/rich/ (Publications)
PFLDnet http//www.ens-lyon.fr/LIP/RESO/pfldnet200
5/
Dante PERT http//www.geant2.net/server/show/nav.0
0d00h002

Any Questions?

Backup Slides

27
End Hosts NICs CERN-Manc.
Throughput Packet Loss Re-Order

Use UDP packets to characterise Host NIC
SuperMicro P4DP8 motherboard
Dual Xenon 2.2GHz CPU
400 MHz System bus
66 MHz 64 bit PCI bus

Request-response Latency
28
TCP (Reno) Details

Time for TCP to recover its throughput from 1
lost packet given by
for rtt of 200 ms

2 min
UK 6 ms Europe 20 ms USA 150 ms

Write a Comment

User Comments (0)

About PowerShow.com

Investigating the Network Performance of Remote Real-Time Computing Farms For ATLAS Trigger DAQ. - PowerPoint PPT Presentation

Investigating the Network Performance of Remote Real-Time Computing Farms For ATLAS Trigger DAQ.

Catalin Meirosu Politehnica University of Bucuresti & CERN ... Gadomski, A. Negri, A. Kazarov, M. Dobson, M. Caprini, P. Conde, C. Haeberli, M. Wiesmann, ... – PowerPoint PPT presentation