GNEW2004 CERN March 2004 - PowerPoint PPT Presentation

About This Presentation
Title:

GNEW2004 CERN March 2004

Description:

R. Hughes-Jones Manchester. 1. End-2-End Network Monitoring. What do we do ? What do we use it for? Richard Hughes-Jones ... R. Hughes-Jones Manchester. 4 ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 38
Provided by: rhu91
Category:
Tags: cern | gnew2004 | hughes | march | net

less

Transcript and Presenter's Notes

Title: GNEW2004 CERN March 2004


1
End-2-End Network MonitoringWhat do we do ?
What do we use it for?
Richard Hughes-JonesMany people are
involved
2
DataGRID WP7 Network Monitoring Architecturefor
Grid Sites
LDAP Schema
Grid Apps GridFTP


Backend LDAP script to fetch metrics Monitor
process to push metrics
PingER (RIPE TTB) iperf UDPmon rTPL NWS etc
Local Network Monitoring Store Analysis of Data
(Access)
local LDAP Server
Grid Application access via LDAP Schema to -
monitoring metrics - location of monitoring
data.
Access to current and historic data and metrics
via the Web, i.e. WP7 NM Pages, access to metric
forecasts
Robin Tasker
3
WP7 Network Monitoring Components
4
WP7 MapCentre Grid Monitoring Visualisation
  • Grid network monitoring architecture uses LDAP
    R-GMA - DataGrid WP7
  • Central MySQL archive hosting all network metrics
    and GridFTP logging
  • Probe Coordination Protocol deployed, scheduling
    tests
  • MapCentre also provides site node Fabric health
    checks

Franck Bonnassieux CNRS Lyon
5
WP7 MapCentre Grid Monitoring Visualisation
  • CERN RAL UDP
    CERN IN2P3 UDP

CERN RAL TCP
CERN IN2P3 TCP
6
UK e-Science Network Monitoring
  • Technology Transfer
  • DataGrid WP7 M/c
  • UK e-Science DL
  • DataGrid WP7 M/c
  • Architecture

7
UK e-Science Network Problem Solving
24 Jan to 4 Feb 04 TCP iperf RAL to HEP Only 2
sites gt80 Mbit/s RAL -gt DL 250-300 Mbit/s
8
Tools UDPmon Latency Throughput
  • UDP/IP packets sent between end systems
  • Latency
  • Round trip times using Request-Response UDP
    frames
  • Latency as a function of frame size
  • Slope s given by
  • Mem-mem copy(s) pci Gig Ethernet pci
    mem-mem copy(s)
  • Intercept indicates processing times HW
    latencies
  • Histograms of singleton measurements
  • UDP Throughput
  • Send a controlled stream of UDP frames spaced at
    regular intervals
  • Vary the frame size and the frame transmit
    spacing measure
  • The time of first and last frames received
  • The number packets received, lost, out of order
  • Histogram inter-packet spacing received packets
  • Packet loss pattern
  • 1-way delay
  • CPU load
  • Number of interrupts

9
UDPmon Example 1 Gigabit NIC Intel pro/1000
Throughput
  • Motherboard Supermicro P4DP6
  • Chipset E7500 (Plumas)
  • CPU Dual Xeon 2 2GHz with 512k L2 cache
  • Mem bus 400 MHz PCI-X 64 bit 66 MHz
  • HP Linux Kernel 2.4.19 SMP
  • MTU 1500 bytes
  • Intel PRO/1000 XT

Latency
Bus Activity
10
Tools Trace-Rate Hop by hop measurements
  • A method to measure the hop-by-hop capacity,
    delay, and loss up to the path bottleneck
  • Not intrusive
  • Operates in a high-performance environment
  • Does not need cooperation of the destination
  • Based on Packet Pair Method
  • Send sets of b2b packets with increasing time to
    live
  • For each set filter noise from rtt
  • Calculate spacing hence bottleneck BW
  • Robust regarding the presence of invisible nodes

Effect of the bottleneck on a packet pair. L is
a packet size C is the capacity
Examples of parameters that are iteratively
analysed to extract the capacity mode
11
Tools Trace-Rate Some Results
  • Capacity measurements as function of load in
    Mbit/s from tests on the DataTAG Link
  • Comparison of the number of packets required
  • Validated by simulations in NS-2
  • Linux implementations, working in a
    high-performance environment
  • Research report http//www.inria.fr/rrrt/rr-4959
    .html
  • Research Paper ICC2004 International
    Conference on Communications, Paris, France, June
    2004. IEEE Communication Society.

12
Network Monitoring as a Tool to study
  • Protocol Behaviour
  • Network Performance
  • Application Performance
  • Tools include
  • web100
  • tcpdump
  • Output from the test tool
  • UDPmon, iperf,
  • Output from the application
  • Gridftp, bbcp, apache

13
Protocol Performance RDUDP
Hans Blom
  • Monitoring from Data Moving Application Network
    Test Program
  • DataTAG WP3 work
  • Test Setup
  • Path Ams-Chi-Ams Force10 loopback
  • Moving data from DAS-2 cluster with RUDP UDP
    based Transport
  • Apply 1111 TCP background streams from iperf
  • Conclusions
  • RUDP performs well
  • It does Back off and share BW
  • Rapidly expands when BW free

14
Performance of the GÉANT Core Network
  • Test Setup
  • Supermicro PC in London Amsterdam GÉANT PoP
  • Smartbits in London Frankfurt GÉANT PoP
  • Long link UK-SE-DE2-IT-CH-FR-BE-NL
  • Short Link UK-FR-BE-NL
  • Network Quality Of Service
  • LBE, IP Premium
  • High-Throughput Transfers
  • Standard and advanced TCP stacks
  • Packet re-ordering effects

15
Tests GÉANT Core Packet re-ordering
  • Effect of LBE background
  • Amsterdam-London
  • BE Test flow
  • Packets at 10 µs line speed
  • 10,000 sent
  • Packet Loss 0.1
  • Re-order Distributions

16
Application Throughput Web100
  • 2Gbyte file transferred RAID0 disks
  • Web100 output every 10 ms
  • Gridftp
  • See alternate 600/800 Mbit and zero
  • Apachie web server curl-based client
  • See steady 720 Mbit

17
VLBI Project Throughput Jitter 1-way Delay Loss
  • 1472 byte Packets Manchester -gt Dwingeloo JIVE
  • 1472 byte Packets man -gt JIVE
  • FWHM 22 µs (B2B 3 µs )
  • Packets Loss distribution
  • Prob. Density Function P(t) ? e-?t
  • Mean ? 2360 / s 426 µs
  • 1-way Delay note the packet loss (points with
    zero 1 way delay)

18
Passive Monitoring
  • Time-series data from Routers and Switches
  • Immediate but usually historical- MRTG
  • Usually derived from SNMP
  • Miss-configured / infected / misbehaving End
    Systems (or Users?)
  • Note Data Protection Laws confidentiality
  • Site MAN and Back-bone topology load
  • Help to user/sysadmin to isolate problem eg low
    TCP transfer
  • Essential for Proof of Concept tests or Protocol
    testing
  • Trends used for capacity planning
  • Control of P2P traffic

19
Users The Campus the MAN 1
Pete White Pat Myers
  • NNW to SJ4 Access 2.5 Gbit PoS Hits 1 Gbit
    50
  • Man NNW Access 2 1 Gbit Ethernet

20
Users The Campus the MAN 2
  • Message
  • Not a complaint
  • Continue to work with your network group
  • Understand the traffic levels
  • Understand the Network Topology
  • LMN to site 1 Access 1 Gbit Ethernet
  • LMN to site 2 Access 1 Gbit Ethernet

21
VLBI Traffic Flows
Only testing Could be worse!
  • Manchester NetNorthWest - SuperJANET Access
    links
  • Two 1 Gbit/s
  • Access linksSJ4 to GÉANT GÉANT to
    SurfNet

22
GGF Hierarchy Characteristics Document
  • Network Measurement Working Group
  • A Hierarchy of Network Performance
    Characteristics for Grid Applications and
    Services
  • Document defines terms relations
  • Network characteristics
  • Measurement methodologies
  • Observation
  • Discusses Nodes Paths
  • For each Characteristic
  • Defines the meaning
  • Attributes that SHOULD be included
  • Issues to consider when making an observation
  • Status
  • Originally submitted to GFSG as Community
    Practice Documentdraft-ggf-nmwg-hierarchy-00.pdf
    Jul 2003
  • Revised to Proposed Recommendation
    http//www-didc.lbl.gov/NMWG/docs/draft-ggf-nmwg-h
    ierarchy-02.pdf 7 Jan 04
  • Now in 60 day Public comment from 28 Jan 04 18
    days to go.

23
GGF Schemata for Network Measurements
  • Request Schema
  • Ask for results / ask to make test
  • Schema Requirements Document made
  • Use DAMED style namese.g. path.delay.oneWay
  • Send Char. Time, Subject node
    pathMethodology, Stats
  • Response Schema
  • Interpret results
  • Includes Observation environment
  • Much work in progress
  • Common components
  • Drafts almost done
  • 2 (3) proof-of-concept implementations
  • 2 implementations using XML-RPC by Internet2 SLAC
  • Implementation in progress using Document
    /Literal by DL UCL

24
So What do we Use Monitoring for A Summary
  • Detect or X-check problem reports
  • Isolate / determine a performance issue
  • Capacity planning
  • Publication of data network cost for
    middleware
  • RBs for optimized matchmaking
  • WP2 Replica Manager
  • Capacity planning
  • SLA verification
  • Isolate / determine throughput bottleneck work
    with real user problems
  • Test conditions for Protocol/HW investigations
  • Protocol performance / development
  • Hardware performance / development
  • Application analysis
  • Input to middleware eg gridftp throughput
  • Isolate / determine a (user) performance issue
  • Hardware / protocol investigations
  • End2End Time Series
  • Throughput UDP/TCP
  • Rtt
  • Packet loss
  • Passive Monitoring
  • Routers Switches SNMP MRTG
  • Historical MRTG
  • Packet/Protocol Dynamics
  • tcpdump
  • web100
  • Output from Application tools

25
More Information Some URLs
  • DataGrid WP7 Mapcenter http//ccwp7.in2p3.fr/wp7a
    rchive/
  • http//mapcenter.in2p3.fr/datagrid-rgma/
  • UK e-science monitoring http//gridmon.dl.ac.uk/g
    ridmon/
  • MB-NG project web site http//www.mb-ng.net/
  • DataTAG project web site http//www.datatag.org/
  • UDPmon / TCPmon kit writeup http//www.hep.man
    .ac.uk/rich/net
  • Motherboard and NIC Tests www.hep.man.ac.uk/rich
    /net
  • IEPM-BW site http//www-iepm.slac.stanford.edu/bw

26
(No Transcript)
27
  • Network Monitoring to Grid Sites
  • Network Tools Developed
  • Using Network Monitoring as a Study Tool
  • Applications Network Monitoring real users
  • Passive Monitoring
  • Standards Links to GGF

28
Data Flow SuperMicro 370DLE SysKonnect
  • Motherboard SuperMicro 370DLE Chipset
    ServerWorks III LE Chipset
  • CPU PIII 800 MHz PCI64 bit 66 MHz
  • RedHat 7.1 Kernel 2.4.14
  • 1400 bytes sent
  • Wait 100 us
  • 8 us for send or receive
  • Stack Application overhead 10 us / node

29
10 GigEthernet Throughput
  • 1500 byte MTU gives 2 Gbit/s
  • Used 16144 byte MTU max user length 16080
  • DataTAG Supermicro PCs
  • Dual 2.2 GHz Xeon CPU FSB 400 MHz
  • PCI-X mmrbc 512 bytes
  • wire rate throughput of 2.9 Gbit/s
  • SLAC Dell PCs giving a
  • Dual 3.0 GHz Xeon CPU FSB 533 MHz
  • PCI-X mmrbc 4096 bytes
  • wire rate of 5.4 Gbit/s
  • CERN OpenLab HP Itanium PCs
  • Dual 1.0 GHz 64 bit Itanium CPU FSB 400 MHz
  • PCI-X mmrbc 4096 bytes

30
Tuning PCI-X Variation of mmrbc IA32
  • 16080 byte packets every 200 µs
  • Intel PRO/10GbE LR Adapter
  • PCI-X bus occupancy vs mmrbc
  • Plot
  • Measured times
  • Times based on PCI-X times from the logic
    analyser
  • Expected throughput

31
10 GigEthernet at SC2003 BW Challenge
  • Three Server systems with 10 GigEthernet NICs
  • Used the DataTAG altAIMD stack 9000 byte MTU
  • Send mem-mem iperf TCP streams From SLAC/FNAL
    booth in Phoenix to
  • Pal Alto PAIX
  • rtt 17 ms , window 30 MB
  • Shared with Caltech booth
  • 4.37 Gbit hstcp I5
  • Then 2.87 Gbit I16
  • Fall corresponds to 10 Gbit on link
  • 3.3Gbit Scalable I8
  • Tested 2 flows sum 1.9Gbit I39
  • Chicago Starlight
  • rtt 65 ms , window 60 MB
  • Phoenix CPU 2.2 GHz

32
Summary Conclusions
  • Intel PRO/10GbE LR Adapter and driver gave stable
    throughput and worked well
  • Need large MTU (9000 or 16114) 1500 bytes gives
    2 Gbit/s
  • PCI-X tuning mmrbc 4096 bytes increase by 55
    (3.2 to 5.7 Gbit/s)
  • PCI-X sequences clear on transmit gaps 950 ns
  • Transfers transmission (22 µs) takes longer than
    receiving (18 µs)
  • Tx rate 5.85 Gbit/s Rx rate 7.0 Gbit/s (Itanium)
    (PCI-X max 8.5Gbit/s)
  • CPU load considerable 60 Xenon 40 Itanium
  • BW of Memory system important crosses 3 times!
  • Sensitive to OS/ Driver updates
  • More study needed

33
PCI Activity Read Multiple data blocks 0 wait
  • Read 999424 bytes
  • Each Data block
  • Setup CSRs
  • Data movement
  • Update CSRs
  • For 0 wait between reads
  • Data blocks 600µs longtake 6 ms
  • Then 744µs gap
  • PCI transfer rate 1188Mbit/s(148.5 Mbytes/s)
  • Read_sstor rate 778 Mbit/s (97 Mbyte/s)
  • PCI bus occupancy 68.44
  • Concern about Ethernet Traffic 64 bit 33 MHz PCI
    needs 82 for 930 Mbit/s Expect 360 Mbit/s

Data transfer
Data Block131,072 bytes
CSR Access
PCI Burst 4096 bytes
34
PCI Activity Read Throughput
  • Flat then 1/t dependance
  • 860 Mbit/s for Read blocks gt 262144 bytes
  • CPU load 20
  • Concern about CPU load needed to drive Gigabit
    link

35
BaBar Case Study RAID Throughput PCI Activity
  • 3Ware 7500-8 RAID5 parallel EIDE
  • 3Ware forces PCI bus to 33 MHz
  • BaBar Tyan to MB-NG SuperMicroNetwork mem-mem
    619 Mbit/s
  • Disk disk throughput bbcp 40-45 Mbytes/s (320
    360 Mbit/s)
  • PCI bus effectively full!

Read from RAID5 Disks
Write to RAID5 Disks
36
BaBar Serial ATA Raid Controllers
  • 3Ware 66 MHz PCI
  • ICP 66 MHz PCI

37
VLBI Project Packet Loss Distribution
  • Measure the time between lost packets in the time
    series of packets sent.
  • Lost 1410 in 0.6s
  • Is it a Poisson process?
  • Assume Poisson is stationary ?(t) ?
  • Use Prob. Density Function P(t) ?
    e-?t
  • Mean ? 2360 / s426 µs
  • Plot log slope -0.0028expect -0.0024
  • Could be additional process involved
Write a Comment
User Comments (0)
About PowerShow.com