Internet2 E2E piPEs Project - PowerPoint PPT Presentation

1 / 56
About This Presentation
Title:

Internet2 E2E piPEs Project

Description:

... in the Life of Abilene. 8/8/09. 19. First two weeks in March ... Beware the Ides of March. 1st percentile down to 522 Mb/s. Circuit problems along west coast. ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 57
Provided by: e2epiIn
Category:
Tags: e2e | ides | internet2 | march | of | pipes | project

less

Transcript and Presenter's Notes

Title: Internet2 E2E piPEs Project


1
Internet2 E2E piPEs Project
  • Eric L. Boyd

2
Internet2 E2E piPEs
  • Project End-to-End Performance Initiative
    Performance Environment System (E2E piPEs)
  • Approach Collaborative project combining the
    best work of many organizations, including
    DANTE/GEANT, Daresbury, EGEE, GGF NMWG,
    NLANR/DAST, UCL, Georgia Tech, etc.

3
Internet2 E2E piPEs Goals
  • Enable end-users network operators to
  • determine E2E performance capabilities
  • locate E2E problems
  • contact the right person to get an E2E problem
    resolved.
  • Enable remote initiation of partial path
    performance tests
  • Make partial path performance data publicly
    available
  • Interoperable with other performance measurement
    frameworks

4
Measurement Infrastructure Components
5
Sample piPEs Deployment
6
Project Phases
  • Phase 1 Tool Beacons
  • BWCTL (Complete), http//e2epi.internet2.edu/bwctl
  • OWAMP (Complete), http//e2epi.internet2.edu/owamp
  • NDT (Complete), http//e2epi.internet2.edu/ndt
  • Phase 2 Measurement Domain Support
  • General Measurement Infrastructure (Prototype)
  • Abilene Measurement Infrastructure Deployment
    (Complete), http//abilene.internet2.edu/observato
    ry
  • Phase 3 Federation Support
  • AA (Prototype optional AES key, policy file,
    limits file)
  • Discovery (Measurement Nodes, Databases)
    (Prototype nearest NDT server, web page)
  • Test Request/Response Schema Support (Prototype
    GGF NMWG Schema)

7
piPEs Deployment
8
NDT (Rich Carlson)
  • Network Diagnostic Tester
  • Developed at Argonne National Lab
  • Ongoing integration into piPEs framework
  • Redirects from well-known host to nearest
    measurement node
  • Detects common performance problems in the first
    mile (edge to campus DMZ)
  • In deployment on Abilene
  • http//ndt-seattle.abilene.ucaid.edu7123

9
NDT Milestones
  • New Features added
  • Configuration file support
  • Scheduling/queuing support
  • Simple server discovery protocol
  • Federation mode support
  • Command line client support
  • Open Source Shared Development
  • http//sourceforge.net/projects/ndt/

10
NDT Future Directions
  • Focus on improving problem detection algorithms
  • Duplex mismatch
  • Link detection
  • Complete deployment in Abilene POPs
  • Expand deployment into University campus/GigaPoP
    networks

11
How Can you Participate?
  • Set up BWCTL, OWAMP, NDT Beacons
  • Set up a measurement domain
  • Place tool beacons intelligently
  • Determine locations
  • Determine policy
  • Determine limits
  • Register beacons
  • Install piPEs software
  • Run regularly scheduled tests
  • Store performance data
  • Make performance data available via web service
  • Make visualization CGIs available
  • Solve Problems / Alert us to Case Studies

12
Example piPEs Use Cases
  • Edge-to-Middle (On-Demand)
  • Automatic 2-Ended Test Set-up
  • Middle-to-Middle (Regularly Scheduled)
  • Raw Data feeds for 3rd-Party Analysis Tools
  • http//vinci.cacr.caltech.edu8080/
  • Quality Control of Network Infrastructure
  • Edge-to-Edge (Regularly Scheduled)
  • Quality Control of Application Communities
  • Edge-to-Campus DMZ (On-Demand)
  • Coupled with Regularly Scheduled Middle-to-Middle
  • End User determines who to contact about
    performance problem, armed with proof

13
Test from the Edge to the Middle
  • Divide and conquer Partial Path Analysis
  • Install OWAMP and / or BWCTL
  • Where are the nodes?
  • http//e2epi.internet2.edu/pipes/pmp/pmp-dir.html
  • Begin testing!
  • http//e2epi.internet2.edu/pipes/ami/bwctl/
  • Key Required
  • http//e2epi.internet2.edu/pipes/ami/owamp/
  • No Key Required

14
Example piPEs Use Cases
  • Edge-to-Middle (On-Demand)
  • Automatic 2-Ended Test Set-up
  • Middle-to-Middle (Regularly Scheduled)
  • Raw Data feeds for 3rd-Party Analysis Tools
  • http//vinci.cacr.caltech.edu8080/
  • Quality Control of Network Infrastructure
  • Edge-to-Edge (Regularly Scheduled)
  • Quality Control of Application Communities
  • Edge-to-Campus DMZ (On-Demand)
  • Coupled with Regularly Scheduled Middle-to-Middle
  • End User determines who to contact about
    performance problem, armed with proof

15
Abilene Measurement Domain
  • Part of the Abilene Observatory
  • http//abilene.internet2.edu/observatory
  • Regularly scheduled OWAMP (1-way latency) and
    BWCTL/Iperf (Throughput, Loss, Jitter) Tests
  • Web pages displaying
  • Latest results http//abilene.internet2.edu/ami/bw
    ctl_status.cgi/TCP/now Weathermap
    http//abilene.internet2.edu/ami/bwctl_status_map.
    cgi/TCP/now
  • Worst 10 Performing Links http//abilene.internet2
    .edu/ami/bwctl_worst_case.cgi/TCP/now
  • Data available via web service
  • http//abilene.internet2.edu/ami/webservices.html

16
Quality Control of Abilene Measurement
Infrastructure (1)
  • Problem Solving Approach
  • Ongoing measurements start detecting a problem
  • Ad-hoc measurements used for problem diagnosis
  • On-going Measurements
  • Expect Gbps flows on Abilene
  • Stock TCP stack (albeit tuned)
  • Very sensitive to loss
  • Canary in a coal mine
  • Web100 just deployed for additional reporting
  • Skeptical eye
  • Apparent problem could reflect interface
    contention

17
Quality Control of Abilene Measurement
Infrastructure (2)
  • Regularly Scheduled Tests
  • Track TCP and UDP Flows (BWCTL/Iperf)
  • Track One-way Delays (OWAMP)
  • IPv4 and IPv6
  • Observe
  • Worst 10 TCP flows
  • First percentile TCP flow
  • Fiftieth percentile TCP flow
  • What percentile breaks 900 Mbps threshold
  • General Conclusions
  • On Abilene, IPv4 and IPv6 statistically
    indistinguishable
  • Consistently low values to one host or across one
    path indicates a problem

18
A (Good) Day in the Life of Abilene
19
First two weeks in March 50th percentile right at
980 Mb/s 1st percentile about 900 Mb/s Take it as
a baseline.
20
Beware the Ides of March 1st percentile down to
522 Mb/s Circuit problems along west coast. nb
50th percentile very robust.
21
Recovery sort of life through 29 April 1st
percentile back up to mid-800s, lower and
shakier. nb 50th percentile still very robust.
22
Ah, sudden improvement through 5-May 1st
percentile back up above 900 Mb/s and more
stable. But why??
23
Then, while Matt Z is tearing up the tracks 1st
percentile back down to the 500s. Diagnosis
something is killing Seattle. Oh, and Sunnyvale
is off the air.
24
Matt fixes Sunnyvale, and things get (slightly)
worse both Seattle and Sunnyvale are bad. 1st
percentile right at 500 Mb/s. Diagnosis web100
interaction.
25
Matt fixes the web100 interaction. 1st percentile
cruising through 700 Mb/s. Life is good.
26
Friday the (almost) 13th JUNOS upgrade induces
packet loss for about four hours along many
links. 1st percentile falls to 63
Mb/s. Long-distance paths chiefly impacted.
27
A Known Problem
  • Mid-May routers all got a new software load to
    enable a new feature
  • Everything seemed to come up, but on some links,
    utilization did not rebound
  • Worst-10 reflected very low performance across
    those links
  • QoS parameter configuration format change

28
(No Transcript)
29
Nice weekend. 1st percentile rises to 968
Mb/s. But why??
30
(No Transcript)
31
We Found It First
  • Streams over SNVA-LOSA link all showed problems
  • NOC responded Found errors on SNVA-LOSA link
  • (NOC is now tracking errors more closely)
  • Live (URL subject to change) http//abilene.inter
    net2.edu/ami/bwctl_percentile.cgi/TCPV4/1/50/14118
    254811367342080_14169839516075950080

32
Example piPEs Use Cases
  • Edge-to-Middle (On-Demand)
  • Automatic 2-Ended Test Set-up
  • Middle-to-Middle (Regularly Scheduled)
  • Raw Data feeds for 3rd-Party Analysis Tools
  • http//vinci.cacr.caltech.edu8080/
  • Quality Control of Network Infrastructure
  • Edge-to-Edge (Regularly Scheduled)
  • Quality Control of Application Communities
  • ESNet / ITECs (33) See Joe Metzgers talk to
    follow
  • eVLBI
  • Edge-to-Campus DMZ (On-Demand)
  • Coupled with Regularly Scheduled Middle-to-Middle
  • End User determines who to contact about
    performance problem, armed with proof

33
Example Application Community VLBI (1)
  • Very-Long-Baseline Interferometry (VLBI) is a
    high-resolution imaging technique used in radio
    astronomy.
  • VLBI techniques involve using multiple radio
    telescopes simultaneously in an array to record
    data, which is then stored on magnetic tape and
    shipped to a central processing site for
    analysis.
  • Goal Using high-bandwidth networks, electronic
    transmission of VLBI data (known as e-VLBI).

34
Example Application Community VLBI (2)
  • Haystack lt-gt Onsala
  • Abilene, Eurolink, GEANT, NorduNet, SUNET
  • User David Lapsley, Alan Whitney
  • Constraints
  • Lack of administrative access (needed for Iperf)
  • Heavily scheduled, limited windows for testing
  • Problem
  • Insufficient performance
  • Partial Path Analysis with BWCTL/Iperf
  • Isolated packet loss to local congestion in
    Haystack area
  • Upgraded bottleneck link

35
Example Application Community VLBI (3)
  • Result
  • First demonstration of real-time, simultaneous
    correlation of data from two antennas (32 Mbps,
    work continues)
  • Future
  • Optimize time-of-day for non-real-time data
    transfers
  • Deploy BWCTL at 3 more sites beyond Haystack,
    Onsala, and Kashima

36
TSEV8 Experiment
  • Intensive experiment
  • Data
  • 18 scans, 13.9 GB of data
  • Antennas
  • Westford, MA and Kashima, Japan
  • Network
  • Haystack, MA to Kashima, Japan
  • Initially, 100 Mbps commodity Internet at each
    end, Kashima link upgraded to 1 Gbps just prior
    to experiment

37
TSEV8 e-VLBI Network
38
Network Issues
  • In week leading up to experiment, network showed
    extremely poor throughput 1 Mbps!
  • Network analysis/troubleshooting required
  • Traditionally pair-wise iperf testing between
    hosts along transfer path, step-by-step tracing
    of link utilization via Internet2/Transpac-APAN
    network monitoring websites
  • Time consuming, error prone, not conclusive
  • New approach automated iperf-testing using
    Internet2s bwctl tool (allows partial path
    analysis), one single website to integrate link
    utilization statistics into one single website
  • No maintenance required once setup, for the first
    time an overall view of the network and bandwidth
    on segment-by-segment basis

39
E-VLBI Network Monitoring
http//web.haystack.mit.edu/staff/dlapsley/tsev7.h
tml
40
E-VLBI Network Monitoring
http//web.haystack.mit.edu/staff/dlapsley/tsev7.h
tml
41
E-VLBI Network Monitoring
  • Use of centralized/integrated network monitoring
    helped to enable identification of bottleneck
    (hardware fault)
  • Automated monitoring allows view of network
    throughput variation over time
  • Highlights route changes, network outages
  • Automated monitoring also helps to highlight any
    throughput issues at end points
  • E.g. Network Inteface Card failures, Untuned TCP
    Stacks
  • Integrated monitoring provides overall view of
    network behavior at a glance

42
Result
  • Successful UT1 experiment completed June 30 2004.
  • New record time for transfer and calculation of
    UT1 offset
  • 4.5 hours (down from 21 hours)

43
Acknowledgements
  • Yasuhiro Koyama, Masaki Hirabaru and colleagues
    at National Institute for Information and
    Communications Technology
  • Brian Corey, Mike Poirier and colleagues from MIT
    Haysack Observatory
  • Internet2, TransPAC/APAN, JGN2 networks
  • Staff at APAN Tokyo XP
  • Tom Lehman - University of Southern California -
    Information Sciences Institute East

44
Example piPEs Use Cases
  • Edge-to-Middle (On-Demand)
  • Automatic 2-Ended Test Set-up
  • Middle-to-Middle (Regularly Scheduled)
  • Raw Data feeds for 3rd-Party Analysis Tools
  • http//vinci.cacr.caltech.edu8080/
  • Quality Control of Network Infrastructure
  • Edge-to-Edge (Regularly Scheduled)
  • Quality Control of Application Communities
  • Edge-to-Campus DMZ (On-Demand)
  • Coupled with Regularly Scheduled Middle-to-Middle
  • End User determines who to contact about
    performance problem, armed with proof

45
How Can you Participate?
  • Set up BWCTL, OWAMP, NDT Beacons
  • Set up a measurement domain
  • Place tool beacons intelligently
  • Determine locations
  • Determine policy
  • Determine limits
  • Register beacons
  • Install piPEs software
  • Run regularly scheduled tests
  • Store performance data
  • Make performance data available via web service
  • Make visualization CGIs available
  • Solve Problems / Alert us to Case Studies

46
(No Transcript)
47
Extra Slides
48
American / European Collaboration Goals
  • Awareness of ongoing Measurement Framework
    Efforts / Sharing of Ideas (Good / Not
    Sufficient)
  • Interoperable Measurement Frameworks (Minimum)
  • Common means of data extraction
  • Partial path analysis possible along
    transatlantic paths
  • Open Source Shared Development (Possibility, In
    Whole or In Part)
  • End-to-end partial path analysis for
    transatlantic research communities
  • VLBI Haystack, Mass. ?? Onsala, Sweden
  • HENP Caltech, Calif. ?? CERN, Switzerland

49
American / European Collaboration Achievements
  • UCL E2E Monitoring Workshop 2003
  • http//people.internet2.edu/eboyd/ucl_workshop.ht
    ml
  • Transatlantic Performance Monitoring Workshop
    2004
  • http//people.internet2.edu/eboyd/transatlantic_w
    orkshop.html
  • Caltech lt-gt CERN Demo
  • Haystack, USA lt-gt Onsala, Sweden
  • piPEs Software Evaluation (In Progress)
  • Architecture Reconciliation (In Progress)

50
Example Application Community ESnet / Abilene (1)
  • 33 Group
  • US Govt. Labs LBL, FNAL, BNL
  • Universities NC State, OSU, SDSC
  • http//measurement.es.net/
  • Observed
  • 400 usec 1-way Latency Jump
  • Noticed by Joe Metzger
  • Detected
  • Circuit connecting router in the CentaurLab to
    the NCNI edge router moved to a different path on
    metro DWDM system
  • 60 km optical distance increase
  • Confirmed by John Moore

51
Example Application Community ESnet / Abilene (2)
52
American/European Demonstration Goals
  • Demonstrate ability to do partial path analysis
    between Caltech (Los Angeles Abilene router)
    and CERN.
  • Demonstrate ability to do partial path analysis
    involving nodes in the GEANT network.
  • Compare and contrast measurement of a lightpath
    versus a normal IP path.
  • Demonstrate interoperability of piPEs and
    analysis tools such as Advisor and MonALISA

53
Demonstration Details
  • Path 1 Default route between LA and CERN is
    across Abilene to Chicago, then across Datatag
    circuit to CERN
  • Path 2 Announced addresses so that route between
    LA and CERN traverses GEANT via London node
  • Path 3 Lightpath (discussed earlier by Rick
    Summerhill)
  • Each measurement node consists of a BWCTL box
    and an OWAMP box next to the router.

54
All Roads Lead to Geneva
55
Results
  • BWCTL http//abilene.internet2.edu/ami/bwctl_stat
    us_eu.cgi/BW/14123130651515289600_1412424390274344
    5504
  • OWAMP http//abilene.internet2.edu/ami/owamp_stat
    us_eu.cgi/14123130651515289600_1412424390274344550
    4
  • MONALISA
  • NLANR Advisor

56
Insights (1)
  • Even with shared source and a single team of
    developer-installers, inter-administrative domain
    coordination is difficult.
  • Struggled with basics of multiple paths.
  • IP addresses, host configuration, software
    (support source addresses, etc.)
  • Struggled with cross-domain administrative
    coordination issues.
  • AA (accounts), routes, port filters, MTUs, etc.
  • Struggled with performance tuning measurement
    nodes.
  • host tuning, asymmetric routing, MTUs

57
Insights (2)
  • Connectivity takes a large amount of coordination
    and effort performance takes even more of the
    same.
  • Current measurement approaches have limited
    visibility into lightpaths.
  • Having hosts participate in the measurement is
    one possible solution.

58
Insights (3)
  • Consider interaction with security lack of
    end-to-end transparency is problematic.
  • Security filters are set up based on expected
    traffic patterns
  • Measurement nodes create new traffic
  • Lightpaths bypass expected ingress points
Write a Comment
User Comments (0)
About PowerShow.com