Active Measurements on the AT - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Active Measurements on the AT

Description:

... network behavior (e.g. AT&T's Network Status Site http://www.att.com/ipnetwork) ... Presented at the IETF 50 IPPM meeting by Al Morton ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 25
Provided by: gomathiram
Category:

less

Transcript and Presenter's Notes

Title: Active Measurements on the AT


1
Active Measurements on the ATT IP Backbone
  • Len Ciavattone,
  • Al Morton, Gomathi Ramachandran
  • ATT Labs

2
Colleagues on This Project
  • Nicole Kowalski
  • Ron Kulper
  • George Holubec
  • Shashi Pulakurti

3
Measurements for Large Networks
  • Must be
  • Easily understood
  • Estimate or assess customer performance
  • Useful for alarming and associated actions
  • Not likely to generate false positives
  • As close as possible to real-time notification
  • Part of the traditional fault/passive management
    system

4
Traditional Measurements
  • Fault
  • Triggered by hard failures (link, card, router,
    etc)
  • Near real-time alarms
  • Passive
  • Element level monitoring
  • Traffic, drops, device health, card performance
    monitored
  • Performance alarming possible per interface
  • Where can traditional measurements be added to?
  • Path level performance information
  • Delay and delay variation measurements
  • Indication of customer degradation (except hard
    failures)

5
Active Measurements
  • Active measurements introduce synthetic traffic
    into the network
  • Advantages
  • Traffic flow follows a sampled customer path
  • Delay, delay variation and sampled loss directly
    measurable
  • Possible to estimate customer impact of element
    level degradation
  • Well designed sampling methodology will allow
    sound estimation of levels of degradation seen
  • Can be used to give customers a sense of network
    behavior (e.g. ATTs Network Status Site
    http//www.att.com/ipnetwork)
  • Disadvantages
  • Need to introduce traffic into the network
  • Based on sampling, not customer traffic

6
Practical Considerations
  • From a practical standpoint, what limits the
    measurements?
  • Amount of data generated
  • Desire to use a standard/unmodified UNIX kernel
  • Expense of bigger and more powerful servers
  • Cost of deployment of new servers in COs.
  • Difficulty of acquiring appropriate GPS feed

7
Measurement Design
24 hours
. . .
15 minutes
  • Poisson Sequence
  • 15 minute duration
  • ? 0.3 pkts/sec
  • Type UDP
  • 278 bytes total
  • packet loss threshold is a min of 3 s
  • Periodic Sequence
  • 1 minute duration
  • Random Start Time
  • 20 ms spacing
  • Type UDP, IPv4
  • 60 bytes total
  • packet loss threshold is a min of 3 s

Presented at the IETF 50 IPPM meeting by Al Morton
8
Sampling and Event Detection
  • Poisson Sequence
  • All 15 minutes tested with average inter-arrival
    time of 3.33s
  • Assume 10 s congestion events (minimum length)
  • If
  • Probability of Detection by one or more packets

9
Sampling and Event Detection
  • Periodic sequence
  • 1-min test in a 15-min test cycle (2 if
    considering RT processes)
  • Assume 10s congestion events (minimum length),
    assume 1 event per test cycle
  • Consider that only recurring events are
    actionable Average Number of cycles
    to detection (one-way) 1/0.0777 13 test
    cycles
  • The Poisson Probe sequence detects accurately,
    the Periodic Probe sequence is used to
    characterize recurring events

10
Metrics
  • Round Trip (RT) Loss
  • RT Delay (std dev, 95th percentile, min, mean)
  • Inter-Packet Delay Variation (IPDV) and DV jitter
  • Out of sequence events (non-reversing sequence
    definition -- up for consideration in the IETF
    IPPM)
  • Approximate one-way loss
  • Degraded seconds or minutes
  • Loss pattern (number of consecutive losses)
  • Distributions of delay variations
  • Traceroutes performed at the beginning of each
    test
  • 85 Metrics kept indefinitely

11
IPDV Definition and Example
IPDV is a measure of transfer delay variation.
For Packet n, IPDV(n) Delay(n) -
Delay(n-1) If the nominal transfer time is
?10msec, and packet 2 is delayed in transit for
an additional 5 msec, then two IPDV values will
be affected. IPDV(2) 15 - 10 5 msec IPDV(3)
10 - 15 -5 msec IPDV(4) 10 - 10 0 msec
Tx
Rcv
Playout
1
?
2
Inter packet arrival time, longer than send
interval
1
3
?t
2
4
3
4
Time spent in Transit Rcv Buffer
12
IP Packet Sequence
Src
Dst
Playout
Arriving Packets are compared with the next
expected RefNum. Packet 2 arrives
Out-of-Sequence, since Packet 3 has arrived and
the next expected packet in Packet 4. Packet 2
is Offset by 1 packet, or Late by the arrival
time of Packet 2 - Packet 3 ?t
1
?
2
3
Tolerance on R2 arrival with 2 Packet Buffer
1
?t
4
2
3
Time spent in Transit Rcv Buffer
13
Common Problems Detected
  • Route Changes
  • Card degradation
  • Low-level fiber errors
  • Effects of Maintenance (Card swaps etc)

14
Examples of Detection
  • Bit errors that cause low-level (0.03) loss can
    be detected accurately using this method and can
    be fixed before customers feel the impact
  • Typically in such cases the degradation is subtle
    enough that traditional IP alarms do not show the
    problem clearly
  • Customers arent complaining.yet
  • In the case shown, no customer complaints were
    made and the problem was fixed proactively

15
Increasing Bit Errors
More occasional Loss was seen with the Poisson
Probe Sequence
Fiber span taken out of service
Two packet losses per Periodic test
Single packet loss per Periodic test
16
Detection of Route Changes
RT Delay
107
109
9
6
Time
Periodic Sequence
100
115
17
Poisson Probe Route change detection
18
Periodic probe (same incident)
19
The Blenders
  • First shown by Steve Casner et al in the NANOG 22
    conference (May 20-22, 2001, A Fine-Grained View
    of High Performance Networking,
    http//www.nanog.org/mtg-0105/agenda.html)
  • Seem to be properties of route loops
  • Rare events, but interesting as they may shed
    light on some properties of route convergence

20
Simple Blender
  • 88 packets arrive within 64 ms
  • 79 OOS packets, 9 in sequence
  • 7 sequence discontinuities.
  • Zero Loss
  • Delay and IPDV actually describe this event best

21
Simple Blender Magnified
22
Blender 2
  • Scattered loss throughout
  • 250 packets in event,
  • 10 separate sequence discontinuities
  • Delay of first packet 6s

23
Blender 2
24
Summary
  • Active measurements
  • Can provide a view of customer performance
  • Can be used to alert maintenance personnel
    proactively
  • Can provide insight into network behavior
  • Can be used to improve planned maintenance
Write a Comment
User Comments (0)
About PowerShow.com