Title: Fault and Performance Management
1Fault and Performance Management for Next
Generation IP Communication Alan Clark, Telchemy
Fault and Performance Management for Next
Generation IP Communication Alan Clark, Telchemy
2Outline
- Problems affecting VoIP performance
- Tools for Measuring and Diagnosing Problems
- Protocols for Reporting QoS
- Performance Management Architecture
- What to ask for/ integrate?
3Enterprise VoIP Deployment
IP Phone
IP Phones
IP VPN
Branch Office
Teleworker
Gateway
IP Phone
4VoIP Deployment - Issues
ROUTE FLAPPING, LINK FAIL
IP Phone
IP Phones
IP VPN
CODEC DISTORTION
ECHO
Gateway
LAN CONGESTION, DUPLEX MISMATCH, LONG CABLES.
ACCESS LINK CONGESTION
IP Phone
5Call Quality Problems
- Packet Loss
- Jitter (Packet Delay Variation)
- Codecs and PLC
- Delay (Latency)
- Echo
- Signal Level
- Noise Level
6Packet Loss and Jitter
Jitter Buffer
Codec
IP Network
Distorted Speech
Packets lost in network
Packets discarded due to jitter
7Routers, Loss and Jitter
Queuing delay
Serialization delay
Processing delay
Queuing delay
Output queue
Input queue
Arriving packets
Prioritize/ Route
Voice packet delayed by one or more data packets
Packet loss due to buffer Overflow or RED
8Queuing Delays
Added delay due to wait for data packets to be
sent Jitter
9Jitter
Average jitter level (PPDV) 4.5mS Peak jitter
level 60mS
10WiFi can also cause jitter
11Effects of Jitter
- Low levels of jitter absorbed by jitter buffer
- High levels of jitter
- lead to packets being discarded
- cause adaptive jitter buffer to grow - increasing
delay but reducing discards - If packets are discarded by the jitter buffer as
they arrive too late they are regarded as
discarded - If packets arrive extremely late they are
regarded as lost hence sometimes lost packets
actually did arrive
12Packet Loss
Average packet loss rate 2.1 Peak packet loss
30
13Packet Loss is bursty
- Packet loss (and packet discard) tends to occur
in sparse bursts - say 20-30 in density and one
second or so in length - Terminology
- Consecutive burst
- Sparse burst
- Burst of Loss vs Loss/Discard
14Example Packet Loss Distribution
Consecutive loss
20 percent burst density (sparse burst)
15Loss and Discard
- Loss is often associated with periods of high
congestion - Jitter is due to congestion (usually) and leads
to packet discard - Hence Loss and Discard often coincide
- Other factors can apply - e.g. duplex mismatch,
link failures etc.
16Example Loss/Discard Distribution
17Leads To Time Varying Call Quality
High jitter/ loss/ discard
18Packet Loss Concealment
Estimated by PLC
- Mitigates impact of packet loss/ discard by
replacing lost speech segments - Very effective for isolated lost packets, less
effective for bursty loss/discard - But isnt loss/discard bursty?
- Need to be able to deal with 10-20-30 loss!!!
19Effectiveness of PLC
Codec distortion
Impact of loss/ discard and PLC
20Call Quality Problems
- Packet Loss
- Jitter (Packet Delay Variation)
- Codecs and PLC
- Delay (Latency)
- Echo
- Signal Level
- Noise Level
21Effect of Delay on Conversational Quality
22Causes of Delay
External delay
Accumulate and encode
RTP
IP UDP TCP
CODEC
Echo Control
Network delay
Jitter buffer, decode and playout
RTP
IP UDP TCP
CODEC
Echo Control
23Cause of Echo
Gateway
IP
Echo Canceller
Acoustic Echo
Line Echo
Round trip delay - typically 50mS
Additional delay introduced by VoIP makes
existing echo problems more obvious Also -
convergence echo
24Echo problems
- Echo with very low delay sounds like sidetone
- Echo with some delay makes the line sound hollow
- Echo with over 50mS delay sounds like. Echo
- Echo Return Loss
- 55dB or above is good
- 25dB or below is bad
25Call Quality Problems
- Packet Loss
- Jitter (Packet Delay Variation)
- Codecs and PLC
- Delay (Latency)
- Echo
- Signal Level
- Noise Level
26Signal Level Problems
Amplitude Clipping occurs -- speech sounds loud
and buzzy
0 dBm0
-36 dBm0
Temporal Clipping occurs with VAD or Echo
Suppressors -- gaps in speech, start/end of
words missing
27Noise
- Noise can be due to
- Low signal level
- Equipment/ encoding (e.g. quantization noise)
- External local loops
- Environmental (room) noise
- From a service provider perspective - how to
distinguish between - room noise (not my problem)
- Network/equipment/circuit noise (is my problem)
28Measuring VoIP performance
Analog signal based
VoIP Specific
VQmon ITU G.107
ITU P.862 (PESQ)
Active Test - Measure test calls
VQmon ITU P.VTQ
ITU P.563
Passive Test - Measure live calls
29Gold Standard - ACR Test
- Speech material
- Phonetically balanced speech samples 8-10 seconds
in length - Test designed to eliminate bias (e.g.
presentation order different for each listener) - Known files included as anchors (e.g. MNRU)
- Listening conditions
- Panel of listeners
- Controlled conditions (quiet environment with
known level of background noise)
30Example ACR test results
- Extract from an ITU subjective test
- Mean Opinion Score (MOS) was 2.4
- 1Unacceptable
- 2Poor
- 3Fair
- 4Good
- 5Excellent
31Packet based approaches
Test Call
VoIP Test System
VoIP Test System
IP
Measure call
Live Call
VQmon, G.107. P.VTQ
VoIP End System
VoIP End System
IP
Passive Test
Passive Test
32Packet based approaches
- ITU G.107 R Ro - Is - Ie - Id A
- Really a network planning tool
- Missing many essential monitoring features
- VQmon
- ITU G.107 ETSI TS 101 329-5 Annex E .
- Proprietary but widely used (Superset of G.107
P.VTQ) - ITU P.VTQ
- Available late 2005, very limited functionality
33Extended E Model - VQmon
4 State Markov Model Gather detailed packet loss
info in real time
Arriving packets
Loss/ Discard events
Discarded
Jitter buffer
Signal level Noise level Echo level
CODEC
Call Quality Scores Diagnostic Data
Metrics Calculation
34Modeling transient effects
Ie(burst)
Measured Call quality
User Reported Call quality
Ie(VQmon)
Ie(gap)
10
15
20
25
30
35
Time (seconds)
35VQmon - computational model
Burst loss rate
Perceptual model
Calculate R-LQ MOS-LQ
Ie mapping
Gap loss rate
ETSI TS101 329-5
Recency model
Calculate Ro, Is
Signal level Noise level
ITU-T G.107
Calculate R-CQ MOS-CQ
Calculate Id
Echo Delay
36Accuracy Non-bursty conditions
37Accuracy Bursty conditions
- G.107
- Well established model for network planning
- No way to represent jitter
- Few codec models
- Inaccurate for bursty loss
- Conversational Quality only
- VQmon
- Extended G.107
- Transient impairment model
- Wide range of codec models
- Narrow Wideband
- Jitter Buffer Emulator
- Listening and Conversational Quality
VQmon
E Model
Comparison of VQmon and E Model for severely time
varying conditions
38Signal based approaches
P.862 Tester
Test Call
VoIP End System
VoIP End System
IP
P.862 is an Active Test Approach
VoIP End System
VoIP End System
IP
P.563 Tester
P.563 is a Passive Test Approach
39ITU P.862 - Active testing
Tested segment of connection
IP
PESQ
Time align
Audio files
FFT
Compare
PESQ Score
FFT
40ITU P.862 - Active testing
- Send speech file
- Compare received file with original using FFT
- Takes typically 50-100 MIPS per call
- MOS-like score in the range -0.5 to 4.5
- Widely used within the industry
Results for G.729A codec for a set of speech
files (i.e. for each packet loss rate the only
thing changed is the speech source file)
41ITU P.563 - Passive monitoring
- Analyses received speech file (single ended)
- Produces a MOS score
- Correlates well with MOS when averaged over many
calls - Requires 100MIPS per call
Comparison of P.563 estimated MOS scores with
actual ACR test scores. Each point is average
per file ACR MOS with 16 listeners compared to
P.563 score
42Performance Monitoring - Passive Test
Embedded Monitoring Function
RTCP XR
SIP QoS Report
43SLA Monitoring - Active Test
Test call
Active Test Functions
44Active or Passive Testing?
- Active testing
- works for pre-deployment testing and on-demand
troubleshooting - But!!!!
- IP problems are transient
- Passive monitoring
- Monitors every call made - but needs a call to
monitor - Captures information on transient problems
- Provides data for post-analysis
- Therefore - you need both
45VoIP Performance Management Framework
Network Management System
Call Server and CDR database
Signaling Based QoS Reporting
SNMP Reporting
Network Probe, Analyzer or Router
VQ
VoIP Gateway
VQ
VoIP Endpoint
VQ
RTP stream (possibly encrypted)
Embedded Monitoring
Embedded Monitoring
Media Path Reporting (RTCP XR)
46VoIP Performance Management Framework
- Embedded monitoring function in IP phones,
residential gateways. - Close to the user
- Least cost widest coverage
- Protocol support developed
- RTCP XR (RFC3611), SIP, MGCP, H.323, Megaco
- Draft SNMP MIB
- Works in encrypted environments
- Already being deployed by equipment vendors
47The role of RTCP XR
RTCP XR (RFC3611)
- Provides a useful set of metrics for VoIP
performance monitoring and diagnosis - Supports both real time monitoring and
post-analysis - Extracts signal level, noise level and echo level
from DSP software in the endpoint - Exchanges info on endpoint delay and echo to
allow remote endpoint to assess echo impact - Provides midstream probes/ analyzers access to
analog metrics if secure RTP is used - Goes through firewalls
48RFC3611 - RTCP XR
49SIP Service Quality Reporting Event
PUBLISH sipcollector_at_example.com SIP/2.0
Via SIP/2.0/UDP pc22.example.combranchz9hG4bK3
343d7 Content-Type
application/rtcpxr Content-Length ...
VQSessionReport LocalMetrics TimeStampsSTART
10012004.18.23.43 STOP10012004.18.26.02 SessionD
escPT0 PDG.711 SR8000 FD20 FPP2 PLC3
SSUPon CallID1890463548_at_alice.uac.chicago.
com SignalSL2 NL10 RERL14 QualityEst
RLQ90 RCQ85 EXTR90 MOSLQ3.4 MOSCQ3.3
QoEEstAlgVQMonv2.1 DialogID38419823470834to
-tag8472761from-tag9123dh311
50RTCP XR MIB
History table
Session table
Basic parameters
Alerting
Call quality metrics
51Passive Monitoring Framework
IP Phone
IP Phones
VQ
VQ
VQ
VQ
VQ
VQ
IP VPN
VQ
Branch Office
VQ
VQ
Teleworker
RTCP XR
VQ
SNMP
VQ
Gateway
SIP QoS Report
VQ
NMS
IP Phone
52What to Implement/ Ask For
- Embedded monitoring functionality in IP Phones
and Gateways (e.g. VQmon) - RTCP XR for mid-call data exchange between
endpoints - SIP Service Quality Events for reporting end of
call quality - RTCP XR MIB for SNMP support
53Summary
- Problems affecting VoIP performance
- Tools for Measuring and Diagnosing Problems
- Protocols for Reporting QoS
- Performance Management Architecture
- What to ask for/ integrate?