Title: Tools and Processes for Testing VoIP
1Tools and Processes for Testing VoIP Chris
Bajorek, DirectorCT Labswww.ct-labs.com
2About the Speaker Chris Bajorek, Director and
Founder, CT Labs Chris Bajorek is a 25-year
veteran of computer telephony and converged
communications. Bajorek has led the company to
its industry-leading position in testing services
which include real-world performance testing,
interoperability verification, and usability and
quality analysis. Customers include first-tier
enterprise and carrier-grade next-generation
network product manufacturers. Prior to
founding CT Labs, Bajorek founded Telephone
Response Technologies, Inc. (TRT), which
developed and sold turnkey voice response and
unified messaging products as well as
award-winning toolkits for rapid development of
voice-based applications. Prior to TRT he worked
for Integrated Office Systems and Time and Space
Processing where he performed pioneering work on
voicemail and digital voice communications
products. Bajorek holds a B.S.E.E. from Cal
Poly, San Luis Obispo.
3For Todays Talk Taking a Developers
Perspective to VoIP Test
- Much of CT Labs business is with RD and QA
groups of VoIP product manufacturers - Would like to provide a window into some of our
VoIP test experiences, including - Common VoIP test myths
- Testing tips and suggestions
- Focus on voice quality testinghot area for VoIP
4Myths around VoIP Deployment
- Voice quality is a given
- VoIP is easy to deploy
- VoIP is inexpensive to deploy
- All VoIP-enabled phones are created equal
- Once you have your VoIP network set up, you can
leave it alone
5VoIP Requires a Lifecycle Approach
- Lack of a proper lifecycle will
- Drive Costs Up
- Reduce VoIP Reliability / Availability
- Risk Complete Failure of Deployment
Should design new VoIP products with this in mind
6VoIP Troubleshooting Areas The Big Picture
- Call Processing (i.e. call connectivity, service
availability) - Voice Quality
- Interoperability / Feature Interaction
- Configuration / Registration
- Routing
- Security
- Applications (conferencing, IVR, voicemail, )
7Troubleshooting example
- Symptom sporadic call failures
- Common causes
- Gateway and switch mis-configuration
- Interoperability issues between equipment
- Capacity limitations
- Performance issues and delays triggering
timeouts - Feature interaction issues such as conflicting
call-forwarding settings
8VoIP Deployment Segments
- Residential (Voice over Broadband)
- Enterprise
- Next-Gen Network Carriers and Service Providers
- All three areas are quite active now
9VoIP Products, by Segment(products that touch
the media stream)
- Residential
- Analog terminal adapters, VoIP softphones,
residential routers - Enterprise
- IP PBXs, IP Contact Centers, VoIP phones
softphones, firewalls/ALGs, media servers
(conferencing, voice mail) - Next-Gen Carriers and Service Providers
- Session border controllers, media servers, media
gateways, transcoding/VQ enhancement processors
10VoIP Testing Areas of Focus
- Service reliability
- i.e. Availability of service, Call connectivity
- Voice quality
- Includes measurement of VQ, latency, levels, echo
can., etc. - Phone features
- CLASS features, such as call park, transfer, etc.
- VoIP Access to enhanced services
- Voice mail, conferencing, IVR, etc.
- Each of these areas has its own set of testing
challenges, but one thing is clear all relate
to the end-user Quality Experience and must be
validated
11Active versus Passive VoIP Testing
- Active tests
- Involves driving real 2-way calls thru the VoIP
network - Benefits more accurate, uses mature standards
(PESQ, etc.) for automated quality assessment - Negatives consumes network resources
- Passive tests
- Involves passive evaluation of call-based packet
flows - Ignores (or models) VoIP endpoint-specific
behaviors to network conditions
12Post-Deployment, Passive Testing is Key
- Deployed VoIP networks should
- Continuously monitor passive VQ, call completion
rates, network packet loss, jitter, latency - Set alarming thresholds for VoIP call performance
that degrades below adaptive-corrective levels - Assumption Pre-deployment tests resulted in
- Clean bill of network health
- Baseline characterization of network during peak,
off-peak times
13Passive Monitoring Embedded Components for
Product Developers
- Products incorporating these can quickly adapt to
changing IP network conditions - Real-time access to estimated MOS, round-trip
latency - Access to level and echo information for estimate
of MOS-Conversational Quality - VQMon from Telchemy (www.telchemy.com)
- PsyVoIP -- from Psytechnics (www.psytechnics.com)
14A few things about Codecs
- Waveform codecs
- Produces waveform as identical as possible to the
original (G.711 PCM, G.726 ADPCM) - Source codecs
- Uses a model of how speech is generated
- Can significantly alter the time-domain waveform
while sounding very similar to the input
(G.729a/729, G.723.1)
15A few things about Codecs
- Hybrid codecs
- Combine techniques from waveform and source
codecs - Uses different modes and bit rates depending on
network conditions - AMR
- Bit rate 4.75-12.2 kbps MIPS complexity
15-20 - AMR-WB / G.722.2 (wideband7kHz signal bw)
- Bit rate 6.6-28.3 kbps MIPS complexity 38
(incl. VAD and CNG) - Why knowledge of codec method(s) is useful for VQ
analysis
16Devices that can affect a Users VoIP
Experience
SBCs (Border Controllers) Media Servers
Firewalls/ALGs Messaging Servers Conference
Bridges
- IP PBXs
- IP Phones VoIP endpoints
- Media Gateways
- IVR / Voice portals
17Voice Quality versus Intelligibility
- Voice quality the acceptability of speech
- Intelligibility the clarity of speech
- Subjective tests Diagnostic Rhyme Test, Modified
Rhyme Test - Higher frequencies more important for
intelligibility, a good benefit of wideband
codecs - Lower quality affects intelligibility but not
necessarily vice versa
18Voice Quality Measurement A Hot Topic
- What is considered the gold standard way to
measure voice quality? - Answer with humans, and the more of them in a
listening session the better the resolution of
the resulting quality scores - However, conducting a live-listener test is not
as easy or cheap as you may think
19MOS Subjective Testing
- Its a Standard ITU-T P.800 (1996)
- The technique rates quality using absolute
category rating method (ACR) 5-grade scale
5excellent 4good 3fair 2poor
1bad
20MOS Subjective Testing
- How its done
- Requires use of a group of 32-64 naive
listeners - Standardized male, female, and child phrases are
used - Calibrating reference degraded conditions are
intermixed with actual samples - The identical speech sample sets are played to
all listeners - Listeners judge the quality of each phrase using
ACR scale
21MOS Subjective Testing
- Strengths
- Provides the definitive answer to which sounds
best? - Weaknesses
- High cost, especially when many different test
conditions or sample sets must be evaluated - Takes time to schedule test and get results
22Objective VQ Standards
- All automated VQ measurement techniques are
designed to estimate the way humans perceive
voice quality - PSQM P.861 (1996)
- PSQM handled higher distortion levels than PSQM
- PESQ P.862 (2001)
- Solved variable delay (alignment) problem of
PSQM
23What PESQ VQ Testing is designed for
- PESQ is a way to quickly and cost-effectively
estimate the effects of one-way speech distortion
and noise on speech quality - PESQ is endpoint-agnostic can be used for
VoIP-to-VoIP, VoIP-to-PSTN calls, etc. - PESQ can be used for VQ assessment of wideband
codecs if your test platform supports it (if not,
3.1kHz signal bandwidth applies)
24PESQ Narrowband vs. Wideband
25What PESQ VQ Testing is not designed for
- PESQ does not evaluate the effects of loudness
loss, fixed latency, sidetone, or echo as related
to two-way caller interactions - PESQ can not safely be used to declare a VQ
winner when the PESQ score differential is
small (i.e. lt.25) - Opposite conclusion errors are very possible,
so the bigger the score differential the better - Especially true when comparing samples with more
than a single changed variable
26Objective VQ Testing
- Strengths
- Provides excellent estimate of voice quality
- Tests can be performed quickly
- Tests are very repeatable
- Weaknesses
- Not good for reliably resolving small differences
in quality scores
27Troubleshooting VQ Issues
- Must look at all the metrics of VoIP calls
exactly as transmitted on the network
- Packet Loss ?
- Jitter ?
- Delay ?
- Voice Quality ?
Measurement is critical for problem resolution
Jitter distribution graph
28Tip How to test end-to-end VQof VoIP phones
- 1 Its usually not enough to evaluate VQ by
just looking at the packet streams (i.e. E-model) - 2 Must evaluate quality all the way to the
phones earpiece and microphone wires - So can evaluate the proper operation of the
phones internal VoIP gateway, including
automatic gain (AGC), voice activity detection
(VAD), comfort noise generation (CNG), echo
cancellation, codecs, jitter buffer management,
and packet loss concealment algorithms. - In other words, there is much that can go wrong.
29Tip How to test end-to-end VQof VoIP phones
- 3 Must evaluate under expected LAN/WAN
impairment conditions - Packet loss, Jitter, Latency
- Effective bandwidth of IP connection
- i.e. Broadband versus Dialup
- 4 Dont forget interoperability testing against
other VoIP devices - Verify VQ against other expected manufacturers
devices
30Testing end-to-end VQ of VoIP phones
- The automated VQ test
- Important for verifying VQ under many conditions
- Vary one dimension at a time during subsequent
test runs - The manual VQ real user test
- Conduct 2-way calls with real users who are
familiar with potential echo cancellation and
other 2-way effects - Include handset and speakerphone test calls
31Testing end-to-end VQ of VoIP phones
- Test setup examples
- Softphone to softphone test
- VoIP Phone to VoIP Phone test (in lab)
- VoIP Phone to PSTN call test
- Variations on these themes easily set up
- Wideband codecs used? If so, be sure to verify
that all test equipment in the audio/media signal
path can support 8 kHz.
32Testing Softphone-to-Softphone
Media may flow peer-to-peer or through the VoIP
Network component
PESQ evaluated off-line via batch process
33Testing VoIP Phone-to-VoIP Phone
Good setup when isolated device performance test
is needed.Phone calls are manually placed with
this setup.
34Testing VoIP Phone to PSTN calls
35Example WAN Impairment Conditionsfor VQ
TestConditions suitable for emulation of
overseas Internet dialup conditions
Broadband and Dialup IP bandwidths for each condition below
Packet Loss 0Latency / Jitter 10/30 mSec (uniform distributed latency model)
Packet Loss Random 2.5, Latency / Jitter 10/30 mSec
Packet Loss Burst 5.0, 1-5 packet burst size Latency / Jitter 50/80 mSec
Packet Loss Burst 10.0, 1-8 packet burst size Latency / Jitter 125/250 mSec
36Watch out for
- Do not try to compare MOS scores derived from
different sources or evaluation engines - Even the numeric ranges from worse to best
can vary (i.e. best 4.5, not 5.0) - Especially, dont compare passive with active VQ
results
37Real-World Next-Gen NetworkProduct Testing
www.ct-labs.com 916-577-2100 Chris
Bajorekchris_at_ct-labs.com916-577-2110 direct line