Using Speech Recognition to Predict VoIP Quality - PowerPoint PPT Presentation

1 / 15

About This Presentation

Title:

Using Speech Recognition to Predict VoIP Quality

Description:

But the relative recognition ratio Rrel is universal and speaker-independent ... The relative word recognition ratio is a universal, speaker-independent metric ... – PowerPoint PPT presentation

Number of Views:62

Avg rating:3.0/5.0

Slides: 16

Provided by: henningsc

Learn more at: http://www.cs.columbia.edu

Category:

Tags: voip | predict | quality | recognition | relative | speech | using

Transcript and Presenter's Notes

Title: Using Speech Recognition to Predict VoIP Quality

1
Using Speech Recognition to Predict VoIP Quality

Wenyu Jiang
IRT Lab
April 3, 2002

2
Introduction to Voice Quality

Quality factors in Voice over IP (VoIP)
Packet loss, delay, and jitter
Choice of voice codec
Quality metric Mean Opinion Score
Widely used
Human based
Time consuming
Labor intensive
Results N/A in real-time

3
Motivation

Features of a speech recognizer
Automatic speech recognition (ASR), no human
listeners needed
Accuracy of recognition is apparently coupled
with the quality of input speech
Recognition can be done in real-time, allowing
online quality monitoring.
Recognition performance may be related to speech
intelligibility as well as quality.

4
Related Work

ITU-T E-model G.107/G.108
An analytical model for estimating perceived
quality
Provides loss-to-MOS mapping for some common
codecs (G.729, G.711, G.723.1).
Chernick et al studies speech recognition
performance with DoD-CELP codec
Effect of bit error rate instead of packet loss
Phoneme (instead of word) recognition ratio
Some MOS results, but not accurate enough

5
Experiment Setup

Speech recognition engine
IBM ViaVoice on Linux
Wrote software for both voice model training and
performance testing
Training and Testing
2 scripts, 1 for training, 2 for testing.
2 speakers, A and B, both read 2 scripts.
Script 2 is split into 25 audio clips, with 5
clips per loss condition (0, 2, 5, 10, 15)
Codec G.729
Training by G.729 processed audio

6
Experiment Setup, contd.

Performance metric
Absolute word recognition ratio
Relative word recognition ratio
p is packet loss probability
MOS listening tests 22 listeners

7
Recognition Ratio vs. MOS

Both MOS and Rabs decrease w.r.t loss

Then, eliminate middle variable p

8
Properties of ASR Performance

When loss probability is low
Recognition ratio changes slowly
Possibly due to robustness in ViaVoice
Less accurate MOS prediction in such case
Importance of voice training method
Training audio should use same codec as testing

9
Speaker Dependence in ASR

ViaVoice SDK cites a 90 accuracy for
Average speaker without a heavy accent
Sampling at 22KHz, PCM linear-16
For speaker A, we achieved
About 42 accuracy with no packet loss
Reasons
8KHz sampling G.729 compression
Accent talk speed
Does not interfere with MOS prediction, but need
to check for speaker dependence

10
Speaker Dependence Check

Absolute recognition ratio is
70 for speaker B, but 42 for speaker A
dependent on the speaker

But the relative recognition ratio Rrel is
universal and speaker-independent

11
Rrel as Universal MOS Predictor

Mapping from relative recognition ratio Rrel to
MOS

12
Human Recognition Results

Listeners are asked to transcribe what they hear
in addition to MOS grading.
Human recognition result curves are less smooth
than MOS curves.

13
Human Results, contd.

Two flat regions in loss-human curve
2-5 loss (some loss but not very high)
10-15 loss (loss is already too high)

Mapping between machine and human recognition
performance

14
Application Scenarios

Sender transmits a pre-recorded audio clip of a
speaker known to receiver.
Receiver does the following
Looks up Rabs(0) for this speaker
Performs speech recognition
Compare to the original text, compute Rrel
No need to store the original audio clip
Just the text is sufficient ? less storage
Need not know packet loss probability
Suitable for e2e black-box measurements

15
Conclusions

Evaluation of speech recognition performance as a
MOS predictor
Used ViaVoice speech engine
Performance metric word recognition ratio
The relative word recognition ratio is a
universal, speaker-independent metric
Also analyzed human recognition performance
Future work evaluate other codecs, e.g., G.726,
GSM.

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

Introducing-PowerShowcom PowerPoint PPT Presentation

Introducing-PowerShowcom - Introducing-PowerShowcom (Without Music)

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

Presentation Title Here PowerPoint PPT Presentation

Presentation Title Here - Reverse phone lookup. Log lead. Record interests. Set reminder calls ... Corporate directory capability. Internal news, training. Extend personal area network ... | PowerPoint PPT presentation | free to view

Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation PowerPoint PPT Presentation

Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation - ITU-T 2002 G.722.2 recommended for wideband speech ... OMA: Open Mobile Alliance (an organization formed to facilitate the global user ... | PowerPoint PPT presentation | free to view

EECS 294-12 An Information and Communications Technology (ICT) Framework for Developing Regions PowerPoint PPT Presentation

EECS 294-12 An Information and Communications Technology (ICT) Framework for Developing Regions - Source: Stan Shih, Acer, 1992. 4 Billion People. Earning less ... Source: Stan Shih, Acer, 1992. Key Idea: Can such a model be used to successfully develop ... | PowerPoint PPT presentation | free to view

Holly Voice Platform Overview PowerPoint PPT Presentation

Holly Voice Platform Overview - SpeechTEK 2006 Voice Over IP Tutorial. Andrew Hunt, Ph.D. VP Engineering, Holly Connects ... hunt@holly-connects.com. Web: http://www.holly-connects.com ... | PowerPoint PPT presentation | free to view

When%20will%20the%20telephone%20network%20disappear? PowerPoint PPT Presentation

When%20will%20the%20telephone%20network%20disappear? - First packetized speech over SATNET between Lincoln Labs and NTA (Norway) and UCL (UK) ... to be better than the rat LBR. Allows a more 'fair' comparison ... | PowerPoint PPT presentation | free to view

Lahore University of Management and Sciences PowerPoint PPT Presentation

Lahore University of Management and Sciences - Theoretical Computer Science and Software Engineering ... Automotive Systems. Vision-based Smart Airbag. Advanced Electrical System Design (MAESTrO) ... | PowerPoint PPT presentation | free to view

Contact%20Center%20within%20your%20reach:%20Improving%20your%20communication%20with%20Voxtron's%20software%20suite PowerPoint PPT Presentation

Contact%20Center%20within%20your%20reach:%20Improving%20your%20communication%20with%20Voxtron's%20software%20suite - Our Mission is to support our client's businesses through providing them with ... directly from desktop (e.g. from Word, PDF, etc.) saves time and energy (ie money! ... | PowerPoint PPT presentation | free to view

The Status 0f Voice Over Internet Protocol (VoIP) Worldwide survey of worldwide status of VoIP regulation PowerPoint PPT Presentation

The Status 0f Voice Over Internet Protocol (VoIP) Worldwide survey of worldwide status of VoIP regulation - Voice over Internet Protocol (VoIP) is referred to as and ... with QoS and reliability, (including continuance of service during power cuts and security) ... | PowerPoint PPT presentation | free to view

Nincs diacm PowerPoint PPT Presentation

Nincs diacm - representing 6 type of the intonation curve as: fall, rise, flooting, rise-fall, jumping intonation curves and the silence. fall. silent ... | PowerPoint PPT presentation | free to view

QoS Measurement and Management for Multimedia Services PowerPoint PPT Presentation

QoS Measurement and Management for Multimedia Services - QoS Measurement and Management for Multimedia Services. Thesis Proposal. Wenyu Jiang ... One-way vs. round-trip measurement. Internet load often asymmetric ... | PowerPoint PPT presentation | free to view

Quality of Service - applications PowerPoint PPT Presentation

Quality of Service - applications - video-on-demand 80 GB disk P2P-over-TCP. voice-over-IP where? ... socialist: administer scarcity - we like SUVs (or to drive 100 mph) ... | PowerPoint PPT presentation | free to view

VoIP Testing PowerPoint PPT Presentation

VoIP Testing - The Network Emulator can be. controlled via test script for. WAN condition changes. dynamic ... Network Emulator. VoIP Call Generator. SIP Attack. Generator ... | PowerPoint PPT presentation | free to view

Network Guide to Networks 5th Edition PowerPoint PPT Presentation

Network Guide to Networks 5th Edition - Voice immediately digitized, issued to network in packet form ... Digitize audio, visual signals. Use video codecs. Network Guide to Networks, 5th Edition ... | PowerPoint PPT presentation | free to view

CINEMA%20Columbia%20InterNet%20Extensible%20Multimedia%20Architecture PowerPoint PPT Presentation

CINEMA%20Columbia%20InterNet%20Extensible%20Multimedia%20Architecture - Transport to remote IP address and port number over UDP (Why not TCP? ... Encode each quantized sample into 8 bit code word. PCM: 8000 x 8 bits = 64 kb/s ... | PowerPoint PPT presentation | free to view

A path towards common quality assessment of narrowband and wideband voice PowerPoint PPT Presentation

A path towards common quality assessment of narrowband and wideband voice - A path towards common quality assessment of ... Microsoft has included the Siren codec in Windows XP/ Windows Messenger. (Siren is the 16 kbit/s mode ... | PowerPoint PPT presentation | free to view

Lahore University of Management and Sciences PowerPoint PPT Presentation

Lahore University of Management and Sciences - PhDs from Harvard, Stanford, MIT, Oxford. Over 60 international publications in 2006 ... Advanced Electrical System Design (MAESTrO) Research Areas and Funded Projects ... | PowerPoint PPT presentation | free to view

VoIP beyond replicating the limitations of the past PowerPoint PPT Presentation

VoIP beyond replicating the limitations of the past - automated call back rarely used, too inflexible ... only contact in real-time if destination is willing and able ... in room our lab stereo changes CDs for ... | PowerPoint PPT presentation | free to view

3G Tutorial PowerPoint PPT Presentation

3G Tutorial - 3G Tutorial Brough Turner & Marc Orange Originally presented at Fall VON 2002 most recently updated 1 Nov 2002 Preface... The authors would like to ... | PowerPoint PPT presentation | free to view

What You Need to Know BEFORE Making the Move to VoIP PowerPoint PPT Presentation

What You Need to Know BEFORE Making the Move to VoIP - What You Need to Know BEFORE Making the Move to VoIP Sanjeev Sawai Vice President, Research and Development Envox Worldwide Key Standards and Technologies ... | PowerPoint PPT presentation | free to view

Network Guide to Networks 5th Edition PowerPoint PPT Presentation

Network Guide to Networks 5th Edition - Network+ Guide to Networks 5th Edition Chapter 11 Voice and Video Over IP Network+ Guide to Networks, 5th Edition * Set top box Decodes video signal, issues to ... | PowerPoint PPT presentation | free to view

ADVANCED INTELLIGENT SYSTEMS PowerPoint PPT Presentation

ADVANCED INTELLIGENT SYSTEMS - Title: Slide 1 Author: Judy Last modified by: china* Created Date: 7/21/2006 2:54:23 PM Document presentation format: (4:3) Other titles | PowerPoint PPT presentation | free to view

Network Guide to Networks 5th Edition PowerPoint PPT Presentation

Network Guide to Networks 5th Edition - Network+ Guide to Networks 5th Edition Chapter 11 Voice and Video Over IP Network+ Guide to Networks, 5th Edition * Set top box Decodes video signal, issues to ... | PowerPoint PPT presentation | free to view

Network Guide to Networks 5th Edition PowerPoint PPT Presentation

Network Guide to Networks 5th Edition - Network+ Guide to Networks 5th Edition Chapter 11 Voice and Video Over IP Network+ Guide to Networks, 5th Edition * Set top box Decodes video signal, issues to ... | PowerPoint PPT presentation | free to view

henry.sinnreich@wcom.com* PowerPoint PPT Presentation

henry.sinnreich@wcom.com* - Title: Slide 1 Author: Henry Sinnreich Last modified by: Iben Created Date: 9/14/2002 3:52:04 PM Document presentation format: Affichage l' cran | PowerPoint PPT presentation | free to view

QoS Measurement and Management for Multimedia Services PowerPoint PPT Presentation

QoS Measurement and Management for Multimedia Services - QoS Measurement and Management for Multimedia Services Thesis Proposal Wenyu Jiang April 29, 2002 | PowerPoint PPT presentation | free to view

Voice DSP Processing III PowerPoint PPT Presentation

Voice DSP Processing III - DSP Processing III Yaakov J. Stein Chief Scientist RAD Data Communications | PowerPoint PPT presentation | free to view

Business toll free numbers providers PowerPoint PPT Presentation

Business toll free numbers providers - VoIP4callcenter is VoIP Business Services and toll-free Providers in UK, USA, Canada and more countries. | PowerPoint PPT presentation | free to view