Understanding and Optimizing Speech Recognition - PowerPoint PPT Presentation

1 / 31

About This Presentation

Title:

Understanding and Optimizing Speech Recognition

Description:

Many users cannot pronounce the name. Mazowiecki Popieluszko = Mazzy Pop. 23 ... Record the user's pronunciation. Make sure the name is in the database ... – PowerPoint PPT presentation

Number of Views:57

Avg rating:3.0/5.0

Slides: 32

Provided by: voc4

Category:

more less

Transcript and Presenter's Notes

Title: Understanding and Optimizing Speech Recognition

1
Understanding and Optimizing Speech Recognition

Vocera Customer Webinar Series
July 29, 2009

2
Objectives

Understand how the Nuance engine works
Measure the performance of the Nuance engine
Analyze reports from the Vocera Report Server
Maximize speech recognition performance with best
practices

3
Speakers

Kathy Brown
Vocera, Inc.
Director of Professional Services
Steve Blair
Vocera, Inc.
Director of Engineering

4
Measuring Speech Recognition
5
Some Speech Rec Questions

What is good speech recognition?
How do you know you have it?
What is bad speech recognition?
What can you do about it?

6
Measuring Speech Recognition

Two methodologies
Supervised
Unsupervised
Unsupervised Automated reporting
Reports generated automatically from call log
data
VRS Reports
Supervised Manual reporting
Manually transcribe what the user actually said
Compare transcribed utterance to call log data
and grammars

Transcription written text corresponding to the
actual spoken utterance
7
Typical VRS Speech Rec Summary
27,243 69.10 8,388 21.30 3,036
7.70 729 1.80 19 0.00 Total
39,415 100
Overall speech recognition statistics
8
Measuring Accuracy

Unsupervised reporting has built-in limitations
You must compare the Nuance interpretation to the
user utterance to determine accuracy
This is a manual process
Unsupervised reporting is an automated process

You cannot measure accuracy without transcriptions
9
Types of Utterances

First step in measuring accuracy is to separate
utterances into two groups

Utterances
Out-of-Grammar
In-Grammar
10
In and Out of Grammar

In-grammar
User says something that is in the defined
grammar
Example Call Brian Sturges
Out-of-grammar
User says something that is not in the defined
grammar
Example Give me Brian Sturges

11
Why Separate Utterances?

Distinction between in- and out-of-grammar is
critically important
In-grammar utterances
When not correctly recognized, need recognizer
and dictionary tuning
Out-of-grammar utterances
Indicate training, usability, and database issues

12
In-Grammar Measurements

In-Grammar Correct Accept (IG-CA)
The recognizer returned the correct answer
In-Grammar False Accept (IG-FA)
The recognizer returned an incorrect answer
In-Grammar False Reject (IG-FR)
The recognizer could not find a good match and
rejected rather than return an answer

13
In-Grammar Measurements

In-Grammar Correct Accept

In-Grammar False Accept

In-Grammar False Reject

14
Out-of-Grammar Measurements

Out-of-Grammar Correct Reject (OG-CR)
The recognizer correctly rejected the input
Out-of-Grammar False Accept (OG-FA)
The recognizer returned a wrong answer because
the input was not in the grammar

15
Out-of-Grammar Measurements

Out-of-Grammar Correct Reject

Out-of-Grammar False Accept

16
True Recognition Accuracy

Typical recognition accuracy is usually much
greater than 70

More representative speech recognition accuracy
Actual customer data from medium-sized hospital
17
Limits of VRS Reports

VRS reports do not measure accuracy!
Why?

TIME Sun Apr 20 080559 2008 BEGIN_TIME
14.25 SESSION_ID
3d53190a_000009d8_480b5bd7_84a2_0000 START_OF_SPEE
CH_DELAY 4.36 SPEECH_DURATION
3.2 STATUS
RECOGNITION LATENCY
0.219 PERCENT_CPU_RT
0.274689 SERVER_HOSTNAME
absmc-vcdb2 NUM_RESULTS
2 RESULT0 call patricia
shiferaw please CONFIDENCE0
47 PROBABILITY0 -11201 NUM_NL_INTER
PRETATIONS0 1 NL_INTERPRETATION00
ltsnAction Callgt ltsnName1 u-pshiferawgt CONFIDENCE
00snAction 65 CONFIDENCE00snName1
48
User utterance
Nuance call log data
18
Speech Recognition Baseline

Performed intensive survey
Collected 41,845 in-field utterances and Nuance
logs from 5 large Vocera hospitals (8 sites)
Transcribed 7932 utterances
Analyzed results for accuracy (intent result?)
Result highlights
18 utterances are out-of-grammar
89 of in-grammar utterances were successfully
recognized
71 of utterances were CALL subject

19
What You Can DoOptimizing Speech Recognition
20
Use ASNs Sparingly

Alternate Spoken Names (ASNs) increase the size
of the grammar
Larger grammars increase the likelihood of
misrecognition
Incorrect ASNs create problems that are hard to
debug

21
When to Use an ASN

Use ASNs for nicknames
Victoria Noel, who is known as Tori
Charles Gregory Drew, who goes by his middle
name, Greg
Ernest Alexander, who is called Skip by
everyone
Use ASNs for titles
Benjamin Spock, known as Doctor Spock
Dwight Eisenhower, called General Eisenhower

22
Other Uses of ASNs

Name most commonly used
Adult I C U Lead Seventh Floor Lead
Maiden name
Rebecca Barry Rebecca Nunn
Many users cannot pronounce the name
Mazowiecki Popieluszko Mazzy Pop

23
Quasi-Phonetic Spellings

Carefully use ASNs to spell non-English names
phonetically
Spell as English words or names
Spell as phonetically simple syllables
Pseudo-phonetic spellings are a slippery slope!
Examples of useful phonetic ASNs
Pacita
Nuance thinks it is Pacheetah
Could be spelled as Paseeta
Cindy Landola
Could be spelled as Lantola

24
How to Add ASNs

Identify a commonly mispronounced name
Record the users pronunciation
Make sure the name is in the database
Call the name to see if it works for you
Respell the name phonetically
Call the name to check your work

Test, test, and test again!
25
When to Use Learned Names

Learned names
Effective when only a few users mispronounce the
name
Alternate spoken names
Effective when many users mispronounce the name

26
When to Use Department Names

Department name allows use of first name only
Michael Gebrekristos Michael in Administration
Aster Cabansag Aster in E K G
Department name disambiguates similar names
Peri Gunay in Imaging
Eric Auneit in Transport Services
Assigning departments to each user is akin to
adding an ASN for each user

27
Specify Length of Phone Extensions

Specify the number of digits in fixed length
extensions and pager numbers
Recognition improves significantly by limiting
the length of the digit string
Accuracy degrades 1 per digit
Set values in properties.txt
TelExtensionLength
Number of digits in a phone extension
TelPagerNumberLength
Number of digits in an inside pager number

29
Speech Recognition Support

Vocera Professional Services Documents
Vocera Speech Recognition Professional Services
Support Data Sheet
Vocera Professional Services Brochure

Download both from Live Meeting by clicking the
Handouts button in the top right area of your
screen These documents can also be found on the
Vocera customer portal at http//vocera.com/porta
l/View/Default.aspx?id112parentServiceandSupp
ort
30
Contact Information and Questions
Vocera Speech Recognition Questions Vocera
Professional Services or your Vocera Sales
Associate toll free 1 888 9 Vocera 1 888
986 2372 Webinar Information To register for
upcoming webinars or view recordings of previous
webinars visit http//vocera.com/news/webinars.as
px About Vocera Webinars If you have questions
about the Vocera Customer Webinar Series, please
send an email to webinars_at_vocera.com
31
Thank You

Write a Comment

User Comments (0)