Understanding and Optimizing Speech Recognition - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Understanding and Optimizing Speech Recognition

Description:

Many users cannot pronounce the name. Mazowiecki Popieluszko = Mazzy Pop. 23 ... Record the user's pronunciation. Make sure the name is in the database ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 32
Provided by: voc4
Category:

less

Transcript and Presenter's Notes

Title: Understanding and Optimizing Speech Recognition


1
Understanding and Optimizing Speech Recognition
  • Vocera Customer Webinar Series
  • July 29, 2009

2
Objectives
  • Understand how the Nuance engine works
  • Measure the performance of the Nuance engine
  • Analyze reports from the Vocera Report Server
  • Maximize speech recognition performance with best
    practices

3
Speakers
  • Kathy Brown
  • Vocera, Inc.
  • Director of Professional Services
  • Steve Blair
  • Vocera, Inc.
  • Director of Engineering

4
Measuring Speech Recognition
5
Some Speech Rec Questions
  • What is good speech recognition?
  • How do you know you have it?
  • What is bad speech recognition?
  • What can you do about it?

6
Measuring Speech Recognition
  • Two methodologies
  • Supervised
  • Unsupervised
  • Unsupervised Automated reporting
  • Reports generated automatically from call log
    data
  • VRS Reports
  • Supervised Manual reporting
  • Manually transcribe what the user actually said
  • Compare transcribed utterance to call log data
    and grammars

Transcription written text corresponding to the
actual spoken utterance
7
Typical VRS Speech Rec Summary
27,243 69.10 8,388 21.30 3,036
7.70 729 1.80 19 0.00 Total
39,415 100
Overall speech recognition statistics
8
Measuring Accuracy
  • Unsupervised reporting has built-in limitations
  • You must compare the Nuance interpretation to the
    user utterance to determine accuracy
  • This is a manual process
  • Unsupervised reporting is an automated process

You cannot measure accuracy without transcriptions
9
Types of Utterances
  • First step in measuring accuracy is to separate
    utterances into two groups

Utterances
Out-of-Grammar
In-Grammar
10
In and Out of Grammar
  • In-grammar
  • User says something that is in the defined
    grammar
  • Example Call Brian Sturges
  • Out-of-grammar
  • User says something that is not in the defined
    grammar
  • Example Give me Brian Sturges

11
Why Separate Utterances?
  • Distinction between in- and out-of-grammar is
    critically important
  • In-grammar utterances
  • When not correctly recognized, need recognizer
    and dictionary tuning
  • Out-of-grammar utterances
  • Indicate training, usability, and database issues

12
In-Grammar Measurements
  • In-Grammar Correct Accept (IG-CA)
  • The recognizer returned the correct answer
  • In-Grammar False Accept (IG-FA)
  • The recognizer returned an incorrect answer
  • In-Grammar False Reject (IG-FR)
  • The recognizer could not find a good match and
    rejected rather than return an answer

13
In-Grammar Measurements
  • In-Grammar Correct Accept
  • In-Grammar False Accept
  • In-Grammar False Reject

14
Out-of-Grammar Measurements
  • Out-of-Grammar Correct Reject (OG-CR)
  • The recognizer correctly rejected the input
  • Out-of-Grammar False Accept (OG-FA)
  • The recognizer returned a wrong answer because
    the input was not in the grammar

15
Out-of-Grammar Measurements
  • Out-of-Grammar Correct Reject
  • Out-of-Grammar False Accept

16
True Recognition Accuracy
  • Typical recognition accuracy is usually much
    greater than 70

More representative speech recognition accuracy
Actual customer data from medium-sized hospital
17
Limits of VRS Reports
  • VRS reports do not measure accuracy!
  • Why?

TIME Sun Apr 20 080559 2008 BEGIN_TIME
14.25 SESSION_ID
3d53190a_000009d8_480b5bd7_84a2_0000 START_OF_SPEE
CH_DELAY 4.36 SPEECH_DURATION
3.2 STATUS
RECOGNITION LATENCY
0.219 PERCENT_CPU_RT
0.274689 SERVER_HOSTNAME
absmc-vcdb2 NUM_RESULTS
2 RESULT0 call patricia
shiferaw please CONFIDENCE0
47 PROBABILITY0 -11201 NUM_NL_INTER
PRETATIONS0 1 NL_INTERPRETATION00
ltsnAction Callgt ltsnName1 u-pshiferawgt CONFIDENCE
00snAction 65 CONFIDENCE00snName1
48
User utterance
Nuance call log data
18
Speech Recognition Baseline
  • Performed intensive survey
  • Collected 41,845 in-field utterances and Nuance
    logs from 5 large Vocera hospitals (8 sites)
  • Transcribed 7932 utterances
  • Analyzed results for accuracy (intent result?)
  • Result highlights
  • 18 utterances are out-of-grammar
  • 89 of in-grammar utterances were successfully
    recognized
  • 71 of utterances were CALL subject

19
What You Can DoOptimizing Speech Recognition
20
Use ASNs Sparingly
  • Alternate Spoken Names (ASNs) increase the size
    of the grammar
  • Larger grammars increase the likelihood of
    misrecognition
  • Incorrect ASNs create problems that are hard to
    debug

21
When to Use an ASN
  • Use ASNs for nicknames
  • Victoria Noel, who is known as Tori
  • Charles Gregory Drew, who goes by his middle
    name, Greg
  • Ernest Alexander, who is called Skip by
    everyone
  • Use ASNs for titles
  • Benjamin Spock, known as Doctor Spock
  • Dwight Eisenhower, called General Eisenhower

22
Other Uses of ASNs
  • Name most commonly used
  • Adult I C U Lead Seventh Floor Lead
  • Maiden name
  • Rebecca Barry Rebecca Nunn
  • Many users cannot pronounce the name
  • Mazowiecki Popieluszko Mazzy Pop

23
Quasi-Phonetic Spellings
  • Carefully use ASNs to spell non-English names
    phonetically
  • Spell as English words or names
  • Spell as phonetically simple syllables
  • Pseudo-phonetic spellings are a slippery slope!
  • Examples of useful phonetic ASNs
  • Pacita
  • Nuance thinks it is Pacheetah
  • Could be spelled as Paseeta
  • Cindy Landola
  • Could be spelled as Lantola

24
How to Add ASNs
  • Identify a commonly mispronounced name
  • Record the users pronunciation
  • Make sure the name is in the database
  • Call the name to see if it works for you
  • Respell the name phonetically
  • Call the name to check your work

Test, test, and test again!
25
When to Use Learned Names
  • Learned names
  • Effective when only a few users mispronounce the
    name
  • Alternate spoken names
  • Effective when many users mispronounce the name

26
When to Use Department Names
  • Department name allows use of first name only
  • Michael Gebrekristos Michael in Administration
  • Aster Cabansag Aster in E K G
  • Department name disambiguates similar names
  • Peri Gunay in Imaging
  • Eric Auneit in Transport Services
  • Assigning departments to each user is akin to
    adding an ASN for each user

27
Specify Length of Phone Extensions
  • Specify the number of digits in fixed length
    extensions and pager numbers
  • Recognition improves significantly by limiting
    the length of the digit string
  • Accuracy degrades 1 per digit
  • Set values in properties.txt
  • TelExtensionLength
  • Number of digits in a phone extension
  • TelPagerNumberLength
  • Number of digits in an inside pager number

28
  • QA

29
Speech Recognition Support
  • Vocera Professional Services Documents
  • Vocera Speech Recognition Professional Services
    Support Data Sheet
  • Vocera Professional Services Brochure

Download both from Live Meeting by clicking the
Handouts button in the top right area of your
screen These documents can also be found on the
Vocera customer portal at http//vocera.com/porta
l/View/Default.aspx?id112parentServiceandSupp
ort
30
Contact Information and Questions
Vocera Speech Recognition Questions Vocera
Professional Services or your Vocera Sales
Associate toll free 1 888 9 Vocera 1 888
986 2372 Webinar Information To register for
upcoming webinars or view recordings of previous
webinars visit http//vocera.com/news/webinars.as
px About Vocera Webinars If you have questions
about the Vocera Customer Webinar Series, please
send an email to webinars_at_vocera.com
31
Thank You
Write a Comment
User Comments (0)
About PowerShow.com