Voice as a User Interface Tuning Speech - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Voice as a User Interface Tuning Speech

Description:

Context-independent common commands (like Main Menu, Operator, Goodbye etc. ... Basic Call Handling (Answer, Divert all calls to, Transfer etc. ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 21
Provided by: stephanieb8
Category:

less

Transcript and Presenter's Notes

Title: Voice as a User Interface Tuning Speech


1
Voice as a User InterfaceTuning Speech
  • Baiju D. Mandalia, PhD
  • Senior Technical Staff Member, IBM Corporation

2
Voice as a User Interface
  • Human factors standards
  • Error handling
  • Prompt Design
  • Dialog Design
  • Grammar Design
  • Tuning voice applications

3
ETSI ES 202 077 for spokencommands
  • Context-independent common commands (like Main
    Menu, Operator, Goodbye etc.)
  • Context dependent common commands (Help, Repeat)
  • Core commands (Yes, No, Stop)
  • Digits (including zero, Oh, double etc)
  • Name and Digit dialing (like Home, Work, Mobile)
  • Basic Call Handling (Answer, Divert all calls
    to, Transfer etc.)
  • Media control (Play, Pause, Continue etc.)
  • Browseable List Navigation (Next, Continue,
    Details)
  • Editing Commands (Delete, Save, Record etc.)
  • Device settings (Volume up/ Louder, Volume
    down/Quieter)
  • Word-spotting mode (Wake-up- to activate other
    modal functions in above list)

4
Dialog Components for handling errors
5
Prompt Design
  • Pre-recorded vs Text to Speech
  • Guiding the caller
  • SSML
  • Dictionaries

6
Dialog Design
  • Consistency
  • Make sure caller experience is uniform during
    call
  • Barge-in
  • Define dialog based on use of better interaction
    and barge-in by experienced users
  • Using Nbest
  • Exploit nbest to provide disambiguation with
    complex tasks

7
Grammar Design
  • Unknown pronunciations
  • Acoustic Confusability
  • Grammar Coverage
  • Complexity
  • Dynamic Application Development
  • Weighting of more common words

8
Tuning Voice Applications
  • Timeouts
  • Lexicons
  • Weighting
  • Confidence levels
  • Speed vs Accuracy
  • Sensitivity
  • Acoustic model adaptation

9
WebSphere Voice Server Tuning Tools
10
Tuning tools features
  • Eclipse based
  • User friendly , graphical interface
  • Tightly integrated for repetitive testing
  • Assistive tools like pronunciation builder for
    tuning

11
Validating grammars on the MRCP Server
12
Enumerating a grammar (random)
13
Testing grammars with text
14
Pronunciation Builder
15
Testing grammars with speech
16
Voice Trace Analyzer for Tuning
  • Set the Voice Server trace specification.
  • Run your voice application (generate trace data).
  • Run the WVS Collector tool.
  • (optional for Integrated Runtime Environment)
  • Import the data into the Voice Trace Analyzer.

17
Voice Trace Analyzer Views
18
Transcriptions within tool
  • In Grammar
  • Accuracy
  • Correct Accept (CA)
  • Correct Reject (CR)
  • False Accept (FA)
  • FA-In
  • FA-Out
  • False Rejects (FR)

19
References
  • WebSphere Voice Server
  • http//www.ibm.com/software/pervasive/voice_serve
    r
  • WebSphere Voice Server Information Center
  • http//publib.boulder.ibm.com/infocenter/pvcvoice
    /51x/index.jsp
  • WebSphere Voice Zone
  • http//www.ibm.com/developerworks/websphere/zones
    /voice
  • IBM WVS for Multiplatforms V5.1.1/V5.1.2 Handbook
  • http//www.redbooks.ibm.com/abstracts/sg246447.htm
    l?Open
  • Speech User Interface Guide
  • http//www.redbooks.ibm.com/redpieces/abstracts/r
    edp4106.html?Open

20
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com