Multimedia Communications: The Voice User Interface - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Multimedia Communications: The Voice User Interface

Description:

Tolerance to Acoustics, Context, and Speaker. Isolated Word ... Dual Party Telephone Relay. Reverse Directory Assistance. Spoken E-mail and Fax. Talking Books ... – PowerPoint PPT presentation

Number of Views:174
Avg rating:3.0/5.0
Slides: 13
Provided by: btcnet
Category:

less

Transcript and Presenter's Notes

Title: Multimedia Communications: The Voice User Interface


1
Multimedia CommunicationsThe Voice User
Interface
mmc10.01
2
Dimensions of the Voice Interface
Input/Output Acoustics Speech Compression Automati
c Speech Recognition (ASR) Text-to-speech
Synthesis (TTS) Speaker Verification Natural
Language Dialogue
mmc10.02
3
Applications of ASR
Voice Dialing Voice-Controlled Answering
Machine Call Routing Airline Information
System Data Entry, Form Filling Network
Agent Keyboard Replacement Internet Access
mmc10.03
4
Dimensions of ASR Performance
Recognition Error Rate Size of Vocabulary Permitte
d Fluency Tolerance to Acoustics, Context, and
Speaker
mmc10.04
5
Isolated Word Recognition
SpokenWord Input
Recognized Pattern (Word)
Pattern Similarity
Feature Measurement
Decision Rule




Word Reference Templates
mmc10.05
6
Hidden Markov Model
Number of urns Number of states Observation in
each state Color of ball drawn Transition
probabilities between states Model of sequence
mmc10.06 Rabiner and Juang 84
7
Evolution of ASR Technology
Spontaneous Speech
Speaking Style
Time
Isolated Words
2
20000
Vocabulary Size
mmc10.07
8
Applications of TTS
Voice Announcements and Readouts Dual Party
Telephone Relay Reverse Directory
Assistance Spoken E-mail and Fax Talking
Books Internet Access
mmc10.08
9
Dimensions of TTS Performance
Intelligibility Naturalness Expressiveness Customi
zation of Voice Quality Multiple-Language
Capability
mmc10.09
10
TTS Block Diagrams
mmc10.10 Bell Labs 90
11
Evolution of TTS Technology
Intelligibility
100
Prosody
Quality
Percent
1970
2000
Year
mmc10.11
12
Research Challenges in ASR and TTS
Phrase and Sentence Recognition Synthesis of
Paragraph-Length Input Natural Language
Human-Machine Dialogue Automatic Language
Translation
mmc10.12
Write a Comment
User Comments (0)
About PowerShow.com