John McCoey - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

John McCoey

Description:

Everyday Communication and Word Processing. Recording lectures ... Definitive measure for scoring the Accuracy and Readability of a Speech-to-Text Transcript ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 12
Provided by: johnm142
Category:

less

Transcript and Presenter's Notes

Title: John McCoey


1
Methods for Improving Readability of Speech
Recognition Transcripts
I
  • John McCoey

2
What is a Speech Recognition Transcript?
I
  • Direct output from a Speech-to-Text Translation
    (STT) system
  • Uses
  • Everyday Communication and Word Processing
  • Recording lectures/court cases
  • Assistive technology for the hearing impaired
  • Especially focusing on classroom settings

3
What is Readability?
I
  • Readability is not the same as word accuracy
  • Change of speaker, change of thought, accent,
    pauses, disfluent words, capitalization,
    punctuation
  • Example

Images from Jones, Douglas, et al., Measuring
the Readability of Automatic Speech-to-Text
Transcripts Proc. Eurospeech, pp. 1585-1588,
2003.
4
Measuring Readability
I
  • Definitive measure for scoring the Accuracy and
    Readability of a Speech-to-Text Transcript
  • Word Accuracy Percentage Score
  • (Words Spoken Word Errors) / Words Spoken 100
  • Readability Percentage Score
  • (Words Spoken Sentences Speaker Changes)
    (Word Errors Sentence Errors Speaker Change
    Errors) / (Words Spoken Sentences Speaker
    Changes) 100

R. Stuckless. Recognition means more than just
getting the words right Beyond accuracy to
readability. Speech Technology, Oct. /Nov. 1999,
pp. 30--35, 1999.
5
What Factors Negatively Effect Readability?
I
  • Searching number of recognizable words
  • What if word isnt recognized?
  • Discontinuous words, pauses, or unexpected
    changes of thought
  • Capitalization and punctuation
  • Implementation
  • Change in speaker

6
STT Systems and Algorithms
I
  • CMU Sphinx
  • Uses a large vocabulary and Hidden Markov Models
    to determine probability of next spoken word

15
80
60
15
25
5
20
5
75
7
STT Systems and Algorithms
I
Mosur K. Ravishankar. Efficient Algorithms for
Speech Recognition Ph.D. Thesis, Technical
Report CMU-CS-96-143, Computer Science
Department, Carnegie Mellon University, 1996.
8
VUST System
I
Image From Richard Kheir and Thomas Way.
Inclusion of Deaf Students in Computer Science
Classes using Real-Time Speech Transcription.
ITiCSE07. Applied Computing Technology
Laboratory, Department of Computing Sciences,
Villanova University, 2007.
9
Classroom Use
I
  • Real-time Text Display
  • Disadvantages?
  • Note-taking / Study Guide
  • Missed class
  • Review for later
  • Personal laptop connection
  • Accessed only by individuals who require access
    (hearing impaired)
  • Ability to save to personal computer again for
    future study guide

10
Proposed Work
I
  • Incorporate Pauses in Training / DiBS
  • Pause Detection Software
  • Short
  • Comma, semicolon
  • Normal
  • End of sentence
  • Long
  • End of paragraph, change of speaker, etc.

11
Any Questions?
I
  • John McCoey
  • CSC 3990-001
  • Villanova University
  • October 24, 2007
Write a Comment
User Comments (0)
About PowerShow.com