Automatic Speech Attribute Transcription (ASAT) - PowerPoint PPT Presentation

1 / 10
About This Presentation
Title:

Automatic Speech Attribute Transcription (ASAT)

Description:

Automatic Speech Attribute Transcription (ASAT) Project Period: 10/01/04 9/30/08 The ASAT Team Mark Clements (clements_at_ece.gatech.edu) Sorin Dusan (sdusan_at_speech ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 11
Provided by: LucentOMD5
Category:

less

Transcript and Presenter's Notes

Title: Automatic Speech Attribute Transcription (ASAT)


1
Automatic Speech Attribute Transcription (ASAT)
  • Project Period 10/01/04 9/30/08
  • The ASAT Team
  • Mark Clements (clements_at_ece.gatech.edu)
  • Sorin Dusan (sdusan_at_speech.rutgers.edu)
  • Eric Fosler-Lussier (fosler_at_cse.ohio-state.edu)
  • Keith Johnson (kjohnson_at_ling.ohio-state.edu)
  • Fred Juang (juang_at_ece.gatech.edu)
  • Larry Rabiner (lrr_at_caip.rutgers.edu)
  • Chin Lee (Coordinator, chl_at_ece.gatech.edu)
  • NSF HLC Program Director (mharper_at_nsf.gov)

2
ASAT Paradigm and SoW
1
2
3
4
5. Overall System Prototypes and Common Platform
3
Bank of Speech Attribute Detectors
  • Each detected attribute is represented by a time
    series (event)
  • An example frame-based detector (0-1 simulating
    posterior probability)
  • ANN-based Attribute Detectors
  • An example nasal and stop detectors
  • Sound-specific parameters and feature detectors
  • An example VOT for V/UV stop discrimination
  • Biologically-motivated processors and detectors
  • Analog detectors, short-term and long-term
    detectors
  • Perceptually-motivated processors and detectors
  • Converting speech into neural activity level
    functions
  • Others?

4
An Example More Visible than Spectrogram?
jve ding zii ji gong he guo de
ming vn
Stop
XX
Nasal
Vowel
Early acoustic to linguistic mapping !!
5
Event Merger
  • Merge multiple time series into another time
    series
  • Maintaining the same detector output
    characteristics
  • Combine temporal events
  • An example combining phones into words (word
    detectors)
  • Combine spatial events
  • An example combining vowel and nasal features
    into nasalized vowels
  • Extreme Build a 20K-word recognizer by
    implementing 20K keyword detectors
  • Others OOV, partial recognition

6
Evidence Verifier
  • Provide confidence measures to events and
    evidences
  • Utterance verification algorithms can be used
  • Output recognized evidences (words and others)
  • Hypothesis testing is needed in every stage
  • Prune event and evidence lattices
  • Pruning threshold decisions
  • Minimum verification error (MVE) verifiers
  • Many new theories can be developed
  • Others?

7
Word and Phone Verifiers(/w// //n/ one)
8
Knowledge Sources Definition Evaluation
  • Explore large body of speech science literature
  • Define training, evaluation and testing databases
  • Develop Objective Evaluation Methodology
  • Defining detectors, mergers, verifiers,
    recognizers
  • Defining/collecting evaluation data for all
  • Document all pieces on the web

9
Prototype ASR Systems and Platform
  • Continuous Phone Recognition TIMIT?
  • Continuous Speech Recognition
  • Connected digit recognition
  • Wall Street Journal
  • Switchboard?
  • Establishment of a collaborative platform
  • Implementing divide-n-conquer strategy
  • Developing a user community

10
Summary
  • ASAT Goal Go beyond state-of-the-art
  • ASAT Spirit Work for team excellence
  • ASAT team member responsibilities
  • MAC Event Fusion
  • SD Perception-based processing
  • EF Knowledge Integration (Event Merger)
  • KJ Acoustic Phonetics
  • BHJ Evidence Verifier
  • LRR Attribute Detector
  • CHL Overall
Write a Comment
User Comments (0)
About PowerShow.com