Notes%20on%20ICASSP%202004 - PowerPoint PPT Presentation

About This Presentation
Title:

Notes%20on%20ICASSP%202004

Description:

New evaluation scheme is deviced for overlapped speech. Resource preparation ... Overlapped speech require different schemes for evaluation ... – PowerPoint PPT presentation

Number of Views:15
Avg rating:3.0/5.0
Slides: 8
Provided by: Arthu61
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Notes%20on%20ICASSP%202004


1
Notes on ICASSP 2004
  • Arthur Chan
  • May 24, 2004

2
This Presentation (5 pages)
  • Brief note of ICASSP 2004
  • NIST RT 04
  • Evaluation results
  • Other interesting things relate to CALO

3
NIST RT 04 Meeting Transcription Headlines.
  • Meeting Transcription
  • A challenge to core technology, evaluation and
    resource preparation.
  • Core technology
  • Speaker Segmentation
  • Speech to Text (STT)
  • Evaluation
  • New evaluation scheme is deviced for overlapped
    speech.
  • Resource preparation
  • LDC has a big headache in preparing the data.

4
Speaker Segmentation
  • Segmenting the speech
  • Search for the number of speakers.
  • Get speaker turns.
  • Measured by Diarization rate.
  • Insights (from ISL)
  • More speakers the harder the task.
  • A new measure called speaker speaking time
    entropy is proposed.

5
STT
  • Very hard task
  • ICSI, ISL use the state of the art technology
  • Constrained linear transform
  • Discriminative training (DT-MAP)
  • Speaker Adaptive Training.
  • Individual headphone results WER 34.8 for
    non-overlapping speech.
  • Some meeting is very hard. Many people is
    speaking at the same time.
  • Trained on 4 different subset of data, ICSI data
    is just one of them (70 of the total)
  • Insights
  • (ICSI) feature-based technique doesnt help too
    much
  • Multiple-distance microphones and array
    microphones techniques help.
  • Conclusion we will also have a hard-time.

6
Evaluation and Resource Preparation
  • Evaluation
  • Overlapped speech require different schemes for
    evaluation
  • Will require multiple string matching. (Detail
    unknown yet.)
  • Resource Preparation
  • Currently, no tool can satisfy the need of
    transcribing multiple channels of speech with
    interaction
  • Professional transcriber failed.

7
Other interesting news from ICASSP related to CALO
  • Project EARS
  • Lightly supervised training
  • 3000 hours close captioned speech is used
  • Discriminative training is found to be useful for
    some sites.
  • Others
Write a Comment
User Comments (0)
About PowerShow.com