Accounting for STT Uncertainty in MDE - PowerPoint PPT Presentation

1 / 7
About This Presentation
Title:

Accounting for STT Uncertainty in MDE

Description:

Prosodic Features for Lattices ... for efficient computation of prosodic features over all lattice ... Prosodic and language model scores for each event node ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 8
Provided by: officespec9
Category:

less

Transcript and Presenter's Notes

Title: Accounting for STT Uncertainty in MDE


1
Accounting for STT Uncertainty in MDE
  • Dustin Hillard, Mari Ostendorf, Andreas Stolcke
  • SSLI Lab, University of Washington
  • ICSI SRI International

2
Outline
  • Review
  • Why confusion networks for metadata?
  • N-best list decoding for metadata
  • Updated and New Eval Results
  • Moving to Lattice Decoding

3
Example Using multiple ASR hypotheses
REF any easier for the president . --
The united states was set 1st Best any
easier for the president OF the united
states . WHAT set 2nd Best any easier for the
president . -- The united states
was set 3rd Best AN easier for the
president . UH The united states --
set
Should be an SU here.
Idea If the detected SUs in several N-Best
hypotheses have high probability, then the
combined score could provide a better solution
than using only the 1-best.
4
SU Confusion Networks
SU
SU SU
1
1
1
.4
president no-event of no-event
the no-event
SU
-- SU
1
1
.3
president no-event -- ---
the no-event
SU
SU SU
1
1
1
.3
president no-event uh no-event
the no-event
president SU
of SU SU
1
1
president no-event --
--- the no-event
1
President uh no-event
.3
.54
5
CTS MDE Eval Results
  • Slot error rate results (insertions deletions)
    / truth no subtype
  • Nbest give .7 improvement over the 1best system
    on the pruned list, but no gain relative to
    unpruned 1-best
  • Differences from 1-best system
  • WER increase of 1-2 due to use of pruned Nbest
    lists
  • Problems defining turn feature in nbest lists

6
Moving to Lattice Decoding
  • Prosodic Features for Lattices
  • Implemented new software for efficient
    computation of prosodic features over all
    lattice hypotheses
  • Decreases computational redundancy and
  • allows for broader search space
  • Provides processing speed-up by orders
  • of magnitude
  • Decoding Metadata in the Lattice
  • Insert metadata after each word in the lattice
  • Prosodic and language model scores for each event
    node
  • Optimize score weights to reduce metadata error
  • Decode lattice with confusion networks

7
Conclusions and Future Work
  • Using multiple hypotheses reduces SER
  • Previous reductions 1 absolute for CTS SU, 3
    absolute for CTS IP
  • New reductions .7 for CTS SU
  • New Findings
  • HMMMaxent improves N-best results over single
    models
  • Gains from newest (V6) training data also
    transfer
  • Gains from nbest have reduced as the 1best SER
    and WER decreased
  • Future Work
  • Optimize lattice decoding for WER to investigate
    if including metadata information can lower WER
Write a Comment
User Comments (0)
About PowerShow.com