a principled approach for rejection threshold optimization - PowerPoint PPT Presentation

About This Presentation
Title:

a principled approach for rejection threshold optimization

Description:

correctly vs. incorrectly transferred concepts. rejection tradeoff ... incorrectly transferred. 5 ... incorrectly transferred concepts per turn. utility = 0.55 ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 22
Provided by: danb7
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: a principled approach for rejection threshold optimization


1
a principled approach for rejection threshold
optimization
  • Dan Bohus www.cs.cmu.edu/dbohus
  • Alexander I. Rudnicky www.cs.cmu.edu/air
  • Computer Science Department
  • Carnegie Mellon University
  • Pittsburgh, PA, 15217

2
understanding errors and rejection
  • systems often misunderstand
  • use confidence scores
  • common design pattern
  • compare input confidence against a threshold
  • reject utterance if confidence is too low
  • may lead to false rejections

3
rejection tradeoff
  • misunderstandings vs. false rejections

false rejections
misunderstandings
4
rejection tradeoff
  • misunderstandings vs. false rejections
  • correctly vs. incorrectly transferred concepts

correctly transferred concepts / turn
incorrectly transferred
5
question
  • given this trade-off, how can we optimize the
    rejection threshold in a principled fashion?

6
outline
  • current solutions
  • proposed approach
  • data
  • results
  • conclusion

7
current solutions
  • follow ASR manual Nuance documentation
  • acknowledge the tradeoff postulate costs
  • misunderstandings are X times more costly than
    false rejections Raymond et al 2004 Kawahara
    et al, 2000 Cuayahuitl et al, 2002
  • costs are likely to differ
  • across domains / systems
  • across dialog states within a system

8
proposed approach
  • derive costs in a principled fashion

2. choose a dialog performance metric task
completion (binary, kappa) TC 3. build a
regression model logit(TC) ? C0 CCTCCTC
CITCITC 4. optimize threshold to maximize
performance th argmax (CCTCCTC CITCITC)
9
state-specific costs
  • costs are different in different dialog states
  • CTC and ITC on a per-state basis
  • logit(TC) ? C0
  • CCTCstate1CTCstate1 CITCstate1ITCstate1
  • CCTCstate2CTCstate2 CITCstate2ITCstate2
  • CCTCstate3CTCstate3 CITCstate3ITCstate3
  • optimize separate threshold for each state
  • thstate_x argmax (CCTCstate_xCTCstate_x
    CITCstate_xITCstate_x)

10
outline
  • current solutions
  • proposed approach
  • data
  • results
  • conclusion

11
data
  • collected using RoomLine
  • phone-based, mixed-initiative spoken dialog
    system
  • conference room reservations
  • sphinx-2
  • utterance-level confidence annotator 0-1
  • 46 participants (first-time users)
  • 10 scenario-driven interactions
  • corpus
  • 449 dialog sessions
  • 8278 user turns
  • manually labeled decoded concept correctness

12
roomline states
  • 71 dialog states total
  • clustered into 3 classes
  • open-request
  • How may I help you?
  • request(bool)
  • Would you like a reservation for this room?
  • Would you like a room with a projector?
  • request(non-bool)
  • For what time would you like to reserve the room?

13
results task success model
  • model predicting binary task success

14
results threshold optimization
open-request
1
0.5
0
0
1
0.5
0.25
0.75
15
results threshold optimization
  • utility profiles are different across the three
    states
  • task duration models lead to similar results

16
conclusion
  • principled method for optimizing rejection
    threshold
  • determine costs for various types of
    understanding errors
  • data-driven approach
  • can derive state-specific costs
  • bridge mismatches between off-the-shelf
    confidence annotators and domain

17
thank you
18
fit for task success model
19
expected changes in task success
Remains to be seen
20
task duration model
21
Model 2 Resulting fit and coefficients
R2 0.56
intro data collection rejection threshold
Write a Comment
User Comments (0)
About PowerShow.com