a principled approach for rejection threshold optimization - PowerPoint PPT Presentation

About This Presentation

Title:

a principled approach for rejection threshold optimization

Description:

correctly vs. incorrectly transferred concepts. rejection tradeoff ... incorrectly transferred. 5 ... incorrectly transferred concepts per turn. utility = 0.55 ... – PowerPoint PPT presentation

Number of Views:34

Avg rating:3.0/5.0

Slides: 22

Provided by: danb7

Learn more at: http://www.cs.cmu.edu

Category:

more less

Transcript and Presenter's Notes

Title: a principled approach for rejection threshold optimization

1
a principled approach for rejection threshold
optimization

Dan Bohus www.cs.cmu.edu/dbohus
Alexander I. Rudnicky www.cs.cmu.edu/air
Computer Science Department
Carnegie Mellon University
Pittsburgh, PA, 15217

2
understanding errors and rejection

systems often misunderstand
use confidence scores
common design pattern
compare input confidence against a threshold
reject utterance if confidence is too low
may lead to false rejections

3
rejection tradeoff

misunderstandings vs. false rejections

false rejections
misunderstandings
4
rejection tradeoff

misunderstandings vs. false rejections
correctly vs. incorrectly transferred concepts

correctly transferred concepts / turn
incorrectly transferred
5
question

given this trade-off, how can we optimize the
rejection threshold in a principled fashion?

6
outline

current solutions
proposed approach
data
results
conclusion

7
current solutions

follow ASR manual Nuance documentation

acknowledge the tradeoff postulate costs
misunderstandings are X times more costly than
false rejections Raymond et al 2004 Kawahara
et al, 2000 Cuayahuitl et al, 2002
costs are likely to differ
across domains / systems
across dialog states within a system

8
proposed approach

derive costs in a principled fashion

2. choose a dialog performance metric task
completion (binary, kappa) TC 3. build a
regression model logit(TC) ? C0 CCTCCTC
CITCITC 4. optimize threshold to maximize
performance th argmax (CCTCCTC CITCITC)
9
state-specific costs

costs are different in different dialog states
CTC and ITC on a per-state basis
logit(TC) ? C0
CCTCstate1CTCstate1 CITCstate1ITCstate1
CCTCstate2CTCstate2 CITCstate2ITCstate2
CCTCstate3CTCstate3 CITCstate3ITCstate3
optimize separate threshold for each state
thstate_x argmax (CCTCstate_xCTCstate_x
CITCstate_xITCstate_x)

10
outline

current solutions
proposed approach
data
results
conclusion

11
data

collected using RoomLine
phone-based, mixed-initiative spoken dialog
system
conference room reservations
sphinx-2
utterance-level confidence annotator 0-1
46 participants (first-time users)
10 scenario-driven interactions
corpus
449 dialog sessions
8278 user turns
manually labeled decoded concept correctness

12
roomline states

71 dialog states total
clustered into 3 classes
open-request
How may I help you?
request(bool)
Would you like a reservation for this room?
Would you like a room with a projector?
request(non-bool)
For what time would you like to reserve the room?

13
results task success model

model predicting binary task success

14
results threshold optimization
open-request
1
0.5
0
0
1
0.5
0.25
0.75
15
results threshold optimization

utility profiles are different across the three
states
task duration models lead to similar results

16
conclusion

principled method for optimizing rejection
threshold
determine costs for various types of
understanding errors
data-driven approach
can derive state-specific costs
bridge mismatches between off-the-shelf
confidence annotators and domain

17
thank you
18
fit for task success model
19
expected changes in task success
Remains to be seen
20
task duration model
21
Model 2 Resulting fit and coefficients
R2 0.56
intro data collection rejection threshold

Write a Comment

User Comments (0)