Focus of Attention Based on Speech Recognizer Hypothesis - PowerPoint PPT Presentation

1 / 18

About This Presentation

Title:

Focus of Attention Based on Speech Recognizer Hypothesis

Description:

Focus of Attention Based on Speech Recognizer Hypothesis. Michael Katzenmaier ... some100 text only sentences (commands) (2 Persons 1 Room/Robot) for every turn X: ... – PowerPoint PPT presentation

Number of Views:22

Avg rating:3.0/5.0

Slides: 19

Provided by: i13p4

Category:

more less

Transcript and Presenter's Notes

Title: Focus of Attention Based on Speech Recognizer Hypothesis

1
Focus of Attention Based on Speech Recognizer
Hypothesis
Michael Katzenmaier Interactive Systems
Labs University of Karlsruhe
2
Outline

Motivation
Basics
Settings
Features
Methods of Classification
Experiments
Preliminary Testing of Features
Results Comparison
Summary Outlook

3
Motivation

Machines are increasingly involved in human
interaction, e.g.
Improved Usability through integration of
human-machine communication (HM)
in human-human communication (HH)
gt Requires Determination of Focus of Attention

intelligent room with voice-operated equipment
(lighting, video/audio, etc.)
household robot
home multimedia terminal

4
Possible Input Modalities forTracking Focus of
Attention
hypothesis of speech recognizer eye gaze state
of dialogue place of speaker gesture ...
conversation human-human (HH)
classifier
commands human-machine (HM)
5
In this work

restriction on hypothesis
of speech recognizer
extracted features
on hypothesis and transcription

6
Experiment Settings

data - approx. 10 min. real collected dialogues
- some100 text only sentences (commands)
(2 Persons 1 Room/Robot)
for every turn X
transcript XTrans
N-Gram-Recognizer hypothesis XN-Gram
CFG-Recognizer hypothesis XCFG
corresponding features
hand-crafted CFG customized for Human Robots
Project
N-Gram-Recognizer JANUS Verbmobil-Evaluation
System
Stuttgart Neuronal Network Simulator for
NN-experiments

7
Features

sentence length ?1,2,.. S(X)
XN-Gram
CFG- parseable ?0,1 Z(X)
perplexity with Verbmobil-LM PerpHH1(X)
?0..? with VODIS-LM PerpHM1(X)
with trans. HM-monologs PerpHM2(X)
with trans. HH-dialogs PerpHH2(X)
correlation btw. XN-Gram and XCFG
?0..1 of words
Kwrd(XCFG,XN-Gram)
of letters Kltr(XCFG,XN-Gram)
occurrence of robot ?0,1 R(X)
number of imperatives ?1,2,.. I(X)

8
Methods of Classification

simple comparison (threshold, etc.)
perpHM(X) lt ? gt perpHH(X) hypothesis perpHM(X
?HM)ltperpHH(X ?HM)
K(XCFG,XN-Gramm) gt? threshold t hypothesis
X?HM ?K(XCFG,XN-Gramm) gt t
Bayes-Classifier
P(HMX) lt ? gt P(HHX)
Multilayer Perception
input features
output HM or HH

P(HMX) P(HHX)
perpHM(X) perpHH(X) S(X) K(XCFG,X N-Gramm)
Z(X)
9
Preliminary Testing 1
perpHM2
perpHM1
transcript
(XN-Gram)
hypothesis
perpHH2
perpHH1
HH
HM
10
Preliminary Testing 2
frequency
frequency
correlation of letters
correlation of words
HH
HM
11
Preliminary Testing 3
frequency
HH
CFG parseable?
56
15
13
6
HH HM
HH HM
hypothesis transcript
HM
length of sentence
(XN-Gram)
12
Preliminary Testing 4
Imperatives intact?
Does robot occur?
44
13
2
2
HH HM
HH HM
hypothesis transcript
transcript
(XN-Gram)
13
Simple Linear Classifierusing Perplexity
LM without Data with Data
without with
error 26 39 precision 63
75 recall 50 33
error 26 39 precision 69 88 recall 50
39
14
Simple Linear Classifierusing Correlation
frequency
letters
results wrd ltr errors
20 25 precision 100 50
recall 25 13
HH
HM
words
correlation
treshold twrd 10 tltr 50
15
Bayes Classifier

estimate two models p(e(X)HH) and p(e(X)HM)
with Gaussian distribution
e(X) ? perpHM(X), perpHH(X), S(X), K(XCFG,X
N-Gramm), Z(X)
estimate P(HH) and P(HM) by counting
classificate C argmax P(e(X)C)P(C)

HH,HM
perp2(X)
e(X)
perpHH1/2(X),S(X)
perp1/2(X),S(X),Z(X)
error
23
20
28
precision
75
75
47
recall
19
38
56
the three best results
16
Multi-layer Perceptron
Used Architectures A1 PerpHH1(X)
PerpHM1(X) A2 A1 S(X) Z(X) A3
A2 PerpHH2(X) PerpHM2(X) A4 A3
K(X) T1 some as A3 T2 A4
R(X) I(X)
on hypothesis (XN-Gram)
on transcript
A1
e(X)
T1
T2
A2
A3
A4
error
23
23
13
13
33
18
65
precision
100
56
90
83
43
recall
12
56
56
62
81
69
Besten Ergeb. mit 2,4,6 7 Merkmalen.
transcript
17
Comparison of Methods
prec.
on tran- script
recall
prec.
error
Com-PP
26
50
69
HH
Correl.
20
25
100
Bayes
guess
38
75
20
HM
MLP
18
69
65
recall
18
Summary Outlook

MLP has better results than all other methods
( Com-PP gt Bayes gt Corrl
decision criterion F-Measure)
Focus of Attention on hypothesis only possible
with 65 precision and 69 recall
on transcript better than on hypothesis
High expectations for further
modalities, especially state of dialogue and
gaze ? Diplomarbeit (further work)

Write a Comment

User Comments (0)