A Question of Questions: Prosodic Cues to Question Form and Function - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

A Question of Questions: Prosodic Cues to Question Form and Function

Description:

Distr. Example. Form. Question-Bearing Turns. Contain one or more questions. N = 918 ... right? 24% yes/no. The weight? 54% declarative. Example. Distr. Form ... – PowerPoint PPT presentation

Number of Views:104
Avg rating:3.0/5.0
Slides: 34
Provided by: vend4
Category:

less

Transcript and Presenter's Notes

Title: A Question of Questions: Prosodic Cues to Question Form and Function


1
A Question of Questions Prosodic Cues to
Question Form and Function
  • Julia Hirschberg
  • (Joint work with)
  • Jennifer Venditti and Jackson Liscombe

2
Questioning in Dialogue
  • A fundamental activity in conversation
  • Elicit information
  • Elicit action
  • But
  • How to define a question?
  • Bolinger 57 fundamentally an attitudean
    utterance that craves a verbal or other
    semiotic response
  • Ginzburg Sag 00 the semantic object
    associated with the attitude of wondering and the
    speech act of questioning
  • How to identify a question as such
  • How to represent its semantics? The intention of
    the questioner?

3
Distinguishing Question Form and Function
  • Questions may take many syntactic forms
  • Is it a question? What is a question? Its a
    question, isnt it? Is it a question or an
    answer? Right? Its a question?
  • Questions may serve many pragmatic functions
  • Clarification-seeking? Information-seeking?
    Confirmation-seeking?
  • Possible Indicators
  • Syntactic cues
  • Context
  • Intonation

4
Questions in Spoken Dialogue Systems
  • Goals
  • Examine question form and function
  • How are they related?
  • What features characterize them?
  • Identify form and function automatically in an
    Intelligent Tutoring domain

5
Previous Studies
  • Integration of prosodic tree model with language
    model based on words yields best performance
    accuracy in detecting questions/question form
    (Shriberg et al.98 English)
  • Some corpus-based (MapTask) studies have examined
    tune/accent types wrt. question function
    (Kowtko96 Glaswegian English Grice et al.95
    German, Italian, Bulgarian)
  • Studies of different types (functions) of
    clarification questions (Rodríguez
    Schlangen94 German Edlund et al.95 Swedish)
  • Our goal a comprehensive quantitative analysis
    of question form and function in English which
    will permit question form/function identification

6
Domain Intelligent Tutoring Systems
  • ITSs must be able to recognize both the form and
    function of student questions
  • Students ask human tutors many questions
  • More questions ? better learning
  • Different question FORMs seek different
    information
  • e.g. polar questions seek yes-no answer
  • wh-questions seek different information
  • Different question FUNCTIONs also often require
    different types of answers

7
  • Wh-questions, e.g.
  • Information-seeking
  • (S has just submitted an essay to the tutor)
  • S Ok, what do you think about that?
  • T Uh, well that uh you have uh there are too
    many parameters here which uh need definition ...
  • Clarification-seeking
  • T So if there is if the only force on an object
    in earths gravity then what is its motion
    called?
  • S What was the motion called?
  • T Yes, whats the name for this motion?

8
  • Yes-no questions, e.g.
  • Information-seeking ? tutor provides additional
    information
  • Clarification ? clarification subdialogue
  • Successful ITSs must be able to recognize the
    presence of a question in a student turn and its
    form and function

9
Question Corpus
  • Human-human tutoring dialogs collected by Litman
    et al.04 for development of ITSpoke, a
    speech-enabled ITS designed to teach physics
  • Why2-Atlas (Kurt VanLehn (U. Pitt), Art Graesser
    (U. Memphis))
  • Corpus includes 1030 student questions
  • Question defined a la Bolinger 57 as an
    utterance that craves a response
  • 25.2 Qs/hour
  • 13.3 of total student speaking time
  • This study a subset of 643 tokens

10
pr01_sess00_prob58
11
Question Detection
  • what symbol are you talking about
  • do i have to rewrite this again
  • am i ok with that
  • so itd be one meter per second squared

12
Coding question type
  • Form coding based on surface syntax
  • Declarative question (dQ) Its a vector? A
    vector?
  • Yes-no question (ynQ) Is it a vector?
  • Wh-question (whQ) What is a vector?
  • Tag question (ynTAG) Its a vector, isnt it?
  • Alternative question (altQ) Is it a vector or a
    scalar?
  • Particle (part) Huh?
  • Function coding derived from Stenström 84
  • Confirmation-seeking check question (chk)
  • Clarification-seeking question (clar)
  • Information-seeking question (info)
  • Other (oth)

13
Form/Function Distribution
chk clar info oth N ()
dQ 257 81 2 4 344 (53.5)
ynQ 53 80 27 5 165 (25.7)
whQ - 47 21 - 68 (10.6)
ynTAG 41 5 - - 46 (7.2)
altQ 6 5 1 - 12 (1.9)
part - 8 - - 8 (1.2)
N 357 226 51 9 643
() (55.5) (35.1) (7.9) (1.4) (100)
14
Falling (L-L) F0 contours
chk clar info oth N ()
dQ 3 4 - - 7 (2.0)
ynQ - 4 5 2 11 (6.7)
whQ - 12 17 - 29 (42.6)
ynTAG 1 1 - - 2 (4.3)
altQ 2 5 1 - 8 (66.7)
part - - - - -
N 6 26 23 2 57
() (1.7) (11.5) (45.1) (22.2) (100)
15
F0 measures of non-falling questions
  • Quantitative analysis of F0 height in the 573
    non-falling tokens w/sufficient data for analysis
  • Examined question nucleus (nucF0) and tail (btF0)
    only
  • Speaker-normalized (z-score) F0 of
  • 1. nuclear accent (nucF0)
  • 2. rightmost edge of question (btF0)
  • 3. difference between 1 2 (riserange)

16
Question Form and F0
  • DeclQs and YNQs both thought to rise (HH-H vs.
    LH-H?) Are there F0 height differences
    between them?
  • 2-way ANOVA on form x function
  • FORM nucF0 F(5)19.34, p0
  • btF0 F(5)10.71, p0
  • riserange F(5)3.6, plt.01
  • Planned comparisons (Tukey, alpha.01) show no
    difference between declarative Qs and yes-no Qs
  • Main effect of form caused by yes-no tags (low
    F0) and particles (high F0)

17
Normalized means at nucF0 and btF0
18
Question Function and F0
  • Question dialog acts thought to correlate with
    F0 Does question FUNCTION affect F0?
  • 2-way ANOVA on form x function
  • FUNCTION nucF0 F(3)16.6, p0
  • btF0 F(3)8.56, plt.001
  • riserange F(3)3.94, plt.01
  • Main effect planned comparisons show
  • clarQ gt chkQ (nucF0 btF0)
  • infoQ gt clarQ/chkQ (nucF0)
  • No interactions for any measure

19
Clarification types and F0
Clark 96 levels of coordination sources of
communication problems
1 Channel Problem hearing if the tutor actually said something or not (Huh?, Hm?)
2 Perception Problem hearing what the tutor said (G as in God?, Did you say a word or a letter?, including reprise/echo questions (A what?)
3 Understanding Problem with reference resolution (This up here?, What did I imply or what does the statement imply?), or with general understanding (Is that the same thing or is that different?, What do you mean?)
4 Intention Problem determining what the tutor intended by his utterance (You want an exact number?, Uh are you asking me another characteristic of freefall?)
Non-interlocutor-related (NIR) Problem understanding the task (Am I supposed to speak this or type it?), or clarification of the examination question (Should I assume both vehicles are going at the same speed?)
20
Effects of Clarification Type
  • One-way ANOVA combining levels 12 into single
    acoustic/perceptual category
  • nucF0 F(3)5.41, p.001
  • btF0 F(3)6.6, plt.001
  • riserange F(3)2.59, p.05
  • Main effect for clarification type
  • Ranking for each measure
  • higher F0 gt gt gt gt gt gt gt gt gt gt gt gt gt gt gt lower F0
  • acoust/percept gt understanding gt NIR gt intention
  • Planned comparisons (Tukey, alpha.01)
    show only significant comparison was
    acoust/percep gt intention

21
Can Prosody Distinguish Question Form? Question
Function?
  • Only a few question forms prosodically distinct
    in our study lexico/syntactic information can
    help
  • Question function more successfully
    differentiated prosodically where there is less
    reliable lexico/syntactic information
  • Can we use prosodic information with
    lexico-syntactic information to help identify
    question form and function automatically?

22
Detecting Student Questions
  • Syntax
  • Wh-words, subject/auxiliary inversion
  • Prosody
  • Phrase-final rising intonation (Pierrehumbert
    Hirschberg 90)
  • Duration and pausing (Shriberg et al. 98)
  • Lexico-pragmatics
  • personal pronouns, utterance-initial pronouns
    (Geluykens 1987 Beun 1990)

23
Corpus
  • 141 ITSpoke dialogues
  • 5 hours of student speech
  • Student turns average 2.5 seconds
  • 1,030 questions
  • 25 questions per hour
  • 70 of turns consist entirely of the question
  • 89 of questions are turn-final

24
Question Form Distribution in ITSpoke
Form Example Distr.
yes/no Is that right? 24
wh- What do you mean? 10
yes/no tag It will stay the same, right? 7
alternative Force or something? 3
particle Huh? 2
declarative The weight? 54
25
Question-Bearing Turns
  • Contain one or more questions
  • N 918

26
Features Extracted
  • Prosodic
  • pitch
  • loudness
  • pausing
  • speaking rate
  • calculated over entire turn and last 200 ms
  • Syntactic
  • unigram and bigram part-of-speech tags

27
Feature Extraction
  • Lexical
  • unigram and bigram hand-labeled transcriptions
  • Student and task dependent
  • pre-test score
  • gender
  • correctness
  • previous tutor dialogue act

28
Machine Learning Experiments
  • Question-bearing vs. non-question-bearing
  • Down-sampled to 50/50 distribution
  • Experimented by feature type
  • Adaboosted C4.5 decision trees
  • 5-fold cross validation
  • Best results with all features
  • Accuracy 79.7
  • Precision Recall F-measure 0.8

29
Accuracy by Feature Type
prosody pausing and speaking rate 52.6
student and task dependent 56.1
prosody loudness 61.8
syntactic 65.3
lexical 67.2
prosody last 200 ms 70.3
prosody pitch 72.6
prosody all 74.5
30
Feature Type Discussion
  • Which features most informative?
  • pitch slope of last 200 ms and entire turn
  • maximum and mean pitch of turn
  • Which features most often used in learning?
  • pre-test score
  • slope of last 200 ms
  • maximum pitch of entire turn
  • cumulative pause duration

31
Other Observations
  • Syntactic features were informative
  • personal pronoun verb, wh-pronoun, interjection
  • Lexical features were informative
  • yes, right, what, I, you

32
Conclusions
  • Most questions in our tutoring corpus are
    declarative in form
  • More than syntax is needed to identify these as
    questions
  • Prosodic features are very important
  • Detecting question-bearing turns is possible
  • Detecting question function is needed

33
Question Forms in ITSpoke
Form Distr. Example
declarative 54 The weight?
yes/no 24 Is that right?
wh- 10 What do you mean?
yes/no tag 7 It will stay the same, right?
alternative 3 Force or something?
particle 2 Huh?
Write a Comment
User Comments (0)
About PowerShow.com