Spoken Dialogue for Intelligent Tutoring Systems: Opportunities and Challenges - PowerPoint PPT Presentation

About This Presentation
Title:

Spoken Dialogue for Intelligent Tutoring Systems: Opportunities and Challenges

Description:

[Thanks to Natalie Person and Lindsay Sears, Rhodes College] Intelligent Tutoring Systems ... and the left side pumps blood to the other parts of the body. ... – PowerPoint PPT presentation

Number of Views:80
Avg rating:3.0/5.0
Slides: 61
Provided by: csP4
Category:

less

Transcript and Presenter's Notes

Title: Spoken Dialogue for Intelligent Tutoring Systems: Opportunities and Challenges


1
Spoken Dialogue for Intelligent Tutoring
SystemsOpportunities and Challenges
  • Diane Litman
  • Computer Science Department
  • Learning Research Development Center
  • University of Pittsburgh
  • HLT-NAACL 2006

2
Outline
  • Motivation and History
  • The ITSPOKE System and Corpora
  • Opportunities and Challenges
  • Performance Evaluation
  • Affective Reasoning
  • Discourse Analysis
  • Summing Up

3
What is Tutoring?
  • A one-on-one dialogue between a teacher and a
    student for the purpose of helping the student
    learn something.
  • Evens and Michael 2006
  • Human Tutoring Excerpt
  • Thanks to Natalie Person and Lindsay
    Sears, Rhodes College

4
Intelligent Tutoring Systems
  • Students who receive one-on-one instruction
    perform as well as the top two percent of
    students who receive traditional classroom
    instruction Bloom 1984
  • Unfortunately, providing every student with a
    personal human tutor is infeasible
  • Develop computer tutors instead

5
Tutorial Dialogue Systems
  • Why is one-on-one tutoring so effective?
  • ...there is something about discourse and
    natural language (as opposed to sophisticated
    pedagogical strategies) that explains the
    effectiveness of unaccomplished human tutors.
  • Graesser, Person et al. 2001
  • Working hypothesis regarding learning gains
  • Human Dialogue Computer Dialogue Text

6
Spoken Tutorial Dialogue Systems
  • Most human tutoring involves face-to-face spoken
    interaction, while most computer dialogue tutors
    are text-based
  • Can the effectiveness of dialogue tutorial
    systems be further increased by using spoken
    interactions?

7
A Brief History
  • 1970 Mid 1980s
  • SCHOLAR (Carbonell)
  • WHY (Stevens and Collins)
  • SOPHIE (Burton and Brown)
  • Meno-Tutor (Woolf and McDonald)
  • Late 1980s - 1990s
  • CIRCSIM-Tutor (Evens, Michael and Rovick)
  • SHERLOCK II (Lesgold)
  • Unix Consultant (Wilensky et al. )
  • EDGE (Cawsey)
  • Currently
  • Why2-AutoTutor (Graesser et al.) (speech
    synthesis)
  • Why2-Atlas (VanLehn et al.)
  • CyclePad (Rose et al.)
  • Beetle (Moore et al.)
  • DIAG-NLG (Di Eugenio)
  • SCoT (Peters et al.) (spoken dialogue)
  • ITSPOKE (Litman et al.) (spoken dialogue)

8
Potential Benefits of Speech I
  • Self-explanation correlates with learning Chi et
    al. 1994 and occurs more in speech Hausmann and
    Chi 2002
  • Tutor The right side pumps blood to the lungs,
    and the left side pumps blood to the other parts
    of the body. Could you explain how that works?
  • Student 1 (self-explains) So the septum is a
    divider so that the blood doesn't get mixed up.
    So the right side is to the lungs, and the left
    side is to the body. So the septum is like a wall
    that divides the heart into two parts...it kind
    of like separates it so that the blood doesn't
    get mixed up...
  • Student 2 (doesnt self-explain) right side
    pumps blood to lungs

9
Potential Benefits of Speech I
  • Self-explanation correlates with learning Chi et
    al. 1994 and occurs more in speech Hausmann and
    Chi 2002
  • Tutor The right side pumps blood to the lungs,
    and the left side pumps blood to the other parts
    of the body. Could you explain how that works?
  • Student 1 (self-explains) So the septum is a
    divider so that the blood doesn't get mixed up.
    So the right side is to the lungs, and the left
    side is to the body. So the septum is like a wall
    that divides the heart into two parts...it kind
    of like separates it so that the blood doesn't
    get mixed up...
  • Student 2 (doesnt self-explain) right side
    pumps blood to lungs

10
Potential Benefits of Speech I
  • Self-explanation correlates with learning Chi et
    al. 1994 and occurs more in speech Hausmann and
    Chi 2002
  • Tutor The right side pumps blood to the lungs,
    and the left side pumps blood to the other parts
    of the body. Could you explain how that works?
  • Student 1 (self-explains) So the septum is a
    divider so that the blood doesn't get mixed up.
    So the right side is to the lungs, and the left
    side is to the body. So the septum is like a wall
    that divides the heart into two parts...it kind
    of like separates it so that the blood doesn't
    get mixed up...
  • Student 2 (doesnt self-explain) right side
    pumps blood to lungs

11
Potential Benefits of Speech II
  • Speech contains prosodic information, providing
    new sources of information about the student for
    dialogue adaptation Fox 1993 Litman and
    Forbes-Riley 2003 Pon-Barry et al. 2005
  • A correct but uncertain student turn
  • ITSPOKE How does his velocity compare to that of
    his keys?
  • STUDENT his velocity is constant

12
Potential Benefits of Speech III
  • Spoken computational environments may foster
    social relationships that may enhance learning
  • AutoTutor Graesser et al. 2003

13
Potential Benefits of Speech IV
  • Some applications inherently involve spoken
    language
  • Spoken Conversational Interface for
  • Language Learning
  • Thanks to Stephenie Seneff, MIT and Cambridge
  • Reading Tutors Mostow, Cole
  • Others require hands-free interaction
  • Circuit Fix-It Shop Smith 1992

14
Why Should NLP Researchers Care?
  • Many reasons why tutoring researchers are
    interested in spoken dialogue
  • Why should spoken dialogue researchers become
    interested in tutoring?
  • Tutoring applications differ in many ways from
    typical spoken dialogue applications
  • Opportunities and Challenges!

15
Outline
  • Motivation and History
  • The ITSPOKE System and Corpora
  • Opportunities and Challenges
  • Performance Evaluation
  • Affective Reasoning
  • Discourse Analysis
  • Summing Up

16
  • Back-end is Why2-Atlas system VanLehn et al.
    2002
  • Sphinx2 speech recognition and Cepstral
    text-to-speech

17
  • Back-end is Why2-Atlas system VanLehn et al.
    2002
  • Sphinx2 speech recognition and Cepstral
    text-to-speech

18
  • Back-end is Why2-Atlas system VanLehn et al.
    2002
  • Sphinx2 speech recognition and Cepstral
    text-to-speech

19
Two Types of Tutoring Corpora
  • Human Tutoring
  • 14 students / 128 dialogues (physics problems)
  • 5948 student turns, 5505 tutor turns
  • Computer Tutoring
  • ITSPOKE v1
  • 20 students / 100 dialogues
  • 2445 student turns, 2967 tutor turns
  • ITSPOKE v2
  • 57 students / 285 dialogues
  • both synthesized and pre-recorded tutor voices

20
ITSPOKE Experimental Procedure
  • College students without physics
  • Read a small background document
  • Took a multiple-choice Pretest
  • Worked 5 problems (dialogues) with ITSPOKE
  • Took an isomorphic Posttest
  • Goal was to optimize Learning Gain
  • e.g., Posttest Pretest

21
Outline
  • Motivation and History
  • The ITSPOKE System and Corpora
  • Opportunities and Challenges
  • Performance Evaluation
  • Affective Reasoning
  • Discourse Analysis
  • Summing Up

22
Predictive Performance Modeling
  • Opportunity
  • Spoken dialogue system evaluation methodologies
    can improve our understanding of how dialogue
    facilitates student learning Forbes-Riley and
    Litman 2006
  • Challenges
  • How to measure system performance?
  • What are predictive interaction parameters?

23
Predictive Performance Modeling
  • Understand why a spoken dialogue system fails or
    succeeds
  • PARADISE Walker et al. 1997
  • Measure parameters (interaction costs and
    benefits) and performance in a system corpus
  • Train model via multiple linear regression over
    parameters, predicting performance
  • System Performance ? wi pi
  • Test model on new corpus
  • Predict performance during future system design

n
i1
24
Challenges
  • System Performance
  • Prior evaluations used User Satisfaction
  • Is Student Learning more relevant for the
    tutoring domain?
  • Interaction Parameters
  • Prior applications used Generic parameters
  • Are Task-Specific and Affective parameters also
    useful?

25
Findings
  • Using PARADISE to predict Learning
  • Posttest .86 Time .65 Pretest - .54
    Neutrals
  • Useful Predictors
  • Traditional parameters
  • e.g., Elapsed Time, Dialogue and Turn Length
  • New parameters
  • e.g., Affect, Correctness

26
Contrasts with Non-Tutorial Dialogue
  • User Satisfaction models are less useful
  • Tutoring systems are not designed to maximize
    User Satisfaction
  • Interaction parameters for learning
  • Posttest .86 Time .65 Pretest - .54
    Neutrals
  • longer dialogues are better
  • speech recognition problems dont seem to matter
  • lack of some types of affect is bad

27
Contrasts with Non-Tutorial Dialogue
  • User Satisfaction models are less useful
  • Tutoring systems are not designed to maximize
    User Satisfaction
  • Interaction parameters for learning
  • Posttest .86 Time .65 Pretest - .54
    Neutrals
  • longer dialogues are better
  • speech recognition problems dont seem to matter
  • lack of some types of affect is bad

28
Contrasts with Non-Tutorial Dialogue
  • User Satisfaction models are less useful
  • Tutoring systems are not designed to maximize
    User Satisfaction
  • Interaction parameters for learning
  • Posttest .86 Time .65 Pretest - .54
    Neutrals
  • longer dialogues are better
  • speech recognition problems dont seem to matter
  • lack of some types of affect is bad

29
Outline
  • Motivation and History
  • The ITSPOKE System and Corpora
  • Opportunities and Challenges
  • Performance Evaluation
  • Affective Reasoning
  • Discourse Analysis
  • Summing Up

30
Detecting and Responding to Student Affective
States
  • Opportunity
  • Affective spoken dialogue system technology can
    improve student learning and other measures of
    performance Aist et al. 2002 Pon-Barry et al.
    2006
  • Challenges
  • What to detect?
  • How to respond?
  • Pedagogical versus spoken dialogue performance?

31
Monitoring Student State (motivation)
  • Tutor26 Let me ask my question in a more
    general way. What happens to the velocity of a
    body when there is no force acting on it?
  • Student27 dammit (ASR it is)
  • Tutor28 Could you please repeat that?
  • Student29 same (ASR i same)
  • Tutor30 Right. And what happens to the
    HORIZONTAL velocity of a body when there is no
    HORIZONTAL force acting on it?
  • Student31 zero (ASR the zero)
  • Tutor32 Well... OK, let's consider how forces
    in the vertical direction affect the motion of a
    body in a horizontal direction in a different
    scenario
  • Student33 oh gravity you already said this (ASR
    beats gravity you're exert less)
  • Tutor34 Fine. Are there any other forces acting
    on the apple as it falls?
  • Student35 no why are you doing this again (ASR
    no y and to it yes)
  • Tutor36 Correct. The only force acting on the
    apple as it falls to the earth is the earth's
    gravitational force. In which direction does
    gravitational force act?
  • Student37 downward you computer (ASR downward
    you computer)

32
Affective Spoken Dialogue Systems Standard
Methodology
  • Manual Annotation of Affect and Attitudes
  • Naturally-occurring spoken dialogue data Ang et
    al. 2002 Lee et al. 2002 Batliner et al. 2003
    Devillers et al. 2003 Shafran et al. 2003
    Liscombe et al. 2005
  • Prediction via Machine Learning
  • Automatically extract features from user turns
  • Use different feature sets (e.g. prosodic,
    lexical) to predict affect
  • Significant reduction of baseline error

33
Challenge 1 What emotions to detect?
  • Communicator and Customer Care Systems
  • Negative Angry, Annoyed, Frustrated, Tired
  • Positive/Neutral Amused, Cheerful, Delighted,
    Happy, Serious
  • Ang et al. 2002 Shafran et al. 2003 Lee and
    Narayanan 2005 Liscombe et al. 2005

34
Challenge 1 What emotions to detect?
  • Communicator and Customer Care Systems
  • Negative Angry, Annoyed, Frustrated, Tired
  • Positive/Neutral Amused, Cheerful, Delighted,
    Happy, Serious
  • Ang et al. 2002 Shafran et al. 2003 Lee and
    Narayanan 2005 Liscombe et al. 2005
  • Tutorial Dialogue Systems
  • Negative Angry, Annoyed, Frustrated, Bored,
    Confused, Uncertain, Contempt, Disgusted, Sad
  • Positive/Neutral Certain, Curious, Enthusiastic,
    Eureka
  • Litman and Forbes-Riley 2006, DMello et al.
    2006

35
Example Student States in ITSPOKE
  • ITSPOKE What else do you need to know to find
    the boxs acceleration?
  • Student the direction UNCERTAIN
  • ITSPOKE If you see a body accelerate, what
    caused that acceleration?
  • Student force CERTAIN
  • ITSPOKE Good job. Say there is only one force
    acting on the box. How is this force, the box's
    mass, and its acceleration related?
  • Student velocity UNCERTAIN
  • ITSPOKE Could you please repeat that?
  • Student velocity ANNOYED

36
Challenge 2 How to respond?
  • In tutoring, not all negatively-valenced emotions
    are bad!
  • While frustration/anger/annoyance is often
    frustrating
  • Frustration can also be an opportunity to learn
  • Example from AutoTutor
  • neutral ? flow ? confusion ? frustration ?
    neutral
  • Thanks to Sidney DMello and Arthur Graesser,
  • University of Memphis

37
Example from ITSPOKE
  • ITSPOKE What is the net force acting on the
    truck equal to?
  • Student I dont know
    Uncertain, Frustrated
  • ITSPOKE Since net force is defined as the
    vector sum of all the forces exerted on an
    object, let's first find the forces acting on
    the truck. Try to name them.
  • Student The impact force, the normal force, and
    gravity Certain, Neutral
  • ITSPOKE Yep.

38
Uncertainty is also a Learning Opportunity
  • Uncertainty represents one type of learning
    impasse VanLehn et al. 2003
  • An impasse motivates a student to take an
    active role in constructing a better
    understanding of the principle.
  • Uncertainty is also associated with cognitive
    disequilibrium Craig et al. 2004
  • A state of failed expectations causing
    deliberation aimed at restoring equilibrium
  • Uncertainty positively correlates with learning

39
Do Human Tutors Respond to Student Uncertainty?
  • A data-driven method for designing dialogue
    systems adaptive to student state Forbes-Riley
    and Litman 2005
  • extraction of dialogue bigrams from annotated
    human tutoring corpora
  • ?2 analysis to identify dependent bigrams
  • generalizable to any domain with corpora labeled
    for user state and system response

40
Example Human Tutoring Excerpt
  • S So the- when you throw it up the acceleration
    will stay the same? Uncertain
  • T Acceleration uh will always be the same
    because there is- that is being caused by force
    of gravity which is not changing. Restatement
    , Expansion
  • S mm-k. Neutral
  • T Acceleration is it is in- what is the
    direction uh of this acceleration- acceleration
    due to gravity?
  • Short Answer Question
  • S Its- the direction- its downward. Certain
  • T Yes, its vertically down. Positive
    Feedback, Restatement

41
Bigram Dependency Analysis
- Student Certainness Tutor Positive Feedback
Bigrams
?2 225.92 (critical ?2 value at p .001 is
16.27)
42
Bigram Dependency Analysis (cont.)
- Less Tutor Positive Feedback after Student
Neutral turns
43
Bigram Dependency Analysis (cont.)
- Less Tutor Positive Feedback after Student
Neutral turns
- More Tutor Positive Feedback after Emotional
turns
44
Findings
  • Statistically significant dependencies exist
    between students state of certainty and the
    responses of an expert human tutor
  • After uncertain, tutor Bottoms Out and avoids
    expansions
  • After certain, tutor Restates
  • After mixed, tutor Hints
  • After any emotion, tutor increases Feedback
  • Dependencies suggest adaptive strategies for
    implementation in computer tutoring systems

45
Challenge 3 Pedagogical versus spoken dialogue
performance?
  • Negative user emotions (e.g. frustration) are
    often associated with speech recognition problems
    Boozer et al. 2003 Goldberg et al. 2003
  • Is this also true in tutoring?
  • Speech recognition problems negatively correlate
    with user satisfaction Walker et al. 2002,
    Pon-Barry et al. 2006
  • Is this also true for learning?

46
Findings
  • Statistically significant dependencies exist
    between student state and speech recognition
    problems Rotaru and Litman 2006
  • Frustrated/Angry turns are rejected more than
    expected
  • Uncertain turns have more problems than expected
    (certain turns have less)
  • Incorrect turns have more problems than expected
    (correct turns have less)
  • Learning opportunities (e.g. uncertain and
    incorrect student states) have more speech
    recognition problems
  • However, speech recognition problems have not
    negatively correlated with learning Litman and
    Forbes-Riley 2005, Pon-Barry et al. 2005

47
Outline
  • Motivation and History
  • The ITSPOKE System and Corpora
  • Opportunities and Challenges
  • Performance Evaluation
  • Affective Reasoning
  • Discourse Analysis
  • Summing Up

48
Discourse Structure
  • Opportunity
  • Dialogues with tutoring systems have more complex
    hierarchical discourse structures compared to
    many other types of dialogues
  • Challenges
  • How can discourse structure be exploited in the
    context of spoken dialogue systems?

49
Exploiting Discourse Structure (Motivation)
  • Average ITSPOKE dialogue is 20 minutes
  • Student turns are hierarchically structured
  • Level 1 1350 (57.3)
  • Level 2 643 (27.3)
  • Level 3 248 (10.5)
  • Levels 4-6 113 (4.8)

50
Discourse structureAnnotation and Transitions
  • Based on the Grosz Sidner theory of discourse
    structure
  • Discourse segment ? Discourse segment purpose
  • Hierarchy of discourse segments
  • Tutoring information encoded in a hierarchical
    structure
  • Human tutor manually authored dialogue paths for
    ITSPOKE
  • Automatic traversal of logs places utterances
    into the structure

51
ITSPOKE behavior Discourse structure annotation
Q1
Q2
Q3
Q2.1
Q2.2
52
Discourse structure transitions
53
Findings
  • Student correctness is predictive of student
    learning, but only after particular discourse
    transitions Rotaru and Litman 2006
  • e.g., After Pops (PopUp, PopUpAdvance)
  • incorrect turns negatively predict learning
  • correct turns positively predict learning
  • Student certainness is more predictive only after
    particular transitions

54
Findings (cont.)
  • While single discourse transitions are not
    predictive of learning, patterns in the discourse
    structure are
  • e.g., Advance-Advance and Push-Push both
    positively correlate with learning
  • Statistically significant dependencies exist
    between discourse transitions and speech
    recognition
  • e.g., after both Pushes and Pops, more
    misrecognitions

55
Outline
  • Motivation and History
  • The ITSPOKE System and Corpora
  • Opportunities and Challenges
  • Performance Evaluation
  • Affective Reasoning
  • Discourse Analysis
  • Summing Up

56
Summing Up I
  • Spoken Dialogue Systems are of great interest to
    researchers in Intelligent Tutoring
  • One-on-one tutoring is a powerful technique for
    helping students learn
  • Natural language dialogue contributes in a
    powerful way to the efficacy of
    one-on-one-tutoring
  • Using presently available NLP technology,
    computer tutors can be built and can serve as a
    valuable aid to student learning

57
Summing Up II
  • Intelligent Tutoring in turn provides many
    opportunities and challenges for researchers in
    Spoken Dialogue Systems
  • Performance Evaluation
  • Affective Reasoning
  • Discourse Analysis

58
Summing Up II
  • Intelligent Tutoring in turn provides many
    opportunities and challenges for researchers in
    Spoken Dialogue Systems
  • Performance Evaluation
  • Affective Reasoning
  • Discourse Analysis
  • and many more!
  • Initiative, Cohesion/Coherence, Dialogue Acts,
    Turn-Taking, Reinforcement Learning, User
    Simulation, Question-Answering

59
Acknowledgements
  • ITSPOKE group
  • Hua Ai, Kate Forbes-Riley, Alison Huettner,
    Beatriz Maeireizo-Tokeshi, Greg Nicholas, Amruta
    Purandare, Mihai Rotaru, Scott Silliman, Joel
    Tetrault, Art Ward
  • Columbia Collaborators Julia Hirschberg, Jackson
    Liscombe, Jennifer Venditti
  • NLP_at_Pitt
  • Jan Wiebe, Rebecca Hwa, Wendy Chapman, Paul
    Hoffmann, Behrang Mohit, Carol Nichols, Swapna
    Somasundaran, Theresa Wilson, Chenhai Xi
  • Why2-Atlas and Human Tutoring groups
  • Kurt Vanlehn, Pam Jordan, Uma Pappuswamy, Carolyn
    Rose
  • Micki Chi, Scotty Craig, Bob Hausmann,
    Margueritte Roy
  • Art Graesser, Natalie Person, Sidney DMello,
    Lindsay Sears
  • Stephenie Seneff
  • Martha Evens

60
Thank You!
  • Questions?
  • Further Information
  • http//www.cs.pitt.edu/litman/itspoke.html
  • And in September, come to Pittsburgh for
    Interspeech 2006!
Write a Comment
User Comments (0)
About PowerShow.com