Recognizing Authority in Dialogue with an Integer Linear Programming Constrained Model presentation

About This Presentation

Transcript and Presenter's Notes

Title: Recognizing Authority in Dialogue with an Integer Linear Programming Constrained Model

1
Recognizing Authority in Dialogue with an
Integer Linear Programming Constrained Model

Elijah Mayfield
Computational Models of Discourse
February 9, 2011

2
Outline

Goal of Negotiation Framework
Comparison to other NLP tasks
Our coding scheme for Negotiation
Computational modeling
Results and Conclusion

3
Goal

How can we measure speakers positioning
themselves as information givers/receivers in a
discourse?
Several related questions
Initiative/Control
Speaker Certainty
Dialogue Acts

4
Initiative and Control

Tightly related concepts from turn-taking
research
Conveys who is being addressed and who is
starting discourse segments
Does not account for authority over content, just
over discourse structure

5
Speaker Certainty

Measures a speakers confidence in what they are
talking about
Evaluates self-evaluation of knowledge and
authority over content
Does not model interaction between speakers

6
Dialogue Acts

Separates utterances out into multiple categories
based on discourse function
Covers concepts from both content of the
utterance and discourse structure
Overly general and difficult to separate into
high/low authority tags

7
The Negotiation Framework

Labels moves in dialogue based on
Authority (primary vs. secondary)
Focus (action vs. knowledge)
Interactions over time (delays and followups)
We must maintain as much insight as possible from
Negotiation while making these analyses fully
automatic.

8
The Negotiation Framework

In the original framework, lines of dialogue can
be marked as

Knowledge Action
Primary K1 A1
Secondary K2 A2
Delay Standard Followup
dX X Xf
cl ch tr
rcl rch rtr
.and more in other research .and more in other research .and more in other research
9
The Negotiation Framework

With these codes, dialogue can be examined at a
very fine-grained level

10
The Negotiation Framework

But these codes are always applied by the
researchers intuition.
Many interpretations exist, depending on the
context and researchers goals.
Quantitative measures of reproducibility between
analysts is not highly valued.

11
Computationalizing Negotiation

We developed a consistent coding manual for a
pared-down Negotiation.
Consulted with sociocultural researchers,
education researchers, sociolinguists,
computational linguistics, computer scientists,
interaction analysts, learning scientists, etc.
Also consulted with James Martin, the researcher
most associated with this framework.

12
Computationalizing Negotiation

Our system has six codes

Code Meaning Example
K1 Primary Knower This is the end.
K2 Secondary Knower Is this the end?
A1 Primary Actor Im going to the end.
A2 Secondary Actor Go to the end.
ch Challenge I dont have an end marked.
o Other So
13
Computationalizing Negotiation

These codes are more complex than equivalent
surface structures such as statement/question/comm
and

Speaker Example Surface Code
Giver Ready? Question o
Giver You should go to the bridge. Statement A2
Follower I should go to the bridge. Statement o
Giver The bridge. Fragment o
Follower Right. Fragment A1
14
Computationalizing Negotiation

Our coding also has a notion of sequences in
discourse.

Speaker Text Code
Giver Have you got farmed land? K2
Follower No. K1
Follower Have I got to follow the babbling brook? K2
Giver Not yet. K1
Giver Further down youve got to cross at the fork. A2
Follower Oh I see, okay. A1
Giver Right. o
15
Computationalizing Negotiation

Thus our simplified model goes from over twenty
codes to six
In parallel is a binary same-new segmentation
problem at each line.
Inter-rater reliability for coding this by hand
reached kappa above 0.7.

16
Results from Manual Coding

We first checked to see whether our simplified
coding scheme is useful.
Defined Authoritativeness Ratio as
Looked for correlation with other factors.

K1 A2
K1 K2 A1 A2
17
Results from Manual Coding

First test Cyber-bullying
Corpus 36 conversations, each between two
sixth-grade students

Speaker Text Bullying
zoo bitch i sed hold on!!\
zoo lol
donan NO IM NOT GONNA RELAZ DAMN LOL
Shia Hold on donan
Shia Relax
donan BITE ME LOL
baby omg zoo please stop
18
Results from Manual Coding

First test Cyber-bullying
Corpus 36 conversations, each between
two sixth-grade students
18 pairs of students, each with two
conversations over two days.
Result
Bullies are more authoritative than non-bullies.
(p lt .05)
Non-bullies become less authoritative over time.
(p lt .05)

19
Results from Manual Coding

Second Test Collaborative Learning
54 conversations, each between 2 sophomore
Engineering undergraduates.
Results
Authoritativeness is correlated with learning
gains from tutoring (r2 0.41, p lt .05)
Authoritativeness has a significant interaction
with self-efficacy (r2 0.12, p lt .01)

20
Results from Manual Coding

We have evidence that our coding scheme tells us
something useful.
Now, can we automate it?

21
Computational Modeling

20 dialogues coded from MapTask corpus

Code Meaning
K1 Primary Knower 984 22.5
K2 Secondary Knower 613 14.0
A1 Primary Actor 471 10.8
A2 Secondary Actor 708 16.2
ch Challenge 129 2.9
o Other 1469 33.6
Total 4374 100
22
Computational Modeling

Baseline model Bag-of-words SVM
Advanced model adds features
Bigrams Part-of-Speech Bigrams
Cosine similarity with previous utterance
Previous utterance label (on-line prediction)
Separate segmentation models for short (1-3
words) and long (4 word) utterances

23
Computational Modeling

At each line of dialogue, we must select a label
from K1, K2, A1, A2, o, ch
We can also build a segmentation model to select
from new, same
But how does this segmentation affect the
classification task?

24
Constraint-Based Approach

Remember that our coding has been segmented into
sequences based on rules in the coding manual
We can impose these expectations on our models
output through Integer Linear Programming.

25
Constraint-Based Approach

We now jointly optimize the assignment of labels
and segmentation boundaries.
When the most likely label is overruled, the
model must choose to
Back off to most likely allowed label, or
Start a new sequence, based on segmentation
classifier.

26
Constraint-Based Approach

We use a toolkit that allows us to define
constraints as boolean statements.
These constraints define things that must be true
in a correctly labeled sequence.
These correspond to rules defined in our human
coding manual.

27
Constraint-Based Approach

Constraints
In a sequence, a primary move cannot occur before
a secondary move.

Key ui The ith utterance in the dialogue. s
The sequence containing ui uil The label
assigned to ui uis The speaker of ui
28
Constraint-Based Approach

Constraints
In a sequence, action moves and knowledge moves
cannot both occur

Key ui The ith utterance in the dialogue. s
The sequence containing ui uil The label
assigned to ui uis The speaker of ui
29
Constraint-Based Approach

Constraints
Non-contiguous primary moves cannot occur in the
same sequence.

Key ui The ith utterance in the dialogue. s
The sequence containing ui uil The label
assigned to ui uis The speaker of ui
30
Constraint-Based Approach

Constraints
Speakers cannot answer their own questions or
follow their own commands.

Key ui The ith utterance in the dialogue. s
The sequence containing ui uil The label
assigned to ui uis The speaker of ui
31
Experiments

We measure our performance using three metrics
Accuracy of correctly predicted labels
Kappa Accuracy improvement over chance
agreement
Ratio Prediction r2 How well our model predicts
speaker Authoritativeness Ratio.
All results given are from 20-fold
leave-one-conversation-out cross-validation

32
Experiments
Classifier ILP? Acc. Kappa R2
Basic No 59.7 0.465 0.354
33
Experiments
Classifier ILP? Acc. Kappa R2
Basic No 59.7 0.465 0.354
Basic Yes 61.6 0.488 0.663
Accuracy Improved, p lt 0.009 Correlation
Improved, p lt 0.0003
34
Experiments
Classifier ILP? Acc. Kappa R2
Basic No 59.7 0.465 0.354
Basic Yes 61.6 0.488 0.663
Advanced No 66.7 0.565 0.908
Accuracy Improved, p lt 0.0001 Correlation
Improved, p lt 0.0001
35
Experiments
Classifier ILP? Acc. Kappa R2
Basic No 59.7 0.465 0.354
Basic Yes 61.6 0.488 0.663
Advanced No 66.7 0.565 0.908
Advanced Yes 68.4 0.584 0.947
Accuracy Improved, p lt 0.005 Correlation
Improved, p lt 0.0001
36
Error Analysis

Biggest source of error is o vs. not-o
Is there any content at all in the utterance?
High accuracy between 4 codes if content is
identified, though
A2-A1 often looks identical to K1-o.

37
Conclusion

Weve formulated the Negotiation framework in a
reliable way.
Machine learning models can reproduce this coding
highly accurately.
Local context and structure, enforced through
ILP, help in this classification.

Write a Comment

User Comments (0)

About PowerShow.com

Recognizing Authority in Dialogue with an Integer Linear Programming Constrained Model PowerPoint PPT Presentation