Title: gesture features for coreference
1gesture features for coreference
- Jacob Eisenstein
- Randall Davis
- MIT CSAIL
2coreference resolution
- when do two noun phrases refer to the same thing?
- "This circle is rotating clockwise and this piece
of wood is attached at this point and this point
but it can rotate. So as the circle rotates,
this moves in and out. So this whole thing is
just going back and forth."
3coreference resolution
- when do two noun phrases refer to the same thing?
- "This circle is rotating clockwise and this piece
of wood is attached at this point and this point
but it can rotate. So as the circle rotates,
this moves in and out. So this whole thing is
just going back and forth."
4coreference resolution
This Wheel
The same?
This
This Bar
5coreference resolution
This Wheel
Demonstrative NP Singular / Neutral Gender
Traditional Coreference Resolution
The same?
This
This Bar
Demonstrative NP Singular / Neutral Gender
Pronoun Singular / Neutral Gender
6coreference
- annotated cheaply and reliably
- a building block for NLP applications
- summarization
- segmentation
- information retrieval
7coreference and catchments
- recurring gesture features match semantic
patterns - when gesture features disambiguate coreference ?
catchment - studying coreference gives a quantitative
analysis of catchments
8dataset
- new corpus of spontaneous multimodal
communication - nine speaker-listener pairs
- explanations of mechanical device behavior
- manipulation which modalities are available
- speech diagram sketch gesture only
- for this study, its speech diagram only
- more deixis, easier to interpret
- Total of 16 documents, 2-3 minutes in length
9tracking hand position
- motion, color, and edge cues are used to guide an
articulated upper-body model - 13DOF, 2.5D
10particle filtering
- online search of model configurations
- sampled representation to maintain multiple
hypotheses - at each time step
- update weights based on new observation
- resample particles (with replacement)
- drift to capture system dynamics
11extracted data
- position, velocity, acceleration
- hands, arms, body and head
- occlusion model directly
- manually annotated speech transcripts
- force-aligned for time synchronization
- coreference annotations
12gesture features
- features on pairs of gestures
- to predict coreference
- features on individual gestures
- to predict whether an NP introduces a new entity
- to predict whether gesture is relevant to
coreference
13features on pairs of gestures
- distance between gestures
- is the same hand gesturing?
14features on individual gestures
- speed
- jitter
- purpose speed / jitter
- bimanual synchronization
15results pairwise features
- distance between gestures (pixels)
- coreferent mean distance 48.4
- non-coreferent mean distance 74.8
16results single-gesture features
- does the NP have parents?
- not predicted by these features
- does the NP have children?
- predicted by speed, purpose
17results meta-features
- correlate single-gesture features with
discriminability of pairwise distance - speed, purpose (r -.17)
- x distance from body center (r .22)
- regression of single gesture features (r .42)
18when do catchments happen?
- what types of NP coreference are disambiguated by
gesture? - we assumed pronouns, this. not so.
- definite NPs are not predicted well by gesture
19when do catchments happen?
- theres a lot of research on gesture-speech
synchronization - typically measures time at beginning of motion
- this is a different way to measure gesture-speech
synchronization quite precisely
20where do catchments happen?
21future work
- move beyond deictic data, features
- we have data without diagrams, which includes
more representational gestures - recognize or annotate hand shape
- pairwise features that compare gesture
trajectories
22done?
23does gesture actually improve coreference
resolution?
- initial evaluation described in NAACL 2006
- the answer is yes, but not by as much as youd
hope - 54.9 with gestures, 52.8 without
- coreference resolution in spoken dialogues is
hard - better feature combination techniques may improve
performance, as with prosody - need to figure out how to use the meta-features
24All Done!