Title: Building the Design Studio of the Future
1Building the Design Studio of the Future
- Aaron Adler
- Jacob Eisenstein
- Michael Oltmans
- Lisa Guttentag
- Randall Davis
- October 23, 2004
2Sketching Is Not the Whole Picture
3Speech and Gestures
- Express more precise spatial and structural
relationships - Relate sketches that are drawn from different
perspectives - Describe temporal events or paths of motion
- Disambiguate using multiple modalities
4Our Vision
- Previous work focused on one or two modalities
- Step back and take a look at the big picture
- Design Studio of the Future
- Understand users sketching, speech, gesture
inputs - Respond naturally, like talking to another human
5Roadmap
Multimodal Communication
Multimodal Understanding
Recognition Components
Multimodal Intelligent Design Studio
Exploratory Studies
Dialogue Management
Human Dialogue
- Out-of-vocabulary terms
WoZ Dialogue Studies
Usability
- Possible design mistakes
- Illuminate unclear parts
Domain Modeling
Domain Reasoning
6Outline
- Sketching is not the whole picture
- Roadmap
- Trimodal Studies
- Preliminary Components
- Speech and Sketching
- Gesture and Speech
7Outline
- Sketching is not the whole picture
- Roadmap
- Trimodal Studies
- Preliminary Components
- Speech and Sketching
- Gesture and Speech
8Interesting Questions to Study
- Sketch
- What information is best communicated with
sketching? - How do people sketch when theyre explaining
things? - How do they sketch when theyre taking notes?
9Interesting Questions to Study
- Speech
- What information is best communicated with
speech? - How do people talk when theyre drawing? How can
ASR better support speech in the context of other
modalities? - How should computers generate and understand
back-channel feedback?
10Interesting Questions to Study
- Gesture
- What information is best communicated with
gesture? - Which features of gesture are idiosyncratic?
Which are universal? - What does gesture tell us about discourse
structure? - Are gestures cues for disfluencies? Sentence
breaks? - How can we build embodied agents with realistic
body language?
11Trimodal Interaction Study
- How are speech, gesture, and sketching used
together? - 15 pairs of participants
- Two video cameras, a Mimio, and a Tablet PC
- 4.5 hours of data
12Looking Ahead to Recognition
- Captured digital video
- Colored gloves used for future hand tracking
- Headset microphones
- Time-stamped sketch data
13Diagrams and objects
14Trimodal Setup
Speaker
Listener
15Trimodal Setup
16Details to Observe
- Structure then function
- Back channel communication
- Collaboration between speaker and listener
17Movie Clip I
18Movie Clip II
19Movie Clip III
20Early Results
- We identified two phases in the explanations
- Structure phase
- Function phase
- Participants used back channel communication
during the explanations - The amount of sketching the listener did varied
21Outline
- Sketching is not the whole picture
- Roadmap
- Trimodal Studies
- Preliminary Components
- Speech and Sketching
- Gesture and Speech
22Results Speech and Sketching
- Filled pauses important ahh, umm
- Key phrases there are, and
23Multimodal Understanding and Integration
- Combining speech and sketching we can make it
easier to perform some operations
There are three identical pendulums.
There are three touching pendulums.
24Speech-Gesture Integration
Time
- Align referential pronouns and gestures to the
diagram - this piece is made of wood
- it moves back and forth like
this
25Results Speech and Gesture
- Corpus-based approach
- Feature set based on linguistic knowledge
- Improves performance
- 84 F-measure for baseline system
- 95 F-measure for optimization system
26Conclusion
- Intelligent Design Studio of the Future
- Trimodal Corpus
- Empirical Results
- Initial Components