Title: Natural Tasking of Robots Based on Human Interaction Cues
1Natural Tasking of Robots Based on Human
Interaction Cues
- Rodney Brooks
- Cynthia Breazeal
- Brian Scassellati
- MIT Artificial Intelligence Lab
2ObjectiveInstructing Robots Through Imitation
- A commander in the field will be able to task a
robot just as he tasks a soldier - Robots will be useable without special training
or programming skills - Robots will be taskable in unique and dynamic
situations
3Research Issues
- Knowing what to imitate
- What sensory signals are relevant to the current
task? - Mapping between bodies
- How to convert sensory view of one agent to
self-mapping? - Correcting failures and recognizing success
- How can the robot automatically develop success
measures? - Chaining actions together
- How can the learned pieces be chained together
into complete task sequences? - Generalizing to more complex tasks
- How can the robot use invariants and common-sense
knowledge to adapt learned sequences to new
situations? - Making the interaction intuitive for the human
- How can natural social signals make it intuitive
for humans and robots to communicate with each
other?
4Approach
- Four aspects of our research methodology address
these six research issues - Capitalize on social cues from the commander
- Build adaptive systems with a developmental
progression to limit complexity - Exploit the advantages of the robots physical
embodiment - Integrate multiple sensory and motor systems to
provide robust and stable behavioral constraints
5Humanoid Robot Platforms
- Upper-torso humanoid
- 21 DOF
- Visual, vestibular, auditory, tactile, and
kinesthetic senses
- Active vision head with facial expressions
- 6 DOF head/neck, 15 DOF expressions
- Visual and auditory sensing
6New Heads for Cog and Kismet
- Developed under other funding (but used in MARS)
- Modeled after human anatomy.
- 8 DOF neck and eyes
- Very close to human range of motion and speed
- Assembly and initial testing completed for three
systems
7Mobile Platform 1 M4
- Force-controlled Quadruped walker constructed
under another DARPA contract - 19 DOF with visual, vestibular, thermal, and
kinesthetic sensing - Currently completing design specs
8Mobile Platform 2 Coco
- Knuckle-walking gorilla-like robot
- 15 DOF with visual, auditory, kinesthetic, and
tactile sensing - Currently in assembly
9Development of Imitation Skills
Speech Prosody
Vocal Cue Production
Directing Instructors Attention
Robot Teaching
Face Finding
Eye Contact
Gaze Direction
Gaze Following
Intentionality Detector
Recognizing Instructors Knowledge States
Recognizing Beliefs, Desires, and Intentions
Arm and Face Gesture Recognition
Facial Expression Recognition
Recognizing Pointing
Familiar Face Recognition
Motion Detector
Object Saliency
Object Segmentation
Object Permanence
Expectation-Based Representations
Body Part Segmentation
Human Motion Models
Depth Perception
Long-Term Knowledge Consolidation
Attention System
Task-Based Guided Perception
Action Sequencing
Schema Creation
Social Script Sequencing
Instructional Sequencing
Turn Taking
VOR/ OKR
Smooth Pursuit and Vergence
Multi-Axis Orientation
Mapping Robot Body to Human Body
Kinesthetic Body Representation
Self-Motion Models
Tool Use
Reaching Around Obstacles
Object Manipulation
Line-of-Sight Reaching
Simple Grasping
Active Object Exploration
10Current Research Topics
- Perceiving people
- Finding faces
- Finding eyes
- Adaptive motor control
- Simulated muscular motor control
- Oscillator-based locomotion
- Training from non-linguistic vocal cues
- Recognizing prosody (getting feedback)
- Expressive vocalization system (giving feedback)
- Integration
- Occulo-motor control systems
- Imitation as a mechanism for self-recognition
- Social dynamics
11Finding Faces 2 Methods
1) Ratio Template looks for structure of
grayscale gradients to detect frontal views also
extracts eye locations
Raw image
Face Detected
Eye Located
Ratio Template
2) Oval Models looks for a partial ellipse
based on edge gradients to detect a variety of
head orientations
12Adaptive Eye Finding
Raw image
Target Selector
Multi-Layer Perceptron
Heuristic filter
Skin-Color Filter
- Trained on hand-labeled images from real
environments - Currently using this and the ratio-template eye
finder to begin to extract gaze direction, which
leads to an understanding of joint reference
13Current Research Topics
- Perceiving people
- Finding faces
- Finding eyes
- Adaptive motor control
- Simulated muscular motor control
- Oscillator-based locomotion
- Training from non-linguistic vocal cues
- Recognizing prosody (getting feedback)
- Expressive vocalization system (giving feedback)
- Integration
- Occulo-motor control systems
- Imitation as a mechanism for self-recognition
- Social dynamics
14Simulated Muscular Motor Control
- Dynamical properties similar to those of human
musculature, as opposed to typical
position/velocity based control - Compliance/equilibrium-point control provides
model of muscles - Fatigue model joint strength is modulated by
activity - Simulated multi-joint muscles
- Oscillatory circuits that simulate spinal
mechanisms - Implementation roaming bands of modelers
- Build sensory-motor correlation to generate
functional models of the causal relationships
between different systems - Over time, will build up a "body image"
15Oscillator-based Locomotion
- Previous work on using neural oscillators for arm
control - Application of oscillators to control locomotion
- Coupling between the joints determines a gait
- Robust to changes in environment and starting
conditions - Simplifies high-level control
- Natural distributed control
- No explicit model of the dynamics is used
Connections used to achieve trotting gait
16Oscillator-based Locomotion
- Tested in simulation
- Gait and motion determined by oscillator coupling
- Force-control provides robustness
Output of connected oscillators
Output of unconnected oscillators
17Current Research Topics
- Perceiving people
- Finding faces
- Finding eyes
- Adaptive motor control
- Simulated muscular motor control
- Oscillator-based locomotion
- Training from non-linguistic vocal cues
- Recognizing prosody (getting feedback)
- Expressive vocalization system (giving feedback)
- Integration
- Occulo-motor control systems
- Imitation as a mechanism for self-recognition
- Social dynamics
18Recognizing Prosody
- Fernalds (1989,1993) cross-linguistic analysis
of infant-directed speech suggests four different
pitch contours approval, attention, prohibition,
and comfort. - Re-implemented a classifier built by Slaney and
McRoberts (1998) to automatically distinguish
between approval, attention, and prohibition.
- Signal
- Processing
- pitch contour
- MFCC
- energy
- Training Classification of Individual Features
- Gaussian Mixture Model
- .623 bootstrapping procedure (Efron, 1993)
Sequential Forward Feature Selection
32 features pitch mean, variance, slope, DMFCC,
energy variance, etc.
Individual Feature Performance
19Initial Prosody Results
- Best performance using sequential forward
selection - all 32 features
- original system 57.5
- replicated system 60.8
- 8 global features 71.5
- Future work
- incorporate prior knowledge in selecting new
features, such as utterance duration - speaker-dependent classification
20Expressive Vocalization System
- Provide expressive vocal feedback to aid
learning - Convert expressive parameters of voice (such as
pitch and timing) into articulatory parameters
for a speech synthesizer.
Choose from 6 basic emotions anger, disgust,
fear, happy, sad, surprise, and calm...
Manually entered utterances text based.
- Expressive parameters
- Pitch
- Timing
- Voice quality
- Articulation
Self-generated utterances phoneme-based in-line
stress markers
Articulatory parameters
21Current Research Topics
- Perceiving people
- Finding faces
- Finding eyes
- Adaptive motor control
- Simulated muscular motor control
- Oscillator-based locomotion
- Training from non-linguistic vocal cues
- Recognizing prosody (getting feedback)
- Expressive vocalization system (giving feedback)
- Integration
- Occulo-motor control systems
- Imitation as a mechanism for self-recognition
- Social dynamics
22Occulo-motor Control
Right Foveal Camera
Left Foveal Camera
Wide Camera
Integrating target fixation with attention
Right Frame Grabber Daemon
Left Frame Grabber Daemon
Wide Frame Grabber Daemon
target
Skin Detector
Color Detector
Motion Detector
Face Detector
Wide Tracker
Foveal Tracker
Foveal Disparity
Vergence target
Smooth pursuit target
reset
target
Habituation
W
W
W
W
t
Target Selector
m
t,m
Attention
Conjunctive movement
Saccade target
Disjunctive movement
Behaviors Motivations
Ballistic movement
winning behavior
Smooth Pursuit Vergence w/ neck comp.
VOR
Saccade w/ neck comp.
Fixed Action Pattern
Affective Postural Shifts w/ gaze comp.
Arbitor
Eye-Head-Neck Control
Motion Control Daemon
Eye-Neck Motors
23Imitation and Self-recognition
- Use imitation as a way of recognizing
- Robots own body
- Other socially responsive agents
Interaction quality when the robot follows
Interaction quality when the robot leads
24Social Dynamics
- Models of social dynamics must be integrated with
- Perception
- Motor control
- Emotion models and expressive systems
- Future work
- Turn taking
- Social gaze
25Natural Tasking of Robots
- Robots can be naturally tasked using human social
cues - Robots need to understand intuitive human
interactions - Robots need to map their own capabilities to
those of the human instructor
- Perceiving People
- finding faces
- finding eyes
- Adaptive Motor Control
- simulated muscular motor control
- oscillator-based locomotion
- Using non-linguistic vocal cues
- recognizing prosody
- expressive vocalization system
- Integration
- A commander in the field will be able to task a
robot just as he tasks a soldier - Robots will be useable without special training
or programming skills - Robots will be taskable in unique and dynamic
situations
26Acknowledgements
- Eduardo Torres-Jarra
- Naoki Sadakuni
- Juan Velasquez
- Charlie Kemp
- Lijin Aryananda
- Matthew Marjanovic
- Jessica Banks
- Artur Arsenio
- Paul Fitzpatrick
- Paulina Varchavskaia
- Chris Morse
- Chris Scarpino
- Bryan Adams