RRL: A Rich Representation Language for the Description of Agent Behaviour in NECA - PowerPoint PPT Presentation

About This Presentation

Title:

RRL: A Rich Representation Language for the Description of Agent Behaviour in NECA

Description:

RRL: A Rich Representation Language for the Description of ... Meta-conditions used in DRT for WH-questions, Topics and Bridging Anaphora. eShowRoom Example ... – PowerPoint PPT presentation

Number of Views:21

Avg rating:3.0/5.0

Slides: 35

Provided by: hannes2

Category:

more less

Transcript and Presenter's Notes

Title: RRL: A Rich Representation Language for the Description of Agent Behaviour in NECA

1
RRL A Rich Representation Language for the
Description of Agent Behaviour in NECA

Paul Piwek, ITRI, Brighton
Brigitte Krenn, OFAI, Vienna
Marc Schröder, DFKI, Saarbrücken
Martine Grice, IPUS, Saarbrücken
Stefan Baumann, IPUS, Saarbrücken
Hannes Pirker, OFAI, Vienna

2
(No Transcript)
3
NECA

Duration 2.5 years
Start October 2001
A new generation of mixed multi-user / multi
agent virtual spaces for the internet
Populated by
affective conversational agents

4
Affective Conversational Agents

Express themselves through
Emotional speech and
synchronised non-verbal expression

5
Application Scenarios
The NECA Platform will be evaluated in two
concrete application scenarios

Socialite
a multi-user web-application in the social domain
eShowRoom
a novel approach to the presentation of products
in e-Commerce applications

6
Socialite
7
(No Transcript)
8
NECAs Architecture
User Input
Affective Reasoner (AR)
Scene Generator
Scene Description
9
NECAs Architecture
User Input
Affective Reasoner (AR)
Scene Generator
Scene Description
Multi-modal Natural Language Generator (M-NLG)
Multi-modal Output
10
NECAs Architecture
User Input
Affective Reasoner (AR)
Scene Generator
Scene Description
Multi-modal Natural Language Generator (M-NLG)
Multi-modal Output
Text/Concept to Speech Synthesis (CTS)
Emotional Speech
PhoneticProsodic Information
11
NECAs Architecture
User Input
Affective Reasoner (AR)
Scene Generator
Scene Description
Multi-modal Natural Language Generator (M-NLG)
Multi-modal Output
Text/Concept to Speech Synthesis (CTS)
Emotional Speech
PhoneticProsodic Information
Gesture Assignment Module (GA)
Animation directives
12
NECAs Architecture
User Input
Affective Reasoner (AR)
Scene Generator
Scene Description
Multi-modal Natural Language Generator (M-NLG)
Multi-modal Output
Text/Concept to Speech Synthesis (CTS)
Emotional Speech
PhoneticProsodic Information
Gesture Assignment Module (GA)
Animation directives
Player-Specific Rendering
Animation Control Sequence
13
NECAs Architecture
User Input
Affective Reasoner (AR)
Scene Generator
RRL
Scene Description
Multi-modal Natural Language Generator (M-NLG)
RRL
Multi-modal Output
Text/Concept to Speech Synthesis (CTS)
Emotional Speech
RRL
PhoneticProsodic Information
Gesture Assignment Module (GA)
RRL
Animation directives
Player-Specific Rendering
Animation Control Sequence
14
Requirements for RRL

Application Domain
Represent combinations of different types of
information
Expressivity
Processing Modules
Ease of manipulation/search (incremental/fast)
Developers (Maintainability)
Predictability
Locality
Conciseness
Intelligibility

15
Scene Description
SG
What is a Scene? I Theatr. 1 A subdivision of
(an act of) a play, in which the time is
continuous and the setting fixed, the action
and dialogue comprised in any one of these
subdivisions. (New Shorter Oxford English
Dictionary, 1996)
M-NLG
TTS/CTS
GA
16
Scene Descriptions in a Nutshell

Network representations
Flat, uniform
Use the Description Logical T and A-box
distinction. T-box defines types, subtypes,
attributes and constants
Can emulate CFGs, so we can include, e.g.,
semantic representation languages Discourse
Representation Theory (Kamp Reyle, 1994)
Reification of expressions in the network provide
useful handles for interleaving different types
of information
Lends itself well for graphical representation

17
Scene Descriptions in a Nutshell

Further Features of (RRL) Scene Descriptions
For communication between modules XML syntax
Temporal relations are explicitly represented.
Meta-conditions used in DRT for WH-questions,
Topics and Bridging Anaphora

18
eShowRoom Example
19
eShowRoom Example
20
eShowRoom Example
21
eShowRoom Example
22
Multimodal Output
SG

Multimodal Natural Language Generation (M-NLG)
supplies
Information on emotional state
Conceptually rich input for Speech Synthesis
Initial specification of gestures and facial
expressions for later use in Gesture Assignment

M-NLG
TTS/CTS
GA
23
Necas Speech Synthesis Emotions
SG

Not restricted to prosody (pitch, duration)
Several voice databases
diphon-inventories for different voice qualities
(modal, loud, soft)
Emotive interjections
Gradual emotional states
Shades of emotion / changing over time

M-NLG
TTS/CTS
GA
24
Necas Speech Synthesis Concept-to-Speech
SG

Concept-to-Speech instead of Text-to-Speech
approach
Part of Speech tags
Syntactic structure
Information status (given/new)
Information structure (theme/rheme)

M-NLG
TTS/CTS
GA
25
CTS specific information
SG

ltsentencegt
lttextgtThis car has leather seats.lt/textgt
ltgesture modality"voice" meaning"beautiful"/gt
ltsentencegt

M-NLG
TTS/CTS
GA
26
CTS specific information
SG

ltsentencegt
lttextgtThis car has leather seats.lt/textgt
ltgesture modality"voice" meaning"beautiful"/gt
ltword text"This" pos"PDAT"/gt
ltword text"car" pos"NN"/gt
ltword text"has" pos"VAFIN"/gt
ltword text"leather seats" pos"NN" /gt
ltpunct text"." pos"."/gt
lt/sentencegt

M-NLG
TTS/CTS
GA
27
CTS specific information
SG

ltsentencegt
lttextgtThis car has leather seats.lt/textgt
ltgesture modality"voice" meaning"beautiful"/gt
ltsynPhrase category"NP" function"SB"gt
ltword text"This" pos"PDAT"/gt
ltword text"car" pos"NN"/gt
lt/synPhrasegt
ltsynPhrase phrase"VP" function"PD"gt
ltword text"has" pos"VAFIN"/gt
ltsynPhrase phrase"NP" function"OA"gt
ltword text"leather seats" pos"NN" /gt
lt/synPhrasegt
ltpunct text"." pos"."/gt
lt/synPhrasegt

M-NLG
TTS/CTS
GA
28
CTS specific information
SG

ltsentencegt
lttextgtThis car has leather seats.lt/textgt
ltgesture modality"voice" meaning"beautiful"/gt
ltsynPhrase category"NP" function"SB"gt
ltword text"This" pos"PDAT"/gt
ltinfoStatus type"referent-given"gt
ltword text"car" pos"NN"/gt
ltinfoStatus /gt
lt/synPhrasegt
ltsynPhrase phrase"VP" function"PD"gt
ltword text"has" pos"VAFIN"/gt
ltsynPhrase phrase"NP" function"OA"gt
ltword text"leather seats" pos"NN" /gt
lt/synPhrasegt
ltpunct text"." pos"."/gt
lt/synPhrasegt

M-NLG
TTS/CTS
GA
29
CTS specific information
SG

ltsentencegt
lttextgtThis car has leather seats.lt/textgt
ltgesture modality"voice" meaning"beautiful"/gt
ltinfoStruct part"theme"gt
ltsynPhrase category"NP" function"SB"gt
ltword text"This" pos"PDAT"/gt
ltinfoStatus type"referent-given"gt
ltword text"car" pos"NN"/gt
lt/infoStatusgt
lt/synPhrasegt
ltinfoStruct part"rheme"gt
ltsynPhrase phrase"VP" function"PD"gt
ltword text"has" pos"VAFIN"/gt
ltsynPhrase phrase"NP" function"OA"gt
ltword text"leather seats" pos"NN" /gt
lt/synPhrasegt
ltpunct text"." pos"."/gt
lt/synPhrasegt
lt/infoStructgt

M-NLG
TTS/CTS
GA
30
Prosodic/Phonetic Information for GA
SG

Phonetics
exact timing of speech sounds, pauses and
interjections
Prosody
boundarie locations for
syllables
words
prosodic phrases

M-NLG
TTS/CTS
GA
31
Prosodic/Phonetic Information for GA
SG

information on
syllables bearing word-stress
position and type of sentence accents
position and type of prosodic boundaries

M-NLG
TTS/CTS
GA
32
Animation directives
SG

Phonetic information (phonemes) used for
specifying
Visemes
breathing

M-NLG
TTS/CTS
GA
33
Animation directives
SG

Prosodic information (stress, accents, phrasing)
used for specifying
synchronization of gestures with speech
eye-blinking
gaze

M-NLG
TTS/CTS
GA
34
Conclusions

RRL is representation language for wide range of
expert knowledge required at interfaces of NECA
modules.
Scene Descriptions uniform representation/integra
tion of different types of information
(illustrated with integration of DRT) using
handles
Speech Synthesis conceptually rich input as
opposed to text
Gesture Assignment access to exact timing of
speech