Artificial Companions: Explorations in machine personality and dialogue - PowerPoint PPT Presentation

1 / 48

About This Presentation

Title:

Artificial Companions: Explorations in machine personality and dialogue

Description:

Artificial Companions: Explorations in machine personality and dialogue – PowerPoint PPT presentation

Number of Views:47

Avg rating:3.0/5.0

Slides: 49

Provided by: nick202

Category:

more less

Transcript and Presenter's Notes

Title: Artificial Companions: Explorations in machine personality and dialogue

1
Artificial Companions Explorations in machine
personality and dialogue

Yorick Wilks
Computer Science, University of Sheffield
and
Oxford Internet Institute
MLMI04, Martigny CH, June 2004

2
What the talk contains

Two natural language technologies I work within
Human dialogue modelling
Information extraction from the web
What drives NLP dialogue models ML, speech?
Conversational agents as essential for
personalizing the web
making it tractable
Companions for the non-technical as a cosier kind
of persistent agent
For niche groups, some of them non-technical, or
handicapped
As an interface to the web
An an interface to their stored lives

3
Machine dialogue problems with available theory

Dialogue the Cinderella of NLP
It can be vacuous dialogues are systems of
turn-taking
Speech act analysis initially has led to
implausibly deep levels of reasoning--you dont
need plans to sell an airticket.
For some researchers, dialogue theory is still a
question of how best to deploy logic
Much conversation is not task orientated at all,
nor does it have plausible info-states.

4
Important historical systems have all the modern
traits and functionalities in miniature

Colbys PARRY (Stanford, 1971)
Winograds SHRDLU (MIT 1971)
Perrault, Cohen, Allens speech act system
(Toronto, 1979)

5
Colbys PARRY

Perhaps best ever performance, many users,
robust, but not a normal subject (I.e. a
paranoid)
primitive individual models, some control of
dialogue process but it had lots to say!
Primitive simulation of intentionality, and
emotion in output choice
not syntax analysis but fast pattern matching
Far, far, better than ELIZA

6
PARRY conversation

Have you been hospitalized before?
THIS IS THE FIRST TIME
How long have you been there?
ABOUT TWO WEEKS
Any headaches?
MY HEALTH IS FINE
Are you have memory difficulties?
JUST A FEW

7
Does the little block that the hatched pyramids
support supports support anything black?
8
Perrault, Cohen Allen at Toronto

System has domain knowledge and reasoning power
was the first to assign speech act labels to
dialogue items
User must have one of two goals, meeting or
catching a train
Passenger/User Do you know when the Windsor
train arrives?
This is labelled as a REQUEST not a
REQUEST-INFORM (Y/N) because the system knows the
user knows it knows!
but again virtually no performance

9
1970s division of approaches to machine
conversation.

Domain-dependent systems with coded world
knowledge and some parsing and reasoning
VERSUS
Wide shallow systems with little knowledge and
high performance
Published AI academic systems all in first group
Only the second group (I.e. PARRY) performed at
all

10
Academic systems have moved towards performance

Best is Traums TRAINS system--descendant of
Allens work (Toronto-Rochester tradition)
Semi-empiricist uses corpora but retains
reasoning
Also TRINDIKIT at Gothenburg/Edinburgh uses a
large rule base
Pressure from ASR advances
Pressure from the Loebner competition to perform?

11
AI people joined the performersthe Loebner
Competition

Well designed competition over the last ten
years---essentially a modified Turing test.
Domain dependent entries till 1998, then domain
free.
The game is for journalists to separate and rank
human and machine partners
Search Loebner on web for full transcripts

12
Sheffield University/Intelligent Systems won in
1998!

The CONVERSE program pretended to be Catherine, a
34-year old British journalist living in New York
(to explain any usage problems).
She could talk about 80 appropriate topics
(80domain free???) clothes, shopping, food,
abortion, fitness, sex.etc.

13
Loebner Competition 1998---Sheffields winning
entry!

PROGRAM101804Hi, my name is Catherine, whats
yours?
JUDGE04101832My name is Elliott.
PROGRAM101834Well, its nice to meet you,
Elliott.
PROGRAM101848Did you see that story on CNN
last night about the lesbian
PROGRAM101904couple who came out at a White
House party on Sunday?
JUDGE04101955Yes, I did. I think it may all
be a publicity stunt for Ellen.
PROGRAM102001Ellen Degeneres was one of them
she was kissing her lover
PROGRAM102011in front of Bill Clinton. Do you
think he was very
PROGRAM102025embarrassed?
JUDGE04102100I think he want the gay vote. He
will do anything to get it and keep it.
JUDGE04102115He does not have a conservative
base.
PROGRAM102121
PROGRAM102121Are you still a Friend of Bill
Clinton?

14
The CONVERSE prototype 1998

Push-me-pull-you architecture
strong driving top-down scripts (80) in a
re-enterable network with complex output
functions
bottom-up parsing of user input adapted from
statistical prose parser
minimal models of individuals
contained Wordnet and Collins Proper Names
some machine learning from past Loebners BNC
It owed more to PARRY than to Toronto!

15
Sheffield dialogue circa 2002

Empirical corpus-based stochastic dialogue
grammar that maps utterances directly to
dialogue acts and uses IE to match concepts with
templates to provide semantic content.
A better virtual machine for script-like (DAF)
objects encapsulating both the domain moves and
conversational strategy (cf. PARRY and Grosz) to
maintain the push-pull (alias mixed-initative)
approach.
The Dialogue Action Frames provide domain
context, and the stack topic change and reaccess
to partially fulfilled DAFs

16
Resources vs. highest level structure

Need for resources to build belief system
representations and quasi-linguistic models of
dialogue structure, scripts etc., and to provide
a base for learning optimal Dialogue Act
assignments
A model of speakers, incrementally reaching
VIEWGEN style ascription of belief procedures to
give dialogue act reasoning functionality
Cf A. ballim Y. Wilks, 1991 Artificial
Believers, Erlbaum.

17
How this research is funded

AMITIES is a EU-US cooperative R D project
(2001-2005) to automate call centers.
University of Sheffield (EU prime)
SUNY Albany (US prime)
Duke U. (US)
LIMSI Paris (Fr)
IBM (US)
COMIC is an EU R D project ( 2001-2005) to
model multimodal dialogue
MaxPlanck Inst (Nijmegen) (Coordiantor)
University of Edinburgh
MaxPlanck Inst (Tuebingen)
KUL Nijmegen
University of Sheffield
ViSoft GMBH

18
COMIC

Three-year project
Focussed on Multi Modal Dialogue
Speech and pen input/output
Bathroom Design Application
Helps the customer to make bathroom design
decisions
Will be based on existing bathroom design
software
Spoken output is done with a talking head which
includes facial expressions etc.

19
Design of a Dialogue Action Manager

General purpose DAM where domain dependent
features are separated from the control
structure.
The domain dependent features are stored as
Dialogue Action Frames (DAFs) which are similar
to Augmented Transition Networks (ATNs)
The DAFs represent general purpose Dialogue
manoeuvres as well as application specific
knowledge.
The control mechanism is based on a basic stack
structure where DAFs are pushed and popped during
the course of a user session.
The control mechanism together with the DAFs
provide a flexible means for guiding the user
through the system goals (allowing for topic
change and barge-in where needed).
User push is given by the ability to suspend
and stack a new DAF at any point (for a topic or
any user maneuver)
System push is given by the pre-stacked DAFs
corresponding to what the system wants to show or
elicit.
Research question of how much of the stacks
unpopped DAFs can/should be reaccessed (cf. Grosz
limits on reachability).

20
Dialogue Management

DAFs model the individual topics and
conversational manoeuvres in the application
domain..
The stack structure will be preloaded with those
DAFs which are necessary for the COMIC bathroom
design task and the dialogue ends when the
Goodbye DAF is popped.
DAFs and stack interpreters together control the
flow of the dialogue

Greeting DAF
Room measurement DAF
Style DAF

Good-bye DAF
21
DAF example
22
Current work Learning to segment the dialogue
corpora

Segmenting the corpora we have with a range of
tiling-style and MDL algorithms ( by topic and by
strategic maneuver)
To segment it plausibly, hopefully into segments
that correspond to structures for DM (I.e.
Dialogue Action Frames)
Being done on the annotated corpus (i.e. a corpus
word model) and on the corpus annotated by
Information Extraction semantic tags (a semantic
model of the corpus)

23
AMITIÉS Objectives

Call Center/Customer Access Automation
multilingual access to customer information and
services.
Now speech over the telephone (call centers)
Later speech, text and pointing over the
Internet (e-service)
Multilingual natural language dialogue
unscripted, spontaneous conversation
models derived from real call center data
tested and verified in real call center
environment
Showcase applications at real call centers
financial services centers (English, French,
German)
expand into public service gov. applications
(US EC)

24
Corpora

GE Financial call centres
1k English calls (transcribed, annotated)
1k French calls (transcribed, annotated)
IBM software support call centre
5k English calls (transcribed)
5k French calls (transcribed)
AGF insurance claim call centre
5k French calls (recording)
VIEL et CIE
100 French calls (transcribed, annotated)

25
AMITIÉS System

Data driven dialogue strategy
Similar to Colorados communicator system
Statistical a dialogue transition graph is
derived from a large body of transcribed,
annotated conversations
Task and ID identification
Task identification automatically trained
vector-based approach (Chu-Carroll Carpenter
1999)

26
Sheffield does the post ASR fusion in AMITIES

Language Understanding
Use of ANNIE IE for robust extraction
Partial matching (creates list of possible
entities)
Dialogue Act Classifier
Recognise domain-independent dialogue acts
Works well (86 accuracy) for subset of Dialogue
Act labels

27
Evaluation

10 native speakers of English
Each made 9 calls to the system, following
scenarios they were given
Overall call success was 70
Compare this to communicator scores of 56
Similar number of concepts/scenario (9)
Word Error Rates
17 for successful calls
22 for failed calls

28
Evaluation Interesting Numbers

Avg. num turns/dialogue 18.28
Avg. num words/user turn 6.89
High in comparison to communicator scores,
reflecting
Lengthier response to open prompts
Responses to requests for multiple attributes
Greater user initiative
Avg. user satisfaction scores 20.45
(range 5-25)

29
Learning to tag for Dialogue Acts initial work

Samuels et al.(1998) TBL learning on n-gram DA
cues, Verbmobil corpus (75)
Stolcke et al. (2000) full language modelling
(including DA sequences), more complex
Switchboard corpus (71)

30
Starting with a naive classifier for DAs

Direct predictivity of DAs by n-grams as a
preprocess to any ML algorithm.
Get P(dn) for all 1-4 word n-grams and the DA
set over the Switchboard corpus, and take DA
indicated by n-gram with highest predictivity
(threshold for probability levels)
Do 10-fold cross validation (which lowers scores)
Gives a best cross validated score of around 63
over Switchboard but using only some of the data
Stolcke needed.
Single highest score currently 71.2 - higher
than that reported in Stolke
Up to 86 wiuth small (6) DA set

31
Extending the pretagging with TBL

Gives 66 (Stolckes 71) over the Switchboard
data, but only 3 is due to TBL (rather than the
naive classifier).
Samuels unable to see what TBL is doing for him.
This is just a base for a range of more complex
ML algorithms (e.g. WEKA).

32
Dialogue Research Challenges

Will a Dialogue manager raise the DA 75/85
ceiling top-down?
Multimodal dialogue managers. Are they completely
independent of the modality? Are they really
language independent?
What is the best virtual machine for running a
dialogue engine? Do DAFsstack provide a robust
and efficient mechanism for doing Dialogue
Management e.g. topic change? (vs. simple rule
systems)
Will they offer any interesting discoveries on
stack access to, and discarding, incomplete
topics (cf. Stacks and syntax).
Applying machine learning to transcripts so as to
determine the content of dialogue management,
i.e. the scope and content of candidate DAFs.
Can the state set of DAFs and a stack be trained
with reinforcement learning (like a Finite State
matrix)?
Can we add a strong belief/planning component to
this and populate it empirically?
Fusion with QA functionality

33
What is the most structure that might be needed
and how much of it can be learned?

Steve Young (Cambridge) says learn all modules
and no need for rich a priori structures (cf MT
history and Jelinek at IBM)
Availability of data (dialogue is unlike MT)?
Learning to partition the data into structures.
Learing the semantic speech act interpretation
of inputs alone has now reached a (low) ceiling
(75/85).

34
Youngs strategy not quite like Jelineks MT
strategy of 1989!

Which was non/anti-linguistic with no
intermediate representations hypothesised
Young assumes rougly the same intermediate
objects as we do but in very simplified forms.
The aim to to obtain training data for all of
them so the whole process becomes a single
Partially Observable Markov model.
It remains unclear how to train complex state
models that may not represent tasks, let alone
belief and intention models.

35
There are now four not two competing approaches
to machine dialogue in NLP

Logic-based systems with reasoning (traditional
and still unvalidated by performance)
Extensions of speech engineering methods, machine
learning and no structure (new)
Simple handcoded finite state systems in VoiceXML
(Chatbots and commercial systems)
Rational hybrids based on structure and machine
learning.

36
Modes of dialogue with machine agents

Current mode of phone/multimodal interactions at
terminals.
The internet (possibly becoming the semantic
web) will be for machine agents that understand
its content, and with which users dialogue e.g
Find me the best camera under 500.
Interaction with mobile phone agents (more or
less monomodal)
Some or all of these services as part of function
of persistent, more personal, cosy, lifelong
Companion agents.

37
The Companions a new economic and social goal
for dialogue systems
38
An idea for integrating the dialogue research
agenda in a new style of application...

That meets social and economic needs
That is not simply a product but everyone will
want one if it succeeds
That cannot be done now but could in a few years
by a series of staged prototypes
That modularises easily for large project
management, and whose modules cover the research
issues.
Whose speech and language technology components
are now basically available

39
A series of intelligent and sociable COMPANIONS

The SeniorCompanion
The EU will have more and more old people who
find technological life hard to handle, but will
have access to funds
The SC will sit beside you on the sofa but be
easy to carry about--like a furry handbag--not a
robot
It will explain the plots of TV programs and help
choose them for you
It will know you and what you like and dont
It wills send your messages, make calls and
summon emergency help
It will debrief your life.

40
(No Transcript)
41
Other COMPANIONS

The JuniorCompanion
Teaches and advises, maybe from a backpack
Warns of dangerous situations
Helps with homework and web search
Helps with languages
Always knows where the child is
Explains ambient signals and information
Its what e-learning might really mean!

42
(No Transcript)
43
The Senior Companion is a major technical and
social challenge

It could represent old people as their agents and
help in difficult situations e.g. with landlords,
or guess when to summon human assistance
It could debrief an elderly user about events
and memories in their lives
It could aid them to organise their life-memories
(this is now hard!)(see Lifelog and Memories for
Life)
It would be a repository for relatives later
Has Loebner chat aspects as well as
information--it is to divert, like a pet, not
just inform
It is a persistent and personal social agent
interfacing with Semantic Web agents

44
Other issues for Companions we can hardly begin
to formulate

Companion identity as an issue that can be
settled many ways---
like that of the owners web identity---- now a
hot issue?
Responsibilities of Companion agents--who to?
Communications between agents and our access to
them
Are simulations of emotional behaviour or
politeness desirable in a Companion?
Protection of the vulnerable (young and old here)
What happens to your Companion when you are gone?

45
Companions and the Web

A new kind of agent as the answer to a passive
web
The web/internet must become more personal to be
tractable, as it gets bigger (and more structured
or unstructured?)
Personal agents will need to be autonomous and
trusted (like space craft on missions)
But also personal and persistent, particularly
for large sections of populations now largely
excluded from the web.
The semantic web is a start to structure the web
for comprehension and activity, but web agents
are currently abstract and transitory.
The old are a good group to start with (growing
and with funds).

46
The technologies for a Companion are all there
already

ASR for a single user (but may be dysarthric)
Ascribing personality? remember Tamagochi?
Quite intelligent people rushed home to feed one
(and later Furby) even though they knew it was a
simple empty mechnaism.
And Tamaogochi could not even talk!
People with pets live longer.
Wouldnt you like a warm pet to remind you what
happened in the last episode of your favourite TV
soap?
No, OK, but perhaps millions of your compatriots
would?!

47
This isnt just about furry talking handbags on
sofas, but any persistent and personalised entity
that will interface to information sources
phones above all, and for dealing with the web in
a more personal manner. ..claim the internet
is killing their trade because customersseem to
prefer an electronic serf with limitless memory
and no conversation. (Guardian 8.11.03)
48
Conclusions

Companions are a plausible binding concept for
exploring and evaluating a richer concept of
human-machine interaction (useful too!!)
Interactions beyond simple task-driven dialogues.
That require more interesting theories
underpinning them, even ones we cannot
immediately see how to reinforce/learn.
Interactions with persistent personality, affect,
emotion, interesting beliefs and goals
Above all, we need a more sophisticated and
generally accepted evaluation regime