Artificial Companions: Explorations in machine personality and dialogue - PowerPoint PPT Presentation

1 / 48
About This Presentation
Title:

Artificial Companions: Explorations in machine personality and dialogue

Description:

Artificial Companions: Explorations in machine personality and dialogue – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 49
Provided by: nick202
Category:

less

Transcript and Presenter's Notes

Title: Artificial Companions: Explorations in machine personality and dialogue


1
Artificial Companions Explorations in machine
personality and dialogue
  • Yorick Wilks
  • Computer Science, University of Sheffield
  • and
  • Oxford Internet Institute
  • MLMI04, Martigny CH, June 2004

2
What the talk contains
  • Two natural language technologies I work within
  • Human dialogue modelling
  • Information extraction from the web
  • What drives NLP dialogue models ML, speech?
  • Conversational agents as essential for
  • personalizing the web
  • making it tractable
  • Companions for the non-technical as a cosier kind
    of persistent agent
  • For niche groups, some of them non-technical, or
    handicapped
  • As an interface to the web
  • An an interface to their stored lives

3
Machine dialogue problems with available theory
  • Dialogue the Cinderella of NLP
  • It can be vacuous dialogues are systems of
    turn-taking
  • Speech act analysis initially has led to
    implausibly deep levels of reasoning--you dont
    need plans to sell an airticket.
  • For some researchers, dialogue theory is still a
    question of how best to deploy logic
  • Much conversation is not task orientated at all,
    nor does it have plausible info-states.

4
Important historical systems have all the modern
traits and functionalities in miniature
  • Colbys PARRY (Stanford, 1971)
  • Winograds SHRDLU (MIT 1971)
  • Perrault, Cohen, Allens speech act system
    (Toronto, 1979)

5
Colbys PARRY
  • Perhaps best ever performance, many users,
    robust, but not a normal subject (I.e. a
    paranoid)
  • primitive individual models, some control of
    dialogue process but it had lots to say!
  • Primitive simulation of intentionality, and
    emotion in output choice
  • not syntax analysis but fast pattern matching
  • Far, far, better than ELIZA

6
PARRY conversation
  • Have you been hospitalized before?
  • THIS IS THE FIRST TIME
  • How long have you been there?
  • ABOUT TWO WEEKS
  • Any headaches?
  • MY HEALTH IS FINE
  • Are you have memory difficulties?
  • JUST A FEW

7
Does the little block that the hatched pyramids
support supports support anything black?
8
Perrault, Cohen Allen at Toronto
  • System has domain knowledge and reasoning power
  • was the first to assign speech act labels to
    dialogue items
  • User must have one of two goals, meeting or
    catching a train
  • Passenger/User Do you know when the Windsor
    train arrives?
  • This is labelled as a REQUEST not a
    REQUEST-INFORM (Y/N) because the system knows the
    user knows it knows!
  • but again virtually no performance

9
1970s division of approaches to machine
conversation.
  • Domain-dependent systems with coded world
    knowledge and some parsing and reasoning
  • VERSUS
  • Wide shallow systems with little knowledge and
    high performance
  • Published AI academic systems all in first group
  • Only the second group (I.e. PARRY) performed at
    all

10
Academic systems have moved towards performance
  • Best is Traums TRAINS system--descendant of
    Allens work (Toronto-Rochester tradition)
  • Semi-empiricist uses corpora but retains
    reasoning
  • Also TRINDIKIT at Gothenburg/Edinburgh uses a
    large rule base
  • Pressure from ASR advances
  • Pressure from the Loebner competition to perform?

11
AI people joined the performersthe Loebner
Competition
  • Well designed competition over the last ten
    years---essentially a modified Turing test.
  • Domain dependent entries till 1998, then domain
    free.
  • The game is for journalists to separate and rank
    human and machine partners
  • Search Loebner on web for full transcripts

12
Sheffield University/Intelligent Systems won in
1998!
  • The CONVERSE program pretended to be Catherine, a
    34-year old British journalist living in New York
    (to explain any usage problems).
  • She could talk about 80 appropriate topics
  • (80domain free???) clothes, shopping, food,
    abortion, fitness, sex.etc.

13
Loebner Competition 1998---Sheffields winning
entry!
  • PROGRAM101804Hi, my name is Catherine, whats
    yours?
  • JUDGE04101832My name is Elliott.
  • PROGRAM101834Well, its nice to meet you,
    Elliott.
  • PROGRAM101848Did you see that story on CNN
    last night about the lesbian
  • PROGRAM101904couple who came out at a White
    House party on Sunday?
  • JUDGE04101955Yes, I did. I think it may all
    be a publicity stunt for Ellen.
  • PROGRAM102001Ellen Degeneres was one of them
    she was kissing her lover
  • PROGRAM102011in front of Bill Clinton. Do you
    think he was very
  • PROGRAM102025embarrassed?
  • JUDGE04102100I think he want the gay vote. He
    will do anything to get it and keep it.
  • JUDGE04102115He does not have a conservative
    base.
  • PROGRAM102121
  • PROGRAM102121Are you still a Friend of Bill
    Clinton?

14
The CONVERSE prototype 1998
  • Push-me-pull-you architecture
  • strong driving top-down scripts (80) in a
    re-enterable network with complex output
    functions
  • bottom-up parsing of user input adapted from
    statistical prose parser
  • minimal models of individuals
  • contained Wordnet and Collins Proper Names
  • some machine learning from past Loebners BNC
  • It owed more to PARRY than to Toronto!

15
Sheffield dialogue circa 2002
  • Empirical corpus-based stochastic dialogue
    grammar that maps utterances directly to
    dialogue acts and uses IE to match concepts with
    templates to provide semantic content.
  • A better virtual machine for script-like (DAF)
    objects encapsulating both the domain moves and
    conversational strategy (cf. PARRY and Grosz) to
    maintain the push-pull (alias mixed-initative)
    approach.
  • The Dialogue Action Frames provide domain
    context, and the stack topic change and reaccess
    to partially fulfilled DAFs

16
Resources vs. highest level structure
  • Need for resources to build belief system
    representations and quasi-linguistic models of
    dialogue structure, scripts etc., and to provide
    a base for learning optimal Dialogue Act
    assignments
  • A model of speakers, incrementally reaching
    VIEWGEN style ascription of belief procedures to
    give dialogue act reasoning functionality
  • Cf A. ballim Y. Wilks, 1991 Artificial
    Believers, Erlbaum.

17
How this research is funded
  • AMITIES is a EU-US cooperative R D project
    (2001-2005) to automate call centers.
  • University of Sheffield (EU prime)
  • SUNY Albany (US prime)
  • Duke U. (US)
  • LIMSI Paris (Fr)
  • IBM (US)
  • COMIC is an EU R D project ( 2001-2005) to
    model multimodal dialogue
  • MaxPlanck Inst (Nijmegen) (Coordiantor)
  • University of Edinburgh
  • MaxPlanck Inst (Tuebingen)
  • KUL Nijmegen
  • University of Sheffield
  • ViSoft GMBH

18
COMIC
  • Three-year project
  • Focussed on Multi Modal Dialogue
  • Speech and pen input/output
  • Bathroom Design Application
  • Helps the customer to make bathroom design
    decisions
  • Will be based on existing bathroom design
    software
  • Spoken output is done with a talking head which
    includes facial expressions etc.

19
Design of a Dialogue Action Manager
  • General purpose DAM where domain dependent
    features are separated from the control
    structure.
  • The domain dependent features are stored as
    Dialogue Action Frames (DAFs) which are similar
    to Augmented Transition Networks (ATNs)
  • The DAFs represent general purpose Dialogue
    manoeuvres as well as application specific
    knowledge.
  • The control mechanism is based on a basic stack
    structure where DAFs are pushed and popped during
    the course of a user session.
  • The control mechanism together with the DAFs
    provide a flexible means for guiding the user
    through the system goals (allowing for topic
    change and barge-in where needed).
  • User push is given by the ability to suspend
    and stack a new DAF at any point (for a topic or
    any user maneuver)
  • System push is given by the pre-stacked DAFs
    corresponding to what the system wants to show or
    elicit.
  • Research question of how much of the stacks
    unpopped DAFs can/should be reaccessed (cf. Grosz
    limits on reachability).

20
Dialogue Management
  • DAFs model the individual topics and
    conversational manoeuvres in the application
    domain..
  • The stack structure will be preloaded with those
    DAFs which are necessary for the COMIC bathroom
    design task and the dialogue ends when the
    Goodbye DAF is popped.
  • DAFs and stack interpreters together control the
    flow of the dialogue

Greeting DAF
Room measurement DAF
Style DAF

Good-bye DAF
21
DAF example
22
Current work Learning to segment the dialogue
corpora
  • Segmenting the corpora we have with a range of
    tiling-style and MDL algorithms ( by topic and by
    strategic maneuver)
  • To segment it plausibly, hopefully into segments
    that correspond to structures for DM (I.e.
    Dialogue Action Frames)
  • Being done on the annotated corpus (i.e. a corpus
    word model) and on the corpus annotated by
    Information Extraction semantic tags (a semantic
    model of the corpus)

23
AMITIÉS Objectives
  • Call Center/Customer Access Automation
  • multilingual access to customer information and
    services.
  • Now speech over the telephone (call centers)
  • Later speech, text and pointing over the
    Internet (e-service)
  • Multilingual natural language dialogue
  • unscripted, spontaneous conversation
  • models derived from real call center data
  • tested and verified in real call center
    environment
  • Showcase applications at real call centers
  • financial services centers (English, French,
    German)
  • expand into public service gov. applications
    (US EC)

24
Corpora
  • GE Financial call centres
  • 1k English calls (transcribed, annotated)
  • 1k French calls (transcribed, annotated)
  • IBM software support call centre
  • 5k English calls (transcribed)
  • 5k French calls (transcribed)
  • AGF insurance claim call centre
  • 5k French calls (recording)
  • VIEL et CIE
  • 100 French calls (transcribed, annotated)

25
AMITIÉS System
  • Data driven dialogue strategy
  • Similar to Colorados communicator system
  • Statistical a dialogue transition graph is
    derived from a large body of transcribed,
    annotated conversations
  • Task and ID identification
  • Task identification automatically trained
    vector-based approach (Chu-Carroll Carpenter
    1999)

26
Sheffield does the post ASR fusion in AMITIES
  • Language Understanding
  • Use of ANNIE IE for robust extraction
  • Partial matching (creates list of possible
    entities)
  • Dialogue Act Classifier
  • Recognise domain-independent dialogue acts
  • Works well (86 accuracy) for subset of Dialogue
    Act labels

27
Evaluation
  • 10 native speakers of English
  • Each made 9 calls to the system, following
    scenarios they were given
  • Overall call success was 70
  • Compare this to communicator scores of 56
  • Similar number of concepts/scenario (9)
  • Word Error Rates
  • 17 for successful calls
  • 22 for failed calls

28
Evaluation Interesting Numbers
  • Avg. num turns/dialogue 18.28
  • Avg. num words/user turn 6.89
  • High in comparison to communicator scores,
    reflecting
  • Lengthier response to open prompts
  • Responses to requests for multiple attributes
  • Greater user initiative
  • Avg. user satisfaction scores 20.45
  • (range 5-25)

29
Learning to tag for Dialogue Acts initial work
  • Samuels et al.(1998) TBL learning on n-gram DA
    cues, Verbmobil corpus (75)
  • Stolcke et al. (2000) full language modelling
    (including DA sequences), more complex
    Switchboard corpus (71)

30
Starting with a naive classifier for DAs
  • Direct predictivity of DAs by n-grams as a
    preprocess to any ML algorithm.
  • Get P(dn) for all 1-4 word n-grams and the DA
    set over the Switchboard corpus, and take DA
    indicated by n-gram with highest predictivity
    (threshold for probability levels)
  • Do 10-fold cross validation (which lowers scores)
  • Gives a best cross validated score of around 63
    over Switchboard but using only some of the data
    Stolcke needed.
  • Single highest score currently 71.2 - higher
    than that reported in Stolke
  • Up to 86 wiuth small (6) DA set

31
Extending the pretagging with TBL
  • Gives 66 (Stolckes 71) over the Switchboard
    data, but only 3 is due to TBL (rather than the
    naive classifier).
  • Samuels unable to see what TBL is doing for him.
  • This is just a base for a range of more complex
    ML algorithms (e.g. WEKA).

32
Dialogue Research Challenges
  • Will a Dialogue manager raise the DA 75/85
    ceiling top-down?
  • Multimodal dialogue managers. Are they completely
    independent of the modality? Are they really
    language independent?
  • What is the best virtual machine for running a
    dialogue engine? Do DAFsstack provide a robust
    and efficient mechanism for doing Dialogue
    Management e.g. topic change? (vs. simple rule
    systems)
  • Will they offer any interesting discoveries on
    stack access to, and discarding, incomplete
    topics (cf. Stacks and syntax).
  • Applying machine learning to transcripts so as to
    determine the content of dialogue management,
    i.e. the scope and content of candidate DAFs.
  • Can the state set of DAFs and a stack be trained
    with reinforcement learning (like a Finite State
    matrix)?
  • Can we add a strong belief/planning component to
    this and populate it empirically?
  • Fusion with QA functionality

33
What is the most structure that might be needed
and how much of it can be learned?
  • Steve Young (Cambridge) says learn all modules
    and no need for rich a priori structures (cf MT
    history and Jelinek at IBM)
  • Availability of data (dialogue is unlike MT)?
  • Learning to partition the data into structures.
  • Learing the semantic speech act interpretation
    of inputs alone has now reached a (low) ceiling
    (75/85).

34
Youngs strategy not quite like Jelineks MT
strategy of 1989!
  • Which was non/anti-linguistic with no
    intermediate representations hypothesised
  • Young assumes rougly the same intermediate
    objects as we do but in very simplified forms.
  • The aim to to obtain training data for all of
    them so the whole process becomes a single
    Partially Observable Markov model.
  • It remains unclear how to train complex state
    models that may not represent tasks, let alone
    belief and intention models.

35
There are now four not two competing approaches
to machine dialogue in NLP
  • Logic-based systems with reasoning (traditional
    and still unvalidated by performance)
  • Extensions of speech engineering methods, machine
    learning and no structure (new)
  • Simple handcoded finite state systems in VoiceXML
    (Chatbots and commercial systems)
  • Rational hybrids based on structure and machine
    learning.

36
Modes of dialogue with machine agents
  • Current mode of phone/multimodal interactions at
    terminals.
  • The internet (possibly becoming the semantic
    web) will be for machine agents that understand
    its content, and with which users dialogue e.g
    Find me the best camera under 500.
  • Interaction with mobile phone agents (more or
    less monomodal)
  • Some or all of these services as part of function
    of persistent, more personal, cosy, lifelong
    Companion agents.

37
The Companions a new economic and social goal
for dialogue systems
38
An idea for integrating the dialogue research
agenda in a new style of application...
  • That meets social and economic needs
  • That is not simply a product but everyone will
    want one if it succeeds
  • That cannot be done now but could in a few years
    by a series of staged prototypes
  • That modularises easily for large project
    management, and whose modules cover the research
    issues.
  • Whose speech and language technology components
    are now basically available

39
A series of intelligent and sociable COMPANIONS
  • The SeniorCompanion
  • The EU will have more and more old people who
    find technological life hard to handle, but will
    have access to funds
  • The SC will sit beside you on the sofa but be
    easy to carry about--like a furry handbag--not a
    robot
  • It will explain the plots of TV programs and help
    choose them for you
  • It will know you and what you like and dont
  • It wills send your messages, make calls and
    summon emergency help
  • It will debrief your life.

40
(No Transcript)
41
Other COMPANIONS
  • The JuniorCompanion
  • Teaches and advises, maybe from a backpack
  • Warns of dangerous situations
  • Helps with homework and web search
  • Helps with languages
  • Always knows where the child is
  • Explains ambient signals and information
  • Its what e-learning might really mean!

42
(No Transcript)
43
The Senior Companion is a major technical and
social challenge
  • It could represent old people as their agents and
    help in difficult situations e.g. with landlords,
    or guess when to summon human assistance
  • It could debrief an elderly user about events
    and memories in their lives
  • It could aid them to organise their life-memories
    (this is now hard!)(see Lifelog and Memories for
    Life)
  • It would be a repository for relatives later
  • Has  Loebner chat aspects  as well as
    information--it is to divert, like a pet, not
    just inform
  • It is a persistent and personal social agent
    interfacing with Semantic Web agents

44
Other issues for Companions we can hardly begin
to formulate
  • Companion identity as an issue that can be
    settled many ways---
  • like that of the owners web identity---- now a
    hot issue?
  • Responsibilities of Companion agents--who to?
  • Communications between agents and our access to
    them
  • Are simulations of emotional behaviour or
    politeness desirable in a Companion?
  • Protection of the vulnerable (young and old here)
  • What happens to your Companion when you are gone?

45
Companions and the Web
  • A new kind of agent as the answer to a passive
    web
  • The web/internet must become more personal to be
    tractable, as it gets bigger (and more structured
    or unstructured?)
  • Personal agents will need to be autonomous and
    trusted (like space craft on missions)
  • But also personal and persistent, particularly
    for large sections of populations now largely
    excluded from the web.
  • The semantic web is a start to structure the web
    for comprehension and activity, but web agents
    are currently abstract and transitory.
  • The old are a good group to start with (growing
    and with funds).

46
The technologies for a Companion are all there
already
  • ASR for a single user (but may be dysarthric)
  • Ascribing personality? remember Tamagochi?
  • Quite intelligent people rushed home to feed one
    (and later Furby) even though they knew it was a
    simple empty mechnaism.
  • And Tamaogochi could not even talk!
  • People with pets live longer.
  • Wouldnt you like a warm pet to remind you what
    happened in the last episode of your favourite TV
    soap?
  • No, OK, but perhaps millions of your compatriots
    would?!

47
This isnt just about furry talking handbags on
sofas, but any persistent and personalised entity
that will interface to information sources
phones above all, and for dealing with the web in
a more personal manner. ..claim the internet
is killing their trade because customersseem to
prefer an electronic serf with limitless memory
and no conversation. (Guardian 8.11.03)
48
Conclusions
  • Companions are a plausible binding concept for
    exploring and evaluating a richer concept of
    human-machine interaction (useful too!!)
  • Interactions beyond simple task-driven dialogues.
  • That require more interesting theories
    underpinning them, even ones we cannot
    immediately see how to reinforce/learn.
  • Interactions with persistent personality, affect,
    emotion, interesting beliefs and goals
  • Above all, we need a more sophisticated and
    generally accepted evaluation regime
Write a Comment
User Comments (0)
About PowerShow.com