Microevaluation

About This Presentation

Title:

Microevaluation

Description:

Danielle Pele. Emiel Krahmer. Andrew Marriott. Dominic Massaro. Dagstuhl working group 3 ... A method to test whether the designer's model as implemented in an ECA is ... – PowerPoint PPT presentation

Number of Views:23

Avg rating:3.0/5.0

Slides: 21

Provided by: emi4

Category:

more less

Transcript and Presenter's Notes

Title: Microevaluation

1
Micro-evaluation
Dagstuhl working group 3
Final preliminary report

Jonas Beskow
Justine Cassell
Dirk Heijlen

Han Noot Patrick Olivier Danielle Pele
Emiel Krahmer Andrew Marriott Dominic Massaro
EECA, Dagstuhl, March 15-19, 2004
2
Plan

What micro-evaluation is.
The state of the art.
Discussion.

3
What is micro-evaluation?

A method to test whether the designers model as
implemented in an ECA is understood by subjects
in the intended way.
One important motivation Added value of ECAs in
applications cannot be proven without being sure
that the underlying models are correct.

4
Micro-evaluation paradigm
5
Topics that were discussed

Audio-visual speech.
Non-verbal behaviour.
Natural language content.
Dialogue control and interaction.
Personality, emotion, mood culture.

6
Plan

What micro-evaluation is.
The state of the art.
Discussion.

7
About models

No lack of models (if you know where to look).
Phonetics
Conversational analysis
Cognitive science
Social psychology, etc.
Main complication
Many of these models are incomplete and typically
lack ECA relevant information.

8
So, we need to collect data

Standard research methodology applies.
Social sciences
Any research methodology textbook.
Facial analysis
Ekman et al. (1972, 1982), Wagner (1993).
Talking heads/AV speech
Massaro, Perceiving Talking Faces (ch. 13)
ECAs
Ruttkay and Pelachaud (2004)

9
Elicitation studies

Record people.
Paraphrasing Ekman et al. (1972/1982)
Elicitation circumstance must be representative.
There must be an independent criterion.
Data sampling must be representative.
Issues
One speaker vs group of speakers.
Naturalistic vs experimental.
Ecological vs. functional validity.

10
Data validation studies

Annotation.
Multiple judges
Good coding scheme
Kappa statistics
Coverage of the model.
Training vs testing
Accuracy, precision, recall, F,
Perception studies see judgement studies.

11
Judgement studies

Implement model or data in ECA and test with
human subjects.
If possible, compare to no ECA baseline and
human top-line / gold standard .
Task and Data Analysis
Choose appropriate tasks / scenarios
Choose behavioral measures / metrics
Choose appropriate analyses
Formative Evaluation
Apply to next generation or different ECA
Repeat evaluation paradigm

12
But

The devil is in the details.
It may be difficult to find the right task or
scenario to test your model.
Never ask directly Does my ECA have property
x?
Look for specific paradigm which forces
subjects to make functional use of the ECAs
behavior.
This is the creative part which makes micro-
evaluation fun!

13
Case-study Cassell (in prep)

How to show that gestures actually support a
users understanding of the information presented
by an ECA?
Let ECA tell a story about houses with/without
gestures.
Forced choice selection paradigm.
Which house was described?

14
Plan

What micro-evaluation is.
The state of the art.
Discussion.

15
One relation to other working groups

Group 1
Micro-evaluation methods give an
operationalization of the collection and use of
corpora for ECA design.
Group 2
Micro-evaluation methods apply to realism and
hyperrealism alike and offer a mechanism to
verify empirical issues there.

Group 4
Macro-evaluation involves micro-evaluation
methodology.
Micro-evaluation should precede macro-evaluation.
Group 5
Criteria/methods for micro-evaluation should be
used for ECA contest.
Good methodology helps for sharing resources
(i.e., experimental findings).

17
Two model status

Much more problems discussing micro-evaluation of
emotion and personality than with audio-visual
speech and non-verbal communication.
Why?
Conscious versus unconscious?
Displaying versus feeling / being?
More real, underlying work is needed to fill
in the more complicated models.

18
Three Micro- vs. macro-evaluation

How to make sure micro-evaluation results stay
valid in macro setting?
Introduce cognitive load as a factor in the
micro-evaluation methods.
E.g., noise in audio-visual speech.
E.g., using a secondary task.

19
Four How to get this all done?

To do all this work life-time research project
to fill many PhDs.
Try to engage researchers from outside the ECA
community by raising different kinds of
questions.
Try to initiate more multi-disciplinary research.
The experimental results are also relevant beyond
the ECA community (e.g., better understanding of
human cognition) .

20
Five Where do we go from here?

We will be doing better micro-evaluation studies
from now on
Try to compile our notes into a coherent and
readable whole with references, methodological
best practices, etc.

Write a Comment

User Comments (0)

About PowerShow.com

Microevaluation - PowerPoint PPT Presentation

Microevaluation

Danielle Pele. Emiel Krahmer. Andrew Marriott. Dominic Massaro. Dagstuhl working group 3 ... A method to test whether the designer's model as implemented in an ECA is ... – PowerPoint PPT presentation