Title: Evaluating Interfaces
1Evaluating Interfaces
- Goals of evaluation
- Lab versus field based evaluation
- Evaluation methods
- Design-oriented
- Implementation-oriented
2The Human Centred Design Cycle
Context Users, tasks, hardware, software,
materials, physical and social environments
Plan the user-centred process
From ISO 13407 0 Human Centred Design Process
for Interactive Systems (1999)
Understand and specify the context of use
Specify the user and organisational requirements
Evaluate Designs Against User Requirements
Produce Design Solutions
Meets requirements
3Goals of evaluation
- To ensure that the interface behaves as required
and meets user needs - Assess the extent of its functionality
- Identify specific problems
- Assess the usability of the interface
- Assess its impact on the user
4Laboratory versus field studies
- Laboratory studies
- The user comes to the evaluator
- Well-equipped laboratory may contain
sophisticated recording facilities, two-way
mirrors, instrumented computers etc. - Can control or deliberately manipulate the
context of use - The only option for some dangerous or extreme
interfaces - But cannot reproduce the natural working context
of a users environment - social interaction and
contingencies - Difficult to evaluate long-term use
5 - Field studies
- The evaluator goes to the user
- Captures actual context
- Captures real working practice and social
interaction - Not possible for some applications
- It can also be difficult to capture data
- Cannot prove specific hypotheses
6Different kinds are appropriate at different
stages of design
Early-on formative evaluation of the design
may only involve designers and other experts
Later-on evaluation of the implementation
detailed, rigorous and with end-user
7Evaluation Methods
- Design-oriented evaluation methods
- Heuristic/expert inspections
- Cognitive walkthrough
- Implementation-oriented methods
- Observation
- Query techniques interviews and surveys
- (controlled experiments)
8Heuristic/Expert Inspections
- Experts assess the usability of an interface
guided by usability principles and guidelines
(heuristics) - Suited to early design when some kind of
representation/prototype is available - Its only as good as the experts you can afford
9The Process of Heuristic Expert Inspections
- Briefing session
- Several experts all given identical description
of product, its context of use and the goals of
evaluation - Evaluation period
- Each expert spends several hours independently
critiquing the interface - At least two passes through the interface, one
for overall appreciation and others for detailed
assessment - Debriefing session
- Experts meet to compare findings, prioritise
problems and propose solutions - They report/present their findings to decision
makers and other stakeholders
10Cognitive Walkthrough
- A predictive technique in which designers and
possibly experts simulate the users
problem-solving process at each step of the
human-computer dialogue - Originated in code walkthrough from software
engineering - Used mainly to consider ease of learning issues
especially how users might learn by exploring
the interface
11Stages of Cognitive Walkthrough
- Begins with
- A detailed description of the prototype (e.g.,
menu layouts) - Description of typical tasks the user will
perform - A written list of the actions required to
complete the tasks with the prototype - An indication of who the users are and what kind
of experience and knowledge they may have
12 - For each task, evaluators step through the
necessary action sequences, imagining that they
are a new user (!) and asking the following
questions - Will the user know what to do next?
- Can the user see how to do it?
- Will they know that they have done the right
thing? - It is vital to document the walkthrough
- Who did what and when
- Problems that arose and severity ratings
- Possible solutions
13A short fragment of cognitive walkthrough
- From Philip Craiger's page at http//istsvr03.unom
aha.edu/gui/cognitiv.htm - Evaluating the interface to a personal desktop
photocopier - A design sketch shows a numeric keypad, a "Copy"
button, and a push button on the back to turn on
the power. - The specification says the machine automatically
turns itself off after 5 minutes inactivity. - The task is to copy a single page, and the user
could be any office worker. - The actions the user needs to perform are to turn
on the power, put the original on the machine,
and press the "Copy" button - Now tell a believable story about the user's
motivation and interaction at each action
14The user wants to make a copy and knows that the
machine has to be turned on. So they push the
power button. Then they go on to the next action.
But this story isn't very believable. We can
agree that the user's general knowledge of office
machines will make them think the machine needs
to be turned on, just as they will know it should
be plugged in. But why shouldn't they assume that
the machine is already on? The interface
description didn't specify a "power on"
indicator. And the user's background knowledge is
likely to suggest that the machine is normally
on, like it is in most offices. Even if the
user figures out that the machine is off, can
they find the power switch? It's on the back, and
if the machine is on the user's desk, they can't
see it without getting up. The switch doesn't
have any label, and it's not the kind of switch
that usually turns on office equipment (a rocker
switch is more common). The conclusion of this
single-action story leaves something to be
desired as well. Once the button is pushed, how
does the user know the machine is on? Does a fan
start up that they can hear? If nothing happens,
they may decide this isn't the power switch and
look for one somewhere else.
15Observation
- Observe users interacting with the interface in
the laboratory or field - Typically requires functional prototypes
- Record interactions using
- Pen and paper
- Audio
- Video
- Computer logging
- User notebooks and diaries
- Think-aloud techniques
16 - Analysis
- Document illustrative fragments
- Detailed transcription and coding
- Post-task walkthroughs
- Specialised analysis software can replay video
along system data and help the analyst
synchronise notes and data
17Savannah
- An educational game for six players at a time
- A virtual savannah is overlaid on an empty school
playing field
18Studying Savannah
- Six trials over three days
- Two video recordings from the field
- Game replay interface
19Impala Sequence
20The Impala Sequence Revealed
- Elsa suddenly stops
- Circular formation
- Counting aloud
- Nala and Elsa cannot see the impala
- Replay shows them stopped on edge of locale
- GPS drift carries them over the boundary
- The boy who passed through attacked first
21Evaluation through questionnaires
- A fixed set of written questions usually with
written answers - Advantages
- gives the users point of view good for
evaluating satisfaction - quick and cost effective to administer and score
and so can deal with large numbers of users - user doesnt have to be present
- Disadvantages
- Only tells you how the user perceives the system
- Not good for some kinds of information
- Things that are hard to remember (e.g., times and
frequencies) - Things that involve status or are sensitive to
disclose - Usually not very detailed
22What to ask
- Background questions on the users
- Name, age, gender
- Experience with computers in general and this
kind of interface specifically - Job responsibilities and other relevant
information - Availability for further contact such as
interview - Interface specific questions
23Questions
- Three types of questions
- Factual - ask about observable information
- Opinion what the user thinks about something
(outward facing) - Attitudes how the user feels about something
(inward facing). E.g., do they like the system?
do they feel in control? - Two general styles of question
- Closed the user chooses from among a set number
of options quick to complete and easy to
summarise with statistics - Open the user gives free-form answers
captures more information slower to complete
and harder to summarise statistically (may
require coding) - Most questionnaires mix open and closed questions
24Options for closed questions
- Likert scales capture strength of opinion
- granularity of the scales depends upon
respondents expertise - Present results numerically and graphically
25Deploying questionnaires
- Post
- Interview
- Email
- As part of interface
- Web brings important advantages
- Ease of deployment
- Reliable data collection
- Semi-automated analysis
26Designing questionnaires
- What makes a good or bad questionnaire
- Reliability - ability to give the same results
when filled out by like-minded people in similar
circumstances - Validity - the degree to which the questionnaire
is actually measuring or collecting data about
what you think it should - Clarity, length and difficulty
- Designing a good questionnaire is difficult
pilot, pilot, pilot!! - Statistically valid questionnaire design is a
specialised skill use an existing one
27System Usability Scale (SUS)
- I think I would like to use this system
frequently - I found the system unnecessarily complex
- I thought the system was easy to use
- I think I would need the support of a technical
person to be able to use this system - I found the various functions in this system were
well integrated - I thought there was too much inconsistency in
this system - I would imagine that most people would learn to
use this system very quickly - I found the system very cumbersome to use
- I felt very confident using the system
- I needed to learn a lot of things before I could
get going with this system
28Calculating a rating from SUS
- For odd numbered questions, score scale
position - 1 - For even numbered questions, score 5 - scale
position - Multiply all scores by 2.5 (so each question
counts 10 points) - Final score for an individual sum of multiplied
scores for all questions (out of 100)
29Evaluation during active use
- System refinement based on experience or in
response to changes in users - interviews and focus group discussions
- continuous user-performance data logging
- frequent and infrequent error messages
- analyse sequences of actions to suggest
improvements or new actions - BUT respect peoples rights and consult them
first! - User feedback mechanisms
- on-line forms, email and bulletin boards
- workshops and conferences
30How many users?
- 5-12 as a rough rule of thumb
- Nielson and Landauer (1993) www.useit.com/alertbox
/20000319.html
31Ethical Issues
- Explain the purpose of the evaluation to
participants, including how their data will be
used and stored - Get their consent, preferably in writing.
- Get parental consent for kids
- Anonymise data
- As stored use anonymous ids not names
- As reported in text and also in images
- Do not include quotes that reveal identity
- Gain approval from your ethics committee and/or
professional body
32Example consent form
- I state that I am specific requirements and
wish to participate in a study being conducted by
name/s of researchers/ evaluators at the
organisation name. - The purpose of the study is to general study
aims. - The procedures involve generally what will
happen. - I understand that I will be asked to specific
tasks being given. - I understand that all information collected in
the study is confidential, and that my name will
not be identified at any time. - I understand that I am free to withdraw from
participation at any time without penalty - Signature of participant and date
33Good practice
- Inform users that it is the system under test,
not them - Put users at ease
- Do not criticise their performance/opinions
- Ideally, you should reward or pay participants
- May be polite and a good motivator to make
results available to participants
34Which method to choose
- Design or implementation?
- Laboratory or field studies?
- Subjective or objective?
- Qualitative or quantitative?
- Performance or satisfaction?
- Level of information provided?
- Immediacy of response?
- Intrusiveness?
- Resources and cost?