Title: An Introduction to Usability Testing
1An Introduction to Usability Testing
- Bill Killam, MA CHFP
- Adjunct Professor
- University of Maryland
- bkillam_at_user-centereddesign.com
2Definitions
- Usability testing is the common name for
user-based system evaluation - Popularized in the media by Jakob Neilson and
usually thought of as related to web site design
in the 1990s - Usability testing is one of the activities of
Human Factors Engineering
3Some Historical Examples
- 1911 Taylor study of which is the best way to
do a job? and what should constitute a days
work to determine time standards for basic tasks - 1911 - Frank (an Industrial Engineer) and Lillian
(a Psychologist) Gobreth, studied the motions
involved in bricklaying reduced motions from 18
to 5, the development of the therblig unit of
motion - Late 1940, psychologist at Write Patterson Air
Force Base studies crashes to determine the
cause, it was not what was expected - A study on the effect of a redundant,
high-centered taillight on rear end car crashes
4What is the System
- Sid Smiths user-system interface (compared to
user-computer interface) - Systems are made up of users performing some
activity within a context - Cant redesign users but we can design equipment,
so our goal as designer is to design equipment
to optimize system performance
5What does usability mean?
- Accepted Definition
- The ability of a specific group of users to
perform a specific set of activities within a
specific environment - ISO 9241 Definition
- The ability of a specific group of users to
perform a specific set of activities within a
specific environment with effectiveness,
efficiency, and satisfaction (emphasis added) - Neilson
- Satisfaction
- Efficiency
- Learnability
- Low Errors
- Memorability
6Contributors to Usability
- Functional Suitability does the product contain
the functionality required by the user? - Ease-of-learning can the user figure out how to
exercise the functionality provided - Ease-of-use - can the user exercise the
functionality accurately and efficiently
(includes accessibility issues) - Ease-of-recall can the knowledge of operation
be easily maintained over time? - Subjective Preference
7our brains work against us
There are a lot of factors
8 9(No Transcript)
10(No Transcript)
11Red
Green
Blue
Orange
Yellow
Black
12Stroop
Stroop
Stroop
Stroop
Stroop
Stroop
13Orange
Yellow
Green
Black
Blue
Red
14(No Transcript)
15(No Transcript)
16we have trouble with patterned data
17(No Transcript)
18(No Transcript)
19(No Transcript)
20(No Transcript)
21(No Transcript)
22(No Transcript)
23(No Transcript)
24(No Transcript)
25we can be fooled
26(No Transcript)
27(No Transcript)
28(No Transcript)
29(No Transcript)
30(No Transcript)
31(No Transcript)
32(No Transcript)
33(No Transcript)
34(No Transcript)
35(No Transcript)
36we make perceptual assumptions
37(No Transcript)
38(No Transcript)
39(No Transcript)
40(No Transcript)
41- Jack and Jill went
- went up the
- Hill to fetch a
- a pail of milk
42- FINISHED FILES ARE THE RE-
- SULT OF YEARS OF SCIENTIF-
- IC STUDY COMBINED WITH THE
- EXPERIENCE OF MANY YEARS
43our perceptual abilities are limited in the
presence of noise
44THE QUICK BROWN FOX JUMPED OVER THE LAZY DOGS
BACK.
45The quick brown fox jumped over the lazy dogs
back.
46THE QUICK BROWN FOX JUMPED OVER THE LAZY DOGS
BACK.
47The quick brown fox jumped over the lazy dogs
back.
48our cognitive abilities are limited
49The Game of 15s
- Lets play the game of 15. The pieces of the
game are the numbers 1, 2, 3, 4, 5, 6, 7, 8, and
9. Each player takes a digit in turn. Once a
digit is taken, the other player cannot use it.
The first player to get three digits that sum to
15 wins. - Heres a sample game Player A takes 8. Player B
takes 2. Then A takes 4, and B takes 3. A takes
5. What digit should B take?
50(No Transcript)
51(No Transcript)
52our memory affects our abilities
53our psychology affects our abilities
54and we need protection from ourselves
55(No Transcript)
561st Dimension of Usability
- Functional Visibility through an obvious
visible structure and adequate feedback - (1) Affordances
- Perceived and actual properties of things
- Chairs are for sitting on
- Doors are for opening
- Flat surfaces are for placing things on
- Buttons are for pressing
- Plates are for pushing
- Handles are for pushing
- Knobs are for turning
- Slots are for inserting things
571st Dimension of Usability (concluded)
- (2) Constraints
- Obvious intended use or limitations (e.g.,
intended use mouse is for grabbing with the
hand, trackball is for rolling with the thumb
limitation buttons are presses with the finger,
pedals are stepped on) - (3) Mapping
- Relationship between parts (e.g., control
movement and effects in the real world) - Natural mapping conforms to cultural norms or
physical analogies - Adequate Feedback required to support, confirm,
and reinforce the visible structures
582nd Dimension of Usability
- Good (conceptual) model to predict the effects of
our actions - what actions are intended
- what the consequences of our actions will be
- what to do if something goes wrong
- Examples
- Real World Metaphors - Trash Can icon, Shopping
Carts - Difficult Models multiple drive types, phone
purchase, Rubber Stamp
59Issues with Conceptual Modal
- Sowa
- The development of percepts into a mosaic
- Characteristics of Bad Conceptual Model
- Incomplete
- Inconsistent
- Self Contradictory
- Examples
- Freezer Controls
- Ollie North
603rd Principle of Usability
- Design for the Intended user (and not for
yourself - Match the Representation to the Task
- Example the Game of 15s in the real world
61(No Transcript)
62(No Transcript)
634th Principle of Usability
- Design for Errors and Slips (or Dont Blame the
User) - Desired Characteristics
- immediately detectable feedback
- impact should be minimal
- results should be reversible
- Levels of Error Design
- Eliminate
- Protect
- Warn
- Train
64How does Usability Testing differ from other
forms of User-Based activities?
65Were Not Doing Behavioral Analysis
- Behavior is a specific action or set of actions
- Performance is acting towards a goal or objective
with a specific measure of success - Behavior and performance are not the same and
forcing a behavior rarely ensure proper or even
adequate performance
66Market Research
- Market Research (Qualitative Quantitative) is
directed primarily at understanding the actual or
potential user population for a product of
service - Size of markets
- Reasons for purchasing
67Basic or Even Applied Research
- Large scale effort that requires knowledge of the
degrees of freedom between the target population
and the sample population and requires
statistical analysis to determine the level of
significance of results
68User Acceptance Testing
- A script driven process to ensure the functional
specifications have been met - i.e., the functionality exists (regardless of if
it is usable) - It occurs after development, just before a
product is shipped
69When and What
70When is Usability Testing Conducted?
- Once
- A few times during development
- Throughout the Development Cycle
- Longitudinally
71What do you test?
- Isolating the Variable
- Conceptual Designs
- Architectures
- Labels
- Wireframe Design
- Visual Design
- Mock-up versus Prototypes
- The much maligned paper prototype
- Non-operational interfaces and the Wizard of Oz
technique - Comparative Evaluations
72The Basics
73Usability Testing
- Human Factors Engineering (including usability
testing) occurs as part of the design process
(typically referred to as a user-centered design
process) to allow designers to make design
decisions based on collected data - Usability testing may be applied as part of an
IVV effort, but a criteria must be established
prior to testing
74Types of Usability Testing Used During Design
- Non-User Based
- Expert Review
- Compliance Reviews
- Heuristic Evaluations
- Cognitive Walkthroughs
- User-based
- User Surveys
- Ethnographic Observation
- Performance-based
- Think Aloud
- Co-Discover
75Non-User based Testing Expert Review
- One or two usability experts review a product,
application, etc. - Free format review
- Subjective but based on sound usability
principles - Lowest cost usability testing
- Highly dependent on the qualifications of the
reviewer(s)
76Non-User based Testing Compliance Testing
- Style Guide-based Testing
- Interpretation
- Checklists
- Scope Limitations
- Interface Specification Testing
- Interpretation
- Standards-based Testing
- ADA Testing
- Ex. Public Law 508 Testing
- Ex. DOD DII HCI Interface Specification Testing
77Non-User based Testing Heuristic Evaluation
- Structured review based on known rules of thumb
- Nielsons 10 Most Common Mistakes Made by Web
Developers (three versions) - Shneidermans 8 Golden Rules
- Normans 4 Principles of Usability
78Non-User based Testing Cognitive Walkthrough
- Team Approach
- Issues related to cognition (understanding) more
than presentation or sequence control - Subjective but based on sound psychological
issues - Also lowest cost usability testing
- Highly dependent on the qualifications of the
reviewer(s)
79Sidebar Intrusive versus Non-intrusive
- Projected Responding
- The Hiesenberg Uncertainty Principle of Usability
- Non-intrusive Testing
- Field Data Collection
- Ethnographic Observation (aka Contextual Inquiry)
- Intrusive testing
- The use of controlled environments
(repeatability) - Isolation of specific variables
80User-Based Testing User Surveys
- Pro Inexpensive and can be conducted remotely
- Pro Can provide trend data
- Con Relies on user self reported data
- Con The vocal minority
81User-Based Testing Ethnographic Observation
- Pro The most real
- Con Analysis Intensive
- Con No interaction (follow-up)
- Con Ethically challenging
82User-Based Testing Think Aloud Protocols
- Probably the most common form of usability
testing in use today - Pro Designed to capture participants
understanding - Con A disruptive test, cannot be used to
evaluate performance - Con Biased against performance measures
83User-Based Testing Co-Discovery Protocols
- Variation on Think Out Loud protocol
- Multiple participant perspective
- Pro More natural interaction than Think Out Loud
- Pro More fun, more revealing
- Con Potentially more difficult participant
selection/matching issues
84User-Based Testing Performance-based
- Semi-Intrusive
- Pro Provides an objective measure
- Pro Good for comparative evaluations
- Con Not a complete picture of usability,
possibly misleading - Critical Incidence Analysis
- Pro Provides a combination of performance and
think aloud in a single session - Con Risks confabulation
85User-Based Testing Mental Workload
- Formats
- Physiological measures
- Mental fatigue (performance measures)
- Blink rate
- Subjective assessment (Cooper-Harper)
- Secondary Task Problem
- Pro Non-intrusive
- Con Difficult to administer and interpret
86User-Based Evaluation Subjective Measures
- Self-reported ease-of-use measures (summative
evaluations) - SUS
- QUIS
- SUMI
- Aesthetic value
- User preferences
87How do you DesignandConduct a User-based Test
88Test Set-up
- Select a Protocol
- Define Your Variables
- Dependent and Independent Variables
- Confounding Variables
- Operationalize Your Variables
- Formats
- Between Subject Designs
- Within Subject Designs
89Participant Issues
- Addressing All Possible User-types (good luck)
- How many?
- Relationship to statistical significance
- Pilot Study format
- Discount Usability whos rule?
- Selecting subjects
- Screeners
- Getting Subjects
- Convenience Sampling
- Recruiting
- Participant stipends
- Over recruiting
- Scheduling
90Defining Task Scenarios
- Areas of concern redesign, or client interest
- Short, unambiguous tasks to be performed
- Wording is critical
- In the users own terms
- Does not contain seeds to the correct solution
- Enough to form a complete test but able to stay
within the time limit
91Preparing Test Materials
- Consent form
- Video release form
- Receipt and confidentiality agreement
- Demographic form
- Introductory comments
- Participant task descriptions
- Questionnaires, SUS, etc.
- Note Takers Forms
- Facilitators Guide
92Piloting the Design
- Dry running the entire experiment
- Getting subjects
- Convenience sampling
- Collect data
93Collecting Data
- Superman versus Facilitator/Observer
- Collecting interaction data
- Collecting observed data
- Behavior
- Reactions
- Collecting participant comments
- Collecting subjective data
- Pre-test data
- Post scenario data
- Post test data
94Analyzing Data
- Descriptive versus Predictive Statistics
- Statistical Analysis
- T-test
- F-test
- Statistical Significance vs. The Principle of
Inter-ocular Drama
95Reporting the Results
- Briefing Results
- Written Reports
- Highlights Tapes
- The a picture is worth a thousand words
principle - NIST CIF
96Types of Usability Testing Used During IVV
- Non-User Based
- Expert Review
- Compliance Reviews
- Heuristic Evaluations
- Cognitive Walkthroughs
- User-based
- User Surveys
- Ethnographic Observation
- Performance-based
- Think Aloud
- Co-Discover
97Wrap-up
98Misc. Things
- When is Usability Testing Conducted? Once
- A few times during development
- Throughout the Development Cycle
- Longitudinally
- Mock-up versus Prototypes versus Actual
Systems/Applications - The much maligned paper prototype
- Non-operational interfaces and the Wizard of Oz
technique - Comparative Evaluations
99Testing Special Situations
- PDAs, Cell Phones and other Handheld devices
- Telephone and IVR interfaces
- Remote Usability Testing
100Testing Special Populations
- Kids
- Consent
- Incentives
- Parental Issues
- Process Management
- Administering Questionnaires and other data
collection issues - Users with Disabilities
- Special Interface devices
- Interaction protocols
- Communications
- Administering Questionnaires and other data
collection issues
101References
102References
- Handbook of Human Factors Test and Evaluation,
OBrian Charlton - A Practical Guide to Usability Testing, Dumas
Redish - Handbook of Usability Testing How to Plan,
Design, and Conduct Effective Tests, Jeffery
Rubin - Cost-Justifying Usability, Randolph Bias
Deborah Mayhew