Title: Practical Results from Large-Scale
1- Practical Results from Large-Scale
- Web Usability Testing
-
-
- Rolf Molich
- DialogDesign
2http//www.dialogdesign.dk/cue2.htmSlides in
Microsoft PowerPoint 97 format
Download Test Reports and Slides
3How It All Started...
- A recent survey shows that
- 80 of all Danish drivers think that their
driving skills are above average. - How about usability testers?
4How It All Started...
- Too much emphasis on one-way mirrors and scan
converters - Little knowledge of REAL usability testing
procedures - mainly beautified descriptions - Too little emphasis on usability test procedures
and quality control (Who checks the checker?)
5Who Checks the Checker?
- When did YOU last have an objective check of your
usability testing skills? - Who would you trust as an evaluator of your
skills?
6Comparative Evaluations
- Test End Test object
Student Professional -
teams teams - 1 Oct 97 9 Danish web-sites 50 0
- 2 Dec 97 CUE-1 Win Calendar Progr. 0 4
- 3 Oct 98 9 Danish web-sites 50 0
- 4 Dec 98 CUE-2 www.hotmail.com 23 7
- 5 Mar 99 Web Text - Encyclopedia 0 4
7Student Tests
- Introductory couse in Human-Computer Interaction
at the Technical University of Copenhagen - Two courses in the fall of 1997 and 1998
- 120 students per course
- Fifty teams of one to three students
- 2 x 9 Danish web-sites tested by four to nine
teams with at least four test participants - Three weeks to complete test and write report
8Can Students do Usability Testing?
- Quality of Usability tests and reports is
acceptable considering that most teams used 20-50
hours - Some teams wrote quite professional reports after
just one month of the course (Surprise?) - Few false problems and opinions
- Limited overlap between findings
9(No Transcript)
10www.bokus.com - Bookstore
- Buttons in lower right corner
- Empty shopping basket
- Change order
- Continue shopping
- Go on with your purchase
- Would a human bookseller act like this?
11Conclusions
-
- Inhuman treatment of users on many e-commerce
web-sites - On-site searching seldom works. Users are better
off without on-site searching - Many web-sites focus on the company, not the user
12Conclusions
- Nice layout and graphics
- Good response time
- Give correct results
13Problem Example
- User task
- You want to take your business to BG Bank. Make
an appointment with the bank - Hard to find in menu structure
- Users entered appointment as keyword for Search
14(No Transcript)
15How to Improve Search
- Tolerate user input errors
- Provide human error messages (constructive)
- Recommend index, site-map
- Special handling of frequent keywords
- Show user search keywords in context
16CUE-1Comparative Usability Evaluation 1
- Four professional teams usability tested the same
Windows calendar program - Two US teams (Sun, Rockwell), one English (NPL)
and one Irish (HFRG, Univ. Cork) - Results published in a panel and a paper at UPA98
- Main conclusions similar to CUE-2
17CUE-2Comparative Usability Evaluation 2
- Nine teams have usability tested the same
web-site - Five professional teams
- Two semi-professional teams
- Two student teams
- (plus three student teams from TUD)
- Test web-site www.hotmail.com
18Purposes of CUE-2
- Provide a survey of the state-of-the art within
professional usability testing of web-sites. - Set a benchmark against which other usability
labs can measure their usability testing skills. - Investigate the reproducibility of usability test
results - Give participating teams an idea of strengths and
weaknesses in their approach to usability testing
19NON Purposes of CUE-2
- To pick a winner
- To make a profit
20Basis for Usability Test
- Web-site address www.hotmail.com
- Client scenario (written by Erika Kindlund and
Meeta Arcuri) - Access to client through intermediary (Erika
Kindlund) - One month to carry out test -Web-site adress not
disclosed until start of test period
21What Each Team Did
- Familiarize with Hotmail
- Define test scenarios
- Define user profile recruit test participants
- Run a suitable number of tests, determined by the
team - Write usability test report in standard company
format and anonymize it
22Problems Found
-
CUE-1 CUE-2 - Total number of problems 141 300
- Found by seven teams - 1
- Found by six teams - 1
- Found by five teams - 4
- Found by four teams 1 4
- Found by three teams 1 15
- Found by two teams 11 49
- Found only by one team 128 (91) 226 (75)
23Comparison of Tests
- Based mainly on test reports
- Focus on significant differences
- Selection of parameters for comparison based on
two generally recognized textbooks - Dumas and Redish, A Practical Guide to Usability
Testing - Jeff Rubin, Handbook of Usability Testing
24Resources
- Team A B C D E F G H J
- Person hours used for test 136 123 84 (16) 130 50
107 45 218 - Usability professionals 2 1 1 1 3 1 1 3 6
- Number of tests 7 6 6 50 9 5 11 4 6
25Usability Test Reports
- Team A B C D E F G H J
- Pages 16 36 10 5 36 19 18 11 22
- Exec summary Y Y N N N Y N Y Y
- Screen shots 10 0 8 0 1 2 1 2 0
- Severity scale 2 3 2 1 2 1 1 3 4
26Usability Results
- Team A B C D E F G H J
- Positive findings 0 8 4 7 24 25 14 4 6
- Problems 26 150 17 10 58 75 30 18 20
- Exclusive 42 71 24 10 57 51 33 56 60
- Core problems (10026) 38 73 35 8 58 54 50 27
31 - Person hours used for test 136 123 84 NA 130 50 1
07 45 218
27Results
- There are overwhelmingly many usability problems.
- There are many serious usability problems.
- Limited overlap between team findings.
28Conclusions
- In most cases, no form of cost-effective testing
will find all or most of the problems - or even
most of the serious ones - Claims like Method x finds at least 80 of all
serious usability problems are not in
accordance with the results of this study
29Problems Found in CUE-2
- Total number of different usability problems
found 300 - Found by seven teams 1
- Found by six teams 1
- Found by five teams 4
- Found by four teams 4
- Found by three teams 15
- Found by two teams 49
- Found only by one team 226
30Problem Found by Seven Teams
- During the registration process Hotmail users are
asked to provide a password hint question. The
corresponding text box must be filled. - Most users did not understand the meaning of the
password hint question. Some entered their
Hotmail password in the text box. - Clever but unusual mechanisms like the password
hint question must be explained carefully to
users.
31(No Transcript)
32Problem Example
- Users consistently glanced briefly at this screen
and then without hesitation clicked the button - I Accept
- The button I Accept is very conveniently placed
(usable), but the text is quite difficult to
read. The text is written in legalese, not in
webbish. - Users want text that they can Skim, skim, and
read. - Do unusable Terms of Service have any legal
value?
33(No Transcript)
34Problems with Terms of Service
- Difficult to read - legalese, not English
- Does not answer important user questions about
privacy, cost - Not in native language
- Signals Dont waste your time on this
- Button I agree is too usable
- No information on how to return to Terms of
Service
35(No Transcript)
36(No Transcript)
37Language Related Problems
- Examples of language related problems that
- were detected by European teams
- Send Mail Term "Compose" difficult to
understand. Use "Create new message" or "Write
Mail (5/9) - Create new account "State/Province" textbox is
required but does not make sense in many
countries (2/9)
38Language Related Problems
- Some language related problems
- suggested by US teams were
- not confirmed by European test teams
- Change "last name" to "family name"
- Meaning of "U.S. Residents only" and "Non-U.S.
Residents Only" is unclear
39(No Transcript)
40Advice for a Usable Usability Report
- Problems listed with severity, users
- Distinguish clearly between
- Personal opinions,
- Expert opinions,
- User opinions,
- User findings
41Some State-of-the-Art Boundaries
- No power user test, although four teams also
recruited power users - Few tests that require complicated setup.
Examples Attachments boundary testing, e.g.
large number of e-mails in in-box - Teams completed their usability tests within
schedule, but they never compared their results
to those from the other teams
42Conclusions
- The total number of usability problems for each
tested web-site is huge,much larger than you can
hope to find in one series of usability tests - Usability testing techniques can be improved
considerably - We need more awareness of the usability of
usability work
43http//www.dialogdesign.dk/cue2.htmSlides in
Microsoft PowerPoint 97 format CUE-2
PanelTuesday at 4.30 p.m.
Download Test Reports and Slides
44(No Transcript)