Title: Usability Testing II
1Usability Testing II
2Reminder
- Preparation
- Goals, Users, Tasks, Materials, equipment Team
- Pilot study
- Check method and make changes
- Conduct the study
- Analyse and present results
3Considerations when running a test
- Conduct a pilot test
- Check Equipment
- Make users feel at ease
- Ensure you have all necessary consents
- Use a procedural checklist
- Devise a labelling scheme for all materials
- De-brief participants
4Results
- List of problems
- Data on times, errors etc
- Data on participants subjective ratings
- Participant comments
- Video tapes
5Triangulation
- Handling multiple data sources to find major
usability problems
User comments Team observations
Quantitative Data
Problem Set
6Determinants of analysis methods
- Single system usability evaluation
- Test(s) focus on areas of concern
- Primary aim to identify usability problems
recommend fixes - Experimental evaluation
- Test multiple systems a comparison is made
- Aim to test a particular hypothesis
7Data Gathering Techniques and Data Types
- Task Completion
- Time on Task
- Error counts
- Verbal Data
- Retrospective ratings
- Video Data
- Task failure (Quant)
- Speed data (Quant)
- Problem counts (Quant)
- Users Mental models (Qual)
- Satisfaction scores (Quant)
- Comments, gestures (Qual)
Quant Quantitative Qual Qualitative
8Structure of what follows
- Verbal data analysis
- Scale analysis
- Time, error analysis
- Classifying Problem severity
9Verbal Data
- Comments made by participant or team that relate
to - Experienced difficulties, frustration, inferences
- Verbal Protocol Data
- Moment-by-moment decision making
- Can be difficult to elicit
- Often seen to be un-natural
10Rating Scales (Likert Scales)
- Write a set of statements some positive and
others negative in tone - Decide upon a scoring scheme
- 5 for strongly agreeing with positive statement
- 1 for strongly disagreeing with a positive
statement - 1 for strongly agreeing with negative statement
- 5 for strongly disagreeing with a negative
statement - So the higher the score the more positive the
perceptions expressed by the user.
11Likert Scales
The system was easy to navigate
Strongly Disagree
Disagree
Uncertain
Agree
Strongly Agree
1
2
3
4
5
The system was easy to use
Strongly Disagree
Disagree
Uncertain
Agree
Strongly Agree
1
2
3
4
5
Link labels were ambiguous
Strongly Agree
Agree
Uncertain
Disagree
Strongly Disagree
5
4
3
2
1
12Likert Scale analysis
- Some the scores
- E.g on previous slide the some is 12
- When comparing two systems you could see if there
are any significant differences between the two. - Alternatively use an off-the-peg scale such as
SUMI, WAMI, SUS to give an overall satisfaction
score.
13Likert Scales Problems and Caveats
- Requires users to reflect on interaction
- Hence problems with accuracy
- Subject to response set users always agreeing
or disagreeing - Randomize items - half positive and half negative
in tone - Reverse the scale for negative items
- Limited statistics
14Time and error data
- Tabulate data for each participant and each task
- Look for trends across tasks and participants
- Examine outliers
- Look for evidence to explain these trends -
triangulation
15Tabulation example - search speed
Total
Mean
T4
T3
T2
T1
118.6
29.7
P1
29.5
23.4
32.1
33.6
114.9
28.7
P2
28.8
25.7
33.5
26.9
212.3
53.1
P3
48.7
54.9
60.1
48.6
159.8
39.9
P4
40.0
37.8
40.0
42.0
122.0
30.5
31.7
P5
28.2
31.1
31.0
142.3
35.6
32.8
20.6
20.9
P6
68.0
143.8
35.9
39.8
36.9
34.1
P7
33.0
26.1
104.2
P8
26.9
19.0
28.7
29.6
280.9
254.4
270.8
311.8
Total
35.1
31.8
33.9
38.9
Mean
16Examining usability concerns
- Task should be linked to your usability concerns
- Look at all data sources for key tasks
- Time,
- Video
- Screen capture
17Organising problems by scope and severity
- Scope
- How widespread the problem is- does it affect the
majority of users? - Severity
- How critical the problem is does it prevent
task completion?
18Scope
- Local Problems
- Are restricted to one screen or page
- E.g. a date entry field does not specify if date
should be dd/mm/yy or mm/dd/yy - Global Problems
- Are of major importance
- The scope of Global problems extends across on
number of screens or pages - E.g. Menu items are difficult to find users
spend a lot of time looking through various menus
19Severity
- Grade 1 Problems that prevent task completion
- E.g. Continual selection of wrong menu option
- Grade 2 Problems that create a major delay or
cause frustration - E.g. Repeating a task due to lack of feedback
- Grade 3 Problems that have a minor effect
- Causing user to question. E.g using different
terminology for the same thing. Send a computer
message send an email message. Subtle problems
that can often lead to product enhancements - Grade 4 Subtle problems
- E.g It would be nice if.
20Tests with multiple conditions
21Use of Statistics
- Usability Tests of a single system
- Descriptive statistics
- Frequency counts of errors
- Average task completion times - Mean
- Score variability Standard deviation
- Depict data in tables or bar charts
22Standard deviation
- Describes the variability of the data
23Experiments
- Performance with system B is significantly better
in terms of search speed than system A - Inferential Statistics
- Dangerous in the wrong hands
- Need to understand the principles and tests used.
24Change Recommendation
- Use triangulation to find the cause of problems
- Consider the whole product or site
- Focus on global problems
- These may affect many aspects of the interface
- Probably more symptoms than those found in the
test.
25Presenting the Results
- Usually in report format
- Detail problems with impact and cause (report
format) - Write for the intended audience
- Keep the report succinct and to the point
- Often supported by a highlight DVD
- Presentation of key issues
- Supported by video data
26Summary
- Usability tests are Empirical therefore problems
must be grounded in evidence - Use triangulation to uncover causation and effect
- Make use of appropriate descriptive statistics
- Organise problems by scope and severity
- Global problems are key
- Write concise, succinct reports