Automating Assessment of Web Site Usability

1 / 55

About This Presentation

Title:

Automating Assessment of Web Site Usability

Description:

Only 18% are free from user testing. Only 6% for the web. IBM Almaden, Oct 2000 ... Collected data for 2,015 information-centric pages from 463 sites ... – PowerPoint PPT presentation

Number of Views:103

Avg rating:3.0/5.0

Slides: 56

Provided by: harva54

Learn more at: http://people.ischool.berkeley.edu

more less

Transcript and Presenter's Notes

Title: Automating Assessment of Web Site Usability

1
Automating Assessment of Web Site Usability

Marti Hearst University of California, Berkeley
2
The Usability Gap
3
The Usability Gap
196 M new Web sites in the next 5 years
Nielsen99
Most sites have inadequate usability Forrester,
Spool, Hurst (users cant find what they want
39-66 of the time)
4
Usability effects the bottom line

IBM case study 1999
Spent millions to redesign site
84 decrease in help usage
400 increase in sales
Attributed to improvements in information
architecture

5
Usability effects the bottom line

IBM case study 1999
Spent millions to redesign site
84 decrease in help usage
400 increase in sales
Attributed to improvements in information
architecture

Creative Good Study 1999 Studied 10
e-commerce sites 59 attempts failed If 25 of
these had succeeded - estimated additional 3.9B
in sales
6
Talk Outline

Web Site Design
Automated Usability Evaluation
Our approach
WebTANGO
Some Empirical Results
Wrap-up
Joint work with Melody Ivory Rashmi Sinha

7
Web Site Design (Newman et al. 00)

Information design
structure, categories of information
Navigation design
interaction with information structure
Graphic design
visual presentation of information and navigation
(color, typography, etc.)

Courtesy of Mark Newman
8
Web Site Design(Newman et al. 00)

Information Architecture
includes management and more responsibility for
content
User Interface Design
includes testing and evaluation

Courtesy of Mark Newman
9
Web Site Design Process
Start
Discovery
Assemble information relevant to project
Courtesy of Mark Newman
10
(No Transcript)
11
Usability EvaluationStandard Techniques

User studies
Potential users use the interface to complete
some tasks
Requires an implemented interface
"Discount" Usability Evaluation
Heuristic Evaluation
Usability expert assesses guidelines

12
Automated UE

We looked at 124 methods
AUE is greatly under-explored
Only 36 of all methods
Fewer methods for the web (28)
Most techniques require some testing
Only 18 are free from user testing
Only 6 for the web

13
Survey of Automated UE

Predominant methods (Web)
Structural analysis (4)
Bobby, Scholtz Laskowski 98, Stein 97
Guideline Reviews (11)
Log file analysis (9)
Chi et al. 00, Drott 98, Fuller de Graaff 96,
Guzdial et al., Sullivan 97, Theng Marsden 98
Simulation (2)
Webcriteria (Max), Chi et al. 00

14
Existing Metrics

Web metric analysis tools report on what is easy
to measure
Predicted download time
Depth/breadth of site
We want to worry about
Content
User goals/tasks
We also want to compare alternative designs.

15
Web TANGOTool for Assessing NaviGation
Organization

Goal automated support for comparing design
alternatives
How Assess usability of the information
architecture
Approximate information-seeking behavior
Output quantitative usability metrics

16
Benefits/Tradeoffs

Benefits
Less expensive than traditional methods
Use early in design process
Tradeoffs
Accuracy?
Validate methodology with user studies
Illustrate different problems than traditional
methods
For comparison purposes only
Does not capture subjective measures

17
Information-Centric Sites
18
Guidelines

There are many usability guidelines
A survey of 21 sets of web guidelines found
little overlap (Ratner et al. 96)
Why?
Our hypothesis not empirically validated
So lets figure out what works!

19
An Empirical Study
Which features distinguish well-designed web
pages?
20
Methodology

Collect quantitative measures from 2 groups
Ranked Sites rated favorably via expert review
or user ratings
Unranked Sites that have not been rated
favorably
Statistically compare the groups
Predict group membership

21
Quantitative Measures

Identified 42 aspects from the literature
Page Composition (e.g., words, links, images)
Page Formatting (e.g., fonts, lists, colors)
Overall Page Characteristics
(e.g., information layout quality, download
speed)

22
Metrics

Word Count
Body Text Percentage
Emphasized Body Text Percentage
Text Positioning Count
Text Cluster Count
Link Count

Page Size
Graphic Percentage
Graphics Count
Color Count
Font Count
Reading Complexity

23
Data Collection

Collected data for 2,015 information-centric
pages from 463 sites
Education, government, newspaper, etc.
Data constraints
At least 30 words
No e-commerce pages
Exhibit high self-containment (i.e., no style
sheets, scripts, applets, etc.)
1,054 pages fit constraints (52)

24
Data Collection

Ranked pages
Favorably assessed by expert review or user
rating on expert-chosen sites
Sources
Yahoo! 101 (ER)
Web 100 (UR)
PC Mag Top 100 (ER)
WiseCats Top 100 (ER)
Webby Awards (ER) Peoples Voice (UR)

25
Data Collection

Unranked
Not favorably assessed by expert review or user
rating on expert-chosen sites
Do not assume unranked unfavorable
Sources
WebCriterias Industry Benchmark
Yahoo Business Economy Category
Others

26
Data Analysis

428 pages
214 ranked pages
840 unranked pages
214 chosen randomly

27
Findings

Several features are significantly associated
with ranked sites
Several pairs of features correlate strongly
Correlations mean different things in ranked vs.
unranked pages
Significant features are partially successful at
predicting if site is ranked

28
Significant Differences
29
Significant Differences

Ranked pages
More text clustering (facilitates scanning)
More links (facilitate info-seeking)
More bytes (more content ? facilitate info
seeking)
More images (clustering graphics ? facilitates
scanning)
More colors (facilitates scanning)
Lower reading complexity (close to best numbers
in Spool study ? facilitates scanning)

30
Metric Correlations
31
Metric Correlations

Created hypotheses based on correlations
Ranked Pages
Colored display text
Link clustering
? Both patterns on all pages in random sample
Unranked Pages
Display text coloring plus body text emphasis or
clustering
Link coloring or clustering
Image links, simulated image maps, bulleted links
? At least 2 patterns in 70 of random sample
Confirmed by sampling

32
Two Examples
33
Ranked Page
Colored display text Link clustering
34
UnRanked Page
Body text emphasis Image links
35
Predicting Web Page Rating

Linear Regression
Explains 10 of difference between groups
63 Accuracy (better at unranked prediction)

36
Predicting Web Page Rating

Home vs. Non-home pages
Text cluster count predicts home page ranking
66 accuracy
Consistent with primary goal of home pages
Non-home page prediction
Consistent with full sample results
4 of 6 metrics (link count, text positioning
count, color count, reading complexity)

37
Another Rating System

Web site ratings from RateItAll.com
User ratings on 5-point scale
(1 Terrible! 5 Great!)
No rating criteria
Small set of 59 pages (61 ranked)
54 of pages classified consistently
Only 17 unranked with high rating ? unranked
sites properly labeled
29 ranked with medium rating ? difference
between expert/non-expert review
Ranking predicted by graphics count with 70
accuracy
? Carefully design studies with non-experts

38
Second study (new results)

Better rating data
Webby Awards
Sites organized into categories
New metrics computation tool
More quantitative measures
Process style sheets, inline frames
Larger sample of pages

39
Webby Awards 2000

27 categories
We used finance, education, community, living,
health, services
100 judges
6 criteria
3 rounds of judging
We used first round only
2000 sites initially

40
Webby Awards 2000

6 criteria
Content
Structure navigation
Visual design
Functionality
Interactivity
Overall experience
Factor analysis first factor accounted for 91
of the variance
Judgements somewhat normally distributed, with
skew

41
New Metrics
42
Methodology

Data collection
1108 pages
163 sites
3 levels per site
14 metrics
About 85 accurate
Text cluster and text positioning counts less
accurate

43
Preliminary Results

Linear regression to predict Webby judges ratings
Top 30 vs bottom 30
Prediction accuracy
72 if categories not taken into account
83 if categories assessed separately

44
Significant Metrics by Category
45
Category-based Profiles

K-means clustering of good sites, according to
the metrics
Preliminary results suggest the sites do cluster
Can use clusters to create profiles of good and
poor sites for each category
These can be used as empircally verified
guidelines

46
Ramifications

It is remarkable that such simple metrics predict
so well
Perhaps good design is good overall
There may be other factors
A foundation for a new methodology
Empircal, bottom up
Does this reflect cognitive principles?
But, no one path to good design

47
Longer Term Goal A Simulator for Comparing
Site Design

48
Monte Carlo Simulation

Have a model of information structure
Have a set of user goals
Want to assess navigation structure
Compare alternatives/tradeoffs
Identify bottlenecks
Identify critically important pages/links
Check all pairs of start/end points
Check overall reachability before and after a
change.

49
X
One Monte Carlo simulation step for Design 1,
Task 1. Simulation starts from the home page and
the target information is at Renter Support.
50
X
Monte Carlo simulation results for Design 1, Task
1. Simulation runs start from all pages in the
site. Average Navigation times are shown for
Tasks 2 3.
51
Monte Carlo Simulation

At each step in the simulation
Assume a probability distribution over a set of
next choices.
The next choice is a function of
The current goal
The understandability of the choice
Prior interaction history
The overall complexity of the page
Varying the distribution corresponds to varying
properties of the links
Spot-check important choices

52
Monte Carlo Simulation

At each step in the simulation
Assume a probability distribution over a set of
next choices.
The next choice is a function of
The current goal
The understandability of the choice
Prior interaction history
The overall complexity of the page
Varying the distribution corresponds to varying
properties of the links
Spot-check important choices

53
In Summary

Automated Usability Assessment should help close
the Web Usability Gap
We can empirically distinguish between highly
rated web pages and other pages
Empirical validation of design guidelines
Can build profiles of good vs. poor sites
Are validating expert judgements with usability
assessments via a user study
Web use simulation is an under-explored and
promising new approach

54
Current Projects

Automating Web Usability (Tango)
Melody Ivory, Rashmi Sinha
Text Data Mining (Lindi)
Barbara Rosario, Steve Tu
Metadata in Search Interfaces (Flamenco)
Ame Elliott, Andy Chou
Web Intranet Search (Cha-Cha)
Mike Chen, Jamie Laflen

More information
http//www.cs.berkeley.edu/ivory/web
http//www.sims.berkeley.edu/hearst

56
(No Transcript)
57
Automated Usability Evaluation

Logging/capture
Pro Easy
Con Requires implemented system
Con Don't know the user task (web)
Con Don't present alternatives
Con Don't distinguish error from success
Analytical Modeling
Pro doable at design phase
Con models an expert
Con academic exercise
Simulation

58
Research Issues Navigation Predictions

Develop model for predicting link selection
Requirements
Information need (task metadata)
Representation of pages (page metadata)
Method for selecting links (relevance ranking)
Maintaining users conceptual model during site
traversal (scent Fur97,LC98,Pir97)

Write a Comment

User Comments (0)