Evaluation - PowerPoint PPT Presentation

1 / 42

About This Presentation

Title:

Evaluation

Description:

'Isn't my new visualization cool?...' Fall 2000. CS 7450. 3. Evaluation ... Participants used XML3D and one of the other tools per session (vary order) ... – PowerPoint PPT presentation

Number of Views:17

Avg rating:3.0/5.0

Slides: 43

Provided by: JohnS3

Category:

Tags: evaluation

more less

Transcript and Presenter's Notes

Title: Evaluation

1
Evaluation

CS 7450 - Information Visualization
November 9, 2000

2
Area Focus

Most of the research in InfoVis that weve
learned about this semester has been the
introduction of a new visualization technique or
tool
Fisheyes, cone trees, hyperbolic displays,
tilebars, themescapes, sunburst, jazz,
Isnt my new visualization cool?

3
Evaluation

How does one judge the quality of work?
Different measures
Impact on community as a whole, influential ideas
Assistance to people in the tasks they care about

4
Strong View

Unless a new technique or tool helps people in
some kind of problem or task, it doesnt have any
value

5
Broaden Thinking

Sometimes the chain of influence can be long and
drawn out
System X influences System Y influences System Z
which is incorporated into a practical tool that
is of true value to people
This is what research is all about (typically)

6
Evaluation in HCI

Takes many different forms
Qualitative, quantitative, objective, subjective,
controlled experiments, interpretive
observations,
Which ones are best for evaluating InfoVis
systems?

7
Controlled Experiments

Good for measuring performance or comparing
multiple techniques
What do we measure?
Performance, time, errors,
Strengths, weaknesses?

8
Subjective Assessments

Find out peoples subjective views on tools
Was it enjoyable, confusing, fun, difficult, ?
This kind of personal judgment strongly influence
use and adoption, sometimes even overcoming
performance deficits

9
Qualitative, ObservationalStudies

Watch systems being used (you can learn a lot)
Is it being used in the way you expected?
Ecological validity
Can suggest new designs and improvements

10
Running Studies

Beyond our scope here
Take CS 6750-HCI and youll learn more about this

11
Confounds

Very difficult in InfoVis to compare apples to
apples
UI can influence utility of visualization
technique
Different tools were built to address different
user tasks

12
Example

Lets design an experiment to compare the utility
of Kohonen maps, VIBE and Themescapes in finding
documents of interest...

13
Examples

Lets look at a couple example studies that
attempt to evaluate different InfoVis systems
Both taken from good journal issue whose focus is
Empirical Studies of Information Visualizations
International Journal of Human-Computer Studies,
Nov. 2000, Vol. 53, No. 5

14
InfoVis for Web Content

Study compared three techniques for finding and
accessing information within typical web
information hierarchies
Windows Explorer style tool
Snap/Yahoo style category breakdown
3D hyperbolic tree with 2D list view (XML3D)

Risden, Czerwinski, Munzner and Cook 00
15
XML3D
16
Snap
17
Folding Tree
18
Information Space

Took 12,000 node Snap hierarchy and ported it to
2D tree and XML3D tools
Fast T1 connection

19
Hypothesis

Since XML3D has more information encoded it will
provide better performance
But maybe 3D will throw people off

20
Methodology

16 participants
Tasks broken out by
Old category vs. New category
One parent vs. Multiple parents
Participants used XML3D and one of the other
tools per session (vary order)
Time to complete task measured, as well as
judgment on quality of task response

21
Example Tasks

Old - one
Find the Lawnmower category
Old - multiple
Find photography category, then learn what
different paths can take someone there
New - one
Create new Elementary Schools category and
position appropriately
New - multiple
Create new category, position it, determine one
other path to take people there

22
Results

General
Used ANOVA technique
No difference in two 2D tools so their data was
combined

23
Results

Speed
Participants completed tasks faster with XML3D
tool
Participants were faster on tasks with existing
category, larger when a single parent was
involved

24
Results

Consistency
No significant difference across all conditions
-gt Quality of placements, etc., was pretty much
the same throughout

25
Results

Feature Usage
What aspect of XML3D tool was important?
Analyzed peoples use of parts of tool
2D list elements - 43.9 of time
3D graph - 32.5 of time

26
Results

Subjective ratings
Conventional 2D received slightly higher
satisfaction rating, 4.85-4.5 out of 1-gt7
Not significant

27
Discussion

XML3D provides more focuscontext than the
other two tools that may aid performance
Appeared that integration of 3D graph plus the 2D
list view was important
Maybe new visualization techniques like this work
best when coupled with more traditional displays

28
Space-Filling Hierarchy Views

Compare Treemap and Sunburst with users
performing typical file/directory- related tasks
Evaluate task performance on both correctness and
time

Stasko, Catrambone, Guzdial and McDonald 00
29
Tools Compared
Treemap
SunBurst
30
Hierarchies Used

Four in total
Used sample files and directories from our own
systems (better than random)

Small Hierarchy (500 files)
Large Hierarchy (3000 files)
A
B
A
B
31
Methodology

60 participants
Participant only works with a small or large
hierarchy in a session
Training at start to learn tool
Vary order across participants

SB A, TM B TM A, SB B SB B, TM A TM B, SB A
32 on small hierarchies 28 on large hierarchies
32
Tasks

Identification (naming or pointing out) of a
file based on size, specifically, the
largest and second largest files (Questions 1-2)
Identification of a directory based on size,
specifically, the largest (Q3)
Location (pointing out) of a file, given the
entire path and name (Q4-7)
Location of a file, given only the file name
(Q8-9)
Identification of the deepest subdirectory
(Q10)
Identification of a directory containing files
of a particular type (Q11)
Identification of a file based on type and size,
specifically, the largest file of a
particular type (Q12)
Comparison of two files by size (Q13)
Location of two duplicated directory structures
(Q14)
Comparison of two directories by size (Q15)
Comparison of two directories by number of files
contained (Q16)

33
Hypothesis

Treemap will be better for comparing file sizes
Uses more of the area
Sunburst would be better for searching files and
understanding the structure
More explicit depiction of structure
Sunburst would be preferred overall

34
Try It Out

Conduct a couple example sessions

35
Small Hierarchy
Correct task completions (out of 16 possible)
36
Large Hierarchy
Correct task completions (out of 16 possible)
37
Performance Results

Ordering effect for Treemap on large hierarchies
Participants did better after seeing SB first
Performance was relatively mixed, trends favored
Sunburst, but not clear-cut
Oodles of data!

38
Subjective Preferences

Subjective preferenceSB (51), TM (9), unsure
(1)
People felt that TM was better for size tasks
(not borne out by data)
People felt that SB better for determining which
directories inside others
Identified it as being better for structure

39
Strategies

How a person searched for files etc. mattered
Jump out to total view, start looking
Go level by level

40
Summary

Why do evaluation of InfoVis systems?
We need to be sure that new techniques are really
better than old ones
We need to know the strengths and weaknesses of
each tool know when to use which tool

41
Challenges

There are no standard benchmark tests or
methodologies to help guide researchers
Moreover, theres simply no one correct way to
evaluate
Defining the tasks is crucial
Would be nice to have a good task taxonomy
Data sets used might influence results
What about individual differences?
Can you measure abilities (cognitive, visual,
etc.) of participants?

42
References