Title: EVALUATING VISUALIZATIONS : USING A TAXONOMIC GUIDE
1EVALUATING VISUALIZATIONS USING A
TAXONOMIC GUIDE
- By
- E. MORSET, M. LEWIS K. A. OLSEN
- PRESENTORS
- CHANNA P. WITANA
- CALVIN OR
2CONTENT
- Introduction
- Visual and domain tasks
- Methodology
- Tasks
- Results
- Discussion
- Conclusion
3INTRODUCTION
- Previous Papers published
- Morse, E. Lewis, M. (1997).
- Why information retrieval visualizations
sometimes fail, in Proceeding of the 1997 IEEE
International Conference on Systems, Man, and
Cybernetics, Oct. 12-15, Orlando, FL - Morse, E., Lewis, M., Korfhage, R., and Olsen, K.
(1998). Evaluation of text, numeric and
graphical presentations for information retrieval
interfaces User preference and task performance
measures. Proceedings of the 1998 IEEE
International Conference on Systems, Man, and
Cybernetics, Oct 12-14, San Diego, CA, 1026-1031 -
4Information Retrieval Visualization Systems
- Bead (Chalmers, 1996)
- InfoCrystal (Spoerri, 1993)
- BIRD
- GUIDO
- VIBE
-
- These have been developed as visual information
exploration tools to aid in retrieval tasks.
5In TILE BARS (Hearts, 1995)
- Paragraphs on X-axis
- Query items on Y-axis
- Each query term tile is shaded according to how
well the paragraph matches the query term. - By glancing the Tile Bar a user can see which
query terms match, most relevant sections,
distribution and coincidence of topics throughout
the document.
6In VIBE
- VIBE represents query terms as moveable circles
with documents as variously sized rectangles
suspended between them
7VISUAL AND DOMAIN TASKS
- Basic Forms
- Map Systems
- Dimensions Reference Point Systems
- Visualization Types
8Task Classification of WEHREND LEWIS
- Locate
- Identify
- Distinguish
- Categorize
- Cluster
- Distribution
- Rank
- Compare between relations
- Associate
- Correlate
9ZHOU FEINER Visual Task Taxonomy
10METHODOLOGY
- Dependent Variables
- Number of correct answers.
- Time to completion of a task set.
- Independent Variables
- Display Type
- Order of Presentation
- Individual Task
- Scenario Difficulty
- 195 subjects undertook the study using web
- 2 term or 3 term test randomly
11PROTOTYPE
12Generating Experimental Tasks
- Sample as broadly as possible rather than deeply
- Select tasks whose parameter lists varied
significantly
132-Term Test
2.1 Are there more documents that contain ONLY
the term Romania or ONLY the term Czechoslovakia?
2.2 Which is the most frequent key term in
this set of documents? A. Oil B. York 2.3
One of the documents is unlike any of the others.
Can you identify it? Place the document number in
the text box. 2.4 Rank documents A, B, and C
with respect to the amount of term Soviet that
they contain 2.5 Which of the following
documents are most similar with respect to the
relative amount of the key terms? 2.6 What of
the following statements is true? Â Â Â Â Â Â Â Â Â Â Â Â
A. There are no documents that contain roughly
equal amounts for the two terms.
             B. If a document talks about
Oil then it also talks about Texas.
             C. Texas and Oil are not very
highly related. Â Â Â Â Â Â Â Â Â Â Â Â D. A and
C Â Â Â Â Â Â Â Â Â Â Â Â Â E. All of the above 2.7
Location
143-Term Test
3-1.  Are there more documents that contain
ONLY the term earthquake or ONLY the term
California or ONLY the term death? 3-2. Â
Which is the most frequent key term in this set
of documents? A. Vatican B. Embassy C.
Noriega 3-3.  One of the documents is unlike any
of the others. Can you identify it? Place the
document number in the text box. 3-4.  Rank
documents A, B, and C with respect to the amount
of term Company that they contain. 3-5.  Which
of the following documents are most similar with
respect to the relative amount of the key terms?
3-6.  Which of the following statements is
true? A. At least one document contains all
three terms. B. At least one document contains
the terms Arab and bomb. C. Vatican and
Arab are not very highly related. D. B and
C E. All of the above. 3-7.  Location
15Evaluating visualizations using a taxonomic guide
- Results subjects
- 1. Subjects
- No significant differences between the studies
for any of these variables. - Mean age in the 2- and 3-term studies was 23.2
and 23.6 years. - The results show that the skill level of subjects
of the 2- and 3-term groups were no significant
differences.
1.Gender
2.Current educational level
3.Native language
- No significant differences between the studies
for any of these variables. - The results show that the skill level of subjects
of the 2- and 3-term groups were no significant
differences.
16Evaluating visualizations using a taxonomic guide
- Results time to completion
- 2. For the 2-term study
- Significant differences among the display types
with respect to completion time (plt0.001).
17Evaluating visualizations using a taxonomic guide
- Results time to completion
- 2. For the 2-term study
- Significant differences among the display types
with respect to completion time (plt0.001). - Using spring as pivot case, all of the other
display types are shown to take a significantly
longer time in order to complete the task.
18Evaluating visualizations using a taxonomic guide
- Results time to completion
- 2. For the 3-term study
- The ANOVA shows that the four displays were
significantly different (plt0.001). - Using spring as the pivot case, the completion
time is highly different from each of the other
displays.
Within-subjects contrasts for 3-term display
19Evaluating visualizations using a taxonomic guide
- Results time to completion
- 2. Analysis by pair-wise contrasts
- The word and table displays were roughly
equivalent in terms of speed of performance. - The icon display was faster.
- The spring display was fastest.
20Evaluating visualizations using a taxonomic guide
Results time to completion 2. Comparison
across study types (2- and 3-term)
Effect of display type on time to complete task
set 2-term vs. 3-term
21Evaluating visualizations using a taxonomic guide
Results time to completion 2. Comparison
across study types (2- and 3-term)
- Between-subjects factor.
- For the word, icon, and table displays, the
subjects required more time in the 3-term
conditions in order to complete the tasks than
the corresponding 2-term conditions.
Effect of display type on time to complete task
set 2-term vs. 3-term
22Evaluating visualizations using a taxonomic guide
Results time to completion 2. Comparison
across study types (2- and 3-term)
- Between-subjects factor.
- For the word, icon, and table displays, the
subjects required more time in the 3-term
conditions in order to complete the tasks than
the corresponding 2-term conditions. - The spring display did not achieve significance
(p0.086).
Effect of display type on time to complete task
set 2-term vs. 3-term
23Evaluating visualizations using a taxonomic guide
3. Correctness of answers
- Results correctness of answers
- Second method of assessing performance.
24Evaluating visualizations using a taxonomic guide
3. Correctness of answers
- Results correctness of answers
- Second method of assessing performance.
- Word display shows a lower number of correct
answers than the other displays (pair-wise
comparisons all plt0.001).
25Evaluating visualizations using a taxonomic guide
3. Correctness of answers
- Results correctness of answers
- Second method of assessing performance.
- Word display shows a lower number of correct
answers than the other displays (pair-wise
comparisons all plt0.001). - No significant differences in number of correct
answers between the 2-term and 3-term studies.
26Evaluating visualizations using a taxonomic guide
4. Order effect (time performance)
- Results order of presentation
- The order of presentation of the display type was
randomized. - Poorer performance when the display was presented
first in the series. - Progressive decreases in the time of the
subsequent trials.
2-term study
3-term study
27Evaluating visualizations using a taxonomic guide
4. Order effect (correctness of answers)
- Results order of presentation
28Evaluating visualizations using a taxonomic guide
4. Order effect (correctness of answers)
- Results order of presentation
- There was no significant effect of the
presentation order on performance as measured by
the correctness of answers.
29Evaluating visualizations using a taxonomic guide
Results performance with respect to task
types
- 5. Performance with respect to task types
- Associate, identify and rank task
- were performed in very short time periods and
- associated with a very high fraction of
correct - answers.
30Evaluating visualizations using a taxonomic guide
- 5. Performance with respect to task types
- Associate, identify and rank task
- were performed in very short time periods and
- associated with a very high fraction of
correct - answers.
- Cluster, locate, and some of the compare
tasks - were took significantly longer to perform and
- have high fraction of error.
Results performance with respect to task
types
31Evaluating visualizations using a taxonomic guide
Results preferences 6. Preferences (for both
the 2- and 3-term studies)
32Evaluating visualizations using a taxonomic guide
Results preferences 6. Preferences (for both
the 2- and 3-term studies)
analysis showed that no relationship between time
completion and preferences.
- analysis showed that no relationship between time
completion and preferences.
33Evaluating visualizations using a taxonomic guide
Results preferences 6. Preferences (for both
the 2- and 3-term studies)
- analysis showed that no relationship between time
completion and preferences. - However, there was a correlation between
correctness and preferences.
- analysis showed that no relationship between time
completion and preferences.
34Evaluating visualizations using a taxonomic guide
Results preferences 6. Preferences (for both
the 2- and 3-term studies)
- analysis showed that no relationship between time
completion and preferences. - However, there was a correlation between
correctness and preferences. - In the non-parametric analysis,
- no correlation between the position in which
any display was seen and any positional ranking
assigned by the subjects.
- analysis showed that no relationship between time
completion and preferences. - However, there was a correlation between
correctness and preferences.
- analysis showed that no relationship between time
completion and preferences.
35Evaluating visualizations using a taxonomic guide
- The word and text displays were always
associated with poor time performance.
(preliminary studies reported earlier) - Spring display is superior in producing quick
responses. - A visual taxonomy promises to be a useful guide
for developing visual interfaces in general and
IR interfaces in particular.
36Evaluating visualizations using a taxonomic guide
- Based on the technique of back-to-basics
strategy, the visualization techniques themselves
were tested, but not the systems. - The studies show that the spring and icon
displays can provide an efficient and effective
way to present information. - The technique of asking questions could be
redesigned in order to improve the Internal
validity. - ---- END ----
37Evaluating visualizations using a taxonomic guide
Q A
38Evaluating visualizations using a taxonomic guide
39Evaluating visualizations using a taxonomic guide
4. Order effect (time performance)
- Results order of presentation
- Statistical analysis show that the first point
was different from the others.
2-term study
3-term study
40Evaluating visualizations using a taxonomic guide
4. Order effect (time performance)
- Results order of presentation
- Statistical analysis show that the first point
was different from the others. - However, the subsequent presentations were not
different from each other.
2-term study
3-term study
41Evaluating visualizations using a taxonomic guide
4. Order effect (time performance)
- Results order of presentation
- The slopes of the lines are initially steeper.
- The spring display appears to be more flattened
than the other curves.
2-term study
3-term study
42Evaluating visualizations using a taxonomic guide
- 4. Order effect (correctness of answers)
- There was no significant effect of the
presentation order on performance as measured by
the correctness of answers. - Spring display is the only display that is not
influenced by the increased complexity of the
3-term conditions.
- Results order of presentation
43Evaluating visualizations using a taxonomic guide
Results performance with respect to task
types
44Evaluating visualizations using a taxonomic guide
Results performance with respect to task
types
- For paired contrasts, using first question
(compare) as the pivot group
45Evaluating visualizations using a taxonomic guide
Results performance with respect to task
types
- For paired contrasts, using first question
(compare) as the pivot group - both performance measures (completion time and
correct answers) showed a significant difference
for each pair of values
Completion time
Correctness
46Evaluating visualizations using a taxonomic guide
Results performance with respect to task
types
- For paired contrasts, using first question
(compare) as the pivot group - both performance measures (completion time and
correct answers) showed a significant difference
for each pair of values - EXCEPT
- 1.for the Distinguish question for time and
-
Completion time
47Evaluating visualizations using a taxonomic guide
Results performance with respect to task
types
- For paired contrasts, using first question
(compare) as the pivot group - both performance measures (completion time and
correct answers) showed a significant difference
for each pair of values - EXCEPT
- 1.for the Distinguish question for time and
- 2.for the Rankquestion for correctness.
Correctness