Title: When/How/Why%20to%20use%20Grouping/Categorizing/Clustering%20in%20Search%20Interfaces
1When/How/Why to use Grouping/Categorizing/Clusteri
ng in Search Interfaces
Marti Hearst January 21, 2005
2Main Points
- Grouping search results is desirable
- However, getting good groups is difficult
- Furthermore, incorporation of groups into
interfaces has not been done well - Good news improvements are happening
3Talk Outline
- Definition of categories and clusters
- Studies showing failure of clustering in
interfaces - New developments in results grouping
4The Need to Group
- Interviews with lay users often reveal a desire
for better organization of retrieval results - Useful for suggesting where to look next
- People prefer links over generating search terms
- But only when the links are for what they want
- Three main approaches for text and images
- Group items according to pre-defined categories
- Group items into automatically-created clusters
- Group items according to common keywords (new!)
Ojakaar and Spool, Users Continue After Category
Links, UIETips Newsletter, http//world.std.com/u
ieweb/Articles/, 2001
5Categories
- Human-created
- But often automatically assigned to items
- Arranged in hierarchy, network, or facets
- Can assign multiple categories to items
- Or place items within categories
- Usually restricted to a fixed set
- So help reduce the space of concepts
- Intended to be readily understandable
- To those who know the underlying domain
- Provide a novice with a conceptual structure
- There are many already made up!
- However, until recently, their use in interfaces
has been - Under-investigated
- Not met their promise
6Clustering
- The art of finding groups in data
- Kaufman and Rousseeuw
- Groups are formed according to associations and
commonalities among the datas features. - There are dozens of algorithms, more all the time
- Most need a way of determining similarity or
difference between a pair of items - In text clustering, documents usually represented
as a vector of weighted features which are some
transformation on the words - Similarity between documents is a weighted
measure of feature overlap
7Clustering
- Potential benefits
- Find the main themes in a set of documents
- Potentially useful if the user wants a summary of
the main themes in the subcollection - Potentially harmful if the user is interested in
less dominant themes - More flexible than pre-defined categories
- There may be important themes that have not been
anticipated - Disambiguate ambiguous terms
- ACL
- Clustering retrieved documents tends to group
those relevant to a complex query together
Hearst, Pedersen, Revisiting the Cluster
Hypothesis, SIGIR96
8Scatter/Gather Clustering
- Developed at PARC in the late 80s/early 90s
- Top-down approach
- Start with k seeds (documents) to represent k
clusters - Each document assigned to the cluster with the
most similar seeds - To choose the seeds
- Cluster in a bottom-up manner
- Hierarchical agglomerative clustering
- Start with n documents, compare all by pairwise
similarity, combine the two most similar
documents to make a cluster - Now compare both clusters and individual
documents to find the most similar pair to
combine - Continue until k clusters remain
- Use the centroid of each of these as seeds
- Centroid average of the weighted vectors
- Can recluster a cluster to produce a hierarchy of
clusters
Pedersen, Cutting, Karger, Tukey, Scatter/Gather
A Cluster-based Approach to Browsing Large
Document Collections, SIGIR 1992
9Clustering ExampleMedical Text
- Query mastectomy on a breast cancer collection
- 250 documents retrieved
- Summary of cluster themes (subjective)
- prophylactic mastectomy (preventative)
- prostheses and reconstruction
- conservative vs radical surgery
- side effects of surgery
- psychological effects of surgery
- The first two clusters found themes for which
there was no corresponding MESH category
Hearst, The Use of Categories and Clusters for
Organizing Retrieval Results, in Natural Language
Information Retrieval, Kluwer, 1999
10A Clustering Failure
- Query implant and prosthesis
- Four clusters returned
- use of implants to administer radiation dosages
- complications resulting from breast implants
- other issues surrounding breast implants
- other kinds of prostheses
- Reclustering clusters 2 and 3 does not find
cohesive subgroups - An examination of the documents indicates that a
valid subdivision was possible - type of surgical procedure
- risk factors
- This seems to happen when there are too many
features in common - Perhaps a better clustering algorithm can help in
this case -
11Clustering Interface Problems
- Big problem
- Clusters used primarily as part of a
visualization - This just doesnt work
- Every usability study says so
- Lots of dots scattered about the screen is
meaningless to users - There is no inherent spatial relationship among
the documents - Need text to understand content
- Another big problem
- Clustering images according to an approximation
of visual similarity - This just doesnt work
- What limited studies have been done say so
- Instead group according to textual categories
12Visualizing Clustering Results
- Use clustering to map the entire huge
multidimensional document space into a huge
number of small clusters. - User dimension reduction and then project these
onto a 2D/3D graphical representation
13Clustering Multi-Dimensional Document
Space(image from Wise et al 95)
14Clustering Multi-Dimensional Document
Space(image from Wise et al 95)
15Kohonen Feature Maps on Text(from Chen et al.,
JASIS 49(7))
16Is it useful?
- 4 Clustering Visualization Usability Studies
17Clustering for Search Study 1
-
- This study compared
- a system with 2D graphical clusters
- a system with 3D graphical clusters
- a system that shows textual clusters
- Novice users
- Only textual clusters were helpful (and they were
difficult to use well)
Kleiboemer, Lazear, and Pedersen. Tailoring a
retrieval system for naive users. SDAIR96
18Clustering Study 2 Kohonen Feature Maps
- Comparison Kohonen Map and Yahoo
- Task
- Window shop for interesting home page
- Repeat with other interface
- Results
- Starting with map could repeat in Yahoo (8/11)
- Starting with Yahoo unable to repeat in map (2/14)
Chen, Houston, Sewell, Schatz, Internet Browsing
and Searching User Evaluations of Category Map
and Concept Space Techniques. JASIS 49(7)
582-603 (1998)
19Kohonen Feature Maps(Lin 92, Chen et al. 97)
20Study 2 (cont.)
- Participants liked
- Correspondence of region size to documents
- Overview (but also wanted zoom)
- Ease of jumping from one topic to another
- Multiple routes to topics
- Use of category and subcategory labels
Chen, Houston, Sewell, Schatz, Internet Browsing
and Searching User Evaluations of Category Map
and Concept Space Techniques. JASIS 49(7)
582-603 (1998)
21Study 2 (cont.)
- Participants wanted
- hierarchical organization
- other ordering of concepts (alphabetical)
- integration of browsing and search
- correspondence of color to meaning
- more meaningful labels
- labels at same level of abstraction
- fit more labels in the given space
- combined keyword and category search
- multiple category assignment (sportsentertain)
- (These can all be addressed with faceted
hierarchical categories)
Chen, Houston, Sewell, Schatz, Internet Browsing
and Searching User Evaluations of Category Map
and Concept Space Techniques. JASIS 49(7)
582-603 (1998)
22Clustering Study 3 NIRVE
- Each rectangle is a cluster. Larger clusters
closer to the pole. Similar clusters near one
another. Opening a cluster causes a projection
that shows the titles.
23Study 3
- This study compared
- 3D graphical clusters
- 2D graphical clusters
- textual clusters
- 15 participants, between-subject design
- Tasks
- Locate a particular document
- Locate and mark a particular document
- Locate a previously marked document
- Locate all clusters that discuss some topic
- List more frequently represented topics
Visualization of search results a comparative
evaluation of text, 2D, and 3D interfaces
Sebrechts, Cugini, Laskowski, Vasilakis and
Miller, SIGIR 99.
24Study 3
- Results (time to locate targets)
- Text clusters fastest
- 2D next
- 3D last
- With practice (6 sessions) 2D neared text
results 3D still slower - Computer experts were just as fast with 3D
- Certain tasks equally fast with 2D text
- Find particular cluster
- Find an already-marked document
- But anything involving text (e.g., find title)
much faster with text. - Spatial location rotated, so users lost context
- Helpful viz features
- Color coding (helped text too)
- Relative vertical locations
Visualization of search results a comparative
evaluation of text, 2D, and 3D interfaces
Sebrechts, Cugini, Laskowski, Vasilakis and
Miller, SIGIR 99.
25Clustering Study 4
- Compared several factors
- Findings
- Topic effects dominate (this is a common finding)
- Strong difference in results based on spatial
ability - No difference between librarians and other people
- No evidence of usefulness for the cluster
visualization -
Aspect windows, 3-D visualizations, and indirect
comparisons of information retrieval systems,
Swan, Allan, SIGIR 1998.
26SummaryVisualizing for Search Using Clusters
- Huge 2D maps may be inappropriate focus for
information retrieval - cannot see what the documents are about
- space is difficult to browse for IR purposes
- (tough to visualize abstract concepts)
- Perhaps more suited for pattern discovery and
gist-like overviews
27Clustering Algorithm Problems
- Doesnt work well if data is too homogenous or
too heterogeneous - Often is difficult to interpret quickly
- Automatically generated labels are unintuitive
and occur at different levels of description - Often the top-level can be ok, but the subsequent
levels are very poor - Need a better way to handle items that fall into
more than one cluster
28How do people want to search and browse images?
- Ethnographic studies of people who use images
intensely find - Find specific objects is easy
- Find images of the Empire State Building
- Browsing is hard
- In a usability study with architects, to our
surprise we found their response to an
image-browsing interface mock-up was they wanted
to see more text (categories).
Elliott, A. (2001). "Flamenco Image Browser
Using Metadata to Improve Image Search During
Architectural Design," in the Proceedings of CHI
2001.
29An Alternative
- In the Flamenco project, we have shown that
hierarchical faceted metadata, paired with a good
interface, is highly effective for browsing image
collections - Flamenco.berkeley.edu
- (But thats a different talk)
30Study 5 Comparing Textual Cluster Interfaces to
Category Interfaces
- DynaCat system
- Decide on important question types in an advance
- What are the adverse effects of drug D?
- What is the prognosis for treatment T?
- Make use of MeSH categories
- Retain only those types of categories known to be
useful for this type of query.
Pratt, W., Hearst, M, and Fagan, L. A
Knowledge-Based Approach to Organizing Retrieved
Documents. AAAI-99
31DynaCat Interface
Pratt, W., Hearst, M, and Fagan, L. A
Knowledge-Based Approach to Organizing Retrieved
Documents. AAAI-99
32DynaCat Study
- Design
- Three queries
- 24 cancer patients
- Compared three interfaces
- ranked list, clusters, categories
- Results
- Participants strongly preferred categories
- Participants found more answers using categories
- Participants took same amount of time with all
three interfaces
Pratt, W., Hearst, M, and Fagan, L. A
Knowledge-Based Approach to Organizing Retrieved
Documents. AAAI-99
33Study 6 Categories vs. Lists
- One study found users preferred one level of
categories over lists, and were faster at finding
answers - Only 13 top-level categories shown
- Secondary-level categories not very accurate
- However, the queries appeared to be somewhat
setup to optimize the usefulness of the clusters - Example
- Query word indian
- Task find indian motorcyles
- Query alaska
- Task find yatching adventures in alaska
Chen, Dumais, Bringing order to the web
Automatically categorizing search results. CHI
2000
34What about Textual Displays of Clusters?
- Text-based clustering is more promising
- Text-based clustering on the Web
- In the early days, Excite had a mockup on about
10 documents that pretended to do Scatter/Gather
(when it was called Architext) - Quickly removed it and started providing standard
search - For a while NorthernLight had a clustering
interface - Didnt really get anywhere
- The latest entry is Vivisimo
- Has a lot of problems
- BUT theres a new development from Vivisimo
called Clusty - Seems to have much improved clustering and
interface
35An Analysis of Vivisimo
- Query barcelona
- Query dog pregnancy
36(No Transcript)
37(No Transcript)
38(No Transcript)
39An Analysis of Vivisimo
- Query barcelona
- Hotels and Travel Guide are both at top level
- Also, Barcelona City
- But Travel Guide contains
- Hotels
- Spain, Spanish
- Not really helping to make useful distinctions
40(No Transcript)
41(No Transcript)
42An Analysis of Vivisimo
- Query pregnant dog
- What does the category pregnant mean here?
- Why does it have a subcategory of whelping, when
there is also a main category of whelping? - And what the relationship to Pregnancy and Birth
- The pages shown dont seem strongly related to
one another - How to followup?
- There is a find in clusters box, but not very
helpful because no hints about which words might
work
43Search within Results
44Then along came Clusty
- Announced a few months ago
- Produced by Vivisimo
- Much better interface
- Much better clusters
45(No Transcript)
46(No Transcript)
47(No Transcript)
48(No Transcript)
49(No Transcript)
50Clusty Improvements
- Labels tend to be more at the same level of
description - Subcategories are more cautious, reflecting
groups of very similar documents - Do a better job of really showing subcategories
- Nice interface touches
- Better use of color for distinguishing
- Small icons are inviting
- Incorporation of encyclopedia results high up
- Search results are better
- (Not always pregnant dog not much better)
- Using metasearch
- May be throwing out some docs to get more
distribution in the types of results found - Looks like they are focusing on term proximity to
get more meaningful grouping - Dont allow very many results
51(No Transcript)
52(No Transcript)
53(No Transcript)
54Clusty Improvements
- Doing sense disambiguation for abbreviations like
ACL - However, no good followup for how to make use of
this - E.g., to search on ACL (meaning comp ling) plus
some other concepts - On the other hand, using multiple terms is how
most disambiguation is done now - ACL disambiguation
- Jaguar prey
- So not clear if there is a net benefit
- Trying to approximate faceted queries
- Under Jaguar query, for history, show both
history of band with history of car and video
game
55(No Transcript)
56Analysis
- Is it really helping? Or are the categories now
too general and overlapping? - The main effect seems to be that the search
results are better due to the metasearch and term
proximity
57(No Transcript)
58More Analysis
- Reflects the frequency of topics in the data
- So no discussion of nukes in the Spain categories
- No discussion of hotels in the North Korea
categories - Is this good or bad? It depends.
59Brand New Results!!
- Mika Kaki Findex Search Result Categories Help
Users when Document Ranking Fails - To appear at CHI in April
- Two innovations
- Used very simple method to create the groupings,
so that it is not opaque to users - Based on frequent keywords
- Allows docs to appear in multiple categories
- Did a naturalistic, longitudinal study of use
- Other things done correctly
- Took care to ensure good response time
- Analyzed the results in interesting ways
60(No Transcript)
61(No Transcript)
62Study Design
- 16 academics
- 8F, 8M
- No CS
- Frequent searchers
- 2 months of use
- Special Log
- 3099 queries issued
- 3232 results accessed
- Two surveys (start and end)
- Google as search engine rank order retained
63Key Findings (all significant)
- Category use takes almost 2 times longer
- First doc selected in 24.4 sec vs 13.7 sec
- No difference in average number of docs opened
per search (1.05 vs. 1.04) - However, when categories used, users select gt1
doc in 28.6 of the queries (vs 13.6) - Num of searches without 0 result selections is
lower when the categories are used - Median position of selected doc when
- Using categories 22 (sd38)
- Just ranking 2 (sd8.6)
64(No Transcript)
65Key Findings
- Category Selections
- 1915 categories selections in 817 searches
- Used in 26.4 of the searches
- During the last 4 weeks of use, the proportion of
searches using categories stayed above the
average (27-39) - When categories used, selected 2.3 cats on
average - Labels of selected cats used 1.9 words on average
(average in general was 1.4 words) - Out of 15 cats (default)
- First quartile at 2nd cat
- Median at 5th
- Third quartile at 9th
66Survey Results
- Qualitative views improved over time
- Realization that categories useful only some of
the time - Freeform responses indicate that categories
useful when queries vague, broad or ambiguous - Second survey indicated that people felt that
their search habits began to change - Consider query formulation less than before (27)
- Use less precise search terms (45)
- Use less time to evaluate results (36)
- Use categories for evaluating results (82)
67(No Transcript)
68Conclusions from Kaki Study
- Simplicity of category assignment made groupings
understandable - (my view, not stated by them)
- Keyword-based Categories
- Are beneficial when result ranking fails
- Find results lower in the ranking
- Reduce empty results
- May make it easier to access multiple results
- Availability changed user querying behavior
69Summary
- Grouping search results is desirable
- Often requested by lay users
- Very positive results for category interface
- However, till recently getting good groups is
difficult - Two main approaches
- Predefined category sets too hard to get,
doesnt reflect data - Automatically created clusters too hard to
understand - An alternative
- Frequent keywords, overlapping categories
- Findex, and Clusty
- Finally, a believable, well-done study of
category use for search results reveals some
insight! - Not always useful, but not harmful if
understandable (my assertion) and fast - Useful in the situations we have surmised
- Interesting result people change behavior.
70More Recent Attempts
- Analyzing retrieval results
- KartOO http//www.kartoo.com/
- Grokker http//www.groxis.com/service/grok
71(No Transcript)
72(No Transcript)
73(No Transcript)
74(No Transcript)
75References
- Chen, Houston, Sewell, and Schatz, JASIS 49(7)
- Chen and Yu, Empirical studies of information
visualization a meta-analysis, IJHCS 53(5),2000 - Dumais, Cutrell, Cadiz, Jancke, Sarin and
Robbins, Stuff I've Seen A system for personal
information retrieval and re-use. SIGIR 2003. - Hearst, English, Sinha, Swearingen, Yee. Finding
the Flow in Web Site Search, CACM 45(9), 2002. - Hearst, User Interfaces and Visualization,
Chapter 10 of Modern Information Retrieval,
Baeza-Yates and Rebeiro-Nato (Eds),
Addison-Wesley 1999. - Johnson, Manning, Hagen, and Dorsey. Specialize
Your Site's Search. Forrester Research, (Dec.
2001), Cambridge, MA
76References
- Sebrechts, Cugini, Laskowski, Vasilakis and
Miller, Visualization of search results a
comparative evaluation of text, 2D, and 3D
interfaces, SIGIR 99. - Swan and Allan, Aspect windows, 3-D
visualizations, and indirect comparisons of
information retrieval systems, SIGIR 1998. - Yee, Swearingen, Li, Hearst, Faceted Metadata for
Image Search and Browsing, Proceedings of CHI 2003