The Failure of Clustering in Search Interfaces - PowerPoint PPT Presentation

About This Presentation
Title:

The Failure of Clustering in Search Interfaces

Description:

The Failure of Clustering in Search Interfaces or When/How/Why Clustering can be Successful in Search Interfaces Marti Hearst UC Berkeley Oct 6, 2004 – PowerPoint PPT presentation

Number of Views:172
Avg rating:3.0/5.0
Slides: 99
Provided by: peopleIsc4
Category:

less

Transcript and Presenter's Notes

Title: The Failure of Clustering in Search Interfaces


1
The Failure of Clustering in Search Interfaces
orWhen/How/Why Clustering can be Successful
in Search Interfaces
Marti Hearst UC Berkeley Oct 6, 2004
http//www.sims.berkeley.edu/hearst
2
Main Points
  • Grouping search results is desirable
  • However, getting good groups is difficult
  • Furthermore, incorporation of groups into
    interfaces has not been done well
  • Good news improvements are happening

3
Talk Outline
  • Why search interfaces are difficult to define
  • Definition of categories and clusters
  • Studies showing failure of clustering in
    interfaces
  • A new development in clustering in web search
  • How to remedy these problems

4
Clustering Interface Problems
  • Big problem
  • Clusters used primarily as part of a
    visualization
  • This just doesnt work
  • Every usability study says so
  • Lots of dots scattered about the screen is
    meaningless to users
  • There is no inherent spatial relationship among
    the documents
  • Need text to understand content
  • Another big problem
  • Clustering images according to an approximation
    of visual similarity
  • This just doesnt work
  • What limited studies have been done say so
  • Instead group according to textual categories

5
Search interfaces are difficult to design
  • Content and queries are hugely varying
  • The scope of what people search for is all of
    human knowledge and experience (!)
  • Interfaces must accommodate human differences in
  • Knowledge / life experience
  • Cultural background and expectations
  • Reading / scanning ability and style
  • Methods of looking for things (pilers vs. filers)

6
Abstractions Are Difficult to Represent
  • Text describes abstract concepts
  • Difficult to show the contents of text in a
    visual or compact manner
  • Exercise
  • How would you show the preamble of the US
    Constitution visually?
  • How would you show the contents of Joyces
    Ulysses visually? How would you distinguish it
    from Homers The Odyssey or McCourts Angelas
    Ashes?
  • The point it is difficult to show text without
    using text

7
Lack of Technical Understanding
  • Most people dont understand the underlying
    methods by which search engines work.
  • Without appropriate explanations, most of 14
    people had strong misconceptions about
  • ANDing vs ORing of search terms
  • Some assumed ANDing search engine indexed a
    smaller collection most had no explanation at
    all
  • For empty results for query to be or not to be
  • 9 of 14 could not explain in a method that
    remotely resembled stop word removal
  • For term order variation boat fire vs. fire
    boat
  • Only 5 out of 14 expected different results

Muramatsu Pratt, Transparent Queries
Investigating Users Mental Models of Search
Engines, SIGIR 2001.
8
Other Issues
  • Vocabulary Disconnect
  • If you ask a set of people to describe a set of
    things there is little overlap in the results.
  • If one person assigns a name, the probability of
    it NOT matching with another persons is about
    80
  • It is difficult to represent content compactly
  • Small details matter
  • People are reluctant to change search interfaces

Furnas, et al The Vocabulary Problem in
Human-System Communication. Commun. ACM 30(11)
964-971 (1987)
9
The Need to Group
  • Interviews with lay users often reveal a desire
    for better organization of retrieval results
  • Useful for suggesting where to look next
  • People prefer links over generating search terms
  • But only when the links are for what they want
  • Two main approaches for text and images
  • Group items according to pre-defined categories
  • Group items into automatically-created clusters

Ojakaar and Spool, Users Continue After Category
Links, UIETips Newsletter, http//world.std.com/u
ieweb/Articles/, 2001
10
Categories
  • Human-created
  • But often automatically assigned to items
  • Arranged in hierarchy, network, or facets
  • Can assign multiple categories to items
  • Or place items within categories
  • Usually restricted to a fixed set
  • So help reduce the space of concepts
  • Intended to be readily understandable
  • To those who know the underlying domain
  • Provide a novice with a conceptual structure
  • There are many already made up!
  • However, until recently, their use in interfaces
    has been
  • Under-investigated
  • Not met their promise

11
Category System Examples
12
Category System Examples
13
Category System Examples
eat.epicurious.com
14
Category System Examples
eat.epicurious.com
15
Example of Faceted MetadataMedical Subject
Headings (MeSH)
  • Facets
  • 1. Anatomy A
  • 2. Organisms B
  • 3. Diseases C
  • 4. Chemicals and Drugs D
  • 5. Analytical, Diagnostic and Therapeutic
    Techniques and Equipment E
  • 6. Psychiatry and Psychology F
  • 7. Biological Sciences G
  • 8. Physical Sciences H
  • 9. Anthropology, Education, Sociology and
    Social Phenomena I
  • 10. Technology and Food and Beverages J
  • 11. Humanities K
  • 12. Information Science L
  • 13. Persons M
  • 14. Health Care N
  • 15. Geographic Locations Z

16
Each Facet Has Hierarchy
  • 1. Anatomy A Body Regions A01
  • 2. B
    Musculoskeletal System A02
  • 3. C Digestive
    System A03
  • 4. D Respiratory
    System A04
  • 5. E Urogenital
    System A05
  • 6. F
  • 7. G
  • 8. Physical Sciences H
  • 9. I
  • 10. J
  • 11. K
  • 12. L
  • 13. M

17
Clustering
  • The art of finding groups in data
  • Kaufman and Rousseeuw
  • Groups are formed according to associations and
    commonalities among the datas features.
  • There are dozens of algorithms, more all the time
  • Most need a way of determing similarity or
    difference between a pair of items
  • In text clustering, documents usually represented
    as a vector of weighted features which are some
    transformation on the words
  • Similarity between documents is a weighted
    measure of feature overlap

18
Clustering
  • Potential benefits
  • Find the main themes in a set of documents
  • Potentially useful if the user wants a summary of
    the main themes in the subcollection
  • Potentially harmful if the user is interested in
    less dominant themes
  • More flexible than pre-defined categories
  • There may be important themes that have not been
    anticipated
  • Disambiguate ambiguous terms
  • ACL
  • Clustering retrieved documents tends to group
    those relevant to a complex query together

Hearst, Pedersen, Revisiting the Cluster
Hypothesis, SIGIR96
19
Scatter/Gather Clustering
  • Developed at PARC in the late 80s/early 90s
  • Top-down approach
  • Start with k seeds (documents) to represent k
    clusters
  • Each document assigned to the cluster with the
    most similar seeds
  • To choose the seeds
  • Cluster in a bottom-up manner
  • Hierarchical agglomerative clustering
  • Start with n documents, compare all by pairwise
    similarity, combine the two most similar
    documents to make a cluster
  • Now compare both clusters and individual
    documents to find the most similar pair to
    combine
  • Continue until k clusters remain
  • Use the centroid of each of these as seeds
  • Centroid average of the weighted vectors
  • Can recluster a cluster to produce a hierarchy of
    clusters

Pedersen, Cutting, Karger, Tukey, Scatter/Gather
A Cluster-based Approach to Browsing Large
Document Collections, SIGIR 1992
20
(No Transcript)
21
The Scatter/Gather Interface
22
S/G Example query on star
  • Encyclopedia text
  • 14 sports
  • 8 symbols 47 film, tv
  • 68 film, tv (p) 7 music
  • 97 astrophysics
  • 67 astronomy(p) 12 steller phenomena
  • 10 flora/fauna 49 galaxies, stars
  • 29 constellations
  • 7 miscelleneous
  • Clustering and re-clustering is entirely
    automated

23
(No Transcript)
24
(No Transcript)
25
S/G Example query on star
  • Newspaper/Magazine text
  • 22 products / business
  • 41 software / computers 35 hollywood
  • 58 restaurants / food (reviews) 54
    astronomers/movies
  • 98 movies / tv (reviews) 9 film mini-reviews
  • 31 wall street / finance
  • Topics quite different from encyclopedia text

26
Two Queries Two Clusterings
The main differences are the clusters that are
central to the query
27
Clustering ExampleMedical Text
  • Query mastectomy on a breast cancer collection
  • 250 documents retrieved
  • Summary of cluster themes (subjective)
  • prophylactic mastectomy (preventative)
  • prostheses and reconstruction
  • conservative vs radical surgery
  • side effects of surgery
  • psychological effects of surgery
  • The first two clusters found themes for which
    there was no corresponding MESH category

Hearst, The Use of Categories and Clusters for
Organizing Retrieval Results, in Natural Language
Information Retrieval, Kluwer, 1999
28
A Clustering Failure
  • Query implant and prosthesis
  • Four clusters returned
  • use of implants to administer radiation dosages
  • complications resulting from breast implants
  • other issues surrounding breast implants
  • other kinds of prostheses
  • Reclustering clusters 2 and 3 does not find
    cohesive subgroups
  • An examination of the documents indicates that a
    valid subdivision was possible
  • type of surgical procedure
  • risk factors
  • This seems to happen when there are too many
    features in common
  • Perhaps a better clustering algorithm can help in
    this case

29
Clustering Algorithm Problems
  • Doesnt work well if data is too homogenous or
    too heterogeneous
  • Often is difficult to interpret quickly
  • Automatically generated labels are unintuitive
    and occur at different levels of description
  • Often the top-level can be ok, but the subsequent
    levels are very poor
  • Need a better way to handle items that fall into
    more than one cluster

30
Visualizing Clustering Results
  • Use clustering to map the entire huge
    multidimensional document space into a huge
    number of small clusters.
  • User dimension reduction and then project these
    onto a 2D/3D graphical representation

31
Clustering Multi-Dimensional Document
Space(image from Wise et al 95)
32
Clustering Multi-Dimensional Document
Space(image from Wise et al 95)
33
Kohonen Feature Maps on Text(from Chen et al.,
JASIS 49(7))
34
Is it useful?
  • 4 Clustering Visualization Usability Studies

35
Clustering for Search Study 1
  • This study compared
  • a system with 2D graphical clusters
  • a system with 3D graphical clusters
  • a system that shows textual clusters
  • Novice users
  • Only textual clusters were helpful (and they were
    difficult to use well)

Kleiboemer, Lazear, and Pedersen. Tailoring a
retrieval system for naive users. SDAIR96
36
Clustering Study 2 Kohonen Feature Maps
  • Comparison Kohonen Map and Yahoo
  • Task
  • Window shop for interesting home page
  • Repeat with other interface
  • Results
  • Starting with map could repeat in Yahoo (8/11)
  • Starting with Yahoo unable to repeat in map (2/14)

Chen, Houston, Sewell, Schatz, Internet Browsing
and Searching User Evaluations of Category Map
and Concept Space Techniques. JASIS 49(7)
582-603 (1998)
37
Kohonen Feature Maps(Lin 92, Chen et al. 97)
38
Study 2 (cont.)
  • Participants liked
  • Correspondence of region size to documents
  • Overview (but also wanted zoom)
  • Ease of jumping from one topic to another
  • Multiple routes to topics
  • Use of category and subcategory labels

Chen, Houston, Sewell, Schatz, Internet Browsing
and Searching User Evaluations of Category Map
and Concept Space Techniques. JASIS 49(7)
582-603 (1998)
39
Study 2 (cont.)
  • Participants wanted
  • hierarchical organization
  • other ordering of concepts (alphabetical)
  • integration of browsing and search
  • correspondence of color to meaning
  • more meaningful labels
  • labels at same level of abstraction
  • fit more labels in the given space
  • combined keyword and category search
  • multiple category assignment (sportsentertain)
  • (These can all be addressed with faceted
    hierarchical categories)

Chen, Houston, Sewell, Schatz, Internet Browsing
and Searching User Evaluations of Category Map
and Concept Space Techniques. JASIS 49(7)
582-603 (1998)
40
Clustering Study 3 NIRVE
  • Each rectangle is a cluster. Larger clusters
    closer to the pole. Similar clusters near one
    another. Opening a cluster causes a projection
    that shows the titles.

41
Study 3
  • This study compared
  • 3D graphical clusters
  • 2D graphical clusters
  • textual clusters
  • 15 participants, between-subject design
  • Tasks
  • Locate a particular document
  • Locate and mark a particular document
  • Locate a previously marked document
  • Locate all clusters that discuss some topic
  • List more frequently represented topics

Visualization of search results a comparative
evaluation of text, 2D, and 3D interfaces
Sebrechts, Cugini, Laskowski, Vasilakis and
Miller, SIGIR 99.
42
Study 3
  • Results (time to locate targets)
  • Text clusters fastest
  • 2D next
  • 3D last
  • With practice (6 sessions) 2D neared text
    results 3D still slower
  • Computer experts were just as fast with 3D
  • Certain tasks equally fast with 2D text
  • Find particular cluster
  • Find an already-marked document
  • But anything involving text (e.g., find title)
    much faster with text.
  • Spatial location rotated, so users lost context
  • Helpful viz features
  • Color coding (helped text too)
  • Relative vertical locations

Visualization of search results a comparative
evaluation of text, 2D, and 3D interfaces
Sebrechts, Cugini, Laskowski, Vasilakis and
Miller, SIGIR 99.
43
Clustering Study 4
  • Compared several factors
  • Findings
  • Topic effects dominate (this is a common finding)
  • Strong difference in results based on spatial
    ability
  • No difference between librarians and other people
  • No evidence of usefulness for the cluster
    visualization

Aspect windows, 3-D visualizations, and indirect
comparisons of information retrieval systems,
Swan, Allan, SIGIR 1998.
44
SummaryVisualizing for Search Using Clusters
  • Huge 2D maps may be inappropriate focus for
    information retrieval
  • cannot see what the documents are about
  • space is difficult to browse for IR purposes
  • (tough to visualize abstract concepts)
  • Perhaps more suited for pattern discovery and
    gist-like overviews

45
How do people want to search and browse images?
  • Ethnographic studies of people who use images
    intensely find
  • Find specific objects is easy
  • Find images of the Empire State Building
  • Browsing is hard
  • In a usability study with architects, to our
    surprise we found their response to an
    image-browsing interface mock-up was they wanted
    to see more text (categories).

Elliott, A. (2001). "Flamenco Image Browser
Using Metadata to Improve Image Search During
Architectural Design," in the Proceedings of CHI
2001.
46
Clustering in Image Search
  • Using Visual Content
  • Extract color, texture, shape
  • QBIC (Flickner et al. 95)
  • Blobworld (Carson et al. 99)
  • Body Plans (Forsyth Fleck 00)
  • Piction images text (Srihari et al. 91 99)
  • Two uses
  • Show a clustered similarity space
  • Show those images similar to a selected one

47
K. Rodden, Evaluating Similarity-Based
Visualisations as Interfaces for Image Browsing,
PhD thesis, 2001 K. Rodden, W. Basalaj, D.
Sinclair, and K. Wood, Does Organisation by
Similarity Assist Image Browsing?, CHI 2001
48
K. Rodden, Evaluating Similarity-Based
Visualisations as Interfaces for Image Browsing,
PhD thesis, 2001 K. Rodden, W. Basalaj, D.
Sinclair, and K. Wood, Does Organisation by
Similarity Assist Image Browsing?, CHI 2001
49
K. Rodden, Evaluating Similarity-Based
Visualisations as Interfaces for Image Browsing,
PhD thesis, 2001 K. Rodden, W. Basalaj, D.
Sinclair, and K. Wood, Does Organisation by
Similarity Assist Image Browsing?, CHI 2001
50
Image Clustering Study Results
  • Searching was faster with the random arrangement
  • Preference for the clustered arrangement was not
    overwhelming stronger than random
  • 2 out of 10 participants prefered random and 3
    had no preference
  • Median satisfaction for clustered was 4.5 and for
    random was 4.0

K. Rodden, Evaluating Similarity-Based
Visualisations as Interfaces for Image Browsing,
PhD thesis, 2001 K. Rodden, W. Basalaj, D.
Sinclair, and K. Wood, Does Organisation by
Similarity Assist Image Browsing?, CHI 2001
51
An Alternative
  • In the Flamenco project, we have shown that
    hierarchical faceted metadata, paired with a good
    interface, is highly effective for browsing image
    collections
  • Flamenco.berkeley.edu
  • (But thats a different talk)

52
Study 5 Comparing Textual Cluster Interfaces to
Category Interfaces
  • DynaCat system
  • Decide on important question types in an advance
  • What are the adverse effects of drug D?
  • What is the prognosis for treatment T?
  • Make use of MeSH categories
  • Retain only those types of categories known to be
    useful for this type of query.

Pratt, W., Hearst, M, and Fagan, L. A
Knowledge-Based Approach to Organizing Retrieved
Documents. AAAI-99
53
DynaCat Interface
Pratt, W., Hearst, M, and Fagan, L. A
Knowledge-Based Approach to Organizing Retrieved
Documents. AAAI-99
54
DynaCat Study
  • Design
  • Three queries
  • 24 cancer patients
  • Compared three interfaces
  • ranked list, clusters, categories
  • Results
  • Participants strongly preferred categories
  • Participants found more answers using categories
  • Participants took same amount of time with all
    three interfaces

Pratt, W., Hearst, M, and Fagan, L. A
Knowledge-Based Approach to Organizing Retrieved
Documents. AAAI-99
55
Study 6 Categories vs. Lists
  • One study found users prefered one level of
    categories over lists, and were faster at finding
    answers
  • Only 13 top-level categories shown
  • Secondary-level categories not very accurate
  • However, the queries appeared to be somewhat
    setup to optimize the usefulness of the clusters
  • Example
  • Query word indian
  • Task find indian motorcyles
  • Query alaska
  • Task find yatching adventures in alaska

Chen, Dumais, Bringing order to the web
Automatically categorizing search results. CHI
2000
56
What about Textual Displays of Clusters?
  • Text-based clustering is more promising
  • Text-based clustering on the Web
  • In the early days, Excite had a mockup on about
    10 documents that pretended to do Scatter/Gather
    (when it was called Architext)
  • Quickly removed it and started providing standard
    search
  • For a while NorthernLight had a clustering
    interface
  • Didnt really get anywhere
  • The latest entry is Vivisimo
  • Has a lot of problems
  • BUT theres a new development from Vivisimo
    called Clusty
  • Seems to have much improved clustering and
    interface

57
An Analysis of Vivisimo
  • Query barcelona
  • Query dog pregnancy

58
(No Transcript)
59
(No Transcript)
60
(No Transcript)
61
An Analysis of Vivisimo
  • Query barcelona
  • Hotels and Travel Guide are both at top level
  • Also, Barcelona City
  • But Travel Guide contains
  • Hotels
  • Spain, Spanish
  • Not really helping to make useful distinctions

62
(No Transcript)
63
(No Transcript)
64
An Analysis of Vivisimo
  • Query pregnant dog
  • What does the category pregnant mean here?
  • Why does it have a subcategory of whelping, when
    there is also a main category of whelping?
  • And what the relationship to Pregnancy and Birth
  • The pages shown dont seem strongly related to
    one another
  • How to followup?
  • There is a find in clusters box, but not very
    helpful because no hints about which words might
    work

65
Search within Results
66
Then along came Clusty
  • Announced less than a week ago
  • Produced by Vivisimo
  • Much better interface
  • Much better clusters

67
(No Transcript)
68
(No Transcript)
69
(No Transcript)
70
(No Transcript)
71
(No Transcript)
72
Clusty Improvements
  • Labels tend to be more at the same level of
    description
  • Subcategories are more cautious, reflecting
    groups of very similar documents
  • Do a better job of really showing subcategories
  • Nice interface touches
  • Better use of color for distinguishing
  • Small icons are inviting
  • Incorporation of encyclopedia results high up
  • Search results are better
  • (Not always pregnant dog not much better)
  • Using metasearch
  • May be throwing out some docs to get more
    distribution in the types of results found
  • Looks like they are focusing on term proximity to
    get more meaningful grouping
  • Dont allow very many results

73
(No Transcript)
74
(No Transcript)
75
(No Transcript)
76
Clusty Improvements
  • Doing sense disambiguation for abbreviations like
    ACL
  • However, no good followup for how to make use of
    this
  • E.g., to search on ACL (meaning comp ling) plus
    some other concepts
  • On the other hand, using multiple terms is how
    most disambiguation is done now
  • ACL disambiguation
  • Jaguar prey
  • So not clear if there is a net benefit
  • Trying to approximate faceted queries
  • Under Jaguar query, for history, show both
    history of band with history of car and video
    game

77
(No Transcript)
78
Analysis
  • Is it really helping? Or are the categories now
    too general and overlapping?
  • The main effect seems to be that the search
    results are better due to the metasearch and term
    proximity

79
(No Transcript)
80
More Analysis
  • Reflects the frequency of topics in the data
  • So no discussion of nukes in the Spain categories
  • No discussion of hotels in the North Korea
    categories
  • Is this good or bad? It depends.

81
(No Transcript)
82
(No Transcript)
83
(No Transcript)
84
(No Transcript)
85
(No Transcript)
86
(No Transcript)
87
More analysis
  • Adding a related term (Degas, Cezanne) brings up
    relations between the two that dont appear with
    the general term Degas alone
  • Impressionists
  • Pissaro, in particular (should be under
    impressionists)
  • Also leads to messier results

88
Summary
  • Grouping search results is desirable
  • Often requested by lay users
  • Very positive results for category interface
  • However, getting good groups is difficult
  • Two main approaches
  • Predefined category sets
  • Automatically created clusters
  • Furthermore, incorporation of groups into
    interfaces has not been done well
  • Notable Failures in Search Interfaces
  • Visualization of clusters
  • Unintuitive clusters and labels
  • Clustering of images according to visual
    attributes
  • Poor incorporation of categories into search
    interfaces (not covered)
  • Good news improvements are happening
  • Improved clustering that takes better account of
    good display principles as seen in Clusty
  • Flamenco Flexible search and navigation via
    faceted category hierarchies (not discussed here)

89
A Promising DirectionCombining Categories and
Clusters
  • Mehran Sahamis work on combing categories and
    clusters
  • Ray Larsons work on clustering results of
    categorization
  • Would be interesting to cluster MeSH category
    labels
  • Work using UMLS to select subsets of MeSH has
    been successful for analysis tasks

90
Conclusions
  • In order to use clustering in an interface, must
    pay attention to what makes the groupings
    intuitive
  • Much work has been too much of a science
    project
  • Up to now, clustering hasnt succeeded on web
    search results, but Clusty show marked
    improvements that are promising

91
Thank you!
  • Marti Hearst
  • www.sims.berkeley.edu/hearst

92
More Recent Attempts
  • Analyzing retrieval results
  • KartOO http//www.kartoo.com/
  • Grokker http//www.groxis.com/service/grok

93
(No Transcript)
94
(No Transcript)
95
(No Transcript)
96
(No Transcript)
97
References
  • Chen, Houston, Sewell, and Schatz, JASIS 49(7)
  • Chen and Yu, Empirical studies of information
    visualization a meta-analysis, IJHCS 53(5),2000
  • Dumais, Cutrell, Cadiz, Jancke, Sarin and
    Robbins, Stuff I've Seen A system for personal
    information retrieval and re-use.  SIGIR 2003.
  • Hearst, English, Sinha, Swearingen, Yee. Finding
    the Flow in Web Site Search, CACM 45(9), 2002.
  • Hearst, User Interfaces and Visualization,
    Chapter 10 of Modern Information Retrieval,
    Baeza-Yates and Rebeiro-Nato (Eds),
    Addison-Wesley 1999.
  • Johnson, Manning, Hagen, and Dorsey. Specialize
    Your Site's Search. Forrester Research, (Dec.
    2001), Cambridge, MA

98
References
  • Sebrechts, Cugini, Laskowski, Vasilakis and
    Miller, Visualization of search results a
    comparative evaluation of text, 2D, and 3D
    interfaces, SIGIR 99.
  • Swan and Allan, Aspect windows, 3-D
    visualizations, and indirect comparisons of
    information retrieval systems, SIGIR 1998.
  • Yee, Swearingen, Li, Hearst, Faceted Metadata for
    Image Search and Browsing, Proceedings of CHI 2003
Write a Comment
User Comments (0)
About PowerShow.com