Considering a Faceted Searchbased Model - PowerPoint PPT Presentation

About This Presentation
Title:

Considering a Faceted Searchbased Model

Description:

Before the web, source selection was a separate operation from free text search ... Lay people won't know the system. The Wron. How to do it wrong. Force into ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 40
Provided by: hea4
Category:

less

Transcript and Presenter's Notes

Title: Considering a Faceted Searchbased Model


1
Considering a Faceted Search-based Model
  • Marti Hearst
  • UCB SIMS
  • hearst_at_sims.berkeley.edu
  • NAS CSTB DNS Meeting on
  • Internet Navigation and the Domain Name System
  • Technical Alternatives and Policy Implications
  • July 12, 2001

2
Outline
  • The Klensin proposal
  • Synopsis
  • Issues
  • Recommendations
  • UIs and faceted search

3
A Proposal
  • A Search-based access model for the DNS
  • IETF Internet-Draft by John Klensin
  • http//www.ietf.org/internet-drafts/draft-klensin-
    dns-search-00.txt
  • A multi-layer approach to naming
  • Faceted descriptions are used to facilitate both
    flexible naming and inexact search
  • This talk
  • What does research tell us about the search
    issues?

4
Klensins proposal
Free-text Search (unregulated)
Faceted System (detailed, unregulated)
Faceted Classification System (simple, regulated)
DNS (unchanged)
5
Layer 2
Language Spanish
Name Joses Pizza
Industry Category Restaurant
Geolocation Miami
Network Location
6
Layer 2
Inputs search values for one or more
facets Outputs appropriate DNS names and all
tuples with matched facets Allow for partial
(fuzzy) match
Faceted System (simple, regulated)
Joses Pizza, Miami Albertos Pizza, Miami Joses
Bistro, Miami Joses Pizza, Saratoga Joes Pizza,
Miami
7
Layer 2 Selling Points
  • Allows sharing of name space among different
    (commercial) entities
  • Allows specification according to meaningful
    attributes

8
Layer 2 DNS Issues
  • How to guarantee uniqueness?
  • How to determine appropriate descriptors?
  • How to use in a hyperlink?
  • Requires a user interface for confirmation of
    correct choice

9
Layer 2 Descriptor Issues
  • Emphasis on geolocation may be problematic
  • May be too spare
  • SFMOMA
  • SFMOMA exhibits
  • SFMOMA exhibit on digital art called 101010

10
Layer 3
Not centrally coordinated (provided by commercial
services) More detailed facets Allow for
inheritance Context-sensitive (e.g., restaurant
has menu attribute auto repair has
services, etc.) Inputs service-dependent Outputs
layer 2 names
11
Layer 4
Free-text Search (unregulated)
Use standard search to find sites that discuss
topics that relate to the query (as web search
works today)
12
Relation to Web Search
  • Web search is perceived to work better today than
    two years ago. Why?
  • Finds appropriate starting points
  • Also known as source selection
  • Search for toyota no longer returns Tonys
    Toyota pages as the top-ranked hit
  • Before the web, source selection was a separate
    operation from free text search
  • Also, queries tended to be longer
  • Web search engines could do this exclusively
    but they do other things as well.

13
Recommendations on Klensin Proposal
  • A promising, intriguing approach
  • One tweak
  • Combine layers 2 and 3
  • Have a partly regulated portion, and an open
    portion
  • This however is susceptible to spamming
  • Not clear if this should be regulated

14
General Pitfalls ofControlled Vocabularies
  • Difficult to get agreement on the set of labels
  • Difficult to assign labels consistently
  • Granularity
  • Salience / Emphasis
  • Context
  • Connotations
  • New labels always appearing old ones shift in
    meaning
  • Lay people wont know the system

15
How to do it wrongForce into a Hierarchy
The Wron
Lets try to find UCB
16
How to do it wrong
The Wron
17
How to do it wrong
The Wron
18
What is the problem?
  • Two deeply hierarchical facets
  • Region
  • Education
  • Forced in convoluted ways into one hierarchy with
    irregular cross links

19
Two Approaches
  • Statistical approaches map words into metadata
    terms
  • Create flexible user interfaces that
    progressively reveal appropriate subparts of the
    system
  • (How to do so is a topic of our research.)

20
The Practice
  • Using descriptors under the hood
  • The limited empirical work indicates
  • Combining free text descriptors works best
  • Some e-commerce sites do this for finding
    products
  • Can sometimes match queries to standard
    information needs
  • buy palm
  • review crouching tiger
  • berkeley gap

21
The Wron
walmart.com Uses metadata under the hood
22
The Promise
  • Using descriptors in the User Interface
  • Use faceted metadata for navigation
  • Query Previews
  • Tailored Search Forms
  • Tightly Combine Navigation Search

23
Facets
  • Orthogonal sets of descriptors
  • Gets complicated when they are hierarchical
  • Example recipes

24
Metadata Facets
Advantage Great for Mixing and Matching
25
Faceted Recipe Metadata
26
The Wron
Sunset.com Not the right way
27
Dynamic Previews
  • Avoid empty results sets
  • Show the possible next steps
  • A way to seamlessly integrate
  • Related topics
  • User preferences (personalization)
  • Context-sensitivity

28
The Wron
29
The Wron
30
The Wron
31
The Wron
32
Metadata Usage in Epicurious
  • Can choose category types in any order
  • But categories never more than one level deep
  • And can never use more than one instance of a
    category
  • Even though items may be assigned more than one
    of each category type
  • Items (recipes) are dead-ends
  • Dont link to more like this
  • Not fully integrated with search

33
Epicurious Metadata Usage
The Wron
  • Problem lacks integration with search

34
This is fixed in marthastewart.com
The Wron
35
The Wron
Advanced search more specific than sunset.com
also allows for disjunction thus less likely to
get null results
36
UIs for faceted metadata
  • Use dynamic previews
  • Allow user to select metadata in any order
  • At each step, show different types of relevant
    metadata,
  • based on prior steps and personal history,
  • include of documents
  • Previews restricted to only those metadata types
    that might be helpful
  • Tightly integrate with keyword search

37
The Flamenco Research Project
  • Systematically determine what works for
    integrating metadata into search interfaces
  • Develop recommendations that reflect both the
    task structure and the richness of the
    information structure
  • http//bailando.sims.berkeley.edu/flamenco.html

38
Summary
  • Agreement on metadata descriptors assignment is
    difficult to achieve
  • Descriptors need to be constantly updated
  • Layer 2 is probably not rich enough
  • Assigning specifiers is quite different than
    searching for specified items
  • Fuzzy search can help, but
  • Requires a UI for confirmation of correct choices
  • This will end up looking like a search service
  • Can make search more meaningful and task-based

39
Summary
  • Web search engines can do source selection, but
  • Sometimes users do want source selection,
  • But often search hits based on content of pages
    is often closer to what users want to do
  • We need to be certain not to confuse source
    selection from content search
Write a Comment
User Comments (0)
About PowerShow.com