Endeca and faceted browsing: Giving the user a useful catalog PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: Endeca and faceted browsing: Giving the user a useful catalog


1
Endeca and faceted browsing Giving the user a
useful catalog
  • Scott Warren
  • NCSU Libraries
  • South Carolina Library Association Annual Meeting
  • June 7, 2007

2
Outline
  1. Problem and Context
  2. Online searching, shopping, and examples
  3. Demo
  4. Faceted Navigation
  5. Implementation Challenges
  6. Facet Usage Statistics
  7. Reflections

3
The Context
4
Online Catalogs
"Most integrated library systems, as they are
currently configured and used, should be removed
from public view. - Roy Tennant, CDL
5
What is the problem?
  • Existing catalogs are hard to use
  • known item searching works pretty well, but
  • users often do keyword searching and get large
    result sets returned in system sort order (last
    in, etc.)
  • catalogs are unforgiving on spelling errors,
    stemming
  • Authority searching completely mystifying

6
Catalog metadata is buried
  • Subject headings are not leveraged in searching
  • they should be browsed or linked from, not
    searched
  • Data from the item record is not leveraged
  • should be able to filter by item type, location,
    circulation status, popularity

7
Word of the Day for Saturday, May 5, 2007
  • moil \MOYL\, intransitive verb
  • 1. To work with painful effort to labor to
    toil to drudge.2. To churn or swirl about
    continuously.3. Toil hard work drudgery.4.
    Confusion turmoil.

8
Whats the big picture?
  • Improve the quality of the library catalog user
    experience.
  • Exploit our existing metadata infrastructure
    (make MARC work harder).
  • Build a more flexible catalog tool that can be
    integrated with discovery tools of the future.

9
What is Endeca?
  • Software company based in Cambridge, MA
  • Search/information access technology provider for
    a number of major e-commerce websites
  • Developers of the Endeca Information Access
    Platform

10
Why Endeca?
  • Customized relevance ranking of results
  • Better subject access by leveraging available
    metadata through facets
  • Improved response time
  • Enhanced natural language searching through spell
    correction, etc.
  • Browse

11
A question
  • How is the new generation of library catalog
    being developed?
  • informed and enhanced by search technologies
    developed outside of the library
  • based on how our users know how to search, not on
    how we want them to search
  • What does search look like for our users?

12
Examples
13
(No Transcript)
14
(No Transcript)
15
Faceted Navigation on the Web
16
(No Transcript)
17
Facet
Value
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
Faceted Navigation in Libraries
22
Faceted Navigation in Libraries
23
Faceted Navigation in Libraries
24
Demonstration
25
Faceted Navigation
26
What is Faceted Navigation?
27
What is Faceted Navigation?
  • Search and browse in a single interface
  • Facets can vary in scope
  • What is the item about?
  • What kind of item is it?
  • Where is it?
  • Enables users to narrow results
  • Macroscopic behavior of results set
  • Clues to being on the right path

28
Origins of Facets
  • 1930s Ranganathan
  • Colon Classification

29
Cartesian Coordinates
30
Coordinate System
Format
(x, y, z) (Library, LCSH,
Format) (Branch 1, History, Book) (Branch 2,
History, DVD) Multiple records could
be associated with each coordinate point. Each
point is associated with at least one record.
(Branch 1, History, Book)
Book
LCSH
DVD
Art
History
Branch 1
Branch 2
Library
31
Another way to think about it
  • 11 dimensional lattice space
  • All points associated with at least one
    item/record
  • Records can be associated with gt 1 point
  • Keyword search selects subset of points with
    word(s) in record
  • Facets shown are those dimensions corresponding
    to the points in that set (nonzero values).
  • Choosing a facet value is equivalent to slicing
    through the multidimensional lattice on a plane
    along that facet value and reducing the lattices
    dimension by 1.
  • Choose enough facets and you will get down to a
    few items (never a null set)

32
Implementation
33
Implementation Challenges
  • Facet selection
  • Interface design
  • Data issues

34
Endeca at NCSU
  • Endeca used to improve the discovery portion of
    the library catalog
  • Endeca software indexes 1.6 million MARC records
    exported nightly from Sirsi Unicorn ILS
  • Backend functions of ILS remain intact

35
Facets Implemented at NCSU
  • Availability
  • Author
  • Library
  • Format
  • Language
  • Browse New
  • LC Classification
  • Subject Topic
  • Subject Genre
  • Subject Region
  • Subject Era

36
Facet Selection
37
Interface Design
  • Iterative approach using wireframes
  • Eight major revisions in a four month period
  • Still lots of room for improvement

38
Technical Overview
  • Endeca co-exists with SirsiDynix Unicorn ILS and
    Web2 online catalog
  • Endeca handles keyword search
  • Web2 handles authority search and detail page
    display
  • Endeca indexes MARC records exported nightly from
    Unicorn
  • Endeca discovery portion of the ILS

39
Technical Overview
Information Access Platform
Data Foundry
NCSU exports and reformats
MDEX Engine
Parse text files
Raw MARC data
Indices
Flat text files
HTTP
HTTP
NCSU Web Application
40
Technical Overview
Offline - Nightly
NCSU exports and reformats
Data Foundry
MDEX Engine
Parse text files
Raw MARC data
Indices
Flat text files
HTTP
HTTP
NCSU Web Application
41
Technical Overview
Always Online
NCSU exports and reformats
Data Foundry
MDEX Engine
Parse text files
Raw MARC data
Indices
Flat text files
HTTP
HTTP
NCSU Web Application
42
Implementation Team
  • Seven member team
  • 5 IT staff,
  • 1 cataloging librarian,
  • 1 reference librarian
  • Timeline
  • License / negotiation Spring 2005
  • Software acquisition Summer 2005
  • Implementation Aug 2005 to Jan 2006

43
Data Issues
  • ILS data with MARC-8 encoding gt Text data with
    UTF-8 encoding
  • Data consistency between ILS and Endeca catalog
    indexes (updates!)
  • Data issues revealed by exposing metadata (ex
    subject headings) in facets

44
Outcomes
45
Added search tools
  • Automatic spell correction
  • Did you mean suggestions
  • Automatic stemming
  • Bookmark-ability

46
True browse
  • Regain ability to browse catalog without entering
    any search terms

47
July 06 Jan 07
48
July 06 Jan 07
49
July 06 Jan 07
50
July 06 Jan 07
51
July 06 Jan 07
52
Dimension Value Requests
New NEW 56,286
Format Book 16,188
LC Classification Q - Science 12,462
Library Textiles 11,160
Library D.H. Hill 11,060
Availability Available 9,276
Library Online Resources 8,164
LC Classification T Technology 8,052
Subject Topic History 7,915
Format Online 7,858
LC Classification P - Language and literature 7,005
LC Classification H - Social Sciences 6,953
Language English 6,854
Subject Region United States 6,298
Format Journal, Magazine, or Serial 4,621
53
Usability testing
  • 10 undergraduate students
  • 5 with new Endeca-based interface
  • 5 with old catalog interface
  • Identical searching tasks
  • Data collected
  • Task difficulty/failure
  • Task duration

54
Usability testing
55
Usability testing
56
Usability testing
  • For students, relevance ranking is key.
  • July 06 Jan 07 19 continued to page 2
  • Faceted navigation is intuitive, even for
    students who dont use it.
  • Beware of library jargon
  • keyword anywhere, keyword in subject
  • User behavior is influenced by previous
    experience.

57
Reflections
  • Faceted navigation enables new ways to discovery
    resources
  • Library collections often contain rich
    descriptive metadata exploit this!
  • We have much to learn about how to optimize these
    interfaces for the user
  • Great for collection analysis

58
Analyzing collections
59
Conclusions
60
Features Not Supported
  • Work level aggregations / roll-up
  • Customization / personalization
  • Folksonomies / user contributed content
  • Recommender functionality
  • Shopping cart functionality

61
QuickSearch
62
Future directions
  • Experiment with FRBR search/display through
    partnership with OCLC.
  • Integrate catalog w/other tools through web
    services
  • OpenSearch, RSS
  • Enrich catalog through external web services
  • book jackets, reviews, etc. Amazon/OCLC
  • Build modular shopping cart functionality.
  • Use Endeca to index local collections.

63
Big Issues
  • Benchmarking
  • Just how much better is it? For whom? When is it
    not better?
  • Natural Language
  • Revolutionary War problem
  • Experimenting
  • What is the optimal interface?
  • Power Search?

64
Big Wins
  • Relevance ranking
  • Speed / performance
  • Locally managed presentation interface
  • Persistent parameter based entry points
  • Proving it could be done
Write a Comment
User Comments (0)
About PowerShow.com