Title: Creating a New Faceted Browsing Function for Millennium WebPAC Pro
1Creating a New Faceted Browsing Function for
Millennium WebPAC Pro
- Li, Yiu On, Senior Assistant Librarian
- Leung, Roger, Information Technology Officer
- Hong Kong Baptist University Library
9th HKIUG Meeting University of Hong Kong Library
8th Dec., 2009
2Outline
- What is Faceted Browsing?
- Implementations of Faceted Browsing in
Traditional WebPAC Two Approaches - Architecture of the New Faceted Browsing Function
in WebPAC Pro - BU Faceted Browsing Function and Encore A
Comparison - Conclusion
31. What is Faceted Browsing?
41.1 Definition of Faceted Browsing
- Faceted Browsing
- also known as faceted searching, or faceted
navigation - is a special navigation interface designed for
record searching and browsing - to display aspects of result sets in multiple
classification and categorization schemes, (e.g.
related authors, titles, subject headings,
material types, locations, languages, publication
years, etc.)
51.2 Advantages of Faceted Browsing
- Unlike a single, pre-determined, hierarchical
scheme, faceted browsing gives users the
abilities - To find items from multiple dimensions and
attributes - To explore new directions in dynamic taxonomies
(i.e. divisions into ordered groups or
categories) - To refine/narrow down the searches
61.2 Advantages of Faceted Browsing (Cont)
- To easily switch between searching and browsing,
users can use their own terminology for
searching, while browsing the organizations and
categories suggested by faceted classifications - To display the number contents of each
suggested category
71.2 Advantages of Faceted Browsing (Cont)
- For experienced Web users, faceted navigation
isnt something that needs to be explained - -- Marshall Breeding. "Next-Generation Library
Catalogs". Library Technology Reports, vol. 43,
no. 4, July-August 2007, p.12.
81.3 Use of Faceted Browsing in Commercial Sites
- Indeed, faceted browsing has become part of a
well-established user interface convention - A 2003 survey reported that 69 of 75 leading
commercial sites made use of faceted browsing. In
fact, all sites of computers, gifts, kitchen
ware, music/video categories used faceted
browsing - -- Use of Faceted Classification,
http//www.webdesignpractices.com/navigation/facet
s.html - e.g. Amazon, the largest online book stores
9- In Amazon, faceted browsing includes
- New Releases
- Department
- Formats
- Binding
- Shipping Options
- Award Winners
- Promotion
- Avg. Customer Review
- Condition
101.4 Implementation of Faceted Browsing in WebPAC
- If librarians can implement this common faceted
browsing function in WebPAC environment, then - we can change WebPAC from a traditional searching
tool to a powerful information discovery tool
112. Implementations of Faceted Browsing in
Traditional WebPACTwo Approaches
122.1 Need for Adding New Web 2.0 Functions to
WebPAC
- More and more librarians are discontent with the
insufficient functionalities of the traditional
WebPAC interfaces - To win the support from the new generation of web
users, we need to add new Web 2.0 technologies
such as faceted browsing, interactive cloud tags,
federated search, and social networking tools,
etc.
132.2 Next Generation WebPAC
- Different names of WebPAC equipped with new Web
2.0 functions include - Next Generation Library Catalog,
- SmartCat,
- Library Catalog 2.0.
142.3 Two Different Approaches
- In Hong Kong, two different approaches are
adopted to build the Next Generation Library
Catalog. They are - New Functions in New WebPAC (NFNW)
- New Functions in Current WebPAC (NFCW)
- (Note in this presentation, we use faceted
browsing function as an representative example of
the Web 2.0 functions)
152.4 NFNW Development Logic
- The development logic of the New Function New
WebPAC approach may be summarized as - We MUST add faceted browsing function to WebPAC
- Existing Millennium WebPAC Pro environment is too
old and CANNOT accommodate this transformation - Thus, we need to develop a new WebPAC to
implement new Web 2.0 technology - (NOTE this argument is invalid, we will talk
more in NFCW later)
162.5 Two Models of NFNW
- Two different development models of New Function
in New WebPAC - Encore
- III product
- Relatively high annual subscription fee
- CUHK, HKU, PolyU have purchased
- Scriblio
- an open-source software
- enhanced used at HKUST
172.6 Disadvantages of NFNW
- Many existing powerful functions of WebPAC Pro
are missing in Encore - Exact Author, Title, Subject Searching
- Scope searching
- Limit results to items with "Available" status
- Search History
- Author/title/subject authority list (e.g. Author
search Strauss, Johann, 1825-1899) - Modify/Limit this Search command
- Advanced Keyword Search Form
18Existing WebPAC Pro powerful functions are
missing in Encore
192.7 NFNW Dual WebPAC System
- As a result, Encore cannot replace traditional
WebPAC Pro - If patrons want to use those old advanced
search functions, they have to use the
Traditional WebPAC Pro - Thus, Encore (New Function New WebPAC) approach,
in reality, is a dual WebPAC system - Encore (New WebPAC) WebPAC Pro (Traditional
WebPAC)
20click on to access WebPAC Pro for more
old/classic advance searching capabilities
Encore in University of Queensland Library
http//encore.library.uq.edu.au/iii/encore/search/
C7CSStrauss7COrightresult7CU1?langengsuitede
f
212.8 Disadvantages of a Dual WebPAC System
- Patrons have to learn how to use two different
WebPAC systems. This may cause inconvenience and
confusions - Library staff spend more time and effort to
maintain two searching interfaces, therefore,
maintenance cost is high - Systems people waste time to re-invent a new
interface rather than concentrate on the design
work of faceted browsing function
222.9 Building Next Generation Library Catalog
the Second Approach
- The development logic of New Functions New WebPAC
approach is based on an invalid argument - the existing Millennium WebPAC Pro environment
is too old and CANNOT accommodate any Web 2.0
functions - But, our study shows that WebPAC Pro is a
comparatively open and flexible environment, and
we can add in-house developed scripts to the
interface
232.9 Building Next Generation Library Catalog
the Second Approach (Cont)
- Thus, we decided to add faceted browsing function
to the existing WebPAC Pro interface - This is a more logical, simple and direct
approach, and I call it - New Functions in Current WebPAC (NFCW)
242.10 Merits of the NFCW Approach
- Faceted browsing is inserted to WebPAC Pro and
becomes an integral part of it - All the existing WebPAC Pro powerful functions
are kept - The new add-on faceted browsing functions are
fully compatible with the existing WebPAC Pro
functions - The new add-on faceted browsing functions
strengthen the existing WebPAC Pro searching
capabilities
252.10 Merits of the NFCW Approach (Cont)
- Single interface avoids unnecessary
inconvenience, inconsistency and confusion caused
by a dual WebPAC systems - Save library staffs time and efforts in
maintaining two different WebPAC systems - No need to re-invent a new WebPAC interface,
therefore, software development cost and cycle is
largely reduced
262.11 New Faceted Browsing in HKBU WebPAC Pro
- Based on the NFCW development logic, HKBU has
recently installed a new faceted browsing
function on the staging port of WebPAC Pro - Currently, only some 222,000 records are uploaded
to this database for testing
272.11 New Faceted Browsing in HKBU WebPAC Pro
(Cont)
- HKBU WebPAC Pro staging port
- http//hkbulib.hkbu.edu.hk2082/searchS11/?search
typeasearchargplatosearchscope11SORTDexten
ded0searchlimitssearchorigargasmithadam
28New In-house Developed Faceted Browsing Function
in HKBU WebPAC Pro
293. Architecture of the New Faceted Browsing
Function in WebPAC Pro
303.1 Systems Requirements
- Hardware
- X86 based PC/Server
- Our Server configuration
- Dual Xeon Q-Core CPU
- 32GB Memory
- 1 TB HDD space
- NOTE 220,000 bib records are uploaded for
testing, and use 7.8 GB for MySQL and Sphinx
313.1 Systems Requirements (Cont)
- Software
- Perl 5 with marc2xml and marc-charset (for MARC
to XML conversion) - MySQL 5 (for data storage)
- Sphinx (for building index and searching data)
- IIS with ASP 3.0 (for user interface data
conversion)
323.1 Systems Requirements (Cont)
- Systems requirement is minimal
- Dont need a dedicated server
- Dont require special high end programming
language (Perl, MYSQL, and Sphinx are freeware)
333.2 Program Workflow
- Two major parts
- Construct a bibliographic record database for
facets analysis - Create a special iFrame in WebPAC Pro for
displaying facets
343.3 Construction of a New Bibliographic Database
- Metadata are required to calculate facets
- Thus, we build a separate database to store the
raw data for creating facets instead of using the
records in Innopac system - All MARC records are exported from Innopac
system, and uploaded to our in-house developed
bibliographic database
353.3 Construction of a New Bibliographic Database
(Cont)
- An indexing program is designed to extract facets
according to 11 categories below
Variable Fields Fixed Fields
1. Author 5. Scope
2. Title 6. Language
3. Subject 7. Material Type
4. Publisher 8. Location
9. Publication Year
10. Call No. (browsing only)
363.4 Facets Variable Fields
- Below is the facet variable fields for
- Author Search Smith, Adam
373.5 Facets Fixed Fields
- Below is the facet fixed fields for
- Author Search Smith, Adam
383.6 Insert Facets on WebPAC Pro
- WebPAC Pro is an open environment, we can insert
scripts and create an iFrame to display facets on
brief citation browse page and bib record page
393.7 iFrame Tag on Briefcit.html
- An example of iFrame tag
- ltiFrame src"http//lib.hkbu.edu.hk/facet/browse/i
ndex.asp?searchterm " width"100" height"100"
frameborder"0" scrolling"no"gt - lt/iFramegt
403.8 Facet Display Program
- Input search variable by extracting the search
term in the WebPAC search URL, http//hkbulib.hkbu
.edu.hk2082/searchS11/?searchtypeXsearchargch
inasearchscope11SORTDZextended0SUBMITSearc
hsearchlimitssearchorigargXchina - Pass the search term to bibliographic database
and made a SQL search query - Extract search results from the in-house
bibliographic database, and then, calculate and
group the facet values - Display faceted categories and values on iFrame
412. Extract search term and pass to iFrame
1. Create facet iFrame in briefcit.html
3. SQL to bibliographic db
4. Return facets values
42BU Faceted Browsing Function and Encore A
Comparison
434.1 Facets Variable Fields
- Encore cannot provide facets values for variable
fields like Author, Title, and Subject - Thus, Encore cannot provide a meaningful
refinement alternative for variable field
categories - An example Author search Smith, Adam
44- Encore Variable Field Facets
- No facet values
- Indeed, only keyword search links are provided
- Keyword-Author
- Keyword-Title
- Keyword-Subject
- Fail to provide meaningful alternatives for users
to refine/limit their search
45- Author Facets in HKBU
- Names of Chinese translators are provided
- Ebook collections
46- Title Facets in HKBU
- List of Adam Smiths most important work
- Wealth of Nations
- Theory of Moral Sentiments
- Chinese translation titles for Wealth of Nation
??, ??? are provided
47- Subject Facets in HKBU
- Contributions of Adam Smith in subject areas
- Economics
- Ethics
484.2 New Facets Variable Field -- Publisher
- BU provides Publisher as a new facet variable
field - Users may choose to refine the search by
publisher like Oxford University Press
494.3 Publication Year Facets
- In BU, the publication year is sorted in a
10-year range instead of a long list of single
year as in Encore - A 10-year list is easier for browsing, searching,
and collection analysis
504.4 Encore Only Provides Keyword Searching
- In Encore, keyword searching is the only
searching capability - Without exact Author, Title, Subject search, it
will make the searching process more complex, and
difficult - In BU, users can still use exact Author, Title,
and Subject search, and refine the search by
facets
514.4 Encore Only Provides Keyword Searching (Cont)
- It is difficult to do a keyword search on authors
with common last names and first names - e.g., Adam Smith
52- Find all records containing Adam and Smith
- NOTE the first two are not written by Adam
Smith, the British economist, that we are looking
for
53- An exact Author search is much easier and
straight forward - Adam Smith was born in 18th century, entry 2 is
the one we are looking for - Facets is also helpful to refine the search
544.4 Encore Only Provides Keyword Searching (Cont)
- It is also difficult to search subject headings
containing common terms by Keyword - e.g. Philosophy History -- China
55- In Encore, find all records containing philosophy
and history and China - NOTE Many are irrelevant records
56- An exact Subject search is much easier and
straight forward - Facets is very useful to refine the search
574.5 Call Number Analysis
- Unavailable in Encore
- e.g. Keyword Plato
- To facilitate users to browse the class number
list, the program will display both the class
number and scope of content
584.5 Call Number Analysis (Cont)
594.6 Fully Compatible with All WebPAC Searching
Functions
- Unavailable in Encore
- Exact Author, Title, Subject Searching
- Scope searching
- Limit results to items with "Available" status
- Search History
- Author/title/subject authority list (e.g. Author
search Strauss, Johann, 1825-1899) - Modify/Limit this Search command
- Advanced Keyword Search Form
605. Conclusion
615.1 Benefit
- Created new faceted browsing function in existing
WebPAC Pro environment is beneficial - WebPAC Pro can be re-engineered and upgrade to
become a Next Generation Library Catalog - This upgrade can keep all the advanced searching
functionalities of WebPAC Pro
625.1 Benefit of NFCW (Cont)
- Upgrade cost is low because there is no need to
re-invent a new WebPAC - Annual maintenance cost is low because there is
no need to maintain two different WebPAC
interface - Development circle is faster because we can
concentrate our work on designing new functions
635.2 Future Development
- In the second phase, we will add the following
new Web 2.0 functions - Cloud Tagging
- Newly Added Book List
- RSS
- User Book Rating
- User Book Review/Comment
- Adding Google/Amazon table of content
- Most Common Search Terms
- Not available in Encore
64RSS
Cloud Tagging
User Book Rating
Recently Added List
User Book Comment/Review
Google/Amazon table of content
65Demo Thank you
66Demo examples
- BU WebPAC Staging Port
- http//hkbulib.hkbu.edu.hk2082/search/X?SEARCHp
latoSORTDlmpbDaDbsearchscope11 - KW Plato (scope Multimedia)
- AU ??? (publisher ??????)
- AU Strauss, Johann, 1825-1899 (subject
Waltzes (Orchestra)) - SU Kant (author ???)
- KW China (pub year pre 1900)