Faceted Search - PowerPoint PPT Presentation

About This Presentation
Title:

Faceted Search

Description:

Faceted Search Some examples of applied faceted search on websites developed by the EP Jerry Hilbert European Parliament – PowerPoint PPT presentation

Number of Views:137
Avg rating:3.0/5.0
Slides: 37
Provided by: HILB6
Category:
Tags: faceted | java | search | servlet

less

Transcript and Presenter's Notes

Title: Faceted Search


1
Faceted Search
  • Some examples of applied faceted search
  • on websites developed by the EP

Jerry Hilbert European Parliament
2
Faceted search -
Definition
Faceted search, also called faceted navigation or
faceted browsing, is a technique for accessing a
collection of information represented using a
faceted classification, allowing users to explore
by filtering available information. A faceted
classification system allows the assignment of
multiple classifications to an object, enabling
the classifications to be ordered in multiple
ways, rather than in a single, pre-determined,
taxonomic order. Each facet typically corresponds
to the possible values of a property common to a
set of digital objects. Facets are often derived
by analysis of the text of an item using entity
extraction techniques or from pre-existing fields
in the database such as author, descriptor,
language, and format. This approach permits
existing web-pages, product descriptions or
articles to have this extra metadata extracted
and presented as a navigation facet Source
Wikipedia
3
Faceted search -
Technology
Different search engines offer nowadays the
possibility for faceted search. The EP uses
SolR, based on LUCENE. Solr is an open source
enterprise search platform from the Apache Lucene
project. Its major features include powerful
full-text search, hit highlighting, faceted
search, dynamic clustering, database integration,
and rich document (e.g., Word, PDF) handling.
Providing distributed search and index
replication, Solr is highly scalable. Solr is
written in Java and runs as a standalone
full-text search server within a servlet
container such as Apache Tomcat. Solr uses the
Lucene Java search library at its core for
full-text indexing and search, and has REST-like
HTTP/XML and JSON APIs that make it easy to use
from virtually any programming language. Solr's
powerful external configuration allows it to be
tailored to almost any type of application
without Java coding, and it has an extensive
plugin architecture when more advanced
customization is required. Source Wikipedia
4
Faceted search -
Technology
How is Lucene/Solr used? XML IN XML OUT XML
IN Data is structured in XML when submitting for
indexation XML OUT Data is returned as XML
(including facet details) as the result of a
search Also, configuration of the search engine
for free text - number of terms to match -
relevance of the terms, according to the field
they are associated to
5
Faceted search
EP websites
  • In the coming slides examples of faceted search
    as applied on websites developed by the EP will
    be shown for
  • The Legislative Observatory of the EP
  • Public Register of documents
  • IPEX
  • ECPRD

6
Faceted search
EP websites
OEIL Legislative Observatory of the
EP http//www.europarl.europa.eu/oeil (old
version of the site)
7
Faceted search
EP websitesOEIL Legislative Observatory
  • OEIL contains legislative, budgetary, non-legislat
    ive and internal parliamentary procedures, such
    as
  • Co-decision, consultation and assent procedure
  • budgetary and discharge procedures
  • own-initiative reports by the European
    Parliament
  • appointments, waivers of immunity and changes to
    the Rules of Procedure (i.e. internal EP
    procedures) 
  • resolutions and recommendations adopted by the
    European Parliament
  • documents forwarded for information from the
    Commission (during the last 9 months).

8
Faceted search
EP websitesOEIL Legislative Observatory
Situation before implementing faceted search
9
Faceted search
EP websitesOEIL Legislative Observatory
10
Faceted search
EP websitesOEIL Legislative Observatory
  • Challenges for applying facets in OEIL
  • Sequence of facets
  • Parliamentary term,
  • Protocol order for returned matches in the facets
  • Political groups, Commission DGs, etc.
  • Facets with huge results of additional criteria
  • Rapporteurs (possibly a few hundred)
  • Facets for structured keywords list
  • Legal Basis (Treaty to Article)
  • Length of words

11
Faceted search
EP websitesOEIL Legislative Observatory
12
Faceted search
EP websitesOEIL Legislative Observatory
  • Where facets can help out
  • Date range searches
  • Structured references of procedures or documents

13
Faceted search
EP websites
  • Public Register of documents
  • http//www.europarl.europa.eu/registre/

14
Faceted search
EP websitesPublic Register
  • Documents accessible through the Register
  • 5 main categories of documents
  • Parliamentary activity
  • EP general information
  • From other institutions and Member States
  • Documents from third parties
  • Budgetary procedure
  • 125 types of documents
  • 362.217 References
  • 2.386.485 Documents (All LV)
  • List defined by EP Bureau

References Documents
December 2007 207.069 1.306.059
December 2008 262.000 1.682.774
December 2009 310.760 1.998.330
December 2010 362.217 2.386.485
15
Faceted search
EP websitesPublic Register
  • Public Register / Metadata
  • Usually for each document
  • reference number
  • title
  • dates
  • summary
  • authorities
  • authors
  • relations

16
Faceted search
EP websitesPublic Register
Situation before implementing faceted search
17
Faceted search
EP websitesPublic Register
Situation before implementing faceted search
18
Faceted search
EP websitesPublic Register
19
Faceted search
EP websitesPublic Register
20
Faceted search
EP websitesPublic Register
21
Faceted search
EP websites
IPEX Interparliamentary EU Information
Exchange http//www.ipex.eu (old version of the
site)
22
Faceted
searchIPEX Interparliamentary EU Information
Exchange
The IPEX Database contains a complete catalog of
Commission documents from 2006. From each
Commission document users can click on "Related
dossiers" and from there access national scrutiny
pages. Each national scrutiny page contains
documents from the individual national
parliaments relating to the specific Commission
document or legislative procedure. IPEX also
hosts a calendar of interparliamentary
cooperation which contains information concerning
all interparliamentary meetings relating to the
European Union.
23
Faceted
searchIPEX Interparliamentary EU Information
Exchange
24
Faceted
searchIPEX Interparlamentary EU Information
exchange
25
Faceted
searchIPEX Interparlamentary EU Information
exchange
Challenge How to guarantee that the result
lists presents the information in its context
Dossier
Scrutinies
Documents
Private forums
26
Faceted
searchIPEX Interparlamentary EU Information
exchange
27
Faceted search
EP websites
ECPRD European Center for Parliamentary Research
and Documentation http//www.ecprd.org (private
site)
28
Faceted
searchECPRD European Center for Parliamentary
Research and Documentation
29
Faceted
searchECPRD European Center for Parliamentary
Research and Documentation
Situation before implementing faceted search
30
Faceted
searchECPRD European Center for Parliamentary
Research and Documentation
31
Faceted
searchECPRD European Center for Parliamentary
Research and Documentation
32
Faceted
searchECPRD European Center for Parliamentary
Research and Documentation
For the next release an extension to the current
(new) search implementation is foreseen Using
the key facet for Thesaurus entries as a
privileged entry point to find relevant objects
on the site (i.e. Taking benefit of XML
structured output of facettes to use it as a
way to navigate to the good records)
33
Faceted search
Conclusions
34
Faceted search
  • Conclusions
  • One size dont fit it all
  • Advanced search may be required for
    pre-selection
  • Challenges show when large result lists are
    returned
  • Site wide searches require to recall the context
    of the object
  • Analysis starts when indexing, not when
    producing result lists

35
Faceted search
  • Conclusions
  • Easily comprehensible and powerfull drill
    updown feature
  • Flexible to adapt to future queries
  • No 0 result lists when drilling
  • Statistical follow of to expect results

36
Faceted search
Thanks for your attention! Questions?
Write a Comment
User Comments (0)
About PowerShow.com