Is Metasearching Really Better Searching - PowerPoint PPT Presentation

About This Presentation
Title:

Is Metasearching Really Better Searching

Description:

Metasearch and 'social bookmarking' A centre of expertise in digital ... Bookmarking services as ... use/integrate metadata from bookmarking services? ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 36
Provided by: petejo
Category:

less

Transcript and Presenter's Notes

Title: Is Metasearching Really Better Searching


1
Is MetasearchingReally Better Searching? STM
Innovations SeminarLondon, Friday 2 December
2005 Pete Johnston Research Officer, UKOLN,
University of Bath
UKOLN is supported by
www.bath.ac.uk
2
Is Metasearching Better Searching?
  • What is metasearch?
  • Making metasearch work
  • The NISO Metasearch Initiative
  • Metasearch today
  • Metasearch and Google
  • Metasearch and "social bookmarking"

3
What is metasearch?
4
What is metasearch?
Metasearch, parallel search, federated search,
broadcast search, cross-database search, search
portal are a familiar part of the information
community's vocabulary. They speak to the need
for search and retrieval to span multiple
databases, sources, platforms, protocols, and
vendors at one time.
NISO MetaSearch initiativehttp//www.niso.org/com
mittees/MS_initiative.html
5
The search problem
  • User wants to find, access, and use items made
    available by multiple content providers
  • Content providers make their collections
    available through their own separate
    presentation services
  • User interacts with multiple services in
    succession, e.g.
  • Query Resource Discovery Network (RDN) for Web
    resources
  • Query Zetoc for journal articles
  • etc

6
The search problem
7
The search problem
  • User has to
  • Discover different services
  • Manage different authentication/access
    requirements
  • Use different user interfaces for search
  • Interpret different result sets
  • different metadata
  • Manipulate different result sets
  • human-readable (HTML)
  • but difficult to merge, reuse
  • May still not have access to (appropriate copy
    of) resource

8
The metasearch solution
  • The provision of "metasearch" services that
  • enable user to search across the metadata
    databases of multiple content providers from a
    single interface
  • manage multiple result sets and present to user
  • manage authentication/access
  • (etc!)
  • Seamless (to the user) discovery of and access to
    heterogeneous, distributed resources!

9
Approaches to metasearch (1) cross-searching
  • Metasearch service accepts user query
  • Sends query to multiple content provider search
    targets
  • Receives responses from targets
  • Presents result sets to user

10
Z39.50, SRW, SRU, etc
11
Approaches to metasearch (2) harvesting
  • Metasearch service periodically gathers metadata
    records from content provider repositories into
    local database
  • Metasearch service accepts user query
  • Executes query on local database
  • Presents result sets to user
  • Some harvesting services may also harvest/index
    copy of resource

12
OAI-PMH
13
Cross-searching harvesting
  • Metasearch service may use both in combination!
  • Cross-search
  • Latest results returned
  • Content provider controls searches available
  • May slow overall performance
  • Harvesting
  • Better performance for user query
  • Options for normalisation etc by harvester
  • Only as up-to-date as last harvest

14
A hospitable climate for metasearch?
  • Metasearch service depends on access to metadata
  • Web Services
  • Standards for providing machine interfaces to
    applications on Web
  • Based on HTTP and XML
  • SOAP (messaging protocol), WSDL (service
    description), WS- (!!)
  • WS not just for search!
  • Service-oriented approaches, modular applications
  • Google and Amazon provide Web Services
  • "Web 2.0"
  • "The Web as platform"
  • Recombining data and services from multiple
    sources

15
The problems with metasearch
  • User requires/expects resources from increasing
    range of content providers
  • What if content provider doesn't implement
    standard search/harvest interface?
  • Some proprietary APIs, "XML Gateways"
  • Scalability
  • Some "screen-scraping"
  • Parsing of HTML pages to obtain metadata
  • Rights issues
  • Scalability, volatility

16
The problems with metasearch
  • Metasearch services work, but.
  • For service provider
  • complex, laborious
  • fragile, susceptible to change by content
    provider
  • duplication of effort by service providers
  • For content provider
  • concerns over efficiency
  • concerns over access management
  • rights, branding, results presentation/ranking

17
Making metasearch work
18
Making metasearch work
  • Effective metasearch requires agreements between
    content providers and service providers
  • Transport protocol(s)
  • Query language(s)
  • syntax and semantics
  • Metadata schemas
  • syntax and semantics
  • Metadata quality
  • presence of values, formats of literals etc
  • Intellectual property rights issues
  • how metadata records and resources are presented,
    used
  • Authorisation / authentication
  • Disclosure / discovery of collections and services

Andy Powell, "Metasearching an overview",
Presentation to BCS EPSG Seminar, July 2004
19
The NISO Metasearch Initiative
  • Response to concerns of librarians, systems
    vendors, content providers
  • Aims to enable
  • metasearch service providers to offer more
    effective and responsive services
  • content providers to deliver enhanced content and
    protect their intellectual property
  • libraries to deliver services that distinguish
    their services from Google and other free web
    services

NISO MetaSearch initiativehttp//www.niso.org/com
mittees/MS_initiative.html
20
Task Group 1 Access Management
  • Conducted survey of authentication methods in use
  • Developed use cases for authentication in
    metasearch context
  • Ranked methods by ability to satisfy needs of use
    cases
  • Recommends either
  • IP-Authentication with a Proxy Server, or
  • Username/Password authentication
  • Liaison with Shibboleth community

21
Task Group 2 Collection Description
  • Metasearch service needs information about
    targets available for search/harvest
  • Discover collections of potential interest
  • Obtain sufficient information to identify a
    collection
  • Select one or more collections from amongst a
    number of discovered collections
  • Discover the services that provide access to the
    collection
  • Select a service with which to interact
  • Interact with service

Collectiondescription
Servicedescription
22
(No Transcript)
23
Task Group 2 Collection Description
  • Collection Description Specification
  • Metadata schema for collection-level description
  • Closely aligned with DCMI Collection Description
    Application Profile
  • Title, Subject, Size, Language, Item Type, Owner,
    Collector, Audience, Rights etc
  • Whole/Part relationships
  • Collection/Catalogue relationships
  • Collection/Service relationships

24
Task Group 2 Collection Description
  • Information Retrieval Service Description
    Specification
  • Describe those digital services that provide
    access to collections
  • Zeerex
  • Indicates protocol used
  • Describes access point(s) for service
  • Describes authentication/authorization
    requirements
  • Lists operations/queries supported

25
Task Group 3 Search/Retrieve
  • Result Set Metadata
  • Metadata schema to describe result set and record
    within result set
  • To support ranking, branding etc
  • Citation Metadata
  • Metadata schema for citation components (based on
    subset of OpenURL)

26
Task Group 3 Search/Retrieve
  • NISO XML Gateway
  • Based on SRU ("non-conformant subset")
  • Query encoded in URI, transmitted in HTTP GET,
    response as XML document
  • Three levels of implementation
  • Level 0 Any query grammar
  • Level 1 Provide description record for database
  • Level 3 Support CQL
  • Liaison with A9 Opensearch

27
Metasearch today
28
Metasearch and Google
  • Google
  • Harvests full-text of Web pages by following
    links
  • Makes indexes available for search
  • Result ranking based on number of links to page
  • Index coverage limited to "visible Web"
  • Problems with
  • Authentication controls
  • Non-persistent URIs
  • Non-textual resources
  • Even if indexed, low ranking if few links
  • No fielded searching

29
Metasearch and Google
  • "Success is as much about what you dont search
    as what you do"
  • Selection is important
  • Relevance of results not determined only by
    links, citations
  • e.g. often useful/vital to select/filter by
    audience, purpose of resource

Roy Tennant, "Is Metasearch Dead?"http//www.niso
.org/news/events_workshops/OpenURL-05-Agen-FINAL.h
tml
30
Metasearch and Google
  • Google interest in indexing "hidden Web"
  • Collaborations with repository providers, OCLC
    etc
  • Google Scholar
  • Google interest in metadata-based approach?
  • Google Base
  • Google and Metasearch as complementary approaches
    to discovery

31
Metasearch and "Social bookmarking"
del.icio.ushttp//del.icio.us/
32
Metasearch and "Social bookmarking"
Connoteahttp//www.connotea.org/
33
Metasearch and "Social Bookmarking"
  • Simple user-generated metadata
  • Typically description plus "tags"
  • Capture user perceptions of resources
  • Some services adding richer metadata
  • Social merging of personal collections
  • Bookmarking services as discovery services
  • Connotea as "community-driven recommendation
    system" (Lund et al)
  • Metadata available via RSS or simple API
  • Can metasearch services use/integrate metadata
    from bookmarking services?

34
Is Metasearching Better Searching?
  • Technical components for metasearch available
  • User expectations of coverage mean metasearch is
    a cross-domain problem
  • However, quality of metasearch dependent on
  • metadata quality
  • metadata consistency
  • across multiple providers
  • Metasearch can complement other approaches
  • Metasearch as "enabler"
  • supporting construction of many different services

35
Is MetasearchingReally Better Searching? STM
Innovations SeminarLondon, Friday 2 December
2005 Pete Johnston Research Officer, UKOLN,
University of Bath
UKOLN is supported by
www.bath.ac.uk
Write a Comment
User Comments (0)
About PowerShow.com