Topicbased approaches to searching - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Topicbased approaches to searching

Description:

The ongoing need for controlled vocabularies and authority control ... Dictionary of National Biography. Dictionary of Australian Artists Online ... – PowerPoint PPT presentation

Number of Views:16
Avg rating:3.0/5.0
Slides: 27
Provided by: david729
Category:

less

Transcript and Presenter's Notes

Title: Topicbased approaches to searching


1
Topic-based approaches to searching
Judith Pearce and Basil Dewhurst National Library
of Australia
2
What Ill cover
  • The ongoing need for controlled vocabularies and
    authority control
  • How they are currently being deployed in resource
    discovery
  • Doing more with authority files through a
    topic-based approach to searching
  • People Australia as exemplar

3
Assertion
  • Findability is improved through the use of
    controlled vocabularies

4
Why
  • A single word may have multiple meanings
  • Different words may mean the same thing
  • A concept may be a subset of a broader concept or
    related to another concept
  • People may have many names
  • The same name may be used by different people

5
Types of controlled vocabularies
  • Simple enumerated lists
  • Taxonomies
  • Synonym rings
  • Thesauri
  • Rules-based

6
IFLA conceptual models
  • Functional Requirements for Bibliographic Records
    (FRBR)
  • Functional Requirements for Authority Data (FRAD)
  • Functional Requirements for Subject Authority
    Records (FRSAR)

7
Bibliographic entities
Person
Concept
Work
Family
Object
Expression
Event
Manifestation
Corporate body
Place
Item
Names
Identifiers
Controlled access point
8
International thesaurus standards
  • ISO 27881985 Guidelines for the establishment
    and development of monolingual thesauri (BS5723
    1987)
  • ISO 5964 1985 Guidelines for the establishment
    and development of multilingual thesauri (BS
    6723 1985)

9
New work item proposal
  • Revise the two ISO standards based on BS
    87232005 Structured vocabularies for information
    retrieval
  • Definitions, symbols and abbreviations
  • Thesauri
  • Vocabularies other than thesauri
  • Interoperability between vocabularies
  • Exchange formats and protocols for
    interoperability

10
Other work
  • ANSI/NISO Z39.19 2003 Guidelines for the
    Construction, Format, and Management of
    Monolingual Thesauri
  • Freely available
  • Single part
  • Readable
  • Addresses interoperability issues
  • Touches on multilingual issues

11
Schemas
  • Authority data
  • International Standard Archival Authority Record
    for Corporate Bodies, Persons, and Families
    (ISAAR)
  • MARC format for authority data
  • Metadata Authority Description Schema (MADS)
  • Thesauri
  • Z39.50 ZTHES
  • Simple Knowledge Organisation System (SKOS)
  • Parties
  • Encoded Archival Context (EAC)
  • DC Agents
  • International Standard Name Identifier (ISNI)

12
Deployment (traditional)
  • Keyword searching
  • part of resource description
  • Limits and filters
  • Coded values not directly searchable
  • Fielded searching
  • two-step process to exploit non-preferred forms
  • Subject hierarchies
  • Authority files
  • Mainly a librarians tool
  • Need for user training and intermediation

13
Deployment (new generation)
  • Google-like search
  • Relevance ranking
  • Faceted clustering
  • Form (books, pictures, maps)
  • Names
  • Subjects
  • Dates
  • Audience

14
Outstanding issues
  • What the user keys in may not match controlled
    access points
  • Limitations of faceted clustering
  • How many topics to display
  • Selecting a facet limits the search
  • Increases precision but not recall
  • Topic attributes not exploited in the simple
    keyword search

15
Topic-based searching
16
Another way of looking at FRBR
Topic
Resource
Party
Work
Concept
Expression
Event
Manifestation
Place
Item / Object
Resource
17
People Australia
  • An online service that
  • Takes a topic based approach to resource
    discovery
  • Provides access to records about people and
    organisations
  • Allows users to find and get resources, by/about
    a person or organisation
  • Interoperates with contributors services

18
Precursors
  • Bright Sparcs
  • Australian Womens Archive Project
  • Australian Trade Union Archive
  • Dictionary of National Biography
  • Dictionary of Australian Artists Online
  • Archive manuscript collections
  • Australia Dancing
  • Music Australia

19
Content
  • Australian Name Authority File
  • 200,000 Australian names
  • Some contextual information
  • Contributor data
  • 24 archives, museums and libraries
  • Over 47 services
  • Rich sources of information
  • Resources by or about Australian people in
    Australian collections

20
Architecture
  • Harvesting contribution model
  • Records clustered under a single identifier
  • Harvested records fully searchable
  • Hub or lens pages for each person
  • Links to information by or about the person
  • Deep linking to contributor services
  • Both people and organisations

21
Example
  • Sir Douglas Mawson was a geologist and explorer
    famous for leading the Australasian Antarctic
    Expedition 1911 1914
  • A People Australia page for Mawson would provide
  • Biographical information from contributor records
  • Bibliographic records from Libraries Australia
  • Image records from Picture Australia
  • Newspaper articles from the new Newspapers
    Discovery Service
  • Related scholarship from the ARROW Discovery
    Service
  • Museum objects
  • Archival resources

22
Data Schema
  • Based on the Encoded Archival Context (EAC)
    standard
  • Alternatives considered
  • MARC/MARCXML/MADS
  • Interparty
  • Dublin Core Agent
  • Music Australia Party Schema (MAPS)

23
System interfaces
  • Harvest
  • OAI-PMH
  • Search
  • SRU, OpenSearch, Z39.50 (via SRU gateway)
  • Augment
  • SRU Update, OAI-PMH
  • Syndicate
  • OAI-PMH, Coins, RSS, etc
  • Resolve
  • Info URI (to lens page)
  • Search protocols or openURL (to metadata)

24
OAI-PMH
  • Harvesting
  • Metadata formats
  • EAC
  • Dublin Core
  • MARCXML
  • Syndicating
  • EAC
  • Dublin Core

25
Project progress
  • Work is proceeding on building the Search and
    Harvest services
  • Search interfaces available in second qtr 2008
  • Will be making prototypes widely available for
    user testing and feedback
  • Will create a public website to provide
  • Information about the project
  • Regular news
  • Technical information

26
Topic-based approaches to searching
  • Judith Pearce
  • Phone 02 6262 1425
  • Email jpearce_at_nla.gov.au
  • Basil Dewhurst
  • Phone 02 6262 1046
  • Email bdewhurst_at_nla.gov.au
Write a Comment
User Comments (0)
About PowerShow.com