The Virtual International Authority File - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

The Virtual International Authority File

Description:

The Virtual International Authority File – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 30
Provided by: hic688
Category:

less

Transcript and Presenter's Notes

Title: The Virtual International Authority File


1
The Virtual InternationalAuthority File
  • Thomas Hickey
  • ACIG
  • 2009 July
  • ALA, Chicago IL

2
VIAF participants
  • Bibliothèque nationale de France
  • Deutsche Nationalbibliothek
  • Library of Congress/NACO
  • OCLC
  • National Library of the Czech Republic
  • Egypt (Bibliotheca Alexandrina)
  • National Library of Australia
  • National Library of Israel
  • Italy (ICCU)
  • National Library of Portugal
  • National Library of Spain
  • National Library of Sweden
  • Swiss National Library
  • Vatican Library

3
Goals of the Virtual International Authority File
  • Link national-level authority records
  • Expand the concept of universal bibliographic
    control
  • Allow national or regional variations in
    authorized form to co-exist
  • Support needs for variations in preferred
    language, script and spelling
  • Play a role in the emerging semantic web

4
Scope of VIAF
  • Personal names
  • Geographic
  • Corporate
  • Title
  • Family
  • Events
  • Everything but concepts are considered in scope
  • National level, but willing to consider other
    sources

5
A standard problem One name, multiple people
Fournier,Marcel, 1945-
Fournier, Marcel
Fournier, Marcel,1946-
6
Another standard problem One person, multiple
personas
Robb, J. D., 1950-
7
Fundamental to VIAF One persona, many
representations
viaf.org/viaf/29541064
8
  • Matching process

9
Brief LC authority
  • 010 n 84044261
  • 040 DLC c DLC d DLC
  • 100 1 Larson, Jack.
  • 670 Thomson, V. The cat, c1982 b t.p.
    (Jack Larson)

10
Enhancing the authorities
Authority Record
11
Mining the bibliographic record
LDR 00826ccm 2200289 a 4500 1 ocm10025532
5 20031229650847.0 8 840627s1982 nyuuua
n eng 10 a 84758340 40 a
DLC c DLC 19 a 17706440 20 c 2.95 28
22 a 48418 b G. Schirmer 45 2 b d198006 b
d198007 48 b va01 b ve01 a ka01 50 00 a
M1529.3 b .T 100 1 a Thomson, Virgil, d
1896- 245 14 a The cat b duet for soprano and
baritone / c Virgil Thomson words
by Jack Larson. 260 a New York b G.
Schirmer, c c1982. 300 a 1 score (11 p.)
c 31 cm. 500 a For soprano, baritone, and
piano. 650 0 a Vocal duets with piano. 600 10
a Larson, Jack x Musical settings. 700 1 a
Larson, Jack.
12
Information in bibliographic records
  • He is a lyricist
  • His primary subject area is music
  • He was published in the 80s and 90s by G.
    Schirmer and Belwin Mills in New York
  • Worked with Virgil Thomson and Gerhard Samuel
  • Jack Larson is the only name he has used on his
    publications
  • Etc.

13
VIAF data flow
VIAF
Deduplication/ Disambiguation
VIAF History
14
Current state
  • Personal names from 16 files
  • Names are clustered
  • 10.4 million names
  • 8.7 million clusters
  • Identifiers assigned
  • http//viaf.org/viaf/77390479
  • Preliminary work done on geographic names
  • Unicode throughout
  • UNIMARC and MARC-21 supported

15
VIAF interface is built on top of SRU
  • SRU grew out of Z39.50
  • VIAF is SRU plus URL-rewrite rules and
    content-negotiation
  • Also modified to allow the return records without
    SRU XML wrapper
  • New query parameter HTTP Accept
  • http//viaf.org/search?querycql.anyall"dempsey"
    httpacceptapplication/rssbxml
  • Allows support of OpenSearch (RSS returned)

16
URI Patterns and Linked Data
  • VIAF Record
  • Content negotiation
  • HTTP headers or SRU extension

17
SRU Searching
  • Retrieve record by internal control number
  • http//viaf.org/search?querycql.anyall"NKCjn19
    990008936
  • Results list for George Washington
  • http//viaf.org/search
  • ?querylocal.mainHeadingElall"george20washingto
    n
  • stylesheetxsl/results.xsl
  • sortKeysholdingscount

18
  • Matching

19
What makes a match?
  • 1,705,555 Title
  • 846,722 Double date
  • 123,487 Joint author
  • 71,851 LCCN
  • 24,587 Partial date and partial title
  • 11,010 Partial date and publisher
  • 9,179 Partial title and publisher
  • 6,415 Name as subject
  • 3,168 Standard number

20
Consensus
21
Little consensus
22
Date variations are common
23
Occasional long chain
24
  • Example

25
Search results for Sharabi
26
(No Transcript)
27
Next steps
  • More participants
  • More name types (geographics, corporates,)
  • More variety of sources
  • Rights agencies, ISNI
  • Regional files
  • Specialized files

28
Possible applications within OCLC
  • FRBR matching
  • Better matching of non-English metadata
  • Uniform identifier across all languages
  • Authority control for cataloging
  • Better regionalization of WorldCat.org
  • Minimize differences across languages of
    cataloging

29
Discussion
  • How would you use VIAF?
  • How important is VIAF?
  • Will anyone use linked-data URIs?
Write a Comment
User Comments (0)
About PowerShow.com