Metadata : Promise and Practice - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Metadata : Promise and Practice

Description:

Metadata : Promise and Practice. Jeffrey Beall. Nebraska Library Association ... 2006 |b (Carhenge, locale, Box Butte County, Nebraska, 42 09'40'N 102 51'32'W) ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: Metadata : Promise and Practice


1
Metadata Promise and Practice
  • Jeffrey Beall
  • Nebraska Library Association
  • Technical Services Round Table
  • Spring Meeting, April 25, 2008

2
Outline
  • Introduction
  • 8 theses of my talk
  • About me
  • Metadata and high-quality information retrieval
    value of browse displays
  • Four types of searching in libraries
  • The weaknesses of full-text searching
  • The future of cataloging and the debate
  • Next-generation library interfaces

3
000 01005cz a2200217n 450
001 6940590
005 20061110002734.0
008 060822 anannbabn a ana c
035 __ a (DLC)6940590
035 __ a (DLC)sh2006006354
035 __ a (DLC)351667
906 __ t 0645 u te04 v 0
010 __ a sh2006006354
040 __ a CoU-DA b eng c DLC
150 __ a Carhenge (Box Butte County, Neb.)
550 __ w g a Monuments z Nebraska
670 __ a Work cat. Carhenge, genius or junk? VR 2005.
670 __ a GNIS, Aug. 21, 2006 b (Carhenge, locale, Box Butte County, Nebraska, 4209"40'N 10251"32'W)
670 __ a Wikipedia, Aug. 21, 2006 b (Carhenge is a replica of Englands Stonehenge located near the town of Alliance, Nebraska on the High Plains. Instead of being made from stones, Carhenge is constructed of vintage American automobiles, all covered with gray spray paint. Built by Jim Reinders, it was dedicated at summer solstice in June of 1987.)
4
Favorite funny subject headings
Golf and war Electric donkeys Infants
Congresses World Wide Web Early works to
1800 Automobile driving Religious aspects Dance
France Women, Kukukuku (Changed to Women,
Hamtai) Ugly contests Host-fungus relationships
5
Favorite funny subject headings
Weapons of mass destructionSafety
measures Pomegranate seeds in literature Infants
Books and reading Eskimos Hunting Headache
patients writings Bird surveys Violin Methods
(Fiddling) Global warming Fiction Body, Human
Catalogs Mentally ill parents Appalachian Region
Intellectual life
6
Favorite funny subject headings
Tax exemption Taxation Dinosaurs as pets Labor
disputes Poetry Crappie fishing Reality
Fiction Historic buildings Design and
construction Public toilets in motion
pictures Domestic asses Hurling managers Uranus
probes 110 10 a United States. b Office of
Solid Waste
7
Theses
  1. Libraries should provide high-quality information
    discovery and information retrieval.
  2. The best way to achieve this is with systems that
    sufficiently exploit rich, standard, and
    comprehensive metadata.
  3. Rich, standard, and comprehensive metadata
    requires controlled vocabularies for subject
    metadata, name disambiguation, granularity of
    description, and collocation.

8
Theses (continued)
  1. Full-text searching, while not devoid of value,
    is a low-quality IR/ID system for the type of
    searching done in libraries, especially serious
    research and scholarship, etc.
  2. At this time, computers, which do not understand
    the nuances of human language, are not able to
    create metadata that is of sufficient quality for
    use in library IR systems

9
Theses (continued)
  • Information discovery often requires mediation.
    IR systems dont have to be dumbed-down and made
    simple. Many things in the world are complicated,
    so its natural that the organization of
    information will reflect that. Its okay to have
    to learn to use a library catalog or other IR
    system.

10
Theses (continued)
  1. Library IR systems should not abandon
    alphabetical browse displays in favor of
    relevance ranking.
  2. The creation, maintenance, and sharing of
    metadata for intellectual resources should not be
    made so complicated that it reduces the amount or
    quality of metadata being created.

11
About me
Auraria Campus
12
The value of metadata
  • Elements of metadata
  • The value of rich metadata
  • The library technology graveyard analyses of
    low-quality, emerging library technologies
  • Defining quality in library IR systems

13
Left-anchored subject browse display
14
The value of left-anchored browse displays
  1. Simplicity
  2. Structure
  3. Parsing advantage
  4. References
  5. Truncation
  6. Concept consolidation
  7. Collocation of inverted terms
  8. Typographical errors
  9. Classification display
  10. Completeness
  11. Skill transference

15
The Four categories of searching in libraries
  • Deterministic searching
  • Full text searching
  • Metatext searching
  • Metadata-enhanced stochastic searching

16
Deterministic searching
  • An author, title, subject, number search in an
    online library catalog
  • Only searches metadata results sorted
    alphanumerically
  • Can use cross-references

17
Full-text searching
  • Matches words in a search with words in documents
  • Advantages free, good for rare terms, good for
    casual information seeking
  • Also called stochastic searching, probabilistic
    searching

18
Metatext searching
  • Is a full-text search but only of metadata
  • A keyword search in a library catalog is an
    example
  • Advantages good for rare words good for novice
    searchers
  • Disadvantage May miss abbreviated terms is full
    text, but not of full text itself

19
Metadata-enhanced stochastic searching
  • Is a full-text search but also uses metadata to
    limit results
  • Google advanced search is an example
  • Google staff mode how do they encode metadata?
    What's their metadata scheme?

20
(No Transcript)
21
The weaknesses of full-text searching
  • The synonym problem
  • The homonym problem
  • Inability to search by facets
  • Spamming
  • The "aboutness" problem
  • Figurative language
  • Word lists
  • Abstract topics

22
The weaknesses of full-text searching (continued)
  • The incognito problem
  • Difficult-to-search paired topics
  • Search engine variability
  • The opaque web

23
Search fatigue
24
Miscellaneous
  • What computers still cannot do
  • Gresham's Law
  • Still need metadata surrogates
  • The debate about the future of cataloging
  • My strategy
  • "Next-generation" library catalogs

25
WorldCat.org Example of a next-generation,
FRBRized search engine
  • Facets
  • Metatext search
  • Hope for catalogers
  • Can be sorted also by
  • author, title, date

26
jeffrey.beall_at_ucdenver.edu
Discussion
Scarlet
Write a Comment
User Comments (0)
About PowerShow.com