Title: University of Illinois Experiences Using OAI Protocol for Metadata Harvesting
1 University of Illinois Experiences Using OAI
Protocol for Metadata Harvesting
Joanne Kaczmarek University of Illinois at
Urbana-Champaign
ALA ETEXT Discussion Group June 15, 2002
2Outline
- OAI Protocol for Metadata Harvesting
- Illinois Project Objectives
- OAI Cultural Heritage Repository
- Research Issues
3OAI Protocol for Metadata Harvesting
- Harvesting Approachto Interoperability
- Provider Services Harvesting Services
- Relies on HTTP XML
- Applicability BeyondE-Print Archives
Service Provider Metadata Provider
H A R VESTER
REPOSITORY
6 HTTP Requests
OAI
OAI
6 HTTP Responses
42. Illinois Project Objectives
- Cultural Heritage Resources
- Create Harvesting Provider Tools
- Heterogeneous Cross-Collection Repository
5Project Objectives (continued)
- Focus on...
- CIC Resources EAD Finding Aids
- Interface for Scholars K-12 Communities
- Coordinate with University of Michigan OAIster
Project
6Native Metadata View
7Views of Metadata OAI -XML
8Views of MetadataIllinois Repository
93. OAI Cultural Heritage Repository
- 1,004,500 records from 34 different institutions
- 3 museums
- 12 academic libraries
- 4 historical or cultural societies
- 1 state library
- 14 digital libraries
- http//oai.grainger.uiuc.edu/search/
10Participating Institutions
- Alliance Library System
- American Museum of Natural History
- American Numismatic Society
- American Philosophical Society
- Bentley Historical Library, University of
Michigan - Celebration of Women Writers
- Colorado Digitization Project
- CIMI
- Cornell University Library
- Formations
- Harvard University Library
- Ibiblio Digital Library Project
- Illinois State Library
- Indiana University Digital Library Program
- Library of Congress
- Lincoln Trail Libraries System
- Michigan State University
- Minnesota Historical Society
- National Library of Australia
- Northwestern University Library
- Ohio Historical Society
- Online Archive of California
- Open Video Project
- Perseus Digital Library
- Spurlock Museum, University of Illinois
- University of Chicago Libraries
- University of Illinois Library (Archives, DIMTI)
- University of Iowa
- University of Michigan Library
- University of Minnesota
- University of Pennsylvania, Schoenberg Center
- University of Tennessee Libraries
- University of Texas, Austin
11Simple Search Interface
12Advanced Search Interface
13Brief Record Results Listing
14(No Transcript)
15(No Transcript)
164. Research Issues
- Variations in use of metadata
- Organization of resources
- Alternative data presentation
- Rights and permissions
17Variations in use of metadata
- Mapping other schemas into DC
- MARC
- EAD
- Local
- Mapping variance
- Normalizing
18(No Transcript)
19Mapping variance
20Normalizing DC Type element
- Review existing vocabularies used
- Dublin Core Type Vocabulary
- LC Thesaurus for Graphic Materials
- Select terms to add or replace
- Continuous application of vocabulary through
filters - Provide end-user options
- Drop down menus in interface
21Use of Type element
- 1400 different values
- Used 1 to 873 times per collection
- Variety of controlled and local vocabularies
- Dublin Core Type Vocabulary
- LC Thesaurus for Graphic Materials II
22(No Transcript)
23(No Transcript)
24Organization of Resources
25Type of resource or institution
- Online access to text/image collections
- Personal papers, correspondence, manuscripts
- Visual materials (photos, paintings, etc.)
- Physical objects/artifacts
- Textual materials (books, letters, etc.)
- Institution type (museum, library, archive,
historical society) - Individual collection descriptions
26Need collocation to enhance utility of
heterogeneous repository
- Difficult to provide efficient automatic
indexing of DC subject element - Testing with NCSA data mining tools
- Metadata representing resource, not actual
resource
27Alternative Data Presentation
Native EAD Records
http//oai.grainger.uiuc.edu/ead/search
28(No Transcript)
29(No Transcript)
30(No Transcript)
31Test Scalability and Sustainability
- 45 million catalog records
- Low cost for metadata providers
- Moderate cost for harvesting service providers
- Long-term funding