Title: OAIPMH
1OAI-PMH
- The Open Archives Initiative Protocol for
Metadata Harvesting
Presenter Knud Möller Friday, 30.07.2004
2Content
- Basic idea behind OAI-PMH
- Architectural Overview
- Repositories and Harvesters
- Resources, Items and Records
- Internal Record Format
- Sets
- Selective Harvesting
- Response Format
- Command Overview
3Basic idea behind OAI-PMH
- provide a standard protocol for the harvesting/
querying of metadata about any kind of resource -
What kind of resources can you provide and what
are their properties? - OAI-PMH is only the protocol, needs to be
implemented - some implementations exist
- Emblem Project Utrecht
- http//emblems.let.uu.nl/emblems/html/techoai.htm
l - Virginia Tech (VTOAI)
- http//www.dlib.vt.edu/projects/OAI/software/vtoa
i/vtoai.html
4Architectural Overview Repositories and
Harvesters
Harvester
Harvester
Repository
Harvester
Harvester
5Architectural Overview Repositories and
Harvesters
Harvester
Harvester
Repository
Harvester
Harvester
Harvesters issue OAI-PMH requests for metadata
via HTTP.
6Architectural Overview Repositories and
Harvesters
Harvester
Harvester
Repository
Harvester
Harvester
Harvesters issue OAI-PMH requests for metadata
via HTTP. A Repository processes the OAI-PMH
requests and has to implement the protocol.
7Architectural Overview Resources, Items and
Records
Anything - physical artifact, a digital
resource, a concept, etc. Whatever the metadata
is about.
8Architectural Overview Resources, Items and
Records
Representation of resource in repository. Can
disseminate metadata in various formats. Must
always provide Dublin Core. Has unique
identifier.
Item
oaiarXiv.orgcs/0112017
Anything - physical artifact, a digital
resource, a concept, etc. Whatever the metadata
is about.
9Architectural Overview Resources, Items and
Records
XML-encoded byte stream of actual metadata.
Representation of resource in repository. Can
disseminate metadata in various formats. Must
always provide Dublin Core. Has unique
identifier.
Item
oaiarXiv.orgcs/0112017
Anything - physical artifact, a digital
resource, a concept, etc. Whatever the metadata
is about.
10Internal Record Format I
ltrecordgt ltheadergt lt!-- blabla --gt
lt/headergt ltmetadatagt lt!-- blabla --gt
lt/metadatagt ltaboutgt lt!-- blabla
--gt lt/aboutgt lt/recordgt
11Internal Record Format I
ltrecordgt ltheadergt lt!-- blabla --gt
lt/headergt ltmetadatagt lt!-- blabla --gt
lt/metadatagt ltaboutgt lt!-- blabla
--gt lt/aboutgt lt/recordgt
ltheadergt ltidentifiergtoaiarXiv.orgcs/0112017lt
/identifiergt ltdatestampgt2002-02-28lt/datestampgt
ltsetSpecgtcslt/setSpecgt ltsetSpecgtmathlt/setS
pecgt lt/headergt
12Internal Record Format II
ltmetadatagt ltoai_dcdc
xmlnsoai_dc"http//www.openarchives.org/OAI/2.0/
oai_dc/" xmlnsdc"http//purl.org/dc/elem
ents/1.1/" xmlnsxsi"http//www.w3.org/20
01/XMLSchema-instance xsischemaLocation
"http//www.openarchives.org/OAI/2.0/o
ai_dc/ http//www.openarchives.org/OAI
/2.0/oai_dc.xsd"gt ltdctitlegtUsing
Structural Metadata to Localize Experience
of Digital Contentlt/dctitlegt
ltdccreatorgtDushay, Naomilt/dccreatorgt
ltdcsubjectgtDigital Librarieslt/dcsubjectgt
ltdcdescriptiongtWith the increasing ..bla.. to
particular communities of users.
lt/dcdescriptiongt ltdcdategt2001-12-14lt/dc
dategt ltdctypegte-printlt/dctypegt
ltdcidentifiergt http//arXiv.org/abs/c
s/0112017 lt/dcidentifiergt
lt/oai_dcdcgt lt/metadatagt
13Internal Record Format III
ltaboutgt ltprovenance
xmlns"http//www.openarchives.org/OAI/2.0/provena
nce" xmlnsxsi"http//www.w3.org/2001/XML
Schema-instance" xsischemaLocation
"http//www.openarchives.org/OAI/2.0/proven
ance http//www.openarchives.org/OAI/
2.0/provenance.xsd"gt ltoriginDescription
harvestDate"2002-02-02T141002Z"
altered"true"gt ltbaseURLgthttp//the.oa
.orglt/baseURLgt ltidentifiergtoair2.org
klik001lt/identifiergt
ltdatestampgt2002-01-01lt/datestampgt
ltmetadataNamespacegt
http//www.openarchives.org/OAI/2.0/oai_dc/
lt/metadataNamespacegt
lt/originDescriptiongt lt/provenancegt lt/aboutgt
14Sets
- Items can be organized into sets.
- Sets can either be organized flat or
hierarchically.
15Selective Harvesting
- Harvesters can specify some constraints on which
items they are interested in - Regarding datestamps
- only items that where created, modified or
deleted (optional) in a certain time period - Regarding sets
- only items that belong to a specific set (or any
of its subsets)
16Response Format
lt?xml version"1.0" encoding"UTF-8" ?gt
ltOAI-PMH xmlns"http//www.openarchives.org/OAI/2.
0/ xmlnsxsi"http//www.w3.org/2001/XMLS
chema-instance xsischemaLocation
"http//www.openarchives.org/OAI/2.0/
http//www.openarchives.org/OAI/2.0/OAI-PMH
.xsd"gt ltresponseDategt2002-05-01T192030Zlt
/responseDategt ltrequest verb"GetRecord"
identifier"oaiarXiv.orghep-th/9901
001 metadataPrefix"oai_dc"gt
http//an.oa.org/OAI-script
lt/requestgt ltGetRecordgt
ltrecordgt...lt/recordgt lt/GetRecordgt lt/OAI-PMHgt
17Command Overview I
- GetRecord get a specific record, must specify
items URI and metadata prefix - Identify retrieve information about a repository
(name, protocol version, supports deletion, ...) - ListRecords get either all records or a subset,
must specify metadata prefix - ListIdentifiers like ListRecords, but retrieves
only headers
18Command Overview II
- ListMetadataFormats lists the available metadata
formats of a repository - ListSets returns the set structure of a
repository
19References
- OAI-PMH specification http//www.openarchives.org
/OAI/2.0/openarchivesprotocol.htm
20Thanks and goodbye!