Title: Implementation of PREMIS in METS
1Implementation of PREMIS in METS
- Rebecca Guenther
- Sr. Networking Standards Specialist, Library of
Congress - rgue_at_loc.gov
- PREMIS Implementation Fair
- San Francisco, CA
- October 7, 2009
2- METS records the (possibly hierarchical)
structure of digital objects, the names and
locations of the files that comprise those
objects, and the associated metadata - A METS document may be a unit of storage (e.g.
OAIS AIP) or a transmission format (e.g. OAIS SIP
or DIP) - METS is extensible and modular
- METS uses the XML Schema facility for combining
vocabularies from different Namespaces - The METS Editorial Board has endorsed PREMIS as
an extension schema - Many institutions trying to use PREMIS within the
METS context
3Structure of a METS file
4OAIS, METS and PREMIS
ltMETSgt
described by
delimited by
Archival Information Package
Descriptive Information
Packaging Information
identifies
derived from
ltdmdSecgt
MODS MARCXML DC
Preservation Description Information
Content Information
further described by
ltfileGrpgt
ltamdSecgt
Reference Information
ltmdRefgt
Representation Information
Data Object
ltrightsMDgt
Context Information
metsRights premisrights
lttechMDgt
ltfilegt
ltstructMapgt
ltdigiProvMDgt ltsourceMDgt premisevent
Provenance Information
Semantics
Structure
described by
Fixity Information
lttechMDgt
premisobject
File formats
premisobject textMD MIX
Legend Black Arial OAIS Red Times New Roman
METS Primary Schema Blue Times New Roman Italics
Extension Schema
5METS extension schemas
- wrappers or sockets where elements from other
schemas can be plugged in - Provides extensibility
- Uses the XML Schema facility for combining
vocabularies from different Namespaces - Endorsed extension schemas
- Descriptive MODS, DC, MARCXML
- Technical metadata MIX (image) textMD (text)
- Preservation related PREMIS
6Why do we need guidelines for using PREMIS with
METS?
- Contents of each information package may vary
depending on its function within a repository - Need to determine how to include representation
metadata and associate it with package components - PREMIS data entities (objects, events, rights,
agents) do not map perfectly to METS categories
for representation metadata (techMD, digiProvMD,
rightsMD, sourceMD) - There are redundant elements between the two
standards - Both have extensibility mechanisms
- Flexibility of both standards requires
implementation choices
7Development of Guidelines for Using PREMIS with
METS for Exchange
- PREMIS in METS Guidelines Working Group
- Consists of PREMIS and METS experts
- Focuses on the METS document as a mechanism of
exchange of digital objects and their metadata
(SIP or DIP) - Facilitates communication when internal
requirements and technical environments vary - Tension between flexibility and being
prescriptive to facilitate interoperability - Consider usage scenarios
- If a SIP it may get unwrapped and stored in
different structures - If a DIP it is converted from internal structures
to PREMIS - A more liberal approach is possible for a SIP
than a DIP - Establishing guidelines, a METS profile, and
examples - http//www.loc.gov/standards/premis/guidelines-pr
emismets.pdf
8Implementation issues in using PREMIS with METS
- Location of PREMIS metadata within METS documents
- Whether to record elements redundantly if they
occur in both PREMIS and METS - Relationship of different structural metadata
mechanisms in PREMIS and METS - How to record PREMIS Agent entities in METS
documents - Use of identifiers to link elements in PREMIS and
METS - How to record elements that are also part of a
format specific technical metadata schema (e.g.
MIX)
9Some recommendations from Guidelines
- METS sections
- Use Object in techMD or digiProvMD
- Use Event in digiProvMD
- Use Rights in rightsMD
- Use Agent in digiProvMD or rightsMD
- PREMIS Container -- use only if keeping all
PREMIS metadata together. Do not use if
separating PREMIS metadata into different amdSec
subelements - PREMIS and METS redundancies -- Choosing which
options to use is an implementation decision,
document in profile e.g. METS ltsizegt element
attributes and subelements of ltobjectCharacteristi
csgt in PREMIS
10Recommendations (cont.)
- Structural relationship elements -- use the METS
structMap to record structural relationships, use
PREMIS relationship elements to record
preservation and derivation relationships and
structural if desired - ID/IDREF and PREMIS identifier elements -- use
METS ID/IDREF mechanisms, best practices for
using these ID/IDREF mechanisms apply - Use PREMIS extensibility mechanism for format
specific technical metadata - Document decisions in METS profiles
11- ltfileSecgtltfileGrpgt
- ltfile ID"FID1" SIZE"184302" ADMID"TMD1PREMIS
TMD1MIX DP1EVENT DP1AGENT CHECKSUM"4638bc65c5b97
15557d09ad373eefd147382ecbf" CHECKSUMTYPE"SHA-1"gt
- ltFLocat LOCTYPE"OTHER" xlinkhref"BXF22.JPG" /gt
- lt/filegtlt/fileGrpgtlt/fileSecgt
- lttechMD ID"TMD1PREMIS"gt
- ltmdWrap MDTYPE"PREMIS"gt
- ltxmlDatagt ltpremisobject gt
ltobjectCharacteristicsgt ltfixitygt
ltmessageDigestAlgorithmgtSHA-1Â lt/messageDigestAlgor
ithmgt ltmessageDigestgt4638bc65c5b9715557d09
ad373eefd147382ecbf - lt/messageDigestgt
ltmessageDigestOriginatorgtEchoDep/me
ssageDigestOriginatorgt lt/fixitygt
ltsizegt184302lt/sizegt lt/objectCharacteristicsgt - Elements defined in both METS and PREMIS
- METS Checksum, Checksumtype
- attribute of ltfilegt
- not repeatable
- PREMIS fixity
- also includes messageDigestOriginator
- allows multiples
12- ltfileSecgtltfileGrpgt
- ltfile ID"FID1" ADMID"TMD1PREMIS DP1EVENT
DP1AGENT MIMETYPE"image/jpeg" - ltFLocat LOCTYPE"OTHER" xlinkhref"BXF22.JPG"/gt
- lt/filegtlt/fileGrpgtlt/fileSecgt
- lttechMD ID"TMD1PREMIS
- ltmdWrap MDTYPE"PREMIS"gt
- ltxmlDatagt
- ltpremisobjectgt
- ltobjectCharacteristicsgt
- ltformatgt
- ltformatDesignationgt
- ltformatNamegtimage/jpeglt/formatNam
egt - Â ltformatVersiongt1.02Â lt/formatVersi
ongt - lt/formatDesignationgtlt/formatgt
- lt/objectCharacteristicsgt
- Elements defined both in METS and PREMIS
- METS MIMETYPE
- attribute of ltfilegt
13- ltfileSecgt ltfileGrpgt
- ltfile ID"FID1" ADMID"TMD1PREMIS TMD1MIX
DP1EVENT DP1AGENT"gt - lttechMD ID"TMD1PREMIS"gt
- ltlinkingEventIdentifiergt
- ltlinkingEventIdentifierTypegtECHODEP Hub
Event - lt/linkingEventIdentifierTypegt
- ltlinkingEventIdentifierValuegtecho12345lt/linki
ngEventIdentifierValuegt - lt/linkingEventIdentifiergt
- ltdigiprovMD ID"DP1EVENT"gt
- Â ltpremiseventgt
- lteventIdentifiergt
- lteventIdentifierTypegtECHODEP Hub Eventlt/e
ventIdentifierTypegt - lteventIdentifierValuegtecho12345Â lt/eventId
entifierValuegt - lt/eventIdentifiergt
- lteventTypegtingestionlt/eventTypegt
- lteventDateTimegt2006-05-02T151253Â lt/eventD
ateTimegtlt/eventgt - Elements defined both in METS and PREMIS
- METS ID/Idref used to associate metadata in
different sections and for different files
14- ltstructMap TYPEphysicalgt
- ltdiv ORDER"1" TYPE"text"gt
- ltfptr FILEID"FID9"/gt
- ltdiv ORDER"1" TYPE"page" LABEL" Page
1"gt - ltfptr FILEID"FID1"/gtlt/metsdivgt
- ltdiv ORDER"2" TYPE"page" LABEL" Page
2"gt - ltfptr FILEID"FID2"/gtlt/metsdivgt
- lt/divgt
- ltrelationshipgt
- ltrelationshipTypegtstructurallt/relationshipTypegt
- ltrelationshipSubTypegtis sibling of
lt/relationshipSubTypegt - ltrelatedObjectIdentificationgt
- ltrelatedObjectIdentifierTypegtUCBlt/relatedObje
ctIdentifierTypegt - ltrelatedObjectIdentifierValuegtFID2lt/relatedOb
jectIdentifierValuegt - ltrelatedObjectSequencegt1lt/relatedObjectSequen
cegt - Elements defined both in METS and PREMIS
- METS structMap
15Some METS profiles with PREMIS
- UCSD simple and complex object
- UC Berkeley
- ECHO Dep Generic METS Profile for Preservation
and Digital Repository Interoperability - LC Profile for Recorded Events
- Australian METS Profile
- TIPR
- many others
16Additional changes to Guidelines
- Make extensibility mechanism consistent with METS
- significantPropertiesExtension
- objectCharacteristicsExtension
- creatingApplicationExtension
- environmentExtension
- signatureInformationExtension
- eventOutcomeDetailExtension
- rightsExtension
17Additional changes to Guidelines (cont.)
- Add the same elements and attributes as in METS
to PREMIS extension elements in schema and data
dictionary - mdRef, mdWrap
- binData, xmlData
- Attributes ID, LABEL, MDTYPE, MIMETYPE, SIZE,
CREATED, CHECKSUM, CHECKSUMTYPE - Allow URI or string for MDTYPE
- Add use cases/examples to illustrate choices made
- Clarify structural relationships
18Implementing an Exchange Standard
- PREMIS Implementation Tool
- Some tools documented on the PREMIS website
http//www.loc.gov/standards/premis/tools_for_prem
is.php - PiM tool developed by Florida Center for Library
Automation - Further work to generate metadata from digital
files in PREMIS elements