Title: Metadata for Digital Collections in Tools of History
1Metadata for Digital CollectionsinTools of
History
- South Central Regional Library Council
- October 30, 2008
2Workshop Topics
- Definitions of Metadata
- Types of Metadata
- Purposes of Metadata
- Metadata Schemes
- Dublin Core
- Standardization of content
- Controlled vocabularies/thesauri, authority files
- Content standards
- Metadata for Tools of History
- Quality Control
3What is Metadata?
- Simple definition data about data or
information about information - Many more definitions exist
4Metadata Definitions
- Metadata is structured information that
describes, explains, locates, or otherwise makes
it easier to retrieve, use, or manage an
information resource. (NISO, Understanding
Metadata, 2004) - The sum total of what one can say about any
information object at any level of aggregation.
(Murtha Baca, Introduction to Metadata, Getty
Research Institute) - Structured information used to find, access, use
and manage information resources primarily in a
digital environment. (International Encyclopedia
of Information and Library Science, 2003)
5One more definition
- Structured data that describes a resource,
identifies relationships among resources,
supports the discovery, management and effective
use of Web resources, exists in many
environments. - (Sherry Vellucci, UCSD Metadata Services Talk,
2006 http//tpot.ucsd.edu/Cataloging/VellucciPres
entation.ppt295,18,What Is Metadata?)
6Metadata serves many purposes
- Search, browse, discover, access, describe,
identify, use, manage, share (interoperability) - Metadata allows you to
- describe, locate, manage, and preserve your
digital items - Metadata allows users to
- discover, access, identify, understand, and use
your digital items - Metadata allows machines to
- process, share, and manipulate your digital
items
7Types of Metadata
- Descriptive
- Factual information who, what, when, where
- Analytical information what is it about
(subject analysis) - Increases access by providing searchable terms
- Structural
- Information that identifies the structure of
complex objects (e.g. books) - File formats
- Administrative
- Rights, permissions, restrictions
- Identifiers
- Provenance information
- Preservation/Technical information about the
digital file itself, including how it was created
8Metadata Schemes
- The alphabet soup of metadata
- AACR2/MARC
- TEI
- EAD
- LOM
- VRA Core
- CDWA
- METS
- MODS
- PREMIS
- MIX
- DC (Dublin Core HRVH Metadata Style Guide based
on DC) - There are many more, but you get the idea!
9Dublin Core
10Dublin Core Background
- Developed in 1995
- International, cross-disciplinary collaboration
- Primarily descriptive metadata
- Simple and flexible
- 15 Core elements
- All elements are optional (none are mandatory),
all are repeatable
11Simple Dublin Core Elements
12Simple Dublin Core Elements
- Title A name given to the resource.
- Creator An entity primarily responsible for
making the resource. - Subject The topic of the resource.
- Description An account of the resource.
- Publisher An entity responsible for making the
resource available.
13Simple Dublin Core Elements
- Contributor An entity responsible for making
contributions to the resource. - Date A point or period of time associated with
an event in the lifecycle of the resource. - Type The nature or genre of the resource.
- Format The file format, physical medium, or
dimensions of the resource. - Identifier An unambiguous reference to the
resource within a given context.
14Simple Dublin Core Elements
- Source The resource from which the described
resource is derived. - Language A language of the resource.
- Relation A related resource.
- Coverage The spatial or temporal topic of the
resource, the spatial applicability of the
resource, or the jurisdiction under which the
resource is relevant. - Rights Information about rights held in and
over the resource.
15Qualified Dublin Core
- DCMI developed qualifiers to refine the
use/meaning of the simple DC fields make terms
more specific - CONTENTdm allows for Qualified Dublin Core
- Examples
- Qualifiers for Date
- Date-Created, Date-Issued
- Qualifiers for Relation (sometimes used in
pairs) - Relation-References, Relation-Is Referenced By
- Relation-Is Part Of, Relation-Has Part
16Controlling the data
- No rules for formatting content within the fields
inherent to Dublin Core. - Data should be formatted within the fields so
that records are interoperable with records from
other collections. - The who, what, when, and where should be
standardized - Controlled vocabularies, thesauri
- Authority Files
- Encoding standards (e.g. for languages and dates)
- Data content standards
- AACR2, DACS, CCO
17Controlling Names
- Personal, Corporate, Geographic Names
- Examples
- Library of Congress Name Authority File
- Union List of Artist Names (The Getty)
- Thesaurus of Geographic Names (The Getty)
- Provide preferred format of name
- Typically have cross-references
18Controlled Vocabularies/Thesauri
- Used for subject indexing
- List of authorized terms, cross-references, and
scope notes - Cross-references
- Synonym control see references
- Related terms see also references
- Narrower and Broader terms
- Examples
- Thesaurus for Graphic Materials
- Library of Congress Subject Headings
- Sears Subject Headings
- Chenhalls Nomenclature
- Art Architecture Thesaurus
19Metadata is not a perfect science
- Subjectivity, biases, different views, different
content, different formats, different purposes,
different audiences different results - The simplicity of and lack of rules associated
with the Dublin Core is both a blessing and a
curse! - The nature of the Dublin Core allows it to be
applied in a variety of ways - DC implementers develop style guides that impose
rules on the creation of the data within the
fields (specify use of certain vocabularies and
standards)
20Lets examine some records
21Creating Shareable Metadata
- Concept of sharable metadata comes from OAI-PMH
community. - Shareable metadata uses standards and rules
similar to those used by others therefore making
records more interoperable. - Think outside of your local box (organization)
- Include information that is assumed in local
context - Exclude information that only has meaning in
local context - Record should be understandable on its own (when
separated from the resource). - http//webservices.itcs.umich.edu/mediawiki/oaibp/
?PublicTOC
22Principles of Good Metadata
- Good metadata should be appropriate to the
materials in the collection, users of the
collection, and intended, current and likely
future use of the digital object. - Good metadata supports interoperability.
- Good metadata uses authority control and content
standards such as controlled vocabularies that
are in line with user expectations to describe
the content of objects and collocate related
objects. - A Framework of Guidance for Building Good Digital
Collections, NISO - http//www.niso.org/framework/framework2.html
23Principles of Good Metadata, cont
- Good metadata includes a clear statement on the
conditions and terms of use for the digital
object. - Good metadata records are objects themselves and
therefore should have the qualities of good
objects, including archivability, persistence,
unique identification, etc. Good metadata should
be authoritative and verifiable. - Good metadata supports the long-term management
of objects in collections. - A Framework of Guidance for Building Good Digital
Collections, NISO - http//www.niso.org/framework/framework2.html
24Metadata as Communication
- H.P. Grices maxims governing communication
- Make your contribution as informative as possible
- Do not make your contribution more informative
than is required - Do not say what you believe to be false
- Do not say that for which you lack adequate
evidence - Be relevant
- Avoid obscurity of expression
- Avoid ambiguity
- Be brief
- Be orderly
- Structures, standards, and the people who make
them meaningful by David Bade - http//www.loc.gov/bibliographic-future/meetings/d
ocs/bade-may9-2007.pdf
25Tips for Metadata Success
26Tips for Metadata Success
- What are the mission, goals, and objectives of
your project? - Who is your audience? How much information will
they need/want? - Think about other potential uses/users.
- How much information do you already have? Do you
have legacy data or are you starting from
scratch?
27More Tips
- Helpful to have an understanding of entire
digital collection before you begin. - Ideally physical items should already be
inventoried, cataloged, accessioned, etc. - Do some research.
- If you dont know, dont guess.
28Still More Tips
- Take your time.
- Analyze the who, what, when, and where.
- What is the significance of the item?
- How will users find the item?
- How will you manage the item?
- What do users need to know to understand and use
the item? - How will you bring similar resources together?
- And one more tip.
29HAVE FUN!!
30Metadata for Tools of History
31Title
- The first thing users see when searching or
browsing are the image thumbnail and the title. - How much information do you include in the title?
Dates, locations? - If an existing label or caption is not
descriptive, consider creating your own title. - Remember that the title is searchable. Are there
keywords you should use?
32Creator
- Recommended that you consult name authority files
before creating your own heading. - Union List of Artist Names
- Library of Congress Name Authority File
- If you find a heading for the person in an
authority file, enter it exactly how you found
it. - Names should be inverted when creating your own
heading. Include birth/death dates if known - Palmentiero, Jennifer B., 1971-
33Dates
- Dates have always been problematic
- 3/4/07 - March 4, 2007 vs. April 3, 2007
- 1920 June 30 vs. June 30, 1920
- Best practice is to use the ISO 8601 standard as
defined in a profile by the W3C (W3CDTF)
YYYY-MM-DD - Uncertain dates (circa, approximate, date ranges,
unknown dates) - Original Date and Digital Date
34Contributors
- Someone who contributes to the intellectual
content of the resource - Illustrators
- Photographers of photographs in books, articles,
etc. - Filled-in forms (e.g. government forms)
- Reproductions
- Photographs, postcards of works of art
(paintings, sculpture, etc) - Photographs of architecture
- Know your audience
35Publisher
- Original Publisher and Digital Publisher
- HRVH uses Publisher.Original if the original item
was published and if publisher is known - Newspapers and Clippings
- Books
- Postcards
- Brochures
- Digital Publisher is the organization responsible
for making the digital item available (thats
you!).
36Description
- What terms need to be included to help users find
the resource? Think about synonyms for subject
terms. - What information do they need to understand the
resource? - Identify, interpret, both?
- Dont guess or make assumptions.
- How much is too much?
37Subject
- Some general words about subject headings
- Subjects can be topics or names (personal,
corporate, geographic). - Concentrate on item in hand.
- Assign terms from controlled vocabularies and
thesauri -important for bringing similar items
together. - Include form/genre term(s)
- Best practice is to identify source of term
(LCSH, TGM, ATT, etc.)
38Subject Headings -Thesaurus of Graphic Materials
- The TGM comes bundled with CONTENTdm.
- The TGM is also available free online.
- Use the list of TGM terms in CONTENTdm in
conjunction with TGM online. - Lets explore the TGM
39Other sources for subject headings
- TGM is not exhaustive. You may find that you need
to use other vocabularies - Library of Congress Subject Headings
- SEARS Subject Headings
- Art and Architecture Thesaurus (The Getty)
- Chenhalls Nomenclature
- Recommended that you use vocabulary that you are
familiar with, have access to, and is appropriate
for your collection.
40Local subject headings
- Non-standard terms used locally by your
organization should be avoided if concepts can be
expressed using established, standard headings. - Use locally created subject headings sparingly
and consistently. - Examples
- GOOD Black dirt farming
- relevant and important concept in our region, not
represented in thesauri and controlled
vocabularies - NOT SO GOOD Slavery/Bondage/Indentures (for an
image of an ankle iron) - these are three related, but different concepts.
Better to use Slavery and Shackles as two
separate terms - from the TGM
41Subject - Personal Names
- Enter a personal name when a person is the
SUBJECT of the resource. - Consult authority files before creating your own
heading. - When you have to create your own heading invert
the name and use fullest form - Palmentiero, Jennifer B., 1971-
42Personal Names cont
- Choosing among different names (nicknames,
married vs. maiden names, multiple marriages,
name changes) - What about a person who has had multiple names
throughout her life and who is not listed in the
LOC name authority? - For example Born Elnora Stephanie Fothe
1925. School changed given name to Eleanor,
with which she continue throughout her life.
First marriage to a Mr. Otto. Second
marriage to Willard Haynes Patrick. Third
marriage to Reed Dean. Nicknames "Crisco
Kid", "Ellie", "Ed." Died 1997.
43DACS for construction of name headings
- Determine the name by which a person is commonly
known from the following sources and in the order
of preference given - the name that appears most frequently in the
persons published works (if any) - the name that appears most frequently in the
archival materials being described - the name that appears in reference sources
- the latest name
- DACS, Chapter 12
44More from DACS
- If a persons name shows a nickname in quotation
marks or within parentheses as part of other
forename(s), omit the nickname in formulating the
heading - Name used Martin (Bud) Schulman
- Heading Schulman, Martin
- If a married womans name shows her own forenames
in parentheses as part of her married name, omit
the parenthesized elements in formulating the
heading - Name used Mrs. John A. (Edna I.) Spies
- Heading Spies, John A., Mrs.
- DACS, Chapter 12
45Subject - Corporate Names
- Enter a corporate name when a corporate entity is
the SUBJECT of the resource - Consult LCNAF
- What is considered a corporate entity? List
available here http//www.itsmarc.com/crs/auth132
0.htm
46Subject - Geographic Names
- Enter a geographic name when a geographic
location is a SUBJECT of the resource - Avoid using when the location is not explicitly
represented - Thesaurus of Geographic Names or LC?
47SCRLC Topics
- Broad Topic categories for browsing
- Not intended to replace more specific subject
headings - Assign 1-3 as appropriate
48Language
- Language of the content of the resource (not the
metadata record) - Use a language code for text resources
- Assign three-letter code from ISO 639.2
- Full word for the language may be used in the
Description field
49Coverage
- Spatial location or temporal (time) period of the
content of the resource. - Coverage.Spatial used mainly with maps to
record geographic coordinates. - Coverage.Temporal - may be used if creation date
of resource is different than date/time period
represented in the content of the resource. - Example Painting created in 1850 depicting a
scene from the American Revolution - Date.Original 1850
- Coverage.Temporal 18th Century OR 1775-1783
50Format
- Original Format
- Format.Original in HRVH provides users with
physical details of original resource - Very difficult to standardize in a shared
metadata environment - Local practices vary
- Use of different vocabularies and standards
- Digital Format
- Format.Digital in HRVH provides users with format
of digital resource - image/jpeg
- image/jp2
- application/pdf
- video/mpeg
- audio/mp3
51Type
- DCMI Type Vocabulary used in HRVH
- Examples from DCMI Type Vocabulary
- Image
- Still Image
- Moving Image
- Text
- Physical Object
52Relation
- Use qualifiers when appropriate to specify the
nature of the relationship between two resources - An item references another (letter referencing a
photograph) - An item is part of a larger resource (page of a
book, newspaper clipping) - The Relation element should not be used to bring
together items with similar subjects/content.
Subject headings and keywords are used for this
purpose.
53Source
- Use the Source field to record information about
the source collection - Name of physical collection the original item is
part of - Information that will help you locate the
original item (box/folder numbers, accession
numbers, call numbers)
54Resource Identifier
- Most of your items will not have a standard
number associated with them (ISBN, ISSN, etc.) - File names are typically used in HRVH
55Digital Collection
- Map to Dublin Core Relation-Is Part Of
- Important field if your collection will have
sub-collections - Topic/Theme-based
- Format-based (map collections, photograph
collections) - Institution based (for consortia groups)
- Allows users to find all items in that collection
- Consistency in data entry is important
- Should be unique People Railroads
Buildings too broad (Smallville Buildings is
better)
56Holding Institution
- Map to Dublin Core Source
- Enter the name of your organization consistently
(used template in CONTENTdm)
57Contact Information
- Include information that will easily allow users
to contact your organization - Avoid email addresses
- May include link to your organizations Web site
58Rights
- Identifies rights holders
- Lets users know what they can do (or not do) with
a resource - May be a statement or a link to a statement on
your Web site
59Technical Data
- Information about the digital file (Master file)
- File size
- File dimension
- File format
- Compression
- Information about the digital conversion process
- Capture device (scanner, camera) make and model
- Software used
- Resolution
- Bit-depth
- Was image altered/enhanced for the web? Consider
including this information
60Transcripts
- Map to Dublin Core Description
- Must have this field if you want your multi-page
text resources to be full-text searchable in
CONTENTdm.
61Quality Control
62Quality Control
- Check for accurate, complete, and consistent
information. - Watch for typos. They do happen and users will
find them. - Were standards and best practices adhered to?
- Will users be able to find and understand the
resources using the metadata you created?
63Quality Control
- Review your own records.
- Ideally have someone else review them. Reviewer
should know what is being described without
having to look at the item.
64Metadata as Communication
- H.P. Grices maxims governing communication
- Make your contribution as informative as
possible. - Do not make your contribution more informative
than is required. - Do not say what you believe to be false.
- Do not say that for which you lack adequate
evidence. - Be relevant.
- Avoid obscurity of expression.
- Avoid ambiguity.
- Be brief.
- Be orderly.
- Structures, standards, and the people who make
them meaningful by David Bade - http//www.loc.gov/bibliographic-future/meetings/d
ocs/bade-may9-2007.pdf
65Final Thoughts
- Metadata is time consuming
- Metadata is important
- Metadata is fun
- Contact me at any time with questions
- jennifer_at_senylrc.org
- 845.883.9065 x16
- Thank You!!