Title: An Introduction to Metadata
1An Introduction to Metadata and (some) Metadata
Standards Making Sense of Metadata, Society of
Archivists EAD/Data Exchange SIG London,
Thursday 17 November 2005 Pete JohnstonResearch
Officer, UKOLN, University of Bath
UKOLN is supported by
www.bath.ac.uk
2An Introduction to Metadata and (some) Metadata
Standards
- Metadata in action an example
- What is metadata?
- Some metadata standards
- Current issues, challenges
3Metadata in action an example
4A metadata-driven/metadata-dependent device!
5Albums 7614 Accelerator I'm Ready Yellow Kid
Artists (Smog) A Tribe Called Quest The Low End
Theory
Genres Acid Ambient 0 54 12 18
Playlists Ambient Selection Electro Selection I
Think About You I Think About You (Geiger mix)
6Now Playing Herbst Barbara Morgenstern Robert
Lippok Seasons 1 of 4 000138
7- Simple metadata describing each mp3 file
- Track title
- Artist name
- Album title
- Sequence on album
- Genre
- Length
- Sequence in playlist
- Used to find, select, organise, access files
8(No Transcript)
9(No Transcript)
10http//www.last.fm/
11(No Transcript)
12(No Transcript)
13(No Transcript)
14(No Transcript)
15http//www.bloglines.com/
16(No Transcript)
17The mp3 example
- Track metadata obtained from network services
- supplied by users
- Metadata embedded in mp3 file (ID3)
- Extracted/indexed by desktop mp3 player, portable
mp3 player - discovery, management
- Used in "play" metadata posted to network
services - basis for statistics, recommendation services,
"collaborative filtering"
18The mp3 example
- Metadata about different types of resources
- Tracks, albums, artists, "plays", people.
- Metadata obtained from various sources
- Created by different agents
- Metadata moving between different
applications/services - Metadata supporting multiple functions
- Effective (re)use of metadata
- minimal user effort
- "making (meta)data work harder" (Lorcan Dempsey)
19What is metadata?
20What is metadata?Some simple definitions
- Structured data about data.
- Dublin Core Metadata Initiative FAQ, 2005
- http//dublincore.org/resources/faq/
- Machine-understandable information about Web
resources or other things. - Tim Berners-Lee, W3C, 1997
- http//www.w3.org/DesignIssues/Metadata
21"Web resources or other things"
- Metadata might be "about" anything!
- HTML documents
- digital images
- databases
- books
- museum objects
- archival records
- metadata records
- Web sites
- collections
- services
- physical places
- people
- organisations
- works
- formats
- concepts
- events
22What is metadata?Towards a "functional" view
- Data associated with objects which relieves their
potential users of having to have full advance
knowledge of their existence or characteristics. - Lorcan Dempsey Rachel Heery, "Metadata a
current view of practice and issues", 1998 - http//www.ukoln.ac.uk/metadata/publications/jdmet
adata/
23What is metadata?Towards a "functional" view
- Structured data about resources that can be used
to help support a wide range of operations. - Michael Day, "Metadata in a Nutshell", 2001
- http//www.ukoln.ac.uk/metadata/publications/nutsh
ell/
24What might metadata "say"?
What is this called? What is this about? Who made
this? When was this made? Where do I get (a copy
of) this? When does this expire? What format does
this use? Who is this intended for? What does
this cost? Can I copy this? Can I modify
this? What are the component parts of this? What
else refers to this? What did "users" think of
this? (etc!)
25What operations/functions?
- resource disclosure discovery
- resource retrieval, use
- resource management, including preservation
- verification of authenticity
- intellectual property rights management
- commerce
- content-rating
- authentication and authorisation
- personalisation and localisation of services
- (etc!)
26What operations/functions?
- Different functions different metadata
- Metadata (and metadata standards) sometimes
classified according to function - Descriptive primarily for discovery, retrieval
- Administrative primarily for management
- Structural relationships between component parts
of resources - Contextual relationships between resources
- No one size fits all solution!
27Where is metadata?
Metadata embedded in resource
e.g. ID3 metadata in MP3 meta elements in HTML
docs TEI header summary properties in word
processor docs IPTC, EXIF data in image
formats Can resource support embedding of
metadata? Does metadata creator have write access
to resource? Can metadata consumer extract
embedded metadata? What happens when resource
deleted? Metadata about aggregates of
resources? Metadata about people, places,
concepts?
28Where is metadata?
Metadata record as separate object Record
identifier embedded in resource
e.g. link rel"meta" elements in HTML
docs Metadata record may be remote from
resource Can resource support embedding of
link? Does metadata creator have write access to
resource? Can metadata consumer extract link to
metadata record? What happens when resource
deleted? Metadata about aggregates of
resources? Metadata about people, places,
concepts?
29Where is metadata?
Metadata record as separate object Resource
identifier in metadata record
e.g. (lots!) Metadata record may be remote from
resource Does not require embedding of metadata
or link Does not require metadata creator to have
write access to resource Metadata record created
independently of resource possibly multiple
records Metadata consumer uses metadata records
independently of resource Metadata record may
persist after resource deleted Metadata record
can describe anything (with identifier)
30Metadata as managed resource
- Metadata
- may be used independently of resource
- may grow/change independently of resource
- may be used in different subsets, multiple
formats - may be the subject of metadata!
- requires management
- Metadata typically stored in some form of
database, repository - Exposed/exported as required
31Metadata as managed resource
32Who/what creates metadata?
- Information professionals ("cataloguers")
- Resource creators
- Resource managers
- Resource distributors/publishers
- Indexing/abstracting services (and similar)
- Resource users
- Software applications
- Probably others I've forgotten
33User-created metadata
- Growing interest in user-created metadata
- user annotation, ratings, comments, "reviews"
- e.g. Amazon, OCLC OpenWorldCat
- "tagging", folksonomy
- e.g. Flickr, del.icio.us
- Capture user perceptions of resources
- Capture user knowledge of resources
- Questions of authority, accuracy, trust, etc
34Application-captured/generated metadata
- Human metadata creation costs time/effort/money
- "experts" cost even more!
- Software applications can obtain metadata from
- operating system, Web server etc
- size, MIME types etc
- resource itself
- email headers etc
- metadata created by authoring applications (e.g.
MS Word) - automated analysis of resource content (e.g.
citation analysis, keyword extraction, automated
classification) - usage records, transaction logs
- e.g. people who bought/used/played this also
bought these - "joining up" metadata from different sources
35Some metadata standards
36Metadata standards
- Typically defined by "resource management
communities" - Different traditions, perspectives, functional
requirements - Typically comprise
- A "conceptual model" (sometimes not explicit)
- A set of named components ("terms", "elements"
etc) and documentation on their meaning and use - A specification of how to represent a metadata
instance in a digital format (binding)
37Bibliographic Metadata standards
- Machine-Readable Catalogue (MARC)
- primary library cataloguing standard
- supports discovery and management of library
resources - maintained by Library of Congress
- Metadata Object Description Schema (MODS)
- represents subset of MARC
- XML Schema
- maintained by Library of Congress
- ONIX
- information provided by publishers to retailers
- some use of ONIX to enhance library catalogue
records - maintained by EDItEUR/Book Industry Communication
38Archival/Records Management Metadata standards
- ISAD(G)
- not in itself machine-processable?
- but used as basis of database schemas in e.g.
CALM - Encoded Archival Description (EAD)
- metadata about archival records (and aggregations
of records) - may include some metadata about organisations,
individuals - Encoded Archival Context (EAC)
- metadata about organisations, individuals
- Records Management Metadata e.g.
- National Archives ERMS Metadata Standard
39Museum Metadata standards
- SPECTRUM
- Museum documentation standard
- Describes
- Procedures
- Information requirements ("units of information")
- Metadata about objects, events, agents etc
- CIMI XML Schema for SPECTRUM
- Maintained by mda
40Image Metadata standards
- VRA Core
- "works of visual culture as well as the images
that document them" - Image as visual representation of Work
- maintained by Visual Resources Association
- NISO Data Dictionary of Technical Metadata for
Digital Still Images - To facilitate technical interoperability, also
management curation/preservation - Encoded/serialised using MIX XML Schema
41Government Metadata standards
- UK e-Government Metadata Standard
- based on Dublin Core
- also incorporates components from NA ERMS
- specifies constraints on values e.g. Integrated
Public Sector Vocabulary - primarily to support resource discovery,
retrieval/access, some records management - eGMS v3.0 provides large set of terms
- in practice, deployed in subsets
42Learning Metadata standards
- IEEE Learning Object Metadata (LOM)
- To support the disclosure/discovery and use/reuse
of "learning objects" - UK LOM Core as "application profile" of LOM
- IMS Specifications
- Learner Information Profile (people)
- Learning Design (learning activities etc)
- Enterprise (groups/classes etc)
- Resource List Interoperability (reading lists
etc) - etc!
43Multimedia Metadata standards
- MPEG-7
- to describe the content of audio-video streams
- "making audio-visual material as searchable as
text" - designed to be incorporated into the production
process - create metadata at various stages
- extensible through the use of a Description
Definition Language (DDL) - metadata may be embedded in resource or located
separately
44Some current challenges
45Metadata standards interoperability
- Standardisation (mainly) within
communities/domains - but on the Web
- resources/metadata moving between/across
"communities" - services operating on metadata from multiple
"communities"
46Metadata standards interoperability
- How to minimise costly, complex, lossy
mappings/translations? - The "railroad gauge dilemna"
- (Stuart Weibel, "Border Crossings", D-Lib, Jul
2005) - How to maximise effective reuse of existing
metadata? - How to realise aspirations to extensibility,
modularity? - Does the W3C's Resource Description Framework
(RDF) offer a solution?
47Summary
- Metadata is used almost everywhere
- Metadata enables people and software applications
to do things - Not only about "discovery"
- Different functions require different metadata
- Metadata creation is potentially costly
- Clarify functional requirements
- Exploit existing sources
- Many metadata standards established/emerging
- But challenges remain in working across
standards, using standards in combination
48P.S.
49http//base.google.com/
50An Introduction to Metadata and (some) Metadata
Standards Making Sense of Metadata, Society of
Archivists EAD/Data Exchange SIG London,
Thursday 17 November 2005 Pete JohnstonResearch
Officer, UKOLN, University of Bath
UKOLN is supported by
www.bath.ac.uk