Title: Lifecycle Metadata for Digital Objects
1Lifecycle Metadata for Digital Objects
- September 18, 2002
- Implementing Metadata in XML
2Review of orders of data
- First-order language (segmentation)
- Second-order encoding
- Third-order meaning
- Fourth order function
- Fifth order groups of 3 and/or 4
- Note that each order is meta with respect to
the one below and data with respect to the one
above (cf. Goedel) - Hence you mark up the order you wish to
objectivize and access (examples TEI, EAD)
3Two fancy wrappers (what orders are involved?)
- The XML document as metadata repository
- XML document contains all the metadata
- Objects themselves are in separate files pointed
to by the document (XLinks) - The XML document as the whole enchilada
- Object is marked up in XML too
- Metadata is added as additional elements to the
original object
4Why not mark up the object (I.e., place markup
within the object)?
- If the object is not a text!
- If the object is a text, but the text is too
complex to mark up in XML (hierarchical model
doesnt suit everything overlap problem)
5Why mark up the object?
- If the object is a text
- If the text is well-formed as a hierarchical
structure (problem of overlaps not solved in XML) - Advantage is that the object carries its own
metadata
6Best of both worlds
- XML metadata tags
- (Text) object marked up in XML
- Original (text) object pointed to in separate
file for preservation
7XML Syntax rules for well-formed XML
- An element containing text or elements must have
start and end tags - An empty elements tag must have a slash (/)
before the end bracket - Case is significant
- All attribute values must be in quotes
- Elements may not overlap
- Isolated markup characters may not appear in
parsed content - Element names may not use all characters
8What constitutes the XML environment?
- XML editor (note that it cant do anything
automatic until you load a DTD or schema or have
entered a number of elements) - XML parser/validator
- Display program (e.g. browser)
- DTD or schema to define elements
- Style sheet for display of elements
- XSLT engine to convert to other formats (e.g.
database)
9The XML Document
- Document prologue
- XML declaration
- Document type declaration
- Points to root element
- Points to external standards (DTDs, namespaces)
- Document itself
- Bracketed by root element
- Contains elements, attributes, entities
10XML Declaration
- Gives version of XML
- lt?xml version1.0?gt
- Defines character encoding
- lt?xml version1.0 encodingUTF-8?gt
- Indicates presence of other needed files
- lt?xml version1.0 encodingUTF-8
standaloneno?gt
11Document type declaration
- Points first to root element
- lt!DOCTYPE examplegt
- Then points to any external source for definition
of document structure - lt!DOCTYPE example SYSTEM c\My
Documents\classes\metadata\example.dtd - Then adds any overriding elements (internal
subset)
12Function of the DTD
- Document Type Definition not expressed in XML
- Defines the language in which you will be talking
about objects and against which the XML markup
may be validated the grammar of the XML document
that refers to it - Equivalent to declaration of data types in a
programming language allows you to define your
own types (a private, or SYSTEM DTD) - Or you can use a preexisting DTD (a PUBLIC DTD,
example EAD)
13Element declarations in the DTD
- Occur within DTD or to give local definition
overriding DTD - lt!ELEMENT name content-modelgt
- Content-models
- (PCDATA) for character data
- (element, element, element) modified by
- , ?
14Attribute declarations in the DTD
- All attributes for one element declared in
attribute list - Gives attribute name, attributes datatype,
attributes behavior - lt!ATTLIST elementname
- attname1 atttype1 attdesc1
- attname2 atttype2 attdesc2
- gt
15Entity declarations in the DTD
- General entities are like variables they assign
a name and define a type for quoted text text
from an external source other data from an
external sourcelt!ENTITY title Temporary crazy
titlegtlt!ENTITY logo SYSTEM images/logo.gif
NDATA gifgt
16Elements in XML document
- Container elements
- ltname attributevaluegtchardatalt/namegt
- Empty elements
- ltname attributevalue /gt
17Attributes in XML document
- Used to provide more details about an element
- ltelementname attnamevaluegt
18Entities in XML document
- Within the document, the entity name is used
preceded by an ampersand - ltgreetinggt Dear name, lt/greetinggt
- When the document is displayed or used, the
entity value will be substituted for the name
19Tools for working with XML
- Authoring, display
- Amaya (free W3C browser/authoring software)
- XML Cooktop (free XML authoring software)
- Display
- Internet Explorer
- Netscape 6
- Mozilla
- Database
- Apache Xindice
20XML Cooktop editor screenshot
21Amaya screenshot
22How does all this relate to databases?
- By defining a language for markup in XML, you
create categories - Even freely-occurring objects can thus be found
and grouped (e.g., TEI grammatical markup) - Compare to accepted method of placing text in a
relational table in order to process it - Especially useful for regularly-occurring
metadata - This is why the structure of a markup scheme is
so important you get what you pay for