Title: Digital Libraries INFO 653 Week 6 Xia Lin College of Information Science and Technology Drexel University
1Digital LibrariesINFO 653Week 6Xia
LinCollege of Information Science and
TechnologyDrexel University
2Content Organization
- From Metadata to subject access
- From thesaurus to KOS (Knowledge Organization
Systems) - From Digital collections to Semantic Web
3Collection Building
- Important reading
- Creating a Framework of Guidance for Building
Good Digital Collections (2002). - By Timothy W. Cole
- First Monday, 7(5), May, 2002.
4Principles of Good Digital Objects
- A good digital object will be produced in a way
that ensures it supports collection priorities. - A good object is persistent.
- A good object is digitized in a format that
supports intended current and likely future use
or that support the development of access copies
that support those uses. - A good object will be named with a persistent,
unique identifier that conforms to a
well-documented scheme. - A good object should be authenticated.
- A good object should be associated with metadata.
5Principles of Good Metadata
- Good metadata should be appropriate to the
materials in the collection, users of the
collection, and intended, current and likely use
of the digital object. - Good metadata supports interoperability.
- Good metadata uses standard controlled
vocabularies to reflect the what, where, when and
who of the content. - Good metadata includes a clear statement on the
conditions and terms of use for the digital
object. - Good metadata records are objects themselves and
therefore should have the qualities of good
objects, including archivability, persistence,
unique identification, etc. Good metadata should
be authoritative and verifiable. - Good metadata supports the long-term management
of objects in collections.
6Principles of Good Digital Collections
- A good digital collection is created according to
an explicit collection development policy. - Collections should be described so that a user
can discover important characteristics of the
collection, - A collection should be sustainable over time.
- A good collection is broadly available and avoids
unnecessary impediments to use. - A good collection respects intellectual property
rights. - A good collection provides some measurement of
use. - A good collection fits into the larger context of
significant related national and international
digital library initiatives.
7Important Reading
- Library funcitons, scholarly communication, and
the foundation of the digital libray Laying
claim to the control zone. - By Ross Atkinson
- Library Quarterly, 66(3), 1996.
8Do you agree?
- A library, digital or otherwise, is always a
highly selective subset of available information
objects, segregated and favored, to which access
is enhanced and to which the attention of
client-users is drawn in opposition to objects
excluded.
9Control Zone
- The Web is an open zone, and a digital library is
a control zone. - By creating a control zone that selects some
objects and excludes others, information
professionals are using their expertise to point
users to documents that hold a particular value. - add access value to those objects of higher
content value from the perspective of the
individual client-user - Responsibility and focus
10Features of Control Zone
- Core definition
- Particularization (Specialization)
- Maintenance
- Certification
- Standardization
11Controlled Zone
- Must be organized
- Classification
- Thesauri
- Ontologies
12What are disadvantages ofControlled zone?
- Selection might limit the access.
- Selection might be bias.
- Knowledge organization supports pre-existing
concepts, not for new concepts. - Not user-oriented individual users needs are
different. - Lost the tail?
13From metadata to subject access
- Metadata is only the first step for subject
access - provide access entries for searching and
browsing. - make implicit knowledge explicit.
- make connections among related digital objects.
- reduce ambiguity.
14Systems of Knowledge Organization for Digital
Libraries
- The term knowledge organization systems (KOS) is
intended to encompass all types of schemes for
organizing information and promoting knowledge
management. - Includes traditional classification schemes,
subject headings, thesauri, etc. - Also include less traditional schemes such as
semantic networks and ontologies. - All digital libraries use one or more KOS.
15 A Taxonomy of KOS
16Common Characteristics of KOS
- KOS impose a particular view of the world.
- The same entity can be characterized in different
ways depending on the KOS that is used. - There must be a sufficient commonality between
the concept in KOS and the real world objects it
refers.
17KOS Approaches
- KOS imposes a particular view of the world on a
collection through - Providing a controlled list
- Controlling synonyms or equivalents
- Linking DL resources to related resources
- Making semantic relationships explicit
181. Provide a controlled list
- Examples
- Authority files
- Glossaries
- Dictionaries
- Gazetters
- The controlled lists
- provide a standard vocabulary.
- eliminates ambiguity.
19Example Use of pick lists BestCellars.com
202. Controlling synonyms or equivalents
- Synonymy two or more terms representing the same
concept or meaning. - Synonymy exists when two or more different terms
represent the same or similar concept.
21Synonym Rings
A synonym ring connects a set of words that are
defined as equivalent for retrieval.
223. Linking resources
- Use KOS to link digital resources
- Linking sequence numbers to biosequence databanks
- Linking individual industrial codes to the full
scheme - Linking organism names taxonomic records
- Linking chemical names to molecular structures
- Linking personal names to biographical
information
234. Making semantic relationships explicit
- Use existing thesauri
- Use existing classification shcemes
- Create topic maps or ontologies
- Create semantic web
24Planning and Implementing KOS
- Analyzing user needs
- how a KOS might be used with a particular digital
library for the intended users - Locating KOS
- Find if there is a suitable KOS
- Build locally if necessary
- Planning the infrastructure for KOS
- Maintaining KOS
- Presenting KOS to the user
- Acquiring intellectual property of KOS
25Topic Maps
- A key component of Semantic Web
- A new ISO standards
- ISO 13250 Topic Maps
- XML-like syntax
- XML Schema
- XTM XML Topic Maps
- XTM Home
26TAO of Topic Maps
- lttopicmapgt
- TOPIC
- topname
- basename
- dispname
- sortname
- OCCURS
- ASSOC
- assocrl
- facet
- fvalue
- addthms
- lt/topicmapgt
27(No Transcript)
28Topic Maps for Knowledge Representation
- Establishing an associative network between
resources which represent concepts - Organizing legacy resources into a new
information/knowledge space, by relating them to
topics, and associating those topics, in a
structured way - Enabling disparate sets of information resources
to be used together, by interrelating them using
a unifying conceptual framework
29Ontology
- An ontology is a specification of a
conceptualization. - An ontology is a description (like a formal
specification of a program) of the concepts and
relationships that can exist for an agent or a
community of agents. - An ontology is a commitment to use the shared
vocabulary in a coherent and consistent manner.
30Work Force Digital Library Ontology
Cases that worked
Concepts (taxonomy and ontology)
Lessons learned
example-of
example-of
Workforce Programs
describes
represents
Policy and regulation Documents
refers-to
Projects
Info Resources
example-of
sponsors
uses
is-part-of
describes
Government
refers-to
example-of
Document
Guides, Handbooks
initiates
is-related-to
write
Organizations
includes
Describes
People
Presentations
example-of
sponsors
Events (conferences, workshops, ...)
Peter Creticos
sponsors
31Why Develop an Ontology?
- To make domain knowledge explicit
- To share common understanding of information
structure among people and software agents. - To enable a machine or multiple machines to use
and share knowledge in some application. - To help other people understand some area of
knowledge. - To help people reach a consensus in their
understanding of some area of knowledge.
32Tools to build ontology
- Protégé http//protege.stanford.edu/
33Ontology and thesaurus
- Ontology inherits the ideas, purposes, and
functions of the thesaurus. - Ontology extends relationships among concepts
beyond those in thesaurus (NT, BT, RT, Synonyms). - Ontology intends to be consumed by both human and
machine.
34Bridge the Gap
Knowledge
Knowledge-based
Information-based
Technical Architecture
Information Architecture
Collection-based
Interface-based
Documents
35Conclusions
- Collections and content organization is one of
the major challenges of Digital Libraries. - There are increasing demand for formalized
(marked up) knowledge. - There are increasing tools and specification for
subject access (or knowledge access) to the Web
and to Digital libraries.