Creating Topic Maps Topic Maps and Knowledge Organization - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

Creating Topic Maps Topic Maps and Knowledge Organization

Description:

Common structure used by Yahoo!, etc. The folder metaphor. file systems, email, favourites ... BT/NT (hierarchical) and RT (untyped, associative) (Scope notes ... – PowerPoint PPT presentation

Number of Views:118
Avg rating:3.0/5.0
Slides: 43
Provided by: stevep75
Category:

less

Transcript and Presenter's Notes

Title: Creating Topic Maps Topic Maps and Knowledge Organization


1
Creating Topic Maps Topic Maps and Knowledge
Organization
  • Steve Pepper
  • pepper.steve_at_gmail.com
  • Oslo University College, 2007-09-15

2
Course agenda
  • Week 37 09-08 Introduction to Topic Maps Part
    1
  • Week 38 09-15 Creating a topic map
  • Week 39 09-22 Introduction to Topic Maps Part
    2
  • Week 42 10-13 Ontology-driven editing
  • Week 43 10-20 The machinery of Topic Maps
  • Week 46 11-10 (Semantic Web)
  • Week 48 11-24 (Ontologies)
  • Terminology
  • Topic Maps The technology and the standard
  • topic maps The artefacts (documents) we create

3
Todays agenda
  • Quick recap basic concepts and building blocks
  • Topic Maps and Knowledge Organization
  • Metadata, taxonomies, thesauri, faceted
    classification
  • Interchange syntaxes
  • XTM, LTM and CTM
  • Demo Creating a topic map using LTM
  • Pay close attention...

4
Recap Core concepts
  • The TAO of Topic Maps

5
Recap Basic building blocks
  • Basic building blocks are
  • Topics e.g. Puccini, Lucca, Tosca
  • Associations e.g. Puccini was born in Lucca
  • Occurrences e.g. http//www.opera.net/puccini/bi
    o.htmlis a biography of Puccini
  • Each of these constructs can be typed
  • Topic types composer, city, opera
  • Association types born in, composed by
  • Occurrence types biography, street map,
    synopsis

6
Topic Maps and Knowledge Organization
  • Keywords controlled vocabularies
  • Taxonomies, thesauri classifications
  • Indexes glossaries
  • Ontologies

7
Bibliographic languages
  • Work language
  • Author language
  • Title language
  • Edition language
  • Subject language
  • Classification language
  • Index language
  • Document language
  • Production language
  • Carrier language
  • Location language
  • Svenonius, Elaine (2000)The Intellectual
    Foundation of Information Organization.Cambridge,
    MA MIT Press (p.54)
  • Work languages
  • Work languages describe information entities,
    their intellectual (as opposed to physical)
    attributes, and relationships among them. (p.87)
  • Document languages
  • A document is a particular space-time embodiment
    of information a document language describes and
    provides access to this embodiment. (p.107)
  • Subject languages
  • A subject language is used to depict what a
    document is about. (p.127)

8
Two perspectives
  • Works have tended to be conflated with documents
  • So in practice there have been two kinds of
    language
  • Document languages
  • describe the work and its manifestations
  • document-centric (or resource-centric), e.g.
  • document metadata (Dublin Core)
  • bibliographic records (MARC)
  • Subject languages
  • describe the subject space in which the work
    exists
  • subject-centric, e.g.
  • thesauri, taxonomies (ICD)
  • classification schemes (LCSH, DDC)
  • faceted classification (Colon)

9
Metadata
  • Data about data
  • Information about documents
  • e.g. author, title, publisher, date, format,
    keywords
  • Useful for managing the content
  • Especially suitable for librarians
  • Somewhat useful for searching
  • Especially for experts
  • Less useful for end-users
  • the user starts out wanting to know more about a
    subject
  • traditional metadata, however, focuses on the
    document
  • if aboutness is provided at all, it gets squeezed
    into a single field

10
Keywords
  • Primitive form of subject-based classification
  • The keywords are used to describe the subject
  • Cheap and simple Folksonomies and tagging.
  • But also problematic because authors
  • misspell keywrods,
  • use different keywords/terms/tags for the same
    thing, and
  • use keywords that make no sense
  • Secondary problem
  • No way for the user to find out what keywords
    have been used
  • A keyword is a topic name

11
Controlled vocabularies
  • Solution create a list of legal keywords!
  • Requires somewhere to keep the list, and a
    process for new terms
  • Benefits
  • Solves problems of misspelling and duplicates
    (synonyms)
  • Disadvantages
  • Introduces some overhead (a flat list is
    difficult to manage)
  • Users can still search using the wrong terms
  • Users (and authors) still have difficulty finding
    terms
  • A controlled vocabulary is a well-defined set of
    topics with one name per topic

12
Taxonomies
  • Organize the keywords into a tree
  • Most general at the top, more specific further
    down
  • Common structure used by Yahoo!, etc.
  • The folder metaphor
  • file systems, email, favourites
  • Requires relationships between terms
  • Relationships state that one term is more
    specificthan another
  • Advantage terms somewhat easier to find
  • Disadvantage real world does not fit neatly into
    a hierarchy
  • A taxonomy is a set of topics related through a
    specific type of hierarchical association

13
Thesauri
  • Like a taxonomy, but with some extensions
  • Also better defined there are ISO standards for
    thesauri
  • Relationship types
  • BT Broader term NT Narrower term
  • USE Preferred term UF Non-preferred terms
  • RT Related term
  • SN Scope note
  • A thesaurus is a set of topics related through
    particular, predefined association types
  • BT/NT (hierarchical) and RT (untyped,
    associative)
  • (Scope notes are a kind of occurrence)
  • (USE and UF represent multiple names for the same
    concept/topic)

14
Faceted classification
  • Invented by S. R. Ranganathan in the 1930s
  • Defines a number of facets or dimensions
  • Defines a set of terms within each facet
  • Sometimes these terms are arranged in a taxonomy
  • Documents are classified against each facet
    separately
  • A faceted classification is a collection of topic
    hierarchies
  • Each hierarchy contains topics whose names are
    used as terms within a particular facet
  • XFML An XML interchange syntax for faceted
    classification inspired by Topic Maps

15
Expressivity progression
open model
  • Topic maps
  • use any types, properties, and relationships you
    like
  • Faceted classification
  • multiple vocabularies, taxonomies or thesauri
    (one per facet)
  • Thesauri
  • more formal taxonomy still no topic types two
    association types
  • Taxonomy
  • terms arranged in a hierarchy no topic types
    single association type
  • Controlled vocabulary, folksonomies
  • just a list of terms no topic types no
    associations

fixed model
no model
16
Document-centric approaches
  • Traditional metadata is document-centric
  • Provides substantial descriptive power for
    documents
  • Allows connection into subject-based
    classification
  • Crucial for the management of content
  • However, users are most interested in the
    subjects
  • Taxonomies, thesauri, and faceted classification
    are also document-centric
  • These are methods for subject-based
    classification
  • They provide hardly any descriptive power for
    subjects

17
Subject-centric approaches
  • Topic maps are subject-centric
  • They provide great descriptive power for subjects
  • Good as finding aids, because subjects are what
    users care about
  • Documents can be treated as subjects
  • This enables topic maps to capture metadata as
    well
  • It also enables topic maps to stitch metadata and
    subject-based classification together into one
    seamless whole
  • Topic Maps is the knowledge model par excellence
  • A subject-centric knowledge model that
    encompasses every other kind of knowledge
    organization model
  • Topic Maps can therefore be used to relate and
    combine taxonomies, indexes, thesauri,
    classifications, etc. etc.

18
Syntaxes
  • XTM, LTM and CTM
  • What are they?
  • When should I use which?

19
Topic Maps Syntaxes
  • HyTM (HyTime Topic Maps)
  • Original syntax, expressed in terms of SGML and
    HyTime
  • No longer part of ISO 13250
  • XTM (XML Topic Maps Syntax)
  • Later, XML-based syntax, recently moved to
    version 2.0
  • Easy to understand but very verbose
  • LTM (Linear Topic Map Notation)
  • Defined by Ontopia in 2001 and supported by other
    products
  • A simple ASCII syntax for rapid prototyping
  • CTM (Compact Topic Maps Syntax)
  • ISO standard replacement for LTM
  • Complete draft exists, but no implementations yet

20
Topic Map XTM 1.0 Syntax
lt!ELEMENT topicMap ( topic association
mergeMap ) gt lt!ATTLIST topicMap id
ID IMPLIED xmlns CDATA FIXED
'http//www.topicmaps.org/xtm/1.0/'
xmlnsxlink CDATA FIXED 'http//www.w3.org/1999/
xlink' xmlbase CDATA IMPLIED gt lt?xml
version"1.0" encoding"ISO-8859-1"?gt lttopicMap
xmlns"http//www.topicmaps.org/xtm/1.0/"
xmlnsxlink"http//www.w3.org/1999/xlink" gt
lt!-- topics, associations, and mergeMap elements
go here --gt lt/topicMapgt
21
Topic Map LTM Syntax
/ topics, associations, and occurrences go here
/
22
Topic XTM 1.0 Syntax
lt!ELEMENT topic ( instanceOf,
subjectIdentity?, ( baseName occurrence )
) gt lt!ATTLIST topic id ID REQUIRED gt ltto
pic id"italy"gt ... lt/topicgt lttopic
id"puccini"gt ... lt/topicgt
23
Topic LTM Syntax
topic-id italy puccini
24
Topic Name XTM 1.0 Syntax (1 of 2)
lt!ELEMENT baseName ( scope?, baseNameString,
variant ) gt lt!ATTLIST baseName id
ID IMPLIED gt lt!ELEMENT baseNameString
( PCDATA ) gt lt!ATTLIST baseNameString id
ID IMPLIED gt lt!ELEMENT variant
( parameters, variantName?, variant )
gt lt!ATTLIST variant id ID
IMPLIED gt lt!ELEMENT variantName ( resourceRef
resourceData ) gt lt!ATTLIST variantName id
ID IMPLIED gt
25
Topic Name XTM 1.0 Syntax (2 of 2)
lttopic id"la-boheme"gt ltbaseNamegt
ltbaseNameStringgtLa Bohèmelt/baseNameStringgt
ltvariantgt ltparametersgt
ltsubjectIndicatorRef xlinkhref"http//www.to
picmaps.org/xtm/1.0/core.xtmsort"/gt
lt/parametersgt ltvariantNamegt
ltresourceDatagtBohème, Lalt/resourceDatagt
lt/variantNamegt lt/variantgt
lt/baseNamegtlt/topicgt
26
Topic Name LTM Syntax
topic-id basename sortname?
dispname? la-boheme La Bohème"
"Bohème, La"
27
Topic Type XTM 1.0 Syntax
Use ltinstanceOfgt subelement lttopic id"opera"gt
... lt/topicgt lttopic id"tosca"gt ltinstanceOfgt
lttopicRef xlinkhref"opera"/gt
lt/instanceOfgt lt/topicgt lttopic id"boito"gt
ltinstanceOfgt lttopicRef xlinkhref"composer"/
gt lt/instanceOfgt ltinstanceOfgt lttopicRef
xlinkhref"librettist"/gt lt/instanceOfgt lt/topic
gt
28
Topic Type LTM Syntax
topic-id topic-type tosca
opera boito composer librettist
29
Occurrence XTM 1.0 Syntax
Use ltoccurrencegt subelementexternal/internal
resources ltresourceRefgt or ltresourceDatagt lt!ELEM
ENT occurrence ( instanceOf?, scope?, (
resourceRef resourceData ) ) gt lt!ATTLIST
occurrence id ID IMPLIED gt lttopic
id"la-boheme"gt ltoccurrencegt
ltinstanceOfgtlttopicRef xlinkhref"homepage"/gtlt/in
stanceOfgt ltresourceRef
xlinkhref"http//www.opera.it/Opere/La-Boheme/La
-Boheme.html"/gt lt/occurrencegt ltoccurrencegt
ltinstanceOfgtlttopicRef xlinkhref"premiere-date"/
gtlt/instanceOfgt ltresourceDatagt1896 (1
Feb)lt/resourceDatagt lt/occurrencegtlt/topicgt
30
Occurrence LTM Syntax
topic-id, occurrence-type, URL
data la-boheme, homepage,
"http//www.opera.it/Opere/La-Boheme/La-Boheme.htm
l" la-boheme, premiere-date, 1896 (1 Feb)
31
Topic Complete XTM 1.0 Syntax
lttopic id"la-boheme"gt ltinstanceOfgtlttopicRef
xlinkhref"opera"/gtlt/instanceOfgt ltbaseNamegt
ltbaseNameStringgtLa Bohèmelt/baseNameStringgt
ltvariantgt ltparametersgt
ltsubjectIndicatorRef xlinkhref"http//
www.topicmaps.org/xtm/1.0/core.xtmsort"/gt
lt/parametersgt ltvariantNamegtltresourceDatagtBoh
eme, Lalt/resourceDatagtlt/variantNamegt
lt/variantgt lt/baseNamegt ltoccurrencegt
ltinstanceOfgtlttopicRef xlinkhref"homepage"/gtlt/in
stanceOfgt ltresourceRef
xlinkhref"http//www.opera.it/Opere/La-Boheme/La
-Boheme.html"/gt lt/occurrencegt ltoccurrencegt
ltinstanceOfgtlttopicRef xlinkhref"premiere-date"
/gtlt/instanceOfgt ltresourceDatagt1896 (1
Feb)lt/resourceDatagt lt/occurrencegt lt/topicgt
32
Topic Complete LTM Syntax
la-boheme opera "La Bohème" "Boheme, La
la-boheme, homepage, "http//www.opera.it/O
pere/La-Boheme/La-Boheme.html" la-boheme,
premiere-date, 1896 (1 Feb)
33
Association XTM 1.0 Syntax
lt!ELEMENT association (instanceOf?, scope? ,
member)gtlt!ATTLIST association id ID
REQUIREDgtlt!ELEMENT member (roleSpec?, (topicRef
...)) gt lt!ATTLIST member id ID
IMPLIEDgtlt!ELEMENT roleSpec (topicRef ...)
gt ltassociationgt ltinstanceOfgtlttopicRef
xlinkhref"composed-by"/gtlt/instanceOfgt
ltmembergt ltroleSpecgtlttopicRef
xlinkhref"composer"/gtlt/roleSpecgt lttopicRef
xlinkhref"puccini"/gt lt/membergt ltmembergt
ltroleSpecgtlttopicRef xlinkhref"work"/gtlt/roleSpe
cgt lttopicRef xlinkhref"tosca"/gt
lt/membergt lt/associationgt
34
Association LTM Syntax
assoc-type ( role-player, role-player, ...
) composed-by( puccini , tosca ) Note 1
There can be more than two role-players in an
association. Well talk about that next
week. Note 2 The above is an oversimplification
due to the fact that we have not yet talked about
role types. Well do that next week. The exact
syntax should be as follows assoc-type (
role-player role-type, role-player
role-type, ... ) composed-by( puccini
composer, tosca work ) When omitted, the role
type will be assumed to be identical to the type
of the role-playing topic. This can be a useful
short-hand and we will use it for now, but it is
not always what you want...
35
Subject Identity XTM 1.0 Syntax
lt!ELEMENT topic (instanceOf, subjectIdentity?,...
)gt lt!ELEMENT subjectIdentity (resourceRef?,
(topicRef subjectIndicatorRef)) gt lt! Refer
to a resource as subject --gt lttopic id"foo"gt
ltsubjectIdentitygt ltresourceRef
xlinkhref"http//www.ontopia.net"/gt
lt/subjectIdentitygt ltbaseNamegt
ltbaseNameStringgtThe Ontopia Websitelt/baseNameStrin
ggt lt/baseNamegt lt/topicgt lt! Refer to a subject
indicator --gt lttopic id"bar"gt
ltsubjectIdentitygt ltsubjectIndicatorRef
xlinkhref"http//www.ontopia.net/about.html"/gt
lt/subjectIdentitygt ltbaseNamegt
ltbaseNameStringgtOntopialt/baseNameStringgt
lt/baseNamegt lt/topicgt
36
Subject Identity LTM Syntax
topic-id names subject-address-URL topic-id
names _at_subject-indicator-URL / Refer to a
resource as subject / foo "The Ontopia
Website" "http//www.ontopia.net" / Refer to
a subject indicator / bar "Ontopia"
_at_"http//www.ontopia.net/about.html"
37
Scope XTM 1.0 Syntax
lt!-- "scope" subelements on baseName, occurrence,
and association (also "parameters" on
variantName) --gt lttopic id"composed-by"gt
ltbaseNamegt ltbaseNameStringgtcomposed
bylt/baseNameStringgt lt/baseNamegt ltbaseNamegt
ltscopegtlttopicRef xlinkhref"composer"/gtlt/scopegt
ltbaseNameStringgtcomposer oflt/baseNameStringgt
lt/baseNamegt lt/topicgt lttopic id"la-boheme2"gt
ltbaseNamegt ltbaseNameStringgtLa Bohème
(Leoncavallo)lt/baseNameStringgt lt/baseNamegt
ltbaseNamegt ltscopegtlttopicRef
xlinkhref"leoncavallo"/gtlt/scopegt
ltbaseNameStringgtLa Bohèmelt/baseNameStringgt
lt/baseNamegt lt/topicgt
38
Scope LTM syntax
(name or occurrence or association) /
scoping-topic(s) born-in "composed by"
"composer of" / composer la-boheme1
"La Bohème (Puccini)" "La Bohème" /
puccini la-boheme2 "La Bohème
(Leoncavallo)" "La Bohème" /
leoncavallo
39
Demo Creating a topic map
40
Home assignment
  • Prerequisites
  • You have installed Java and the OKS Samplers
  • You know the basics of LTM
  • http//www.ontopia.net/download/ltm.html
  • Create your first topic map
  • Decide what domain you want to cover
  • Write LTM in a text editor (Notepad, TextPad,
    emacs, ...)
  • Keep it in its own directory
  • Copy to .../apache-tomcat/webapps/omnigator/WEB-IN
    F/topicmaps for testing in the Omnigator
  • Use Reload function

41
Your own topic map
  • Choose something that really interests you
  • Its much more fun than something boring!
  • Some ideas
  • Sport (football, cricket, ...)
  • Culture (music, film, literature, theatre, ...)
  • Study courses
  • Project management
  • Conference website
  • Languages
  • Geography
  • This first topic map is your own personal one
  • The next one will be a group project for term
    assessment
  • Requirements
  • Minimum 4 topic types, 4 association types, 4
    occurrence types
  • Minimum 10 topics, 20 associations, 10
    occurrences
  • Send to pepper.steve_at_gmail.com by Monday 29
    September

42
Next lecture
  • Monday September 22
  • Same time, same place
  • Agenda
  • Advanced features (roles, scope, identity,
    reification)
  • Help with home assignment
Write a Comment
User Comments (0)
About PowerShow.com