Title: Introduction to Topic Maps and Subjectcentric Computing
1Introduction to Topic Mapsand Subject-centric
Computing
- Bors István
- pepper.steve_at_gmail.com
- Budapest, 2009-07-21
2Agenda
- Basic Concepts (TAO of Topic Maps)
- Advanced Concepts (scope and roles)
- Writing a Simple Topic Map in LTM
- Ontology-driven Editing with Ontopoly
- Breaks
- 10.30 11.00 (coffee)
- 12.30 13.30 (lunch)
- 15.00 15.30 (coffee)
3The copernican revolution
- For 1,000s of years people thought that the sun
revolved around the earth - Actually some Greek, Indian and Muslim scholars
knew better, but the view of Aristotle, Ptolemy
and the Christian Church was dominant - The publication of On the revolutions of the
celestial spheres (1543) by Nicolaus Copernicus
changed all that - The heliocentric theory turned our understanding
of the universe upside-down or inside out.
4Computing has a similar problem
- Today we face a similar situation in computing
and information management - Our computing universe has applications (and
documents) at the centre - This is wrong, because it does not reflect how
humans think - Humans think in terms of subjects (or concepts)
5The subject-centric revolution
- We must put subjects at the centre, because
thats what really interests us - For example, when looking for information
- This is the subject-centricapproach
- It represents a radically different way of
organizing information and knowledge - Subject-centric computing is what Topic Maps is
really all about
6What is Topic Maps?
- An ISO standard for computer-based
informationand knowledge management - Provides the ability to control infoglut and
share knowledgeby connecting any kind of
information from any kind of source based on its
meaning - A semantic technology
- Cf. Semantic Web (RDF, OWL)
- A form of knowledge representation
- Widely used for web-based delivery of information
- Plus Information Integration, eLearning,
Business Process Modeling, Product Configuration,
Business Rules Management, Asset Management,
Knowledge Management,
7Background to Topic Maps
- Emerged from the SGML community in 1990s
- Initial use case How to merge (digital)
back-of-book indexes - Some input from library science
- Precious little input from computer scientists
before 2001 - Most of the SGML community came from the
humanities - ISO 13250 first published in 2000 (recently
revised) - A model for representing knowledge organization
structures (indexes, glossaries, thesauri,
encyclopedias) - Plus interchange syntax, query language,
constraint language, ... - Widely adopted in Norway (esp. public sector)
- And gaining ground elsewhere
8Basic ConceptsThe TAO of Topic Maps
- Topics
- Associations
- Occurrences
9The TAO of Topic Maps
Callas, Maria 42 Cavalleria Rusticana
71, 203-204 Mascagni, Pietro Cavalleria
Rusticana . 71, 203-204 Pavarotti, Luciano
45 Puccini, Giacomo . 23, 26-31 Tosca
. 65, 201-202 Rustic Chivalry, see
Cavalleria Rusticana singers .
39-52 baritone . 46 bass
.. 46-47 soprano 41-42, 337
tenor . 44-45 see also Callas,
Pavarotti Tosca 65, 201-202
- The core concepts are derived from the
back-of-book index - Extended and generalized for use with digital
information - Consider a two-layer model consisting of
- a set of information resources (below)
- a knowledge map (above)
- This is like the division of a book into content
and index
(INDEX)
knowledge layer
information layer
(CONTENT)
10(1) The information layer
- The lower layer contains the content
- usually digital, but need not be
- can be in any format or notation or location
- can be text, graphics, video, audio whatever
- This is like the content of the book to which
theback-of-book index belongs
information layer
(CONTENT)
11(2) The knowledge layer
- The upper layer consists of (typed) topics and
associations - Topics represent the subjects that the
information is about - Like the list of topics that forms a back-of-book
index - Associations represent relationships between
those subjects - Like see also relationships in a back-of-book
index
composed by
Domain Italian opera
composed by
Tosca
Puccini
MadameButterfly
born in
knowledge layer
Lucca
(INDEX)
12Occurrences link the layers
- Occurrences represent relationships between
information resources and the subjects that they
are about - The links (or locators) are like page numbers in
a back-of-book index - Occurrences canalso be typed (e.g.bio, map,
synopsis)
13Summary of core concepts
Lets look at some TAOsin the Omnigator
Plus topic types, association types, occurrence
types each of which are represented by topics...
14Omnigator interface
Demo
15The power of the TAO model (1)
- Represent subjects explicitly
- Topics represent the things your users are
interested in - Capture relationships between subjects
- Associations provide user-friendly navigation
paths to information (navigation as we may
think) - Associations promote serendipitous knowledge
discovery through browsing - Make information findable
- Topics provide a one-stop-shop for everything
that is known about a subject (collocation of
information and knowledge) - Occurrences allow information about a common
subject to be linked across multiple systems
16The power of the TAO model (2)
- Represent taxonomies and thesauri
- Associations may represent hierarchical
relationships - Topic Maps permits multiple, interlinked
hierarchies and faceted classification - Transcend simple hierarchies
- Rich associative structures capture the
complexity of knowledge and reflect the way
people think - Manage knowledge
- The topic map is the embodiment of corporate
memory - It provides a structured way to capture peoples
knowledge of things, events, relationships, etc.
17Querying topic maps
- Topic Maps is based on a formal data model
- This means that topic maps can be queried, like
databases - Topic Maps Query Language (TMQL)
- Allows more powerful use of taxonomies to
retrieve information - Permits queries that would make Google boggle
(see below) - Based on Ontopias query language tolog
- (Demo of querying in the Omnigator)
- Query example
- Give me all composers that composed operas that
were based on plays that were written by
Shakespeare
18Advanced ConceptsScope and Roles
19The problem of context
- A topic map captures knowledge, but...
- Some knowledge is only valid in a certain context
- Reality is ambiguous
- Knowledge has a subjective dimension
- People have different opinions
- Context is handled using scope
- Enables the expression of contextual validity
- Allows the expression of multiple world views
20How scope works
- We make statements about topics
- names, occurrences, associations
- Every statement is valid within some context
- Statements are qualified by scope
- the name Allemagne for the topicGermany in the
scope French - a certain information occurrencein the scope
technician - a given association is true in thescope
(according to) Authority X
21Topics play roles in associations
- Associations have no direction
- They represent relationships andare inherently
multidirectional - Puccini was born in Lucca
- Lucca was the birthplace of Puccini
- Two ways to express the same relationship
- Impression of direction caused by use of natural
language - One of the topics viewed as the subject and the
other as the object - Instead of direction, associations use roles
- Puccini plays the role of person and Lucca plays
the role of place - person and place are association role types (or
role types, for short) - Labels are assigned based on role perspective
22Anatomy of an association
person
born-in
place
T
T
T
T
R
A
R
T
Puccini
Lucca
- Role types characterize the nature of the
subjects involvement in the relationship - They are also topics
23Associations need not be binary
- Unary associations are not common
- Useful for representing properties that have
boolean values - e.g., the property of being unfinished
- Binary associations are the most common
- Often correspond to verb ( subject, object )
constructs - Ternary associations are quite common
- Often correspond to verb( subject, direct-object,
indirect-object ) constructs - N-ary associations (where n gt 3)
- Less common but sometimes useful
- Many n-ary associations are better represented as
(n-1) binary associations...
24The Topic Maps standards
- ISO/IEC 13250 Topic Maps
- Part 1 Overview and Basic Concepts
- Part 2 Data Model
- Part 3 XML Syntax
- Part 4 Canonicalization
- Part 5 Reference Model
- Part 6 Compact Syntax
- Part 7 Graphical Notation
- ISO/IEC 18048
- Topic Maps Query Language
- ISO/IEC 19756
- Topic Maps Constraint Language
- ISO/IEC TR 29111
- Expressing Dublin Core Metadata Using Topic Maps
25Creating a topic mapInterchange syntaxes
- HyTM, XTM, LTM and CTM
- Using LTM
26Interchange syntaxes
- HyTM (HyTime Topic Maps)
- Original syntax, expressed in terms of SGML and
HyTime - No longer part of ISO 13250
- XTM (XML Topic Maps Syntax)
- Later, XML-based syntax, recently moved to
version 2.0 - Easy to understand but very verbose
- LTM (Linear Topic Map Notation)
- Defined by Ontopia in 2001 and supported by other
products - A simple ASCII syntax for rapid prototyping
- CTM (Compact Topic Maps Syntax)
- ISO standard replacement for LTM
- Complete draft exists, but not yet finalized
27XTM 1.0 Syntax example
lttopic id"la-boheme"gt ltinstanceOfgtlttopicRef
xlinkhref"opera"/gtlt/instanceOfgt ltbaseNamegt
ltbaseNameStringgtLa Bohèmelt/baseNameStringgt
ltvariantgt ltparametersgt
ltsubjectIndicatorRef xlinkhref"http//
www.topicmaps.org/xtm/1.0/core.xtmsort"/gt
lt/parametersgt ltvariantNamegtltresourceDatagtBoh
emelt/resourceDatagtlt/variantNamegt lt/variantgt
lt/baseNamegt ltoccurrencegt ltinstanceOfgtlttopicR
ef xlinkhref"homepage"/gtlt/instanceOfgt
ltresourceRef xlinkhref"http//www.opera.i
t/Opere/La-Boheme/La-Boheme.html"/gt
lt/occurrencegt lt/topicgt
28LTM Syntax example
la-boheme opera "La Bohème" "Boheme"
la-boheme, homepage, "http//www.opera.it/O
pere/La-Boheme/La-Boheme.html"
29LTM basics
- Topictopic-idpuccini composer
"Puccini"lucca city "City" - Associationassoc-type ( player role, player
role )born-in ( puccini person, lucca
place ) - Occurrencetopic-id, occurrence-type, "URL"
topic-id, occurrence-type, string
la-boheme, homepage, "http//www.opera.it/Op
ere/La-Boheme/La-Boheme.html"la-boheme,
premiere-date, 1896 (1 Feb) - Scope(nameoccurrenceassociation) / topic-id
30Demo Creating a topic mapwith LTM
- A simple knowledge management applicationto
capture skills andexperience
31What the topic map is about
- People are employed by organizations in certain
professions. - They have email addresses and other contact
information. - They are members of certain professional
associations and they speak various languages to
varying degrees. - They attend various events (workshops,
conferences) and write papers. - Organizations have web sites and are located in
certain cities
32Some data
- Bognárné Lovász Katalin
- katalinbognarlovasz_at_gmail.com
- 36 305739349
- University of West Hungary
- Association of Hungarian School Librarians
- XI. Summer School for School Librarians
- School librarian and/or manager?
- Topic Maps Workshop
- Hungarian fluent
- English advanced
- German basic
- Fancy dress and tea in the school library(?)
- Horváthné Szandi Ágnes
- szandi_at_bolyai.nyme.hu
- University of West Hungary
- http//www.bdtf.hu/
- Szombathely
- Association of Hungarian School Librarians
- XI. Summer School for School Librarians
- http//www.ktep.hu/NYA2009
- Summer conference held every second year in
different locations - Association of Hungarian School Librarians
- Budapest
- http//www.ktep.hu/
33Advanced ConceptsMerging and Identity
34Merging topic maps
- Topic Maps can be merged automatically
- Arbitrary topic maps can be merged into a single
topic map - This cannot be done with databases or XML
documents - Merging enables many advanced applications
- Information integration across repositories
- Sharing and reusing taxonomies
- Automated content aggregation
- Distributed knowledge management
35Principles of merging
- By definition Every topic represents exactly one
subject - Our goal Every subject represented by just one
topic - When two topic maps are merged, topics that
represent thesame subject should be merged to a
single topic - When two topics are merged, the resulting topic
has theunion of the characteristics of the two
original topics
Merge the two topics together...
(Demo of merging in the Omnigator)
36Subject identity
- Precondition for successful merging
- Knowing when two topics represent the same
subject - What makes merging possible?
- NOT the use of names, which are notoriously
unreliable - Names are not unambiguous (the homonym problem)
- Many topics have multiple names (the synonym
problem) - Achievement of the collocation objective
- Only possible through the use of unique global
identifiers - If subjects have unique identifiers, people are
free to use whatever names they like, and topic
maps can still be merged successfully
37Subjects and Topics
- Topics are surrogates, or proxies (inside the
computer) for the ineffable subjects that you
want to talk about, such as Puccini, love, these
slides, or the second law of thermodynamics
38The identity of subjects
- Topics exist in order to allow us to talk about
subjects - The relationship between the two is sometimes
called intentionality - We need to know exactly which subject a topic
represents - That is, we need to establish its subject
identity - The collocation objective depends on knowing when
applications are talking about the same thing
39Subject identifiers
- The identity of most subjects can only be
established indirectly - An information resource can provide an indication
of the subjects identity to a human - Such a resource is called a subject descriptor
- A subject descriptor has an address,even though
the subject it indicatesdoes not - Computers can use the address of thesubject
descriptor to establish identity - Such addresses are calledsubject identifiers
- Subject descriptors and subject identifiers are
the two sides ofthe human-computer dichotomy
40Advice on subject identifiers
- Always use them for your typing topics
- Makes your topic map and your ontology more
portable - The more serious your application, the more
extensively you should use them for instances - Remember Merging with other topic maps will not
be successful without identifiers - LTM code for subject identifiers
- See ItalianOpera.ltm
- Example
- composer "Composer" _at_"http//psi.ontopedia.n
et/Composer"
41My conventions for PSIs
- URI prefix
- http//psi.ontopedia.net/
- Note Not all my identifiers have corresponding
descriptors - URI suffix
- Initial cap for topic types and role types (e.g.
Composer) - Lower case for association, occurrence and name
types (e.g. born_in) - Wikipedia conventions for instances
- Replace spaces with underscores
- Check Norwegian Opera for examples
- Do not use the Italian Opera Topic Map its
conventions are outdated
42Ontology-driven editing
- Creating topic mapsusing Ontopoly
43What is an ontology?
- Shorter Oxford English Dictionary
- Ontology The science or study of being that
department of metaphysics which relates to the
being or essence of things, or to being in the
abstract. - Russell Norvig Artificial Intelligence
- A particular theory of the nature of being or
existence - Tom Gruber
- A specification of a conceptualization a
description of the concepts and relationships
that can exist for an agent or a community of
agents - Wikipedia
- A data model that represents a set of concepts
within a domain and the relationships between
those concepts - John Sowa Knowledge Representation
- A classification of the types and subtypes of
concepts and relations necessary to describe
everything in the application domain
44Topic Maps terminology
- Ontology
- the set of typing topics that is used within a
given topic map, or that defines a class of topic
maps - i.e. the topic types, association types,
occurrence types, etc. - Constraints
- rules governing classes of objects (i.e. typing
topics) - Schema
- the combination of an ontology and constraints
- Schema language
- a language for writing schemas
- e.g. TMCL and OSL (Ontopia Schema Language)
45Why you need an ontology
- An ontology in Topic Maps corresponds to
- the set of element types and attributes in XML
- the set of tables and columns in an RDBMS
- It determines the kinds of things that can exist
in the topic map - In other words, the ontology determines what you
can say - For example
- You cant express the fact that X and Y are
organization unless you have a organization
topic type - You cant express the fact that person A is
employed by organization B unless you have an
employed by association type - etc.
46Expressing the ontology
- The ontology itself is part of the topic map
- Puccini is a topic of type composer
- Lucca is a topic of type city
- composer and city are also topics thatare
present in the same map - The association between Puccini and Lucca is of
type born-in, where Puccini plays the role of
person and Lucca plays the role of place - born-in, person and place are alsotopics in the
same map - Lucca has an occurrence of type map and Puccini
an occurrence of type bio - map and bio are also topics
- Etc.
47What is ontology-driven editing?
- A user-friendly way to create topic maps
- The equivalent of syntax-directed editing in XML
- The principle is simple
- The ontology describes what kind of things can
exist in the topic map - It also includes constraints on
- Which types of statement are used with which
types of topics - What cardinality they have
- Based on this, the interface is automatically
configured for data entry - The benefits
- Easier user interface no need to understand
syntax - More consistent topic maps
- Ontopoly is such an editor
48How to use Ontopoly
- Read the Ontopoly User Guide!
- It will save you a lot of grief in the long run
- Start the program from OKS Samplers / Ontopoly
Home - Open an existing Ontopoly topic map
- Import an existing non-Ontopoly topic map
- Or create a new topic map
- Use the Description tab to describe the topic map
- (Also to validate it and a few other things)
- Use the Ontology tab to define the ontology
- topic types, type hierarchy, association types,
role types, name types, occurrence types - fields (names, identifiers, occurrences, and
associations) that apply to each topic type,
their order and cardinality - Use the Instances tab to populate the data
- Uses an automatically configured forms-based
interface
49Some tips on ontology creation
- Sketch out the basic ontology on paper first
- Create the type hierarchy in Ontopoly
- Keep it simple
- Create association types and role types
- Specify what the role-playing topic types are
- Create occurrence types and name types
- Go to each topic type in turn, starting at the
top of each type hierarchy, and assign additional
fields - Review the ontology
- Dont add data until you are fairly comfortable
with the ontology - Later changes to the ontology that invalidate the
data may cause extra work
50Some comments on Ontopoly
- Does not (yet) support scope or variant names
- Use typed names instead of scoped names
- Includes system information in the topic map
- The topic map can be exported without this
information - It can be hidden in the Omnigator
- Customize ? Nontopoly model
- Important points to remember
- Clicking on any link submits the HTML form, but
does not save to disk - You MUST click on the Save button regularly
- Changing the ontology when you have already
entered data can lead to invalid data
51Demo Creating a topic mapwith Ontopoly
- A simple knowledge management applicationto
capture skills andexperience
52Conclusion
53Making information findable
- Intuitive navigational interfaces for humans
- The topic/association layer mirrors the way
people think, learn and remember - Powerful semantic queries for applications
- A formal underlying data structure
- Customized views based on individual needs
- Personalized information delivery using scope
- Information aggregation across systems and
organizations - Topic Maps can be merged automatically
- But there is more to Topic Maps than that...
54- Today our desktops are application-centric and
document-centric - Icons represent applications and documents
55- Why cant they be subject-centric, with icons
that represent the subjects we are interested in? - With links between related icons?
- And with context menus that allow us to find
everything related to a particular subject?
gambia
K185
opera
topic maps
LING 2110
OOXML
tm2008
rana
INF 2820
janacek
bantu semantics
keynote
bayreuth
håkon
56References (1/2)
- Articles
- The TAO of Topic Mapshttp//www.ontopia.net/topi
cmaps/materials/tao.html - ELIS article on Topic Mapshttp//www.ontopedia.n
et/pepper/papers/ELIS-TopicMaps.pdf - ISO standards
- http//www.isotopicmaps.org/
- Conferences
- International Topic Maps Users Conference
(Oslo)http//www.topicmaps.com - Topic Maps Research and Applications
(Leipzig)http//www.tmra.de
57References (2/2)
- Mailing lists
- http//www.infoloom.com/mailman/listinfo/topicmapm
ail - http//www.isotopicmaps.org/mailman/listinfo/sc34w
g3 - Tools
- Overview of tools http//www.garshol.priv.no/tmto
ols/ - Ontopia (Open Source Java engine)
http//www.ontopia.net/ - Blogs, websites, etc.
- http//www.topicmap.com
- http//topicmaps.bouvet.no/blog/
58(No Transcript)
59Topic Maps and RDF
- Similarities
- Differences
- Interoperability
60Semantic Web Layer Cake
61Two households, both alike in dignity
- During the late 1990s the W3C and ISO developed
two semantic technologies in parallel - Two communities, largely unaware of each other
- Tackling the same fundamental problems
- Findability
- Semantic interoperability
- The results were RDF and Topic Maps
62How the two families stack up
OWL
TMCL
RDF Schema
TMQL
SPRQL
QUERY
Topic Maps
ORG SYNTAX MODEL CONSTRAINTS
ORG SYNTAX MODEL REASONING
RDF
XML
CTM
XTM
LTM
RDF/A
RDF/XML
N3
ISOTopic Maps
W3CSemantic Web
63Striking similarities
- Both extend XML into the realm of semantics
- Both allow assertions to be made about things in
the real world - Both define abstract, associative (graph-based)
models - Both have URI-based models of identity
- Both allow forms of inferencing or reasoning
- Both have XML-based interchange syntaxes
- Both have constraint languages and query
languages - But they are also different in some crucial
respects...
64Important differences
- Different roots
- Topic Maps has its roots in traditional finding
aids (indexes, thesauri, etc.) - RDF has its roots in document metadata and formal
logic - Different levels of semantics
- RDF is more low level Topic Maps has more
higher-level semantics - Different models
- Identity, scope, association roles, n-ary
relationships, variant names, - Different goals
- RDF An artificially intelligent web for software
agents - Topic Maps Findability and knowledge integration
for humans
65The Most Crucial Differences
- RDF/OWL is for machinesTopic Maps is for humans.
- RDF/OWL is optimized for inferencingTopic Maps
is optimized for findability.
- RDF/OWL is based on formal logicTopic Maps is
not based on formal logic.
- RDF/OWL is to mathematics asTopic Maps is to
language.
66Who can tell me what this is?
- Is it an H or an A?
- (Human or Agent)
- The point is that fuzziness is a fact.
- Humans can handle it machines cant.
67Different capabilities
- RDF/OWL, to support logic-based inferencing,
cannot allow fuzziness - Topic Maps, because it is for humans, has to
support fuzziness - OWL ontologies tend to be very stringent and
complex - Topic Maps ontologies tend to be simpler and less
formal - OWL has properties for things that Topic Maps
doesnt need - Topic Maps has features that would be too complex
for OWL - So you need to decide what it is you really need
68RDF or Topic Maps?
- RDF is more low-level oriented towards machines
- Topic Maps is more high-level oriented towards
humans - OWL is oriented towards artificial intelligence
- Do you simply want to encode document metadata?
- RDF is ideal and you wont need OWL
- Do you want to achieve subject-based
classification of content? - Topic Maps provides the best combination of
flexibility and user-friendliness - Do you want both metadata and subject-based
classification? - Go straight for Topic Maps because it also
supports metadata - Do you want to develop agent-based applications?
- Use RDF/OWL if you already have Topic Maps,
youre half way there - Whatever you choose, you can always move your
data betweenTopic Maps and RDF, thanks to RDFTM
69RDFTM
- RDF/Topic Maps Interoperability Task Force
- A task force within the Semantic Web Best
Practices and Deployment Working Group - Chartered to deliver two documents
- Survey of Existing Interoperability Proposals
- Guidlines for RDF/Topic Maps Interoperability
- Survey published in February 2006
- http//www.w3.org/TR/rdftm-survey/
- Draft guidelines published in June 2006
- http//www.w3.org/2001/sw/BestPractices/RDFTM/guid
elines-20060630.html - The task force is now disbanded and the work will
be finalized by SC34
70(No Transcript)
71Applications of Topic Maps
- Taxonomy Management
- Metadata Management
- Semantic Portals
- Information Integration
- eLearning
- Business Process Modelling
- Product Configuration
- Business Rules Management
- IT Asset Management
- Asset Management (Manufacturing)
72Taxonomy management
- For managing unstructured content
- Organization by subject because thats how
users search - A taxonomy is a simple form of topic map
- Topic Maps provides subject-based organization
de-luxe - Using Topic Maps offers many benefits
- Standards-based means vendor independence and
data longevity - Associative model allows for evolution beyond
simple hierarchies - The taxonomy can also be used as a thesaurus, a
glossary or an index - Identity model permits merging and reuse
- Dutch Tax and Customs Administration
(Belastingdienst) uses Topic Maps as the basis of
a taxonomy management system - http//www.idealliance.org/papers/dx_xmle04/papers
/04-01-03/04-01-03.html - Capability can be added to any Content Management
System
73Metadata management
- A Metadata Server based on Topic Maps
- Management of metadata for government
publications - Used in the central public information portal
(ODIN) - Primary goal
- Ensure much greater consistency in the use of
metadata across different government publications
in order to improve findability for users - ODIN now re-architected as regjeringen.no
- Solution based on Topic Maps
74Semantic portals
- Topic Maps as the Information Architecture
- for web-based publishing (web sites, portals,
intranets, etc.) - Site structure is defined as a topic map
- Each page represents a topic (subject-centric)
- User-friendly navigation paths defined by
associations - Topics used to classify content
- Potential for subject-based portal connectivity
- Smooth evolution into Knowledge Management
solutions
75Enterprise information integration
- Topic Maps are designed for ease of merging
- Generate topic maps from structured data(or
create topic mapviews of that data) - Merge topic maps to providea unified view of the
whole - Easy to filter
- Create personalized viewsof this unified model
- Advantages
- Consolidated access toall related information
- No need to migrateexisting content
- Standards-based
76Enterprise information integration
- Example Elmer project at Starbase (Borland)
- Integration server for software information
- Multiple disparate applications hold related data
- Unified topic map layer enables search across
repositories - Data integration without changing the underlying
applications - Portal interface
- Intuitivenavigation
- Full-text andstructured queries
- Smarttags integration
- Elmer terms (topic names)highlighted
- Provide links into theportal
77E-learning BrainBank
- Topic maps are associative knowledge structures
- They reflect how people acquire and retain
knowledge - Students describe whatthey have learned
- Pilot users 11-13 year olds
- Key learning concepts are
- captured, named, described
- associated with other concepts
- Students are able to
- capture the essence of a subject
- describe what they have learned
- keep track of their knowledge
- Teachers are able to
- monitor students understanding
78Business processes
- Multinational petrochemical company
- Uses TMs to manage business process models
- Flexible model allows arbitrary relationships to
be captured easily - Processes are modelled in terms of
- Steps involved, their preconditions, their
successors, etc - Processes related through
- Composition (one process ispart of another),
- Sequencing (one process isfollowed by another),
- Specialization (one process isa special case of
a moregeneral process)
79Product configuration
- Managing product configuration for mobile phones
- Products belong to families
- Features belong to products or product families
and are grouped in feature sets - There are dependencies between features and they
apply in different regions, etc. - Network of dependencies is already quite complex
- Now throw versioning into the mix!
- Managing all this data is not easy
- Dependencies modelled in a topic map
- Product configuration engineers use this to
configureproducts using a very user-friendly
interface - System is driven by inference rules
- These work on the topic map
- Easily capture complex logic
- Also integrates with product documentation
80Business rules
- US Department of Energy Rules for security
classification - Information about the production of nuclear
weapons subject to thousands of rules - Rules published in 100s of documents
- Most documents are derived from more general
documents - Guidance topics form a complex web of
relationships - Captured in a topic map (KB)
- Concepts connected to if-then-else rules
- KB used with inference engine
- automatically classifies information(documents,
emails, ...), and - "redacts" information (PDF, email, ...)
- Benefits
- Model expressive enough to capturecomplexity of
the rules - ISO standard stability longevity
81IT assets
- University of Oslo Management of IT assets
- Servers, clusters, databases, etc. described in a
TM (KB) - Used to answer questions like
- If operating system Z is upgraded, what apps are
affected? - Service X is down, who do I call?
- If I take Y down, what else goes?
- Uses composite topic map
- Partly autogenerated
- Partly handcoded
- Two applications
- Whitney online
- Houston offline (foruse in emergencies)
82Manufacturing assets
- US Department of Energy
- Topic map describes Y-12 manufacturing facility
- Provides overview of
- equipment,
- processes,
- materials required,
- parts already built,
- etc.
83Tools (http//www.garshol.priv.no/tmtools/)
- ATop
- CmapTools
- ctm-mode
- dtddoc
- Escenic Topic Maps module
- Knowledge Concierge
- ltm-mode
- mappa
- OfficeNet Knowledge Portal
- Ontopia
- Perl TM
- QuaaxTM
- Ruby Topic Maps
- ThinkGraph
- tinyTiM
- TM/XMLtoXTM1 Converter
- TM
- TM4J
- TM4JScript
- TM4L
- TM4Web
- TMAPIX
- TMCore
- TMCore EPiServer Module
- TMCore Sharepoint Module
- TMCore Sitecore Module
- tmedit
- TMNav
- TMTab
- Topincs
- Wandora
- Wordpress Topic Maps
- xSiteable
- XTM1toXTM2 Converter
- xtm2xhtml
- xtm4xmldb
- ZTM
84(No Transcript)
85Topic types, type hierarchiesand other
hierarchies
86Topic types
- A topic type defines a class of things
- Its a particular kind of category that has
instances - You can also think of it as a set of things that
haveone or more properties in common - Rule 1 If it doesnt have instances, it isnt a
type! - Music is a category, but not a type (there are
no instances) - nothing is a music
- Opera is a type, because there are things which
are operas - Tosca is an opera
- A diagnostic for deciding if foo is a type
- If you can think of things which are foos the
answer is yes - But be careful Is wine a type?
- If the answer is no, ask what kind of thing foo
is - Now, that really is a type!
87ISA and type-instance
- The relationship between a type and its instance
is actually a special kind of association - We call it (guess what) a type-instance
relationship - Its also often called an ISA relationship
- It can be represented as an association in XTM or
LTM - But theres no real point
- Use the syntactic shortcut instead
- tosca opera
tosca
is a
opera
88Rules of thumb for topic types
- Choose an appropriate level of generality
- Countries is better than Countries in
South-East Asia - The domain of the topic map tells you which
countries it includes - If it doesnt, an association would be a better
solution - located-in(Thailand, South-East_Asia)
- But dont make it so general as to be useless
- Places instead of countries would mix
countries and cities - Keep the name short
- That makes it easier to display
- Use the singular form
- Experience shows this to be most useful, so
Country, not Countries - Use initial capitals
- A matter of taste, but I think it looks most tidy
89Type hierarchies
- Some topic types can be arranged in hierarchies
- Type hierarchies are a natural way to order parts
of the world - Humans are quite familiar with tree structures
- Type hierarchies provide
- more user-friendly navigation
- more powerful querying/inferencing
- more compact schemas and ontologies
- greater clarity about the relationships between
types - Use hierarchies, but beware of two pitfalls
- Not all hierarchies are type hierarchies...
- Its easy to confuse your ISAs and your AKOs
90Type hierarchies AKO
a dog is A Kind Of canine, a canine is A Kind Of
mammal, etc.
91Dragon 1 Mixing ISAs and AKOs
?
- Steve is a homo sapiens
- A homo sapiens is a mammal
- Therefore Steve is a mammal
- Steve is a homo sapiens
- Homo sapiens is a species
- Therefore Steve is a species
92Types, subtypes and instances
93How type hierarchies work
- The superclass-subclass relationship has defined
semantics - Therefore make sure you use it correctly
- Software (tolog, for example) will assume you
mean what you say - If you abuse the semantics you will get incorrect
results! - If A is a superclass of B, then
- Both A and B must be classes
- If C is an instance of B, it must also be an
instance of A - If C is a subclass of B, it must also be a
subclass of A,(in which case an instance of C
is also an instance of Band an instance of A) - If in doubt define your own association type
- merging it with superclass/subclass later is
trivial
94Being both type and instance
- Most modelling paradigms distinguish between
type and instance - In most paradigms something cannot be both
- In Topic Maps something can be both type and
instance - (or class/category and individual)
- For example, homo sapiens can be both
- a type (supertypeprimate, instanceSteve), and
- an instance (typespecies)
- So be careful!
95Representing a type hierarchy
- Use associations between typing topics
- subtypeOf(homo_sapiens subtype, primate
supertype) - subtypeOf(primate subtype, mammal supertype)
- XTM 1.0 defined identifiers for these three
subjects - subtypeOf (or superclass-subclass)http//www.top
icmaps.org/xtm/1.0/core.xtmsuperclass-subclass - supertype (or superclass)http//www.topicmaps.or
g/xtm/1.0/core.xtmsuperclass - subtype (or subclass)http//www.topicmaps.org/xt
m/1.0/core.xtmsubclass - Topic Maps software understands these and
implements the semantics for you
96Type hierarchies in LTM
- / Techquila hierarchy PSIs /
- hierarchical-relation-type "Hierarchical
relation type" - _at_"http//www.techquila.com/psi/hierarchy/hierar
chical-relation-type" - superordinate-role-type "Superordinate role
type" - _at_"http//www.techquila.com/psi/hierarchy/supero
rdinate-role-type" - subordinate-role-type "Subordinate role type"
- _at_"http//www.techquila.com/psi/hierarchy/subord
inate-role-type" - / XTM superclass-subclass PSIs /
- subtypeOf hierarchical-relation-type
- "Subtype of" "Supertype of" / supertype
- _at_"http//www.topicmaps.org/xtm/1.0/core.xtmsupe
rclass-subclass" - subtype subordinate-role-type "Subtype"
- _at_"http//www.topicmaps.org/xtm/1.0/core.xtmsubc
lass" - supertype superordinate-role-type
"Supertype" - _at_"http//www.topicmaps.org/xtm/1.0/core.xtmsupe
rclass" - / An example type hierarchy /
- subtypeOf( composer subtype , musician
supertype )
/ Techquila hierarchy PSIs / hierarchical-relat
ion-type "Hierarchical relation type"
_at_"http//www.techquila.com/psi/hierarchy/hierarch
ical-relation-type" superordinate-role-type
"Superordinate role type" _at_"http//www.techquila
.com/psi/hierarchy/superordinate-role-type" sub
ordinate-role-type "Subordinate role type"
_at_"http//www.techquila.com/psi/hierarchy/subordin
ate-role-type" / XTM superclass-subclass PSIs
/ subtypeOf hierarchical-relation-type
"Subtype of "Supertype of" / supertype
_at_"http//www.topicmaps.org/xtm/1.0/core.xtmsuperc
lass-subclass" subtype subordinate-role-type
"Subtype" _at_"http//www.topicmaps.org/xtm/1.0/c
ore.xtmsubclass" supertype
superordinate-role-type "Supertype"
_at_"http//www.topicmaps.org/xtm/1.0/core.xtmsuperc
lass"
97Dragon 2 Non-type hierarchies
- Not all hierarchies are type hierarchies
- For example
- geographical containment
- part of relationships
- subject classifications
- These relationshipsare not supertype-subtype
- located in
- part of
- subtopic of
- So again, be careful!
Norway is NOT a kind of Europe...
A piston is NOT a kind of submarine...
An opera is NOT a kind of music...
98(No Transcript)
99Topic Maps and Knowledge Organization
- Keywords controlled vocabularies
- Taxonomies, thesauri classifications
- Indexes glossaries
- Ontologies
100Bibliographic languages
- Work language
- Author language
- Title language
- Edition language
- Subject language
- Classification language
- Index language
- Document language
- Production language
- Carrier language
- Location language
- Svenonius, Elaine (2000)The Intellectual
Foundation of Information Organization.Cambridge,
MA MIT Press (p.54)
- Work languages
- Work languages describe information entities,
their intellectual (as opposed to physical)
attributes, and relationships among them. (p.87) - Document languages
- A document is a particular space-time embodiment
of information a document language describes and
provides access to this embodiment. (p.107) - Subject languages
- A subject language is used to depict what a
document is about. (p.127)
101Two perspectives
- Works have tended to be conflated with documents
- So in practice there have been two kinds of
language - Document languages
- describe the work and its manifestations
- document-centric (or resource-centric), e.g.
- document metadata (Dublin Core)
- bibliographic records (MARC)
- Subject languages
- describe the subject space in which the work
exists - subject-centric, e.g.
- thesauri, taxonomies (ICD)
- classification schemes (LCSH, DDC)
- faceted classification (Colon)
102Metadata
- Data about data
- Information about documents
- e.g. author, title, publisher, date, format,
keywords - Useful for managing the content
- Especially suitable for librarians
- Somewhat useful for searching
- Especially for experts
- Less useful for end-users
- the user starts out wanting to know more about a
subject - traditional metadata, however, focuses on the
document - if aboutness is provided at all, it gets squeezed
into a single field
103Keywords
- Primitive form of subject-based classification
- The keywords are used to describe the subject
- Cheap and simple Folksonomies and tagging.
- But also problematic because authors
- misspell keywrods,
- use different keywords/terms/tags for the same
thing, and - use keywords that make no sense
- Secondary problem
- No way for the user to find out what keywords
have been used - A keyword is a topic name
104Controlled vocabularies
- Solution create a list of legal keywords!
- Requires somewhere to keep the list, and a
process for new terms - Benefits
- Solves problems of misspelling and duplicates
(synonyms) - Disadvantages
- Introduces some overhead (a flat list is
difficult to manage) - Users can still search using the wrong terms
- Users (and authors) still have difficulty finding
terms - A controlled vocabulary is a well-defined set of
topics with one name per topic
105Taxonomies
- Organize the keywords into a tree
- Most general at the top, more specific further
down - Common structure used by Yahoo!, etc.
- The folder metaphor
- file systems, email, favourites
- Requires relationships between terms
- Relationships state that one term is more
specificthan another - Advantage terms somewhat easier to find
- Disadvantage real world does not fit neatly into
a hierarchy - A taxonomy is a set of topics related through a
specific type of hierarchical association
106Thesauri
- Like a taxonomy, but with some extensions
- Also better defined there are ISO standards for
thesauri - Relationship types
- BT Broader term NT Narrower term
- USE Preferred term UF Non-preferred terms
- RT Related term
- SN Scope note
- A thesaurus is a set of topics related through
particular, predefined association types - BT/NT (hierarchical) and RT (untyped,
associative) - (Scope notes are a kind of occurrence)
- (USE and UF represent multiple names for the same
concept/topic)
107Faceted classification
- Invented by S. R. Ranganathan in the 1930s
- Defines a number of facets or dimensions
- Defines a set of terms within each facet
- Sometimes these terms are arranged in a taxonomy
- Documents are classified against each facet
separately - A faceted classification is a collection of topic
hierarchies - Each hierarchy contains topics whose names are
used as terms within a particular facet - XFML An XML interchange syntax for faceted
classification inspired by Topic Maps
108Expressivity progression
open model
- Topic maps and RDF/OWL
- use any types, properties, and relationships you
like - Faceted classification
- multiple vocabularies, taxonomies or thesauri
(one per facet) - Thesauri
- more formal taxonomy still no topic types two
association types - Taxonomy
- terms arranged in a hierarchy no topic types
single association type - Controlled vocabulary, folksonomies
- just a list of terms no topic types no
associations
fixed model
no model
109Document-centric approaches
- Traditional metadata is document-centric
- Provides substantial descriptive power for
documents - Allows connection into subject-based
classification - Crucial for the management of content
- However, users are most interested in the
subjects - Taxonomies, thesauri, and faceted classification
are also document-centric - These are methods for subject-based
classification - They provide hardly any descriptive power for
subjects
110Subject-centric approaches
- Topic maps are subject-centric
- They provide great descriptive power for subjects
- Good as finding aids, because subjects are what
users care about - Documents can be treated as subjects
- This enables topic maps to capture metadata as
well - It also enables topic maps to stitch metadata and
subject-based classification together into one
seamless whole - Topic Maps is the knowledge model par excellence
- A subject-centric knowledge model that
encompasses every other kind of knowledge
organization model - Topic Maps can therefore be used to relate and
combine taxonomies, indexes, thesauri,
classifications, etc. etc.