Title: XML DTDs and other Alternatives: Vocabulary Markup Language (Voc-ML) Project
1XML DTDs and other Alternatives Vocabulary
Markup Language (Voc-ML) Project Friends
April 11, 2002
- Joseph A. Busch
- Director, Solutions Architecture
NetLab and Friends Semantic Web and Knowledge
Organization
2Outline
- The real Semantic Web
- Vocabulary Markup Language (Voc-ML)
- Namespace registry
- Schema
- Services definition
- Voc-ML applications
3The problem IS search!
- Data values, NOT just data structures are needed.
4Soergels SemWeb Proposal
- System of integrated access to data on concepts
and terminology. - Bring together variety of sources that exist
largely in separate worlds, including
dictionaries, thesauri, classification schemes,
etc. - Federated system with multiple collaborators.
- Common interface to all concept terminology
knowledge bases on the Internet.
Dagobert Soergel. SemWeb integrated access to
distributed ontological resources. (April 1998)
Last checked March 29, 2002. http//www.clis.umd.e
du/faculty/soergel/soergelsemwebprop.pdf
5The Real Semantic Web
- Namespace for uniquely identifying a semantic
scheme each concept within each scheme. - Broad template or conceptual schema for holding
all types of semantic information specifying
relationships among them. - Definitions of services for interacting with the
System.
6Namespace NKOS Registry
Asset metadataThe Who, Where and When Title, Alternate, Creator, Publisher, Date, Type, Format, Identifier, Language
Subject metadataThe What and Why Subject, Description , Application
Relational metadataThe Linkages Relation
Use metadataThe How Rights, Entity Types, Relationships, Info Given
http//staff.oclc.org/vizine/NKOS/Thesaurus_Regis
try_version3_rev.htm
7NKOS registry example
Type
Identifier
Namespace
Title
Description
Creator
Rights
Date
2001
Format
Publisher
Subject
8NKOS registry example
Type
Identifier
Namespace
Title
Description
Creator
Rights
Date
2001
Format
Publisher
Subject
9Schema Vocabulary Markup Language (Voc-ML)
- XML schema for the Semantic Web.
- Broad template for structured representation of
semantic schemes. - Z39.19-1993 and ISO 2788
- Dublin Core metadata
- Tags and syntax for uniquely identifying each
concept - Typed relationships (hierarchical, associative,
etc.) - Host agency Networked Knowledge Organization
Systems
http//nkos.slis.kent.edu/VOCML-1.DOC
10Voc-ML schema example
- lt?xml version"1.0" encoding"ISO-8859-1" ?gt
- lt!DOCTYPE MetaSource SYSTEM "Voc-ML.dtd"gt
- ltMetaSourcegt
- ltSVHeadergt
- ltdcTitlegtUN Standard Product and Services
Classificationlt/dcTitlegt - ltdcCreatorgtDunn Bradstreet lt/dcCreatorgt
- ltdcSubjectgtProducts, Industriallt/dcSubjectgt
- ltdcSubjectgtProducts, Consumerlt/dcSubjectgt
-
- ltUIDprefixgtunspsc lt/UIDprefixgt
- lt/SVHeadergt
- ltSVTerm UID"unspsc501921"gt
- ltlabelgtSnack foodslt/labelgt
- ltparent UREF"unspsc5019"gt
- ltchild UREF"unspsc50192101"gt
- ltchild UREF"unspsc50192102"gt
- ltchild UREF"unspsc50192103"gt
- ltchild UREF"unspsc50192104"gt
- lt/SVTermgt
11ADL Thesaurus Protocol XML Elements
ltpropertiesgt Overall properties of the thesaurus
lttermgt A term name and its preferred status (format)
ltterm-descriptiongt Full term description (format)
ltextendedgt Any other thesaurus format (format)
ltlistgt List of zero or more terms ltresponsegt
lthierarchygt A hierarchy of terms ltresponsegt
lterrorgt Error code and description ltresponsegt
ltresponsegt Response from a thesaurus service
http//www.alexandria.ucsb.edu/gjanee/thesaurus/s
pecification.html
12ADL Thesaurus Protocol Services
- ? get-properties
- ? query? (operator, text, fuzzy, format)
- ltquery-operators
- equals"true"
- contains-all-words"true"
- contains-any-words"true"
- matches-regexp"false"/gt
- text text
- fuzzy truefalse
- format lttermgt, ltterm-descriptiongt, ltextendedgt
- ? get-hierarchies? (starting-term,
broader-levels, narrower-levels, format)
13Service definition example ? get-properties
- ltresponsegt
- ltpropertiesgt
- ltdc.namegtUN Standard Product and Services
Classificationlt/dc.namegt - ltdc.CreatorgtDunn Bradstreetlt/dc.Creatorgt
- ltdc.SubjectgtProducts, Industriallt/dc.Subjectgt
- ltdc.SubjectgtProducts, Consumerlt/dc.Subjectgt
- ltquery-operators
- equals"true"
- contains-all-words"true"
- contains-any-words"true"
- matches-regexp"false"/gt
- ltextended-schemagthttp//eccma.org/unspsc.dtdlt/exte
nded-schemagt - lt/propertiesgt
- lt/responsegt
http//nkosregistry.org/unspsc/get-properties
14Service definition example ? query?
http//nkosregistry.org/unspsc/query?operatorcont
ains-any-wordstext snackfoodsformatterm
- ltresponsegt
- ltterm-descriptiongt
- lttermgtSnack foodlt/termgt
- ltscope-notegtUse this category for food eaten
between regular meals.lt/scope-notegt - ltbroadergt
- lttermgtPrepared and preserved foodslt/termgt
- lt/broadergt
- ltnarrowergt
- lttermgtPretzelslt/termgt
- lttermgtCorn chipslt/termgt
- lttermgtPotato chipslt/termgt
- lttermgtPopcornlt/termgt
- lt/narrowergt
15Service definition example ? get-hierarchies?
-
- lthierarchy direction"broader" maxlevels"-2"gt
- ltnodegt
- lttermgtSnack foodslt/termgt
- ltnodegt
- lttermgtPrepared and preserved foodslt/termgt
- ltnodegt
- lttermgtFood Beverage and Tobacco Productslt/termgt
- lt/nodegt
- lt/nodegt
- lt/nodegt
- lt/hierarchygt
http//nkosregistry.org/unspsc/get-hierarchies?sta
rting-termsnack20foodsbroader-levels-2narrowe
r-levels1formatterm
16Application Visual vocabulary editor
Portability. Voc-ML input/output. Utilities to convert outline, spreadsheet, directory to Voc-ML
Manage namespace unique IDs (not just a list of labels). Enforce namespace and ID uniqueness.
Allow polyhierarchy (membership in multiple classes). Copy paste term to additional parent.
Allow typed equivalents. Add/edit equivalents with pre-defined types.
Easy to use. File manager style hierarchy display. Drag drop/cut paste terms and their children. Undo. Right-click functions.
17Application Visual vocabulary editor
18Application Manage product taxonomies
Organize (and reorganize) product classes for diverse purposes Drag and drop editing Preserve unique ID within namespace
Allow products to have many aliases Alternates associated with unique ID
Allow products to exist in more than one class Polyhierarchy (to allow multiple parents)
Map products across multiple taxonomies Relationships across different namespaces (e.g. linked parallel hierarchies in different languages
Generate and maintain linkages to associated documentation Associate metadata labels as well as namespace with unique ID
19Application Manage product taxonomies
Product Information
Categorization Metadata
Deployment Re-use
Application Server
20Application Search query intermediation
Map content to controlled vocabulary (thesaurus, etc.) Associate metadata labels.
Control exactly what is indexed (metadata not just full text) and when. Deploy metadata direct to search engine without spidering.
Deploy controlled vocabulary (thesaurus, etc.) to search engine. Re-direct user queries to thesaurus, and return expanded query based on rules.
21Application Search query intermediation
Categorization Metadata
Search Results
Content
Search Engine Index
Expanded Query
22Joseph A. BuschDirector, Solutions
ArchitectureInterwoven803 11th
AvenueSunnyvale, CA 94089(408)
220-6974jbusch_at_interwoven.com
Contact Information
- Visit www.interwoven.com
- Enterprise Content Management