Title: Roadmap to XML
1Roadmap to XML
Richard Marciano Research Scientist San Diego
Supercomputer Center marciano_at_sdsc.edu
2Outline
- 900 1000
- XML core
- overview, the XML 1.0 Specification syntax,
namespaces, DTDs, ... - 1015 1115
- XML content creation
- tools used to create XML,
- 1130 1230
- XML content retrieval
- browsers, XSLT,
- --------------------------------------------------
-------------------------- - 200 330
- New XML directions
- knowledge and XML Topic Maps, Semantic Web,
Maps, - 400 500
- XML for archivists
- uses of XML for archivists, tools?, other uses?,
needs?
3New XML Directions
- 200 p.m. 330 p.m.
- 330 p.m. 400 p.m. BREAK
4The Semantic WebThe Semantic Web, Scientific
American, May 2001, Tim Berners-Lee
- Extension of the Web (with Knowledge Meaning)
- Where data on the Web is defined and linked in
a way that it can be used by machines not just
for display purposes, but for automation,
integration and reuse of data across various
applications - Provide a language that expresses both data and
rules for reasoning about the data and that
allows rules from any existing knowledge-represent
ation system to be exported onto the Web - Adding logic to the Webthe means to use rules to
make inferences, choose courses of action and
answer questions - Important technologies
- XML
- RDF with RDF, triples form webs of information
about related things - collections of information called Ontologies
- In philosophy, an ontology is a theory about the
nature of existence - Here its a document or file that formally
defines the relations among terms. The most
typical kind of ontology for the Web has a
taxonomy and a set of inference rules. - The taxonomy defines classes of objects and
relations among them - Inference rules help further manipulate the terms
5Normalized Data/Metadata Representation
- Resource Description Framework (RDF)
- Metadata model
- The designer can describe objects, add properties
to define and describe them, and also make
complicated statements about the objects
(statements about relationships between
resources). - The specification comes in two sections
- Model Syntax (viewed as directed, labeled
graphs) - RDF Schemas (using an XML vocabulary)
6Resource Description Framework (RDF)
- Metadata is useful for information retrieval
(esp. if no other schema info or semantics is
available) - Idea representation independent encoding of
metadata as triples (Resource, PropertyType,
Value) - (uri1, DCcreator, uri2), (uri2, vCardname,
smith), ... - "Semantic Net"
DCcreator
uri1
uri2
vCardname
smith
7Ora Lassila is the creator of the resource
http//www.w3.org/Home/Lassila.
Creator
http//www.w3.org/Home/Lassila
Ora Lassila
Figure 1 Simple node and arc diagram
Figure 1 Simple node and arc diagram
8Ora Lassila is the creator of the resource
http//www.w3.org/Home/Lassila.
RDF/XML
3.org/Home/Lassila" Ora
Lassila
namespace prefix 's' refers to a
specific namespace prefix chosen by the author of
this RDF expression and defined in an XML
namespace declaration such as xmlnss"http//des
cription.org/schema/"
Figure 1 Simple node and arc diagram
9TOPIC MAPS ISO/IEC 13250 (Jan. 2000) Bridging
knowledge representation information
management
- STANDARD FOR
- describing knowledge structures
- associating them with information resources
- solution for organizing and navigating large and
large information pools - XTM SPECIFICATION
10TOPIC MAPS
- New paradigm for K. navigation synthesis
- Concept of creating style sheets for K.-based
information access and navigation - GPS for the Web
- TMs define semantically customized views
11The TAO of Topic Maps
T is for Topic
Helms, Jesse
Senate Budget
Senate Finance
Nov 4, 19999
SBC
North Carolina
School Lunch
S.1019
Relief of Edwards
Senate Budget
Senate Budget
Helms, Jesse
McCain, John
SHJ
North Carolina
Topics
Topic types
Topic names
12The TAO of Topic Maps (cont.)
O is for Occurence
Occurrences
Occurrence Roles
13The TAO of Topic Maps (cont.)
A is for Association
McCain, John
McCain, John
S.1078
D.C.
D.C.
S.1078
Helms, Jesse
Helms, Jesse
North Carolina
North Carolina
North Carolina
S.43
S.43
Raleigh
Raleigh
Topic associations
Association types
14The TAO of Topic Maps (cont.)
Independence of topic associations topic
occurrences (information resources)
McCain, John
S.1078
D.C.
Helms, Jesse
North Carolina
S.43
Raleigh
Topic maps as portable semantic networks
15References
http//www.topicmaps.org/xtm/index.html
16Senate Legislative Activities CollectionNARA
106th Senate
Paul S. Sarbanes of Maryland (see p. 135, p. 151,
etc.) January 06, 1999 to March 31, 2000 Section
I Sponsored measures Section II Cosponsored
measures Section III Sponsored measures
organized by committee referral Senate
Armed Services Senate Banking House
Judiciary Section IV Cosponsored measures
organized by committee referral Senate
Agriculture House Science Section V
Sponsored amendments Section VI Cosponsored
amendments Section VII Subject index to measures
and amendments
Raw Data
Raw Data
Raw Data rtf
Senator 1
Senator 2
...
Senator 99
S. 151 Date Introduced 01/19/1999 Cosponsor
s NONE Official title A bill to
amend the International
Maritime Satellite Telecommunications Act Latest
status Jan 19, 1999 Read twice and
referred to the Committee on
Commerce Abstract NONE
Subject Index Academic Performance S.7, S.514,
S.564 Access to Health Care S.6, S.1678,
S.1690 Zoning and zoning law S.9,
S.Con.Res.10, S.Res.41, S.J.Res.39
17TM Example (XTM-like)DTD 1/2
(topic assoc ) occurs) REQUIRED types CDATA IMPLIED
sortname)
sortname (PCDATA)
18DTD 2/2
locator EMPTY CDATA REQUIRED href CDATA REQUIRED
assoc types CDATA IMPLIED assocrl EMPTY CDATA REQUIRED href CDATA
REQUIRED Â
19TM Example The XML doc itself (1/4)
Apartment
houses Apt.
Houses APARTMENTHOUS
ES
/ Â
20TM XML Document (2/4)
Children
Child.
CHILDREN
href"S.300" / role"DiscussedIn" href"S.463" /
/ href"S.1709" / role"DiscussedIn" href"S.Res.125" /
21TM XML Document (3/4)
Â
Welfare
Welf.
WELFARE
href"S.463" / role"DiscussedIn" href"S.1277" /
href"S.Con.Res.28" / role"DiscussedIn" href"S.Res.125" /
 types"SubjectEntry"
Youth employment
Youth empl.
YOUTEMPLOYMENT
role"DiscussedIn" href"S.463" /
Â
22TM XML Document (4/4)
 Â
/ href"t2" / ill" href"t3" / role"DiscussedInSameBill" href"t4" /
 Bills" href"t2" / ill" href"t3" / Â
23Topic Maps Self-ControlExtreme ML 2000, Montreal
Hans Holger Rath
- Topic Map templates
- Logical container for the schema part of the
map - Type/theme declarations
- Constraints
- Inference rules
- Association properties
- Transitivity
- Support inferencing capabilities
- Type hierarchies commercial site
(www.ontopia.net) - Super-subclassing
- Inferencing
- Consistency checking with constraints
- Rule-based constraints control validation process
- Constraint patterns
24Topic Maps Self-Control ( continued)
- Inference rules
- Deduce additional knowledge
- Inference patterns
- Examples
- If topic1 is a sibling of topic2 and topic1 is
a male then topic1 is a brother - typeclass-instance
- scopeir-schema
- ir-topic-A-PERSONassocrl
- male
-
- ? THE TM control their own structure and content!
25Model-Based Mediation
26Simplest Definitions
- Data
- Digital object
- Objects are streams of bits
- Information
- Any tagged data, which is treated as an
attribute. - Attributes may be tagged data within the digital
object, or tagged data that is associated with
the digital object - Knowledge
- Relationships between attributes
- Relationships can be procedural/temporal,
structural/spatial, logical/semantic, functional
27Types of Knowledge Relationships
- Logical / semantic
- Digital Library cross-walks
- Temporal / procedural
- Workflow systems
- Spatial / structural
- GIS systems
- Functional / algorithmic
- Scientific feature analysis
28Knowledge Based Persistent Archive
Ingest Services
Management
Access Services
Knowledge or Topic-Based Query / Browse
Knowledge Repository for Rules
Relationships Between Concepts
Knowledge
XTM DTD
Rules - KQL
(Topic Maps / Buckets / Model-based Access)
Attribute- based Query
Attributes Semantics
Information Repository
SDLIP
Information
XML DTD
(Data Handling System - SRB / FTP / HTTP)
Data
Fields Containers Folders
Storage (Replicas, Persistent IDs)
Grids
Feature-based Query
MCAT/HDF
29Further Information
http//www.npaci.edu/DICE