Title: The Rule Markup Initiative: KR Principles and DTD Modularization
1The Rule Markup Initiative KR Principles and
DTD Modularization
- Harold Boley(joint work with
- Benjamin Grosof and
- Said Tabet)
- 23th Meeting of the DFKI Scientific Advisory
Board - February 19-21, 2001
2Motivation (I)
- Rules in (and for) the Web have become a
mainstream topic since - inference rules were
- marked up for E-Commerce
- identified as a Design Issue of the Semantic Web
- transformation rules were used for document
generation from central XML repository
- Rule interchange is becoming more important in
Knowledge Representation (KR), especially in - Intelligent Agents
- AI shells for knowledge-based systems
3Motivation (II)
- The Rule Markup Initiative has taken initial
steps towards defining a shared Rule Markup
Language (RuleML) for interoperation between
Participants - RuleML permits forward (bottom-up) and backward
(top-down) rules in XML for - deduction
- rewriting
- further inferential-transformational tasks
- Current version uses backward notation, which can
be applied in both directions
4Initial Example Forward-Rule Notation
Challenge hypertext as one XHTML paragraph
ltpgtIf you want to review rule principles, you may
look at lta href"http//www.cs.brandeis.edu/
..."gtRule-Based Systemslt/agtlt/pgt
Original RuleML markup with XHTML in prem/conc
(English premise and semiformal conclusion)
- ltrulegt
- ltpremgt
- ltpgtYou want to review rule principleslt/pgt
- lt/premgt
- ltconcgt
- ltpgtYou may look at
- lta href"http//www.cs.brandeis.edu/..."gtRul
e-Based Systemslt/agt - lt/pgt
- lt/concgt
- lt/rulegt
5Initial Example Backward-Rule Notation
Further formalized RuleML markup (still
unanalyzed English relation and
individual-constant names)
- ltifgt
- ltatomgt
- ltrelgtmay look atlt/relgt
- ltvargtyoult/vargt
- ltur label"Rule-Based Systems"gthttp//www.cs.b
randeis.edu/...lt/urgt - lt/atomgt
- ltatomgt
- ltrelgtwant to reviewlt/relgt
- ltvargtyoult/vargt
- ltindgtrule principleslt/indgt
- lt/atomgt
- lt/ifgt
6RuleML Elements of Datalog DTD Clocksin/Mellish
Sample Prolog Clauses
Rule (Non-unit clause)
Fact (Unit clause)
ltifgt ltatomgt ltrelgtlikeslt/relgt
ltindgtMarylt/indgt ltindgtwinelt/indgt lt/atomgt
ltand/gt lt/ifgt likes(mary,wine).
ltifgt ltatomgt ltrelgtlikeslt/relgt likes(
ltindgtJohnlt/indgt john, ltvargtxlt/vargt X )
lt/atomgt - ltatomgt ltrelgtlikeslt/relgt
likes( ltvargtxlt/vargt X,
ltindgtwinelt/indgt wine ) . lt/atomgt lt/ifgt
Empty and ? true premise ? factual rule
7RuleML Element of UR-Hornlog DTD Proposed
W3C-Page Authentication Rule
Tim Berners-Lee Any person who was some time in
the last 2 months an employee of an organization
which was some time in the last 2 months a W3C
member may register
ltatomgt ltrelgtmember inlt/relgt
ltvargtorglt/vargt ltur label"W3C"gthttp//
www.w3.org/lt/urgt ltctermgt
ltctorgtlastlt/ctorgt ltctermgt
ltctorgtmonthlt/ctorgt ltindgt2lt/indgt
lt/ctermgt lt/ctermgt lt/atomgt
lt/andgt lt/ifgt
ltifgt ltatomgt ltrelgtmay registerlt/relgt
ltvargtanylt/vargt lt/atomgt ltandgt ltatomgt
ltrelgtpersonlt/relgt ltvargtanylt/vargt
lt/atomgt ltatomgt ltrelgtorganizationlt/relgt
ltvargtorglt/vargt lt/atomgt
ltatomgt ltrelgtemployee inlt/relgt
ltvargtanylt/vargt ltvargtorglt/vargt
ltctermgt ltctorgtlastlt/ctorgt
ltctermgt ltctorgtmonthlt/ctorgt
ltindgt2lt/indgt lt/ctermgt lt/ctermgt
lt/atomgt
8RuleML Elements of UR-Equalog DTD Equations for
URI Expansion
URLs/URIs or URs as 1st-class citizens
uriexp(daml) http//www.daml.org/
ltifgt lteqgt ltnanogt ltfungturiexplt/fungt
ltindgtdamllt/indgt lt/nanogt
lturgthttp//www.daml.org/lt/urgt lt/eqgt
ltand/gt lt/ifgt
uriexp(oil) http//www.ontoknowledge.org/oil/
ltifgt lteqgt ltnanogt ltfungturiexplt/fungt
ltindgtoillt/indgt lt/nanogt
lturgthttp//www.ontoknowledge.org/oil/lt/urgt
lt/eqgt ltand/gt lt/ifgt
Empty and ? true premise ? unconditional equatio
n
. . .
9RuleML Element of URC-Bin-Data-Ground-Fact
DTDRDF Triple as Very Special Rule
RDF triple (predicate, subject, object) as
atom predicate(subject, object) or rel(ur,
urind) "http//www.w3.org/Home/Lassila has
creator Ora Lassila." (Creator,
http//www.w3.org/Home/Lassila, Ora Lassila)
ltifgt ltatomgt ltrelgtCreatorlt/relgt
lturgthttp//www.w3.org/Home/Lassilalt/urgt
ltindgtOra Lassilalt/indgt lt/atomgt ltand/gt lt/ifgt
10Modularization of DTDs XHTML and KR
- Advantages
- Leads to reusable subDTDs and DTD interoperation
- Complex DTDs built with 'plug-and-play'
technology - (RuleML) Sublanguages determined by validations!
- For (rulebase) export find most precise
sublanguage!
- Modular DTDs still mostly used outside KR
- First used for XHTML and described in XML Bible
- W3C Working Draft 5 January 2000
Building XHTMLtm Modules - W3C Candidate Recommendation 20 October 2000
Modularization of XHTMLtm
11Structure of the RuleML DTD Hierarchy
- Our system of DTDs (current version 0.7) uses a
modularization approach similar to XHTML in order
to accomodate the various rule subcommunities - The evolving hierarchy of RuleML DTDs forms
a partial order with ruleml as the greatest
element (a ruleml-rooted DAG) -- many
smallest elements - Each DTD node in the hierarchy corresponds to a
specific RuleML sublanguage - Union (join) of sublanguages reached via
outgoing links to smaller or equal nodes below - Intersection (meet) of sublanguages via
incoming links from greater or equal nodes above
12The Module Hierarchy of RuleML DTDs
ruleml
ur-equalog
Rooted DAG will be extended with branches for
further sublanguages
equalog
ur-hornlog
hornlog
ur-datalog
ur-datalog join(ur,datalog)
datalog
bin-datalog
urc-datalog
ur
URL/URI-like ur-objects
urc-bin-datalog
urc-bin-data-ground-log
urc-bin-data-ground-fact
RDF-like triples
13DTDs From Well-Formedness to Validity
XML principles for a document being
well-formed
XML principle for a document
being valid with respect to a DTD
- Open and close all tags
- Empty tags end with /gt
- There is a unique root element
- Elements may not overlap
- Attribute values are quoted
- lt and are only used to start tags and entities
- Only the five predefined entity references are
used
- Matches the type-like constraints listed in the
DTD (or, can be generated from DTD as linearized
CF grammar-derivation tree)
Checked by validators such as http//www.stg.brown
.edu/service/xmlvalid/
14A Relational Languageruleml-datalog.dtd (I)
lt!-- An XML DTD for a Datalog RuleML Sublanguage
--gt lt!-- Last Modification 2001-01-25 --gt lt!--
ENTITY Declarations --gt lt!-- in this
ruleml-datalog.dtd, parameter entities set two
.module switches to INCLUDE --gt lt!ENTITY
datalog.module "INCLUDE"gt lt!ENTITY
datalog-and-hornlog.module "INCLUDE"gt lt!-- hence
all conditional sections "lt!.module" . . .
"gt" activate their content --gt lt!-- in a
stand-alone use of the current DTD
"lt!.module" and "gt" are thus no-ops
--gt lt!datalog-and-hornlog.module lt!-- a
conclusion and premise will be usable within 'if'
implications --gt lt!-- in datalog and hornlog,
conc element uses an atomic formula --gt lt!-- in
datalog and hornlog, prem element uses an atomic
formula or an 'and' --gt lt!ENTITY conc
"atom"gt lt!ENTITY prem "(atom and)"gt gt
15A Relational Languageruleml-datalog.dtd (II)
lt!-- ELEMENT and ATTLIST Declarations --gt lt!--
'rulebase' root element uses 'if' implications as
top-level rules --gt lt!-- label attribute allows
naming of an entire individual rulebase --gt lt!--
e.g., this can help enable forward inferencing of
selected rulebase(s) --gt lt!-- direction
attribute indicates the intended direction of
rule inferencing --gt lt!-- it is a preliminary
design choice and has a 'neutral' default value
--gt lt!ELEMENT rulebase (if)gt lt!ATTLIST
rulebase label CDATA IMPLIEDgt lt!ATTLIST rulebase
direction (forward backward bidirectional)
"bidirectional"gt lt!-- 'if' implications are
usable as top-level rules --gt lt!-- 'if' element
uses a conclusion followed by a premise --gt lt!--
"ltifgtconc premlt/ifgt" stands for "conc if prem",
i.e., "conc is true if prem is true" --gt lt!--
label attribute is a handle for the rule for
various uses, including editing --gt lt!ELEMENT
if (conc, prem)gt lt!ATTLIST if label CDATA
IMPLIEDgt
16A Relational Languageruleml-datalog.dtd (III)
lt!datalog-and-hornlog.module lt!-- an 'and'
is usable within premises --gt lt!-- 'and' uses
zero or more atomic formulas --gt lt!--
"ltandgtatomlt/andgt" is equivalent to "atom"--gt lt!--
"ltandgtlt/andgt" is equivalent to "true"--gt
lt!ELEMENT and (atom)gt gt
lt!datalog.module lt!-- atomic formulas are
usable within conc's, prem's, and 'and's --gt lt!--
atom element uses rel(ation) symbol followed by a
sequence of --gt lt!-- zero or more arguments,
which may be ind(ividual)s or var(iable)s --gt
lt!ELEMENT atom (rel, (ind var))gt gt
17A Relational Languageruleml-datalog.dtd (IV)
lt!-- there is one kind of fixed argument --gt
lt!-- individual constant, as in predicate logic
--gt lt!ELEMENT ind (PCDATA)gt lt!-- there is
one kind of variable argument --gt lt!-- logical
variable, as in logic programming --gt lt!ELEMENT
var (PCDATA)gt lt!-- there are only fixed
(first-order) relations --gt lt!-- relation or
predicate symbol --gt lt!ELEMENT rel (PCDATA)gt
18A URL/URI Languageruleml-ur.dtd
lt!-- An XML DTD for a 'UR' RuleML Sublanguage
--gt lt!-- Last Modification 2001-01-23 --gt
lt!-- ENTITY Declarations --gt lt!-- a Uniform
Resource Identifier is currently PCDATA, but see
W3C's RFC2396 --gt lt!ENTITY URI "PCDATA"gt
lt!-- ELEMENT and ATTLIST Declarations --gt lt!--
there is an additional kind of fixed argument
--gt lt!-- objects (resources) use a URL/URI as
their OID, as in SHOE or RDF (cf. URML) --gt lt!--
however, unlike for XHTML anchors etc. URI used
as content, not as attribute --gt lt!-- 'label'
attribute, unlike URI, need not be unique
--gt lt!-- if no 'label' attribute is given,
browser must highlight the URI itself
--gt lt!ELEMENT ur (URI)gt lt!ATTLIST ur label
CDATA IMPLIEDgt
19The Join Languageruleml-urdatalog.dtd
lt!-- An XML DTD for a 'UR' Datalog RuleML
Sublanguage --gt lt!-- Last Modification
2001-01-23 --gt lt!-- ENTITY Declarations --gt
lt!ENTITY urdatalog.module "INCLUDE"gt lt!ENTITY
datalog.module "IGNORE"gt lt!ENTITY datalog
SYSTEM "ruleml-datalog.dtd"gt datalog lt!ENTITY
ur SYSTEM "ruleml-ur.dtd"gt ur lt!--
ELEMENT and ATTLIST Declarations --gt
lt!urdatalog.module lt!-- atomic formulas
are usable within conc's, prem's, and 'and's
--gt lt!-- atom element uses rel name followed by
three kinds of arguments --gt lt!ELEMENT atom
(rel, (ur ind var))gt gt
20Conclusions
- RuleML DTD 0.7, a system of 12 DTDs, is available
at http//www.dfki.de/ruleml/indtd.html - Sample files -- each referring to the most
specific DTD still validating them -- are at
http//www.dfki.de/ruleml/exa - Further rule categories (e.g. ICs and triggers)
and DTD updates will be available via main RuleML
page at http//www.dfki.de/ruleml - Distributed KR can already be based on current
DTDs -- using (XSLT) transformations to reach
follow-up and Participants DTDs