Title: CS6999 SWT Lecture 1 Introduction to the Semantic Web
1CS6999 SWTLecture 1Introduction to the Semantic
Web
- Bruce Spencer
- NRC-IIT Fredericton
- Sept 12, 2002
2National Research Council
- Research Institutes and Facilities across Canada
- 17 research institutes
- 4 innovation centres
- 3,500 employees 1,000 guest workers
- National science facilities
- ST information for industry and scientific
community - CISTI Candian Inst. for Science and Tech
Information - Network of technology advisors supporting SME
- IRAP Industrial Reseach Assistanceship Program
3Institute for Information Technology
- There are two aspects to IIT
- A mature research organization of 80 people in
Ottawa - New labs being developed in four cities in New
Brunswick and Nova Scotia involving 60 new
people - The whole organization is evolving to accommodate
our new distributed nature
4NRCs plans for New Brunswick
- What?
- NRC is building an e-business research team in
New Brunswick - E-business includes e-learning, e-government,
e-health. - Using information and communication technology to
help us to educate, govern and take care of
ourselves, to create wealth. - New Brunswick and Canadian companies already have
strengths in all three areas - NBs communications infrastructure and interested
telco - Bilingual workforce
5NRCs plans for New Brunswick
- NRC will act locally, and think nationally and
globally - Will work with new Brunswick community to develop
clusters in e-business - This is also NRCs national lab in e-business
- NRC will build international links
- Where?
- Main group (40 staff) in Fredericton, at UNBF
- Satellite in Saint John (6 staff), at E-Comm
Centre, UNBSJ - Satellite in Moncton (6 staff), at U. de Moncton
6NRCs plans for New Brunswick
- How much?
- Five year budget 2001-2006
- Fredericton 25.5M
- Saint John 4.5M
- Moncton 4.5M
- Network 3.0M
- (includes link to NBCC Miramichi)
- TOTAL 37.5M
7Abstract
- Much of the AI community that met at IJCAI
in August 2001 was discussing the "Semantic Web",
a proposal by the inventor of the web, Tim
Berners-Lee, and others to adding meaning to
terms for items found on the web, with a view to
making the web interactions more accurate and
more easily automated. Several US and European
projects are concerned with creating and using
taxonomies of terms in web page design and
retrieval, and are supported by W3C and DARPA.
The DAMLOIL language, a joint US-European
project, proposes to add Resource Description
Framework (RDF) to Extensible Markup Language
(XML), tagging web content with meta-tags
containing links to ontologies, as well as facts
and rules that describe the intended use of the
content. This draws from a quarter century of
work in knowledge representation and reasoning
systems by the artificial intelligence community. - In this talk I will explain the goals
and achievements of the Semantic Web effort to
date, and point out (some of) the remaining
hurdles, and assuming that they are cleared, what
these researchers expect to emerge.
Interoperation among applications that exchange
machine-understandable information will allow
automated processing of web resources, and this
has many applications in ecommerce. I will close
with a suggestion how the IIT-Fredericton's
Security/Privacy, Multi-Agent and "One Web"
thrusts can be aligned with these international
efforts.
8Bruce
- MMath 83, BNR 83-86, Waterloo PhD 86-90, UNB prof
90-01, NRC 01-now - Automated reasoning
- data structures in theorem proving
- eliminate redundant searching
- smallest proofs
- deductive databases
- Java in curriculum since 1997
9Five main points
- Tim Berners-Lees vision
- web information should be machine understandable
- Taxonomies of words shared within web communities
- no single ontology
- RDF meta-tags link XML tags to their roles
- US and European buy-in
- Wheres Canada
- Aligns with IIT Frederictons thrusts
- multi-agent, security, OneWeb, voice
10Overview and Course Mindmap
- Increasing demand for formalized knowledge on
the Web AIs chance! - XML- RDF-based markup languages provide a
'universal' storage/interchange format for such
Web-distributed knowledge representation - Course introduces knowledge markup resource
semantics we show how to marry AI
representations (e.g., logics and frames) with
XML RDF incl. RDF Schema
Namespaces
CSS
DTDs
XSLT
Stylesheets
DAML
Agents
Transformations
Ontobroker
XQL
XML
HornML
Rules
Queries
RuleML
XQuery
Mindmap
XML-QL
SHOE
RDFS
Frames
Acquisition
TopicMaps
Protégé
11The Semantic Web Activityof the W3C
- The Semantic Web is a vision the idea of having
- data on the Web defined and linked in a way that
- it can be used by machines not just for display
purposes, - but for
- automation,
- integration and
- reuse of data across various applications.
- (http//www.w3.org/2001/sw/Activity)
Semantic Web
12What your computer sees in HTML
- ltbgtJoes Computer Store
- lt/bgt
- ltbrgt
- 365 Yearly Drive
Presentation information
What your computer sees in XML
ltlocationgt ltnamegtJoes Computer
Store lt/namegt ltaddressgt 365 Yearly
Drive lt/addressgt lt/locationgt
Content description (ambiguous)
13What a computer could understand
- ltmailaddress xmlnsmailhttp//www.canadapost.ca
gt - ltmailnamegtJoes Computer Store lt/mailnamegt
- ltmailstreetgt 365 Yearly Drive lt/mailstreetgt
- lt/mailaddressgt
- www.canadapost.ca could define address, name,
street, - Search engines could then identify mail addresses
- Consider shopbots being able to find
- price, quantity, feature, model number, supplier,
serial number, acquisition date - Assumes that namespaces will be used consistently
14Semantic Web
- Semantics meaning
- Good Idea Dictionary
- Create a dictionary of terms
- Put it on the web
- Mark up web pages so that terms are linked to
these dictionary-entries - This allow more precise matching
- Better idea Thesaurus
- has hierarchies of terms
- shades of meaning
- Best idea Ontology
- hierarchy of terms and logic conditions
15Semantic Web
- An agent-enabled resource
- information in machine-readable form, creating a
revolution in new applications, environments and
B2B commerce - W3C Activity launched Feb 9, 2001
- DAML DARPA Agent Markup Language
- US Gov funding to define languages, tools
- 16 project teams
- OIL is Ontology Inference Layer
- DAMLOIL is joint DARPA-EU
- Knowledge Representation is a natural choice
16 17 - SmokedSalmon is the intersection of Smoked and
Salmon
18 - SmokedSalmon is the intersection of Smoked and
Salmon
19 - SmokedSalmon is the intersection of Smoked and
Salmon
- Gravalax is the intersection of Cured and Salmon,
but not Smoked
20 The Semantic Web is about having the Internet use
common sense.
- A search for keywords Salmon and Cured should
return pages that mention Gravalax, even if they
dont mention Salmon and Cured - A search for Salmon and Smoked will return smoked
salmon, should also return Lox, but not Gravalax
Smoked Salmon
Lox
21Smoked Salmon
Lox
22Tim Berners- Lees Semantic Web
23RDF Resource Description Framework
- Beginning of Knowledge Representation influence
on Web - Akin to Frames, Entity/Relationship diagrams, or
Object/Attribute/Value triples
24RDF Example
- ltrdfProductSpecs about
- http//www.lemoncomputers.ca/model_2300gt
- ltspecscolourgtyellowlt/specscolourgt
- ltspecssizegtmediumlt/specssizegt
- lt/rdfProductSpecsgt
model_2300
size
colour
medium
yellow
25RDF Class Hierarchy
- All lemon laptops get packed in cardboard boxes
- Allows one to customize existing taxonomies
- Example palmtop computers still get packed in
boxes
model_2300
size
colour
medium
yellow
26Tim Berners- Lees Semantic Web
27Ontology Web Language W3C
- Previously known as DAMLOIL
- US DARPA Agent Markup Language
- EU Ontology Interchange Layer (Language)
- Composed of a hierarchy with additional
conditions - Based on Description logic, limited expressivenss
- Reasoning procedures are well-behaved
- Just enough power
28Identifying Resources
- URL/URI
- Uniform resource locator / identifier
- Information sources, goods and services
- financial instruments
- money, options, investments, stocks, etc.
- Where do you want to go today?
- becomes What do you want to find?
29Ontology
- Branch of philosophy dealing with the theory of
being - Tarskis assumption
- individuals, relationships and functions
- A common vocabulary and agreed-upon meanings to
describe a subject domain - What real-world objects do my tags refer to?
- How are these objects related?
- Communication requires shared terms
- others can join in
30Ontology Layer
- Widens interoperability and interconversion
- knowledge representation
- More meta-information
- Which attributes are transitive, symmetric
- Which relations between individuals are 1-1,
1-many, many-many - Communities exist
- DL, OIL, SHOE (Hendler)
- New W3C working group
31Transitive, Subrole example
- One wants to ask about modes of transportation
from Sydney to Fredericton - connected by Acadian Lines bus is a role in a
Nova Scotia taxonomy - connected by SMT bus from New Brunswick
- Both are subroles of connected
- connected is transitive
- Note that ontologies can be combined at runtime
32Combining Rich Ontologies
- Only these facts are explicit
- in separate ontologies
- Connected by bus
- is superset
- is symmetric and transitive
- Route from Sydney to Fredericton is inferred
33Tim Berners- Lees Semantic Web
34Logic Layer
- Clausal logic encoded in XML
- RuleML, IBM CommonRules
- Special cases of first-order logic
- Horn Clauses for if-then type reasoning and
integrity constraints - Standard inference rules based on Resolution
- Various implementations SQL, KIF, SLD (Prolog),
XSB - J-DREW reasoning tools in Java.
- Modus operandi build tractable reasoning systems
- trade away expressiveness, gain efficiency
35Logic Architecture Example
- Contracting parties integrate e-businesses via
rules
Seller E-Storefront
Buyers ShopBot
Business Rules
Business Rules
Contract Rules Interchange
OPS5
Prolog
36Negotiation via rules
- usualPrice
- price(per-unit, ?PO, 60) ?
- purchaseOrder(?PO, supplierCo, ?AnyBuyer) ?
- shippingDate(?PO, ?D) ?(?D ? 24April2001).
- volumeDiscountPrice
- price(per-unit, ?PO, 55) ?
- purchaseOrder(?PO, supplierCo, ?AnyBuyer) ?
- quantityOrdered(?PO, ?Q) ?(?Q ? 1000) ?
- shippingDate(?PO, ?D) ?(?D ? 24April2001).
- overrides(volumeDiscount, usualPrice).
37Hot Research Topics
- Tools to create ontologies
- Ontolingua
- Protégé-2000 (Stanford)
- OILED
-
- Tools to learn ontologies from a large corpus
such as corporate data - Merging / aligning two different ontologies from
different sources on the same topic - Searching cum reasoning tools
- SHOE
38Eventual Goal of these Efforts
- Agents locate goods, services
- use ontologies
- unambiguous
- business rules
- expressive language but reasoning tractable
- combine from various sources
- Gives rise to need of trust, privacy and security
- e.g. semantic web project to determine
eligibility of patients for a clinical trial