Title: CCSW: The Competence Center Semantic Web
1CCSWThe Competence Center Semantic Web
- Harold Boley, DFKI GmbH
- Presentation in Course Rule Markup Languages
- Univ. Kaiserslautern, April 26th, 2002
2General Overview
- Semantic Web W3C Activity on machine-interpreted
documents that can be used (not just for display
but) for automation, integration, and reuse
across applications (http//www.w3.org/2001/sw/ac
tivity) - DFKI has long been working in Semantic Web
technologiesDescription logics, ontologies,
metadata, rule systems, agents,NL parsing,
information extraction, knowledge management,
etc. - Current CCSW focus at DFKI Robust Web-document
authoring annotation for agent-based
information management with webizedobject
representations, ontologies rule systems - CCSWs Semantic Web view Higher-level system
emerging from increasingly structured subwebs,
each serving needs of specific community
Co-Heads Dr. Harold Boley (Kaiserslautern), Dr.
Paul Buitelaar (Saarbrücken)
URL http//ccsw.dfki.de
Services Consulting, Studies Projects
3Semantic Web and Web ServicesUse Databases and
Rule Systems
4General DFKI SemWeb Areas
- Content Ontology Development
- Manual, Semi-Automatic Ontology Learning and
Adaptation - Specific for a Task, Organisation (IntraNet),
Domain (ExtraNet)
- Applications Intelligent and Dynamic Information
Integration and Access - Intelligent Information Integration
- Intelligent, Cooperative Agents
- Content-Based Information Access
- Cross-Lingual and Multimedia Information Access
- Company- and User-Adaptive Information Systems
- Distributed Agent-Based Organizational Memories
5Some SemWeb Applications_at_DFKI (I)
- Intelligent Information Integration
Intelligent, Cooperative Agents - SmartKOM Combination of User Modeling and Plan
Recognition to Integrate Knowledge from
Multimodal Sources
- Intelligent Information Integration
- MUMIS Ontology-Based Information Integration
from Multilingual Sources
- Content-Based, Cross-Lingual Multimedia
Information Access - Combinations of Ontology-Based Information
Extraction, Text Mining and Semantic Annotation
for Knowledge Markup of Text or Multimedia
Documents with Metadata for Content-Based,
Cross-Lingual, Multimedia Information Access - GETESS (Information Extraction, Text Mining),
MuchMore (Semantic Annotation, Text Mining),
MUMIS (Information Extraction, Multimedia)
6Some SemWeb Applications_at_DFKI (II)
- Company- and User-Adaptive Information Systems
- Adaptive READ Document Retrieval on the Basis of
Machine Learning - Algorithms for Automatic IR-Parameter
Optimization
- Distributed Agent-Based Organizational Memories
- FRODO Ontology Acquisition from Texts and User
Interaction - for Workflow Enactment and Information Access
7The Semantic Web Layered Architecture
Tim Berners-Lee Axioms, Architecture and
Aspirations W3C all-working group plenary
Meeting 28 February 2001
(http//www.w3.org/2001/Talks/0228-tbl/slide5-0.ht
ml)
8Present SemWeb Challenges
- Can we make W3Cs original Semantic Web notion
more - precise (Semantic) content data vs. metadata
semantics? - specific (Web) some intranets vs. the
Internet? - What techniques will semantic webs use from
Information Retrieval, Databases, Ontologies,
(Description, Horn) Logics, W3C Markup Languages
(XML, RDF, XSLT), Knowledge Management, Agents,
Web Services (WSDL), ...? - Which semweb success stories (killer apps)
exist (dmoz.org UNSPSC, eCl_at_ss , ECCnet)? - How to rank candidate semweb applications for
showing the semweb potentials in our own
organizations and for our customers?
9SemWeb Language Principles
- Existing (database, logic) languages can be
webized (Tim Berners-Lee) by introducing URIs
as a new kind of (constant) symbols - The languages should be scalable to a large
amount of Web-distributed content, hence should
use a small, if not minimal, formalism - A simple formalism doesnt interfere with the
content - Relational databases with SQL are a good example
- XML DTDs, the RDF model, the DAMLOIL core, and
the modularized RuleML are such candidate
languages (unlike, perhaps, XML Schema, the many
RDF syntaxes, full DAMLOIL, or a monolithic
RuleML)
10SemWeb Core IssueMetadata Ontologies (I)
- For Web-page annotation, browsers should use a
top-level pane/menu for metadata (cf. Annotea) - Metadata should be generated interactively from
content data, via standardized domain ontologies
(NLP tools/resources for metadata extraction
annotation) - Search engines should show same ontologies for
navigating-searching content with high precision - Information agents may also use the ontologies
for retrieving and integrating content for users
11SemWeb Core IssueMetadata Ontologies (II)
- Instead of a single global ontology for
metadata there will certainly be several local
ontologies, which require integration, e.g. by
alignment on demand or via derivation/transformati
on rules - Maintenance of domain ontologies for metadata
must be machine-supported, e.g. by links and/or
transformations between versions (cf. MeSH) - Metadata ontologies can describe heterogeneous
Web pages in a homogeneous format - Some ontology queries provide direct answers
(fact retrieval) others provide relevant Web
pages (document retrieval) yet others, both
12Web-Based B2C or B2B Rule Exchange
. . .
translate to standard format (e.g., RuleML)
publish rulebase1
publish rulebasem
compare, instantiate, and run rulebases
13From Natural Language to Horn Logic
14RuleML Markup and Tree
''The discount for a customer buying a product is
5.0 percent if the customer is premium and the
product is regular.''
ltimpgt lt_headgt ltatomgt
lt_oprgtltrelgtdiscountlt/relgtlt/_oprgt
ltvargtcustomerlt/vargt ltvargtproductlt/vargt
ltindgt5.0 percentlt/indgt lt/atomgt
lt/_headgt lt_bodygt ltandgt ltatomgt
lt_oprgtltrelgtpremiumlt/relgtlt/_oprgt
ltvargtcustomerlt/vargt lt/atomgt
ltatomgt lt_oprgtltrelgtregularlt/relgtlt/_oprgt
ltvargtproductlt/vargt lt/atomgt
lt/andgt lt/_bodygt lt/impgt
15Intertranslating RuleML and RFML
''The discount for a customer buying a product is
5.0 percent if the customer is premium and the
product is regular.''
ltimpgt lt_headgt ltatomgt
lt_oprgtltrelgtdiscountlt/relgtlt/_oprgt
ltvargtcustomerlt/vargt ltvargtproductlt/vargt
ltindgt5.0 percentlt/indgt lt/atomgt
lt/_headgt lt_bodygt ltandgt ltatomgt
lt_oprgtltrelgtpremiumlt/relgtlt/_oprgt
ltvargtcustomerlt/vargt lt/atomgt
ltatomgt lt_oprgtltrelgtregularlt/relgtlt/_oprgt
ltvargtproductlt/vargt lt/atomgt
lt/andgt lt/_bodygt lt/impgt
lthngt ltpattopgt ltcongtdiscountlt/congt
ltvargtcustomerlt/vargt ltvargtproductlt/vargt
ltcongt5.0 percentlt/congt lt/pattopgt
ltcallopgt ltcongtpremiumlt/congt
ltvargtcustomerlt/vargt lt/callopgt ltcallopgt
ltcongtregularlt/congt ltvargtproductlt/vargt
lt/callopgt lt/hngt
16Current Players
- USA W3C, DARPA, NSF, Maryland, Stanford,
... - Canada NRC-IIT-CISTI, ...
- Europe IST
- Netherlands Amsterdam, Twente, ...
- UK Manchester, Newcastle, ...
- France INRIA , ...
- Germany Karlsruhe, DFKI, Hannover, Hamburg,
Berlin, IW-Köln, ... - Sweden Linköping
- Switzerland MCM
- Japan INTAP, Keio, CARC, Ricoh, ...
- Korea KAIST
- Australia Melbourne, ...
- . . .
17Major Funding
- USA DAML, W3C Web Ontology Working Group
- Canada NRC
- Europe OntoWeb, Semantic Web Technologies
- Japan METI
- . . .
- Canada Europe ISTEC
- Japan Europe ?
- . . .
18SemWeb Courses
- University of Maryland
- Stanford University
- Lehigh University
- Vrije Universiteit Amsterdam
- Universität Karlsruhe
- Universität Kaiserslautern
- Universität Saarbrücken
- ...