Title: Language Technologies and the Semantic Web: An Essential Relationship.
1Language Technologies and the Semantic Web An
Essential Relationship.
- Enrico Motta
- Professor of Knowledge Technologies
- Knowledge Media Institute
- The Open University
2Content of the Talk
- Update on the Semantic Web
- Beyond the hype
- What it is
- Why it is interesting
- Whats its status?
- Semantic Web and AI
- Semantic Web Applications
- Key features
- Reasoning on the Semantic Web
- Key role of Language Technologies
- Conclusions
3The Semantic Web in 2 minutes
4(No Transcript)
5(No Transcript)
6ltfoafPerson rdfabout"http//identifiers.kmi.ope
n.ac.uk/people/enrico-motta/"gt
ltfoafnamegtEnrico Mottalt/foafnamegt
ltfoaffirstNamegtEnricolt/foaffirstNamegt
ltfoafsurnamegtMottalt/foafsurnamegt ltfoafphone
rdfresource"tel44-(0)1908-653506"/gt
ltfoafhomepage rdfresource"http//kmi.open.ac.uk
/people/motta/"/gt ltfoafworkplaceHomepage
rdfresource"http//kmi.open.ac.uk/"/gt
ltfoafdepiction rdfresource"http//kmi.open.ac.u
k/img/members/enrico.jpg"/gt ltfoaftopic_interest
gtKnowledge Technologieslt/foaftopic_interestgt
ltfoaftopic_interestgtSemantic Weblt/foaftopic_inte
restgt ltfoaftopic_interestgtOntologieslt/foaftopi
c_interestgt ltfoaftopic_interestgtProblem
Solving Methodslt/foaftopic_interestgt
ltfoaftopic_interestgtKnowledge Modellinglt/foaftop
ic_interestgt ltfoaftopic_interestgtKnowledge
Managementlt/foaftopic_interestgt
ltfoafbased_neargt ltgeoPointgt
ltgeolatgt52.024868lt/geolatgt
ltgeolonggt-0.707143lt/geolonggt
ltcontactnearestAirportgt
ltairportnamegtLondon Luton Airportlt/airportnamegt
ltairportiataCodegtLTNlt/airportiataCodegt
ltairportlocationgtLuton, United
Kingdomlt/airportlocationgt
ltgeolatgt51.866666666667lt/geolatgt
ltgeolonggt-0.36666666666667lt/geolonggt
ltrdfsseeAlso rdfresource"http//www.daml.org/cg
i-bin/airport?LTN"/gt ltfoafcurrentProjectgt ltf
oafProjectgt ltfoafnamegtAquaLoglt/foafnamegt
lt/foafcurrentProjectgt
7The foaf ontology
8The SW as Web of Data
9Current status of the semantic web
- 10-20 million semantic web documents
- Expressed in RDF, OWL, DAMLOIL
- 7K-10K ontologies
- These cover a variety of domains - multimedia,
computing, management, bio-medical sciences,
geography, entertainment, upper level concepts,
etc
The above figures refer to resources which are
publicly accessible on the web
10The Semantic Web today
- To a significant extent the Semantic Web is
already in place and is characterized by a
widespread production of formalized knowledge
models (ontologies and metadata), from a variety
of different groups and individuals - The Next Knowledge Medium - An information
network with semi-automated services for the
generation, distribution, and consumption of
knowledge - Stefik, 1986
- Knowledge modelling to become a new form of
literacy? - Stutt and Motta, 1997
- Still primarily a research enterprise, however
interest is rapidly increasing in both
governmental and business organizations - early adopters phase
- The result is slowly emerging as an unprecedented
knowledge resource, which can enable a new
generation of intelligent applications on the web
11Semantic Web Applications
- What can you do with the Semantic Web?
12Corporate Semantic Webs
- A corporate ontology is used to provide a
homogeneous view over heterogeneous data sources - Often tackle Enterprise Information Integration
scenarios - Hailed by Gartner as one of the key emerging
strategic technology trends - E.g., see personal information management in
Garlik
13(No Transcript)
14Exploiting large scale semantics
Next GenerationSW Applications
SemanticWeb
Semantic Web Gateway
15Exploiting large scale semantics
Next GenerationSW Applications
SemanticWeb
16NGSW Applications in the context of AI research
17Knowledge-Based Systems
Today there has been a shift in paradigm. The
fundamental problem of understanding intelligence
is not the identification of a few powerful
techniques, but rather the question of how to
represent large amounts of knowledge in a fashion
that permits their effective use
Goldstein and Papert, 1977
Intelligent Behaviour
18The Knowledge Acquisition Bottleneck
KA Bottleneck
Intelligent Behaviour
19SW as Enabler of Intelligent Behaviour
Both a platform for knowledge publishing and a
large scale source of knowledge
Intelligent Behaviour
20KBS vs SW Systems
Classic KBS SW Systems
Provenance Centralized Distributed
Size Small/Medium Extra Huge
Repr. Schema Homogeneous Heterogeneous
Quality High Very Variable
Degree of trust High Very Variable
21Key Paradigm Shift
Classic KBS SW Systems
Intelligent Behaviour A function of sophisticated, logical, task-centric problem solving A side-effect of being able to integrate different types of reasoning to handle size and heterogeneous quality and representation
22Next Generation SW Applications Examples
- Case Study 1 Automatic Alignment of Thesauri in
the Agricultural/Fishery Domain
23Method
- SCARLET - matching by Harvesting the SW
- Automatically select and combine multiple online
ontologies to derive a relation
Access
Semantic Web
Scarlet
Deduce
Concept_A (e.g., Supermarket)
Concept_B (e.g., Building)
Semantic Relation ( )
24Two strategies
Building
OrganicChemical
PublicBuilding
Lipid
Shop
Steroid
Steroid
Supermarket
Cholesterol
Semantic Web
Scarlet
Scarlet
Building
Cholesterol
OrganicChemical
Supermarket
(A)
(B)
Deriving relations from (A) one ontology and (B)
across ontologies.
25Experiment
- Matching
- AGROVOC
- UNs Food and Agriculture
- Organisation (FAO) thesaurus
- 28.174 descriptor terms
- 10.028 non-descriptor terms
- NALT
- US National Agricultural
- Library Thesaurus
- 41.577 descriptor terms
- 24.525 non-descriptor terms
26226 Used Ontologies
http//139.91.183.309090/RDF/VRP/Examples/tap.rdf
http//reliant.teknowledge.com/DAML/SUMO.daml
http//reliant.teknowledge.com/DAML/Mid-level-onto
logy.daml
http//reliant.teknowledge.com/DAML/Economy.daml
http//gate.ac.uk/projects/ htechsight/Technologie
s.daml
27Evaluation 1 - Precision
- Manual assessment of 1000 mappings (15)
- Evaluators
- Researchers in the area of the Semantic Web
- 6 people split in two groups
- Results
- Comparable to best results for background
knowledge based matchers.
28Evaluation 2 Error Analysis
29Other Case Studies
30Giving meaning to tags
31Example
Cluster_1 college commerce corporate course education high instructing learn learning lms school student
1http//gate.ac.uk/projects/htechsight/Employment.
daml. 2http//reliant.teknowledge.com/DAML/Mid-lev
el-ontology.daml. 3http//www.mondeca.com/owl/mos
es/ita.owl. 4http//www.cs.utexas.edu/users/mfkb/R
KF/tree/CLib-core-office.owl.
32(No Transcript)
33(No Transcript)
34(No Transcript)
35Conclusions
36Typical misconceptions
- The SW is a long-term vision
- Ehmactually it already exists
- The SW will never work because nobody is going
to annotate their web pages - The SW is not about annotating web pages, the SW
is a web of data, most of which are generated
from DBs, or from web mining software, or from
applications which produce SW data as a side
effect of supporting users tasks - The idea of a universal ontology has failed
before and will fail again. Hence the SW is
doomed - The SW is not about a single universal ontology.
Already there are around 10K ontologies and the
number is growing - SW applications may use 1, 2, 3, or even hundreds
of ontologies.
37SW and Language Technologies
- All the applications mentioned here combine
language, web, statistical and semantic
technologies - Heterogeneity and sloppy modelling implies that
language and statistical technologies are almost
always needed when building NGSW apps - In contrast with traditional KBS, intelligent
behaviour is more a side-effect of intg. multiple
techniques to handle scale and heterogeneity,
rather than a function of powerful deductive
reasoning
38(No Transcript)