Title: Semantic Web Tutorial
1Semantic Web Tutorial
- Peter Fox (NCAR)
- ESIP Federation Summer Meeting
- July 17, 2007 Madison, WI
- WARNING - this presentation contains acronyms
2Terminology
- Semantic Web
- An extension of the current web in which
information is given well-defined meaning, better
enabling computers and people to work in
cooperation, www.semanticweb.org - Primer http//www.ics.forth.gr/isl/swprimer/
- Semantic Grid
- Semantic services to use the resources of many
computers connected by a network to solve large
scale computational/ data problems - Provenance
- origin or source from which something comes,
intention for use, who/what generated for, manner
of manufacture, history of subsequent owners,
sense of place and time of manufacture,
production or discovery, documented in detail
sufficient to allow reproducibility. - Service-oriented architecture
- Provision of a capability over the internet via a
remote-procedure-call using prescribed input,
output and pre-conditions - Ontology (n.d.). The Free On-line Dictionary of
Computing. http//dictionary.reference.com/browse/
ontology - An explicit?formal specification of how to
represent the objects, concepts?and other
entities that are assumed to exist in some area
of?interest and the relationships that hold among
them.
3Semantic Web Layers
http//www.w3.org/2003/Talks/1023-iswc-tbl/slide26
-0.html, http//flickr.com/photos/pshab/291147522/
4Terminology
- Closed World - where complete knowledge is known
(encoded), AI relied on this - Open World - where knowledge is incomplete/
evolving, SW promotes this - Languages
- OWL - Web Ontology Language (W3C)
- RDF - Resource Description Framework (W3C)
- OWL-S/SWSL - Web Services (W3C)
- WSMO/WSML - Web Services (EC/W3C)
- SWRL - Semantic Web Rule Language, RIF- Rules
Interchange Format - PML - Proof Markup Language
- Editors Protégé, SWOOP, Medius, SWeDE,
- Reasoners
- Pellet, Racer, Medius KBS, FACT, fuzzyDL,
KAON2, MSPASS, QuOnto - Query Languages
- SPARQL, XQUERY, SeRQL, OWL-QL, RDFQuery
- Other Tools for Semantic Web
- Search SWOOGLE swoogle.umbc.edu
- Collaboration www.planetont.org
- Other Jena, SeSAME/SAIL, Mulgara, Eclipse,
KOWARI - Semantic wiki OntoWiki, SemanticMediaWiki
5Application Areas for SW
- Smart search
- Annotation (even simple forms), smart tagging
- Geospatial
- Implementing logic (rules), e.g. in workflows
- Data integration
- Verification . and the list goes on
- Web services
- Web content mining with natural language parsing
- User interface development (portals)
- Semantic desktop
- Wikis - OntoWiki, SemanticMediaWiki
- Sensor Web
- Software engineering
- Explanation
6Semantic Web Basics
- The triple subject-object-predicate
- Interferometer is-a optical instrument
- Optical instrument has focal length
- W3C is the primary (but not sole) governing orgn.
- RDF
- OWL 1.0 - Ontology Web Language (OWL 1.1 on the
way) - RDF
- programming environment for 14 languages,
including C, C, Python, Java, Javascript, Ruby,
PHP,...(no Cobol or Ada yet -( ) - OWL programming for Java (mainly) Lite, DL,
Full - Editors, tools, etc.
7 Ontology Spectrum
Thesauri narrower term relation
Selected Logical Constraints (disjointness,
inverse, )
Frames (properties)
Formal is-a
Catalog/ ID
Informal is-a
Formal instance
General Logical constraints
Terms/ glossary
Value Restrs.
Originally from AAAI 1999- Ontologies Panel by
Gruninger, Lehmann, McGuinness, Uschold, Welty
updated by McGuinness. Description in
www.ksl.stanford.edu/people/dlm/papers/ontologies-
come-of-age-abstract.html
8SW ! ontologies on the web (!)
- Ontologies are important, but use them only when
necessary as identified by use cases - The Semantic Web is about integrating data on the
Web ontologies (and/or rules) are tools to
achieve that when necessary - SW ontologies ! some big (central) ontology
- The ethos of the Semantic Web is on sharing, ie,
sharing possibly many small ontologies - A huge, central ontology could be difficult to
manage in terms of maintenance. - Semantic web languages such as OWL contain
primitives for equivalence and disjointness of
terms and meta primitives for versioning info - The practice
- SW applications using ontologies mix large number
of ontologies and vocabularies (FOAF, DC, and
others) - the real advantage comes from this mix that is
also how new relationships may be discovered - One readable background article from the metadata
world is available at http//www.metamodel.com/ar
ticle.php?story20030115211223271
9Semantic Web Myths
- the Semantic Web is a reincarnation of
Artificial Intelligence on the Web (closed world
versus open world) - it relies on giant, centrally controlled
ontologies for "meaning" (as opposed to a
democratic, bottom-up control of terms) - one has to add metadata to all Web pages,
convert all relational databases, and XML data to
use the Semantic Web - it is just an ugly application of XML
- one has to learn formal logic, knowledge
representation techniques, description logic,
etc, to use it - it is, essentially, an academic project, of no
interest for industry
10Selected Technical Benefits
- Integrating Multiple Data Sources
- Semantic Drill Down / Focused Perusal
- Statements about Statements
- Inference
- Translation
- Smart (Focused) Search
- Smarter Search Configuration
- Proof and Trust
Updated material reused from The Substance of
the Web. McGuinness and Dean. Semantic Web
Applications for National Security. May, 2005.
http//www.schafertmd.com/swans/agenda.html
111 Integrating Multiple Data Sources
- The Semantic Web lets us merge statements from
different sources - The RDF Graph Model allows programs to use data
uniformly regardless of the source - Figuring out where to find such data is a
motivator for Semantic Web Services
Ionosphere
magnetic
hasCoordinates
name
hasLowerBoundaryValue
100
Terrestrial Ionosphere
hasLowerBoundaryUnit
km
Different line text colors represent different
data sources
122 Drill Down /Focused Perusal
- The Semantic Web uses Uniform Resource
Identifiers (URIs) to name things - These can typically be resolved to get more
information about the resource - This essentially creates a web of data analogous
to the web of text created by the World Wide Web - Ontologies are represented using the same
structure as content - We can resolve class and property URIs to learn
about the ontology
NeutralTemperature
Norway
Internet
locatedIn
measuredby
...ISR
...FPI
type
operatedby
EISCAT
...MilllstoneHill
133 Statements about Statements
- The Semantic Web allows us to make statements
about statements - Timestamps
- Provenance / Lineage
- Authoritativeness / Probability / Uncertainty
- Security classification
-
- This is an unsung virtue of the Semantic Web
Dannys
Aurora
hasSource
hascolor
hasDateTime
Red
20031031
Ontologies Workshop, APL May 26, 2006
144 Inference
- The formal foundations of the Semantic Web allow
us to infer additional (implicit) statements that
are not explicitly made - Unambiguous semantics allow question answerers to
infer that objects are the same, objects are
related, objects have certain restrictions, - SWRL allows us to make additional inferences
beyond those provided by the ontology
Interferometer
Millstone Hill
OperatesInstrument
hasInstrument
isOperatedBy
Measures
hasOperatingMode
hasTypeofData
hasMeaasuredData
VerticalMeans
155 Translation
- While encouraging sharing, the Semantic Web
allows multiple URIs to refer to the same thing - There are multiple levels of mapping
- Classes
- Properties
- Instances
- Ontologies
- OWL supports equivalence and specialization SWRL
allows more complex mappings
precipitation
name
ont1EduLevel
ont1Precipitation
VOScientist
precipitation
name
ont2EduLevel
ont2Rain
EduVOK-12
166 Smart (Focused) Search
- The Semantic Web associates 1 or more classes
with each object - We can use ontologies to enhance search by
- Query expansion
- Sense disambiguation
- Type with restrictions
- .
177 Smarter Search / Configuration
18 GEONGRID Ontology Search and Data Integration
Example
-
- Uses emerging web standards to enable smart web
applications - Given an upper-level domain choice
- Ecology
- Illustrate or list contained concepts/hierarchy
- VegetationCover, TreeRings, etc.
- Retrieve some specific options from web
- Maps, tree-ring data,
-
- Info https//portal.geongrid.org8443/gridsphere
/gridsphere
19(No Transcript)
20(No Transcript)
218 Proof
- The logical foundations of the Semantic Web allow
us to construct proofs that can be used to
improve transparency, understanding, and trust - Proof and Trust are on-going research areas for
the Semantic Web e.g., See PML and Inference Web
hasCalibration
FlatField
Critical Dataset
hasPeerReview
Solar Physics Paper
Critical Dataset has been calibrated with a
flat field program that is published In the peer
reviewed literature.
22Inference Web
- Framework for explaining reasoning tasks by
storing, exchanging, combining, annotating,
filtering, segmenting, comparing, and rendering
proofs and proof fragments provided by multiple
distributed reasoners. - OWL-based Proof Markup Language (PML)
specification as an interlingua for proof
interchange - IWExplainer for generating and presenting
interactive explanations from PML proofs
providing multiple dialogues and abstraction
options - IWBrowser for displaying (distributed) PML proofs
- IWBase distributed repository of proof-related
meta-data such as inference engines/rules/language
s/sources - Integrated with theorem provers, text analyzers,
web services,
http//iw.stanford.edu
23Inference Web Infrastructure (McGuinness,
et.al., 2004 http//www.ksl.stanford.edu/KSL_Abstr
acts/KSL-04-03.html )
- Framework for explaining question answering tasks
by - abstracting, storing, exchanging,
- combining, annotating, filtering, segmenting,
- comparing, and rendering proofs and proof
fragments - provided by question answerers.
24SW Questions Answers
- Users can explore extracted entities and
relationships, create new hypothesis, ask
questions, browse answers and get explanations
for answers. -
A context for explaining the answer
A question
An answer
An abstracted explanation
(this graphical interface done by Batelle
supported by KSL)
25Part 2 - starting to develop
- Ontologies
- Tools
- And more
26Creating Ontologies - UML, CMAP
- New release of ODM/MOF
- Ontology Definition Metamodel/Meta Object
Facility (OMG) for UML - Provides standardized notation
- Available for a fee (!) from OMG
- Books likely to be available soon
- CMAP Ontology Editor (concept mapping tool from
IHMC) - Drag/drop visual development of classes, subclass
(is-a) and property relationship - Read and writes OWL
- Formal convention (OWL/RDF tags, etc.)
- White board, text file
27(No Transcript)
28(No Transcript)
29What about Earth Science?
- SWEET (Semantic Web for Earth and Environmental
Terminology) - http//sweet.jpl.nasa.gov
- based on GCMD terms
- modular using faceted and integrative concepts
- VSTO (Virtual Solar-Terrestrial Observatory)
- http//vsto.hao.ucar.edu
- captures observational data (from instruments)
- modular using domains
- MMI
- http//marinemetadata.org
- captures aspects of marine data, ocean observing
systems - partly modular, mostly by developed project
- GeoSciML
- http//www.opengis.net/GeoSciML/
- is a GML (Geography ML) application language for
Geoscience - modular, in packages
30(No Transcript)
31Developing ontologies
- Use cases and small team (7-8 2-3 domain
experts, 2 knowledge experts, 1 software
engineer, 1 facilitator, 1 scribe) - Identify classes and properties (leverage
controlled vocab.) - Start with narrower terms, generalize when needed
or possible - Data integration - often requires broader terms
- Adopt a suitable conceptual decomposition (e.g.
SWEET) - Import modules when concepts are orthogonal
- Minimal properties to start, add only when needed
- Mid-level to depth - i.e. neither top-down nor
bottom-up - Review, review, review, vet, vet, vet, publish -
www.planetont.org (experiences, results, lessons
learned, AND your ontologies AND discussions) - Only code them (in RDF or OWL) when needed (CMAP,
) - Ontologies small and modular
32Best practices
- Rapid prototyping using common tools
- Go Lite as much as possible, then DL and only
if you have to Full - balancing expressibility
vs. implementability - There are also a number core vocabularies to
start with - SKOS Core about knowledge systems
- Dublin Core about information resources, digital
libraries, with extensions for rights,
permissions, digital right management - FOAF about people and their organizations
- DOAP on the descriptions of software projects
- Ontologies/ vocabularies must be shared and
reused! swoogle.umbc.edu, www.planetont.org
33Is OWL the only option?
- SKOS
- Atom
- CL - common logic
- Rabbit
- PENG and other NL forms
- Many similar options are proliferating, but
- The tools are behind
34(No Transcript)
35(No Transcript)
36(No Transcript)
37(No Transcript)
38Editors
- Protégé (http//protégé.stanford.edu)
- SWOOP (http//mindswap.org/2004/SWOOP see also
http//www.mindswap.org/downloads/) - Altova SemanticWorks (http//www.altova.com/downlo
ad/semanticworks/semantic_web_rdf_owl_editor.html)
- SWeDE (http//owl-eclipse.projects.semwebcentral.o
rg/InstallSwede.html), goes with Eclipse - Medius
- TopBraid Composer and other commercial tools
- CMAP Ontology Editor (COE) (http//cmap.ihmc.us/co
e)
39Triple Stores
- Jena (http//jena.sourceforge.net/)
- SeSAME/SAIL (http//www.openrdf.org/)
- KOWARI (http//www.kowari.org/) -gt
- Mulgara (http//www.mulgara.org/)
- Redland (http//librdf.org/index.html)
- Oracle (!)
- Many others (relational, object-relational)
40Software development tools
- Protégé, w/ plug-ins - some better than others
- SWOOP (OWL analyzer, partitioner)
- Jena (http//jena.sourceforge.net/)
- Eclipse (full integrated development environment
for Java http//www.eclipse.org/) - Top Quadrant suite
- Sandsoft
- see Semantic Technologies 2007
41Rules (aka Logic)
- OWL-DL and OWL-Lite are based on Description
Logic - There are things that DL cannot express (though
there are things that are difficult to express
with rules and easy in DL...) - A well known examples is Horn rules (eg, the
uncle relationship) (P1 ? P2 ? ...) ? C - e.g. for any X, Y and Z if Y is a parent of X,
and Z is a brother of Y then Z is the uncle of X - Several attempts already to combine Semantic Web
with Rules (Metalog, RuleML, SWRL, RIF, WRL, cwm,
...)
42Query
- Querying knowledge representations in OWL and/or
RDF - OWL-QL (for OWL) http//projects.semwebcentral.org
/projects/owl-ql/ - SPARQL for RDF http//www.sparql.org/ and
http//www.w3.org/TR/rdf-sparql-query/ - Now a W3C Candidate Recommendation 14 June 2007!
- XQUERY (for XML)
- SeRQL (for SeSAME)
- RDFQuery (RDF)
- None as yet for natural language representations
43Reasoners (aka Inference engines)
- Pellet
- Racer (and Racer Pro)
- Medius KBS
- FACT
- fuzzyDL
- KAON2
- MSPASS
- QuOnto
- Jess (for Rules)
44Collecting the data
- Part of the (meta)data information is present in
tools ... but thrown away at output e.g., a
business chart can be generated by a tool it
knows the structure, the classification, etc.
of the chart,but, usually, this information is
lost storing it in web data would be easy! - SW-aware tools are around (even if you do not
know it...), though more would be good - Photoshop CS stores metadata in RDF in, say, jpg
files (using XMP) - RSS 1.0 feeds are generated by (almost) all
blogging systems (a huge amount of RDF data!) - Scraping - different tools, services, etc, come
around every day - get RDF data associated with images, for example
service to get RDF from flickr images - service to get RDF from XMP
- XSLT scripts to retrieve microformat data from
XHTML files - RSS scraping in use in VO projects in Japan
- scripts to convert spreadsheets to RDF
- SQL - A huge amount of data in Relational
Databases - Although tools exist, it is not feasible to
convert that data into RDF - Instead SQL ? RDF bridges are being developed
a query to RDF data is transformed into SQL
on-the-fly
45More Collecting
- RDFa (formerly known as RDF/A) extends XHTML by
- extending the link and meta to include child
elements - add metadata to any elements (a bit like the
class in microformats, but via dedicated
properties) - It is very similar to microformats, but with more
rigor - it is a general framework (instead of an
?agreement? on the meaning of, say, a class
attribute value) - terminologies can be mixed more easily
- GRDDL - Gleaning Resource Descriptions from
Dialects of Languages - ATOM (used with RSS)
46Services
- Ontologies of services, provides
- What does the service provide for prospective
clients? The answer to this question is given in
the "profile," which is used to advertise the
service. To capture this perspective, each
instance of the class Service presents a
ServiceProfile. - How is it used? The answer to this question is
given in the "process model." This perspective is
captured by the ServiceModel class. Instances of
the class Service use the property describedBy to
refer to the service's ServiceModel. - How does one interact with it? The answer to this
question is given in the "grounding." A grounding
provides the needed details about transport
protocols. Instances of the class Service have a
supports property referring to a ServiceGrounding.
47Services, not standard
- Now 4 submissions to W3C
- OWL-S - http//www.w3.org/Submission/OWL-S
- SWSO/F/L - Semantic Web Services
Ontology/Framework/Language - http//www.w3.org/S
ubmission/SWSF/ - WSMO/X/L - Web Services Modeling
Ontology/Exection/Language - http//www.w3.org/Sub
mission/WSMX/ www.wsmo.org, www.wsmx.org - SAWSDL - (WSDL-S)
48ESIP to the rescue!
- Cluster activity is focusing on a services
ontology for earth and space science (services),
and - A data-type ontology (needed for services and
other uses) - Projects are implementing (OWL-S) services anyway
- Annotation of services is what is needed
49Explanation, Proof (path to Trust)
- Proof markup language (PML)
- an interlingua representation for justifications
of results produced by Semantic Web services - Not W3C, but no competition
- Implemented in InferenceWeb (http//iw.stanford.ed
u) - CWM and N3 and theorem provers - not yet adapted
to OWL-based languages
50Ph.D. topics -)
- Fuzzy logic
- look at alternatives of Description Logic based
on fuzzy logic - alternatively, extend RDF(S) with fuzzy notions
- Probabilistic statements
- have an OWL class membership with a specific
probability - combine reasoners with Bayesian networks
- Security, trust, provenance
- combining cryptographic techniques with the RDF
model, sign a portion of the graph, etc - Ontology merging, alignment, term equivalences,
versioning, development, ...
51NASA TIWG Semantic Web Roadmap 1.0
? Improved Information Sharing
? Revolutionizing how science is done
? Acceleration of Knowledge Production
? Increased Collaboration Interdisciplinary
Science
Results
Outcome
? Geospatial semantic services established
? Autonomous inference of science results
? Scientific semantic assisted services
? Geospatial semantic services proliferate
Output
? Some common vocabulary based product search and
access
? Semantic agent-based integration
? Semantic agent-based searches
? Semantic geospatial search inference, access
Assisted Discovery Mediation
Capability
? Local processing data exchange
? Metadata-driven data fusion (semantic service
chaining), trust
- Interoperable geospatial services(analysis as
service), results explanation service
? Basic data tailoring services (data as
service), verification/ validation
Interoperable Information Infrastructure
? SWEET core 1.0 based on GCMD/CF
? SWEET core 2.0 based on best practices decided
from community
? SWEET 3.0 with semantic callable interfaces via
standard programming languages
? Reasoners able to utilize SWEET 4.0
Technology
Vocabulary
? RDF, OWL, OWL-S
? Geospatial reasoning, OWL-Time
? Numerical reasoning
? Scientific reasoning
Languages/ Reasoning
Near Term (0-2 yrs)
Current
Mid Term (2-5 yrs)
Long Term (5 yrs)
52Semantic Web Roadmap (expanded capability)
Capability
? Some common vocabulary based product search and
access
? Semantic geospatial search inference, access
? Semantic agent-based integration
? Semantic agent-based searches
Assisted Discovery Mediation
? Some metadata and limited provenance available
? Ontologies for data mining, visualization and
analysis emerging/ maturing
Assisted Knowledge Building
? Common terminology captured in ontologies,
crossing domains
? Provenance/ annotation with ontologies in user
tools
? Domain and range properties in ontologies used
in tools
? Service ontologies carry quality provenance
Verifiable Information Quality
? Verification is manual with minimal tool support
? Ontologies for information quality developed
? Services must be hardwired and service
agreements established
Responsive Information Delivery
? Dynamic service discovery and mediation, and
data scheduling
? Semantic markup of data latency (time lags)
which adapt dynamically
? Services annotated with resource descriptions
- Interoperable geospatial services(analysis as
service), results explanation service
? Metadata-driven data fusion (semantic service
chaining), trust
? Basic data tailoring services (data as
service), verification/ validation
Interoperable Information services
? Local processing data exchange
? Limited metadata passed to analysis applications
Interactive Data Analysis
? Shared terminology for the visual properties of
interface objects and graph types...
? Tag properties, non-jargon vocabulary for
non-specialist use
? Semantic fields to describe tag key modal
functions.
? Access mediated by agreed standard
vocabularies, hard-wired connections
? Key data access services are semantically
mediated
? Access mediated by common ontologies
? Mediation aided by services with domain/ range
properties
Seamless Data Access
Current Near
Term Mid Term Long Term
Current Near
Term Mid Term Long
Term 0-2 years 2-5 years 5 years
53Semantic Web Roadmap Details
Competing catalog schemas
Common semantic service catalog established
Enhanced semantic search into search engines
Automatic knowledge discovery and mining
Discovery
Semantic service chaining, SWSL
Intelligent algorithm programming chaining
Semantic framework for Web Services, WSMO
Standard workflow language (BPEL)
Workflow
Built into code logic and in the head of the user
Basic semantics (DL, FOL)
High degree of semantic understanding
Intelligent message routing (SOL)
Inference
l
SWEET Core 1.0 VSTO, MMI, others
SWEET core 2.0 domain and math plug-in
SWEET 3.0 science applications plug-in
Earth Science Standards
GCMD, CF, ESML, GML, etc.
Languages
PML
XML, RDF
OWL-DL, OWL-Full WSML
OWL-S, SWRL
Current Near Term Mid
Term Long Term
Proof/Trust
Syntax
Explanation/Rules
Semantics
54Roadmap ongoing Steps
- Baseline metrics for results evaluation
- Document the Current languages/vocabularies,
capabilities, and outcomes as baseline for
evaluating near-term, mid-term and long-term
progress in the roadmap - Recommendations
- Actions that NASA can take to help achieve
milestones (to ESTO and NASA HQ) - Program and process steps - priorities and
timeliness - Start the infusion process
- Identify technology candidates that could be
submitted to the DSWG standards process to
accelerate infusion - Identify technology champions (groups and
individuals) to shepherd technologies through the
process, e.g. ESIP semantic web cluster
55Tutorial Summary
- Semantics are in-use in a variety of fields
- Substantial RDF and OWL encodings of knowledge,
options for representation are increasing - Standards are in place in key areas, some not
quite - Tools are in reasonable shape, no killer-tool
- Best practices DO exist, even in Earth Sciences
- PARTNER with someone already familiar
- A little semantics goes a long way
56How to get involved/ learn more
- ESIP Semantic Web Cluster
- http//wiki.esipfed.org/index.php/Semantic_Web
- esip-semanticweb_at_rtpnet.org, subscribe at
http//rtpnet.org/mailman/listinfo/esip-semanticwe
b - Telecon schedule - monthly - 11am (PT). noon
(MT), 1pm (CT), 2pm (ET) - 2nd Tuesday starting
in January - Meeting 730pm, Tuesday Jul. 17th
- Semantic Web Plenary and demos (Wed)
- NASA/DSWG/TIWG Semantic Web sub-group
- Semantic Web sub-group meets by telecon, 4th
Thursday 4pm ET/1pm PT as part of TIWG (meets
Jul. 18th 6pm) - http//teambps.mywsssite.com/seeds/wg/infusion/def
ault.aspx - ISWC 2007, CIKM 2007, SemTech 2008, IEEE ICSC
2007, KDD 2007, AAAI/IAAI 2007