Title: Semantic Web Applications
1Semantic Web Applications
2Evaluation RDF Assignment
- Goal Get some experience in working with RDF
(related technologies)
- Deadline problems 58 of the submissions in the
last hour before the deadline
- Some people reported tool problems
- Most people performed quite well on the
assignments
3RDF Assignment Common ProblemsProper
Subclassing
Versus
4Recap Why RDF?
- Consider a typical web page
- Markup consists of
- Rendering information (e.g., font size and
color) - Hyper-links to related content
- Semantic content is accessible to humans but
not (easily) to computers
5What should the schema capture?
Content Information - Author - First
Name Maurits Cornelis - Last Name Escher
- CreationDate 1961 - Title Waterfall -
Method Litography - Subject - Type
Optical Illusion - Type Abstract
- Depicts Waterfall - Depicts
Watermill
Technical Information - Dimension -
Width 665px - Height 850px - Type
JPEG - FileSize 148 KB
Personal Information - User - Name Kees
- Rating 9.1 - Comment Great Optical
Illusion!
Personal Information - User - Name Bart
- Rating 7.3 - Comment Not his Best Work
/ name of department
PAGE 4
24-11-2009
6What does RDF give us?
- A mechanism for annotating data and resources
- Single (simple) data model
- Syntactic consistency between names (URIs).
- Data model for the Web
- Openness Flexibility (use arbitrary
properties) - Resource-centered (triple-based)
- Datamodel easy to understand and manipulate
- RDF graphs can be simply merged (RDF merge is a
monotonic operation!)
7Subjects
- Examples of Semantic Web applications
- CHIP
- iFanzy
- RHCe
- Semantic Web Challenge
- Linked Data
- Legacy Applications
- Evolution
- RDFa
- Current Research Topics
PAGE 6
24-11-2009
8CHIP Rijksmuseum Amsterdamproviding semantic
browsing, searching and semantic recommendations
Onlinehttp//www.rijksmuseum.nl
Inside museum
7000 artworks
50000 artworks
PAGE 7
24-11-2009
9Personalized Art Experience
Personalized Web Site
Personalized Museum Tour
Personalized Tour on a Mobile Device
PAGE 8
24-11-2009
10Approach
- Making museum metadata available in RDF/OWL
- Making relevant vocabularies available in RDF/OWL
- Aligning enriching vocabularies/metadata
- Using resulting RDF/OWL graph for building a
combined (virtual and physical) user model - Using the above results for (semi)automatic
generation of virtual and physical museum tours
PAGE 9
24-11-2009
11Artist Rembrandt van Rijn
PAGE 10
24-11-2009
12Style Baroque
PAGE 11
24-11-2009
13Location Amsterdam
PAGE 12
24-11-2009
14teacher of Nicolaes Maes
teacher of Ferdinand Bol
militia
self-portrait
teacher of Gerrit Dou
style Baroque
place Amsterdam, 1625 to 1650
PAGE 13
24-11-2009
15Semantic Recommendation
PAGE 14
24-11-2009
16Help Needed!
- To collect users' feedback on the effectiveness
of the recommendation strategy, we invite you to
participate in our online user study - http//www.chip-project.org/demoUserStudy3
17iFanzy Personalized TV-guide
- ? http//www.nu.nl/tvgids/
- Focus on the EPG view
- Long list of channels
- No personalization / adaptation / search function
- ? http//www.tvgids.nl/
- Focus on program search
- Limited personalization / adaptation
- Hard to scale interface
PAGE 16
24-11-2009
PAGE 16
24-11-2009
18Goal
- Personalized Web-based browser for digital
television content - Harvest program information from different
sources - Use the application on several type of machines
and platforms - Give the user control over a large source of data
17
PAGE 17
24-11-2009
19Integration!
- Integration of program data
- Crawling the Web to bring different of program
information together - Searching for background knowledge to uncover
extra connections between items - Integration of user data
- Looking at similar users to form different groups
- Looking for different existing user profiles and
integrate them to get a richer user model - Integration of platform interaction
- Different platforms show different user behavior
PAGE 18
24-11-2009
20Content Integration
- RDFized IMDB dataset (multi-million triples,
13GB of data)
- Retrieved URLs photos and trailers from the Web
- Connected to Time, GEO and TVA-genre ontologies
for reasoning purposes
- Relate IMDB with EPGdata by movie title in
combination with director and actor names
19
PAGE 19
24-11-2009
21Converting TV Metadata in RDF/OWL
Input source 1
Input source 2
ltprogram title"Match of the Day"gt ltchannelgtBBC
Onelt/channelgt ltstartgt2008-03-09T194500Zlt/startgt
ltdurationgtPT01H15M00Slt/durationgt ltgenregtsportlt/gen
regt lt/programgt
ltprogram channel"NED1"gt ltsourcegthttp//foo.bar/lt/
sourcegt lttitlegtSportjournaallt/titlegt ltstartgt200803
09184500lt/startgt ltendgt20080309190000lt/endgt ltgenregt
sport nieuwslt/genregt lt/programgt
Translation to TV-Anytime in RDF/OWL
ltTVAProgramInformation ID"crid//foo.bar/0001"gt
lthasTitlegtSportjournaallt/hasTitlegt lthasGenre
rdfresource"TVAGenres3.1.1.9"/gt lt/TVAProgramIn
formationgt ltTVASchedule ID"TVASchedule_0001"gt
ltserviceIDRefgtNED1lt/serviceIDRefgt lthasProgram
crid"crid//foo.bar/0001"/gt ltstartTime
rdfresource"TIMETimeDesc_0001"/gt lt/TVASchedule
gt
ltTIMETimeDescription ID "TIMETimeDesc_0001"gt lty
eargt2008lt/yeargt ltmonthgt3lt/monthgt ltdaygt9lt/daygt lthou
rgt18lt/hourgt ltminutegt45lt/minutegt ltsecondgt0lt/secondgt
lt/TIMETimeDescriptiongt
PAGE 20
24-11-2009
22Converting Vocabularies in RDF/OWL
ltTerm termID"3.1"gt ltName xmllang"en"gtNON-FICTIO
N/INFORMATIONlt/Namegt ltTerm termID"3.1.1gt ltName
xmllang"en"gtNewslt/Namegt ltTerm
termID"3.1.1.9"gt ltName xmllang"en"gtSport
Newslt/Namegt ltDefinition xmllang"en"gtNews of
sportslt/Definitiongt lt/Termgt lt/Termgt lt/Termgt
ltTerm termID"3.2"gt ltName xmllang"en"gtSPORTSlt/Na
megt ltTerm termID"3.2.1gt ltName
xmllang"en"gtAthleticslt/Namegt ltTerm
termID"3.2.1.1"gt lt/Termgt lt/Termgt lt/Termgt
Translation of TV-Anytime genres to RDF/OWL using
SKOS
ltTVAGenresgenre ID"TVAGenres3.1.1.9"gt ltrdfslab
elgtSport Newslt/rdfslabelgt ltskosbroader
rdfresource"TVAGenres3.1.1"/gt ltskosrelated
rdfresource"TVAGenres3.2"/gt lt/TVAGenresgenregt
ltTVAGenresgenre ID"TVAGenres3.2"gt ltrdfslabelgtS
portlt/rdfslabelgt ltskosrelated
rdfresource"TVAGenres3.1.1.9"/gt lt/TVAGenresgen
regt
ltTVAGenresgenre ID"TVAGenres3.1.1"gt ltrdfslabel
gtNewslt/rdfslabelgt ltskosnarrower
rdfresource"TVAGenres3.1.1.9"/gt ltskosbroader
rdfresource"TVAGenres3.1"/gt lt/TVAGenresgenregt
PAGE 21
24-11-2009
23Aligning and Enriching Vocabularies
- Alignment of Genre vocabularies
- Semantic enrichment of Genre vocabulary
- Semantic enrichment of TV metadata with IMDB
movie descriptions - Alignment of date/time descriptions to Time
ontology concepts to allow temporal reasoning
XMLTVdocumentaire ? TVADocumentary IMDBThrill
er ? TVAThriller IMDBSci-Fi ? TVAScience
Fiction
- News skosnarrower-gt Sports News gt Original
Term hierarchy - Sport News skosrelated-gt Sport gt Partial label
matches - Skating skosrelated-gt Ice skating gt Partial
label matches - American Football -skosrelated-gt Rugby gt
Domain expert
Buono, il brutto, il cattivo, Il (1966) ? The
Good, the Bad and the Ugly
PAGE 22
24-11-2009
24Semantic Graph for Recommendations
- Generating recommendations based on usage data
and the RDF/OWL graph, behavior analysis - Query expansion on search terms and UM values
- WordNet synonyms for search terms
- skosnarrower/related relationships
- When asking for a recommendation, empty search
fields like ltgenresgt and lttermsgt are filled in by
user preferences - When requested only specific contexts are
considered. Context includes - Time contexts e.g. preferences in morning,
evening, - Audience e.g. preferences for groups
- Location e.g. preferences can differ per location
PAGE 23
24-11-2009
25iFanzy online www.iFanzy.nl
26Personal TV-guide
27iFanzy STB
- STB interface
- Build to fit on a television screen
- Different layout, same server
- Works with a VOD source
PAGE 26
24-11-2009
28RHCe
27
PAGE 27
24-11-2009
29RHCe
- Regional Historic Center Eindhoven
- Governs all historic material related to the
region of Eindhoven - Govern historic material and provide citizens
access to the archives (including promoting) - Heterogeneous datasets
- Videos, pictures, drawings, postcards, ownership
records, birth marriage death records, maps,
aerial pictures, meeting minutes, financial
records, etc
28
PAGE 28
24-11-2009
30CHI Browser
- RHCe Portal that provides navigation and
personalization over the archives to the user - Navigation Structure
- Objects in general are connected by shared
dimensions description keywords, time and
location - Using these facets allow both searching and
browsing the collections and connecting similar
objects (over these dimensions) - Built specialized browsing paradigms for these
dimensions
29
PAGE 29
24-11-2009
31PAGE 30
24-11-2009
32 31
33Tagging
- What?
- Assigning keywords (or short phrases) to
resource - Tags can be used like text in textual documents
during indexing for retrieval - Why?
- It might be a bit complex for end users to edit
an RDF graph - Tagging is a simpler mechanism that users are
already used to - No complexity users can make up their own words
32
PAGE 32
24-11-2009
34Matching Component
- Desire
- Benefit from the advantages of the simple
tagging mechanism, while also benefitting from
the richer structure of the Semantic Web - Purpose Matching Component
- Relating tags to ontological concepts
33
PAGE 33
24-11-2009
35Matching Component
Component
nsExercise, 0.85 nsPlace, 0.75
Sprot
nsSport, 0.9 nsSpot, 0.8
Swimming
nsSport, 0.05
34
PAGE 34
24-11-2009
36String Matching
- Pattern Matching
- Exact or substring
- day matches Friday
- Levenshtein
- Minimum number of edits to transform one word
into another - Hockie -gt Hocke -gt Hockey (distance 2)
- Jaro-Winkler
- Compare number of similar characters on similar
positions - Hockie vs Hockey (4 exact and one transposition)
- Soundex
- Use phonetics to compare sound of words
- Hockie (H300) vs Hockey (H300)
35
PAGE 35
24-11-2009
37Semantic Broadening
- Using structure of ontology to expand concepts
by following properties - E.g. using rdfssubClassOf or skosnarrower
Sport
Hockey
URI_1
URI_2
36
PAGE 36
24-11-2009
38Semantic Broadening (2)
- Or more complicated, by using a query
Hockey
Sport
Tennis
37
PAGE 37
24-11-2009
39Context Disambiguation
- Context often allows to determine right context
of ambigues word (especially names) - Take several input tags
- Use context of those input tags for
disambiguation - E.g. Bill President versus Bill
Microsoft - Configure concept-distance
38
PAGE 38
24-11-2009
40Semantic Web Challenge
- New technologies are only viable for mass
adoption if a critical mass of applications exist
for it. - The Semantic Web Challenge aims to find new
innovative applications that are based on
Semantic Web Technology - Two Tracks
- Open Track for all applications that are somehow
based on Semantic Web Technology - Billion Tripple Track, requires the participants
to make use of the data set -a billion triples-
provided by the organizers - CHIP and iFanzy are top applications in last year
challenge
39
PAGE 39
24-11-2009
41Semantic Web Challenge - Examples
- Paggr
- Build Widgets over annotated Web pages using
SPARQL - DBpediaMobile
- Revyu
- Review anything on the Web
- Semaplorer
- interactively explore and visualize a large
semantically heterogeneous distributed semantic
data sets in real-time
40
PAGE 40
24-11-2009
42Linked Data
- The Semantic Web isn't just about putting data on
the web. It is about making links, so that a
person or machine can explore the web of data.
 With linked data, when you have some of it, you
can find other, related, data. (Tim Berners-Lee) - Additional rules for Semantic Data so that we
can build a Web of Data, i.e. as an addition to
the existing Web of hypertext documents - Four principles of Linked Data
- Use URIs as names for things
- Use HTTP URIs so that people can look up those
names - When someone looks up a URI, provide useful
information - Include links to other URIs. so that they can
discover more things
PAGE 41
24-11-2009
43Linked Data
44Linked Data - Advantages
- Separation of Concern In current Web pages the
semantics are not separated from presentation - Using linked data the same content URI can be
rendered in different ways (e.g. localization,
device dependency, etc) - Mashups of data from multiple sources
- such as in maps, timelines, etc.
- Define views using (SPARQL) queries
- Reuse!
- No longer create complete domain schemas, but
only the stuff that is interesting for you
PAGE 43
24-11-2009
45Combining Datasets
- IMDB-Cinema-Restaurant scenario
- Find a Cinema that shows a film by Guillermo del
Toro for which there is a French Restaurant
within one kilometer - Airline-Hotel-Car Rental scenario
- Find the trip to South America for two weeks in
July that includes flight, a hotel room for the
whole period and a rental car for the whole
period with the lowest combined price
PAGE 44
24-11-2009
46Legacy Data
- A lot of useful data already exist on the Web in
non-RDF form - RDF is a flexible data model most other data
models can be converted into RDF - Many RDF-wrappers exist
- Babel, ConverterToRdf, GRDDL, RDFizers, Triplr,
etc - Also for relational databases
- Many sources are becoming available
- DBpedia Linked data version of Wikipedia
- US Census RDF version of the 2000 US census data
- LinkedMDB RDF version of IMDB
PAGE 45
24-11-2009
47Why convert to RDF?
- Query information on Webpages
- i.e. beyond Google keyword matches
- For Example on DBPedia, you can now query for
- Give me all Sitcoms that are set in NYC?
- All tennis players from Moscow?
- All films by Quentin Tarentino?
- All German musicians that were born in Berlin in
the 19th century? - All soccer players with tricot number 11, playing
for a club having a stadium with over 40,000
seats and is born in a country with over 10
million inhabitants?
PAGE 46
24-11-2009
48Browsing Linked Data
- RDF Visualization is still a research issue
- However, specialized visualization exist (like
map and timeline visualizations) - A mechanism is needed to combine RDF with
stylesheet for presentation purposes - Some General purpose browsers exist
- Tabulator Browser
- DISCO Hyperdata Browser
- OpenLink RDF Browser
- Rhodonite RDF-editor and browser
49(No Transcript)
50RDFa Integration of RDF in Web pages
- Vision Close the Chasm Between Human and Data
Webs - Enhancing current Web (XHTML) documents with
embedded semantics - Explain the semantics of pieces of content (e.g.
dates) - Provides a set of attributes to carry metadata in
XML tags - One-to-one mapping with RDF
PAGE 49
24-11-2009
51RDFa
- about
- a URI specifying the resource the metadata is
about - rel and rev
- specifying a relationship or reverse-relationship
with another resource - href, src and resource
- specifying the partner resource
- property
- specifying a property for the content of an
element - content
- overrides the content of the element when using
the property attribute - datatype
- specifies the datatype of text specified for use
with the property attribute - typeof
- specifies the RDF type(s) of the subject
PAGE 50
24-11-2009
52RDFa Example
- ltdivgt
- lth2gtThe Trouble with Boblt/h2gt
- lth3gtAlicelt/h3gt
- lt/divgt
ltdiv xmlnsdc"http//purl.org/dc/elements/1.1/"gt
lth2 property"dctitle"gtThe Trouble with
Boblt/h2gt lth3 property"dccreator"gtAlicelt/h3gt
lt/divgt
PAGE 51
24-11-2009
http//ben.adida.net/presentations/www2008-rdfa/(
27)
53RDFa Example
- ltdiv xmlnsdc"http//purl.org/dc/elements/1.1/"gt
- lth2 property"dctitle"gtThe Trouble with
Boblt/h2gt - lth3 property"dccreator"gtAlicelt/h3gt
- ltemgtApril 21st, 2008lt/emgt
- lt/divgt
ltdiv xmlnsdc"http//purl.org/dc/elements/1.1/"gt
lth2 property"dctitle"gtThe Trouble with
Boblt/h2gt lth3 property"dccreator"gtAlicelt/h3gt
ltem property"dcdate" datatype"xsddate"
content"20080421"gtApril 21st,
2008lt/emgt lt/divgt
PAGE 52
24-11-2009
http//ben.adida.net/presentations/www2008-rdfa/(
30)
54RDFa Example
- ltdiv about"/alice/posts/trouble_with_bob"...gt
- lth2 property"dctitle"gtThe Trouble with
Boblt/h2gt - lth3 property"dccreator"gtAlicelt/h3gt
- lt/divgt
- ...
- ltdiv about"/alice/posts/jos_barbecue"...gt
- lth2 property"dctitle"gtJo's Barbecuelt/h2gt
- lth3 property"dccreator"gtEvelt/h3gt
- lt/divgt
PAGE 53
24-11-2009
http//ben.adida.net/presentations/www2008-rdfa/(
33)
55Semantic Web Conclusion
- Machines will never understand content
- The computer doesn't truly "understand" any of
this information, but it can now manipulate the
terms much more effectively in ways that are
useful and meaningful to the human user.
56Some current Research Issues
- Web as a large-scale database (scalability)
- Data/service synchronization and integration (as
you go) - Planning (including service and data discovery)
- Ensuring security, protecting against semantic
spam