Creating and Sharing Structured Semantic Web Contents through the Social Web - PowerPoint PPT Presentation

About This Presentation

Title:

Creating and Sharing Structured Semantic Web Contents through the Social Web

Description:

User can also define his own concept group Connect grouped islands of concepts by hierarchical relations from WordNet elaborate Eg. Hotel, guest house, ... – PowerPoint PPT presentation

Number of Views:183

Avg rating:3.0/5.0

Slides: 66

Provided by: aman69

Category:

more less

Transcript and Presenter's Notes

Title: Creating and Sharing Structured Semantic Web Contents through the Social Web

1
Creating and Sharing Structured Semantic Web
Contents through the Social Web

(Main Evaluation)
Aman Shakya
Advisor Prof. Hideaki Takeda
Sub-advisors Assoc. Prof. Nigel Collier
Assoc. Prof. Kenro Aihara

2
Outline

Introduction
Social Semantic Web
State-of-art and Problems
Proposed approach
The StYLiD system
Concept consolidation
Concept grouping
Evaluation
Practical applications
Conclusions

3
Introduction
4
Background

Information Sharing
Information publishing
Understandable semantics
Information dissemination
Shared information
Better utilization ? Increased value
Shared information put together
Valuable knowledge

5
Social Web and Web 2.0

Easy to publish, understand and use
Information sharing platform
User generated contents
Connecting people
Collaboration
Mass participation Power of People
Wisdom of the crowds

6
Current Limitations and Needs

Data processing and automation
Unstructured data only for humans
Interoperability
Sharing data across
different applications
Integration
Combining data from
different applications

7
The Semantic Web

Web of Structured Data
Machine understandable semantics
Ontologies
Represent Conceptualizations of things
Consensus and common formats
Enables
Automated processing
Interoperation and Integration
Effective search and browsing

8
Challenges
?

Difficult to publish on the Semantic Web
Wide variety of data to share
Long Tail of information domains (Hunyh et
al. 2007)
Not enough ontologies
Ontology creation is a difficult process
Goal - To enable people to easily share wide
variety of semantically structured data

9
Social Semantic Web

Social software Semantic Web
Web 3.0

Social connectivity
Social Semantic Web
Information connectivity
- Adapted from (Decker, 2005)
10
State-of-Art Social Semantic Web
Structured content creation on the Social
Semantic Web
Direct Structured Contents
Derived Structured Contents
Instance Data Creation
Semantification of Social Data
Data Exporters
Semantic Blogging
Scrapers
Semantic Bookmarking
Semantics of Tags
Semantic Desktop
Semantics from Text
Semantic Annotation
Emergent Semantics
Ontology Instance Data creation
Semantic Wikis
Collaborative Ontology Creation
11
Collaborative Knowledge Base Creation
Knowledge base ontology instance data
Collaborative Knowledge Base
Users
Users
12
Collaborative Knowledge Base Creation Systems
Ease of use Expressiveness Constraints Multiplicity Consensus
Semantic Wikis SMW, ikeWiki, etc Complex extended wiki syntax, some training needed Moderate Mainly instances, concept schemas possible strict type constraints No Needed Wiki way
Freebase Metaweb Inc. Moderate Interactive but elaborate interface Moderate Concept schemas, instances strict type constraints Allowed but concepts not related Mostly needed Wiki way, by admin
my- Ontology Siorpaes Hepp, 2007 Complex understanding of ontology needed Moderate Concepts, relations, instances Strict logical constraints No Needed Wiki way
Ontology Maturing Braun et al., 2007 Fairly easy need to build taxonomy Low Concept hierarchy free tagging No Needed By interaction
Desired Solution Easy Moderate Minimum Yes Optional
13
Problems

Complexity and learning curve
Powerful collaborative systems difficult for
ordinary people
Difficult to create perfect concept definitions
and ontologies
Difficult to accommodate all requirements
Strict constraints can make the model rigid
Existence of multiple conceptualizations
Different perspectives or contexts
Difficulty of collaboration and consensus

14
Proposed Approach
15
Proposed Collaborative Knowledge Base Creation
Collaborative Knowledge Base
Users
Users
Users
16
Overview of Proposed Approach
Structured Data Collection
Concept Consolidation
Social Platform for Structured Data Authoring
Schema Alignment
Concepts
Instances
Concept Grouping
Structured Linked Data
Grouped concepts
Browsing, Searching, Services
Emerging Lightweight Ontologies
User Community
17
StYLiD

Structure Your own Linked Data
http//www.stylid.org
Social Software for
Sharing a wide variety of Structured Data
Users freely define their own concepts
Easy for ordinary people
Consolidate multiple concept schemas
Group and organize similar concepts
Popular evolving concepts definitions

18
Hotel Concept
Creating a new Concept
List of Attributes
Description
Or Reuse / Modify existing Concept
Suggested Value Range
19
Shinjuku Prince Hotel
Instance Data
Literal value
Pick value from Suggested range
Resource URI
External URI
Multiple Values
20
Concept Consolidation

Hotel 1
Name
Amenities
Capacity
Contact
Price
Access
Rating

Hotel 2
Name
Facilities
No. of rooms
Phone-number
Single room price
Double room price
Nearest station
Category
Address

Hotel 3
Name
Price
Rating
City
Country
Near-by attractions

Hotel 4
Name
Phone-number
Zip-code
Latitude
Longitude
No. of stories

same
Synonymous / different labels
Different Contexts / Perspectives
Many-to-one
Complimentary
21

Hotel (Consolidated Concept )
Name
Facilities
Capacity
Contact
Single room price
Double room price
Access
Rating
Address
Zip-code
Latitude
Longitude
Near-by attractions
No. of stories

Consolidated Concept
22
Concept Consolidation

A concept consolidation C is defined as a triple
lt , S, Agt where
- consolidated concept
S - set of constituent concepts C1,C2 ,..Cn
A is the attribute alignment between and S
Based on Global-as-View (GAV) approach for data
integration (Lenzerini, 2002)
Global schema defined as views on source schemas
Consolidated Concept with consolidated
attributes
aligned to source concept attributes as views

23
Concept Consolidation
lt , S, Agt
image
view
aligned( , )
aligned( , )
aligned( , )
A ,
23
24
Concept Consolidation

Consolidated view of instances
Translation of instances
From one conceptualization to another
Query Unfolding (Advantage of GAV over LAV)
Queries over (in terms of
attributes)
to queries over C1,C2 ,..Cn
Using alignment A
Union of results
Translation of queries

25
Concept Cloud
Consolidated concept
Sub-Cloud
26
Experiment on Conceptualization

Hypothesis
Multiple conceptualizations by different people
for the same thing can be consolidated
Methodology
Participants given short text passages (6
participants)
List down Facts structured as
(Attribute, Value) table
All concept schemas aligned manually

attribute value
name Kiyomizu
location Kyoto
.. ..
Concept schema
26
27
Observations
Types of Alignment Relations found
Attribute label similarity
28
Remarks

People can express their conceptualizations in
terms of schema
Different people have different
conceptualizations
No one covers all possible attributes
Conceptualizations overlap significantly
Most parts can be aligned
Most have simple alignment relations
Multiple conceptualizations can be consolidated

28
29
Alignment of Concept Schemas

Attribute Alignments suggested Automatically
Alignment API implementation (with WordNet
extension)
(Euzenat, 2004)
Community-supported alignment
Human intelligence Machine intelligence
Alignments are represented and saved
Alignment ontology (Hughes and Ashpole, 2004)
Alignment API alignment specification language
(Euzenat et al., 2004)
Other formats C-OWL, SWRL, OWL axioms, XSLT,
SEKT-ML and SKOS.
Incremental alignment (maintained
collaboratively)
A Unified View
Consolidated concept with Consolidated Attributes
Homogenous table of data

29
30
Semi-automatic Schema Alignment
Two Hotel concepts
x
Consolidated attributes
31
Consolidated Structured Search
Find all hotels with location Tokyo and type
luxury
Search on Consolidated Concept
Hotel 1 ---- Hotel 2 location ? address type ?
category
32
Concept Grouping

Concept Similarity
ConceptSim(C1, C2) w1NameSim(N1, N2)
w2SchemaSim(S1, S2)
NameSim
WordNet-based similarity - Lins algorithm (1998)
Levenshtein distance
SchemaSim
Average similarity of best matching pairs of
attributes
Calculate ConceptSim between all pairs of
concepts
Group similar concepts above Threshold

32
33
Schema Similarity

Calculate NameSim for all pairs of attributes to
create an n1n2 matrix
M NameSim(A1X A2)
Find best matching pairs using
Hungarian Algorithm (M)
(Kuhn, 1955 Munkres, 1957)
Calculate matching average
SchemaSim(S1, S2) 2x?Similarity of best
matching pairs / (A1A2)
Adapted from Semantic similarity between
sentences (Simpson and Dao, 2005)

S1
S2
A2
A1
34
Visualization of Concepts Grouping
Cytoscape
35
Experiments on Freebase Data

Purpose
Evaluate automatic schema alignment
Evaluate proposed concept grouping method
Observations about user-defined concepts
Community-driven database of worlds information
User-defined Types concept schemas
Queried out (May 20, 2008)
Cleaning
Filter out test types, stop-words, types without
instances

35
36
Observations

After cleaning
1,412 concepts
500 users who defined concepts
People want to share a wide variety of data
People define their own concept schemas
Most people only define few concepts (1-5)
Long tail of information types

37
Freebase Concept Consolidation

Concepts with same name, synonyms, morphological
variants
57 consolidated concepts formed
Multiple versions of concept by different users
Up to 6 versions of the same concept
Same user also defines multiple versions
Alignments suggested automatically
51 alignment relations (44 aligned attribute
sets)
Human judgement
Precision 88.24
Recall 67.16

37
38
Concept Consolidation Example

Recipe (user1), Recipe (user2), Recipes
(user3) .
r1 r2 r3
Consolidated concept - Recipe
Consolidated attributes
r1ingredient, r2ingredients, r3materials
r1steps, r2instructions
r3directions
r2tools_required
r3taste
r3author

Aligned attribute Sets
(adapted from Freebase)
38
39
Evaluation of Concept Grouping

ConceptSim(C1, C2) w1NameSim(N1, N2)
w2SchemaSim(S1, S2)

Concept grouping with different thresholds (w1
0.7, w2 0.3)
Concept grouping with different weights
(threshold 0.8)
39
40
Emergence of Lightweight Ontologies

Concepts contributed by community
Concept consolidation
Concept grouping
Popularity of concepts (as in Tag clouds)
Common vocabulary for structured information
sharing
Conceptual schemas (class/property)
Informal organization by similarity

41
Informal Lightweight Ontology
source Schaffert et al. (2005) p. 7
42
Evaluation
43
Evaluation of Usability

Hypothesis
StYLiD is more usable than Freebase (for given
tasks)
Methodology
Tasks performed with StYLiD and Freebase
Task 1 - Structured data authoring
Task 2 - Concept schema creation
Task 3, 4 - Modifying and reusing concepts
Task 5 - Structured concepts and instances
authoring
Task 6 - Searching
Observations
Questionnaires, screen logs, comments, etc

44
Example (Task 1)
Input Band The Beatles
45
Participants

Total 15 participants
Including 6 without IT background
Different backgrounds
Public policy, international relations,
psychology, telecommunication, networks, hotel
staff, etc.
From 10 countries
Age 22 43 (avg. 28.3)
Most did not know the systems before

46
Results

System Usability Scale (SUS) (Digital Equipment
Corp.)
Average scores StYLiD 69.7, Freebase 39.3
Enhanced Semantic MediaWiki 54.8 (Pfisterer et
al., 2008)
Aggregated results from the Tasks (score 0-4)

47
Results for non-IT participants

6 participants
SUS scores
StYLiD (71.67), Freebase (50.42)

48
Observations

StYLiD quite usable without any training,
knowledge or help
Most users preferred StYLiD to Freebase
Specifying attribute value range not easy
Strict data type constraints can cause problems
Many people modify and reuse concepts
People try to input all data in minimum steps
Data entry can be made easier and quicker
Auto-complete mechanisms would be helpful

49
Comparison with some systems
StYLiD Freebase Semantic MediaWiki
Concept creation UI supported UI supported Template markup
Instance creation Form-based Form-based Extended wiki syntax forms
Data authoring Blogging / social bookmarking Structured wiki Wiki text annotation
Data import Wrappers Bulk import facility Not possible
Constraints Flexible Strict type constraints Strict type constraints
Multiplicity Allowed Partly No
Consolidation Schema-level Some instances No
Organization Concept grouping Bases Categories
50
Practical Applications
51
Application Scenarios

Social Site for
Structured Information Sharing

Users
Concept Schemas
External Data Resources
Structured data
Information Sharing Social Semantic Website
Users
51
52
Application Scenarios

Integrated Semantic portal

IS1
Structured data
Wrapper1
IS2
Wrapper2
Wrapper3
IS3
External Data Resources
Concept Schemas
Information Sources
Integrated Semantic Portal
Users
Admin
52
53
Adapting to different scenarios

Variable aspects
Data and concepts acquisition
Community and motivation
Functionalities and constraints
Data quality
Ways of adaptation
Use of wrappers, etc.
Delegate functionalities/constraints
Extensible and customizable open source
Customized queries and views

54
Real practical applications

Integration of research staff directories
Osaka university and Nagoya university
Data scraped from the websites
A musical community website in Tokyo
International Exchange Center
Social data bookmarking site StYLiD.org
A document management system in AIT

55
University Directory Integration

10 alignments automatically suggested
All correct
Total 19 alignments

56
Integrated interface
57
TIEC Musical Community website
58
StYLiD.org Data Bookmarking
59
Document Management system
60
Structured Information Dissemination in
Decentralized Communities
SocioBiblog System
SocioBiblog System
Publishing
Publishing
Aggregation
Aggregation
Social network links
Web
Extended RSS
SocioBiblog System
SocioBiblog System
Publishing
Publishing
Aggregation
Aggregation
60
61
Conclusions
62
Conclusions