Title: A Semantic Web Content Model and Repository
1A Semantic Web Content Model and Repository
- Max Völkel
- 6.9.2007I-Semantics, Graz
2Outline
- Motivation
- Analysis Web vs. Semantic Web
- Developing a unified Semantic Web Content Model
- In three easy steps
- Implementation
3How to model structure content in one model?
- Background
- Wikis, Personal Semantic Wikis, Semantic Desktop,
- Two motivations
- Bring flexibility and expressivity of RDF to the
end-user - Allow RDF to model and represent content as well
not only its metadata - Goal Unified Model
- As usable as the web
- But how to represent semantics? Semantic queries?
- As expressive and as flexible as the semantic web
- How to represent binary data (desktop files, web
resources) in RDF? - Unified search
- Give me all papers written by author X which
contain Y
4Analysis The Web
- Granularity Gets smaller
- Web 1.0 homepages, portals
- Web 2.0 micro-content
- Renderable representations
- Freedom of formalisation
- Less semantic HTML is less portable, but works
HTTP
Representation
URI
Encoding
meta-data
ChangeDate
MimeType
HTML, JPG, CSS, JS, PDF,
HTTP
Content
5Analysis The Semantic Web
- Flexible, very expressive
- Not expressive enough
- Literals cannot be addressed
- Statements cannot be addressed (but reified)
- 10 different node types ? complex for end-users
- Exising formal knowledge can be re-used
6Requirements for a SWCM
-
- Content granularity
- Expressivity
- Binary Content
- Freedom of formalisation
- Human-usable
- Renderable representations
- Human-type- and memorizable names (e.g. like
WikiWords) - Inverse Relations
- Knowledge re-use
- Standard CMS features
- Access rights ? addressable parts
- Versioning ? addressable parts
7Comparison
- Feature Web Sem. Web Desired
- Content granularity mid/large small any
- Goal From small comments to full web pages/files
- Expressivity -
- Binary Content
- Freedom of formalisation -
- Human-usable -
- Renderable representations
- Human-type- and memorizable names (e.g. like
WikiWords) - Inverse Relations
- Knowledge re-use -
- Standard CMS features
- Access rights ? addressable parts
- Versioning ? addressable parts
81
Creating the SWCM Step 1 A Human-Usable RDF
9Step 1 A Human-Usable RDF
- Items have a URI and can have a Literal
- ? Addressable Literals
URI
Literal
0..1
Item
10Step 1 A Human-Usable RDF
- Statements connect Items
- ? Expressivity of RDF
URI
Literal
0..1
Item
source
target
Statement
relation
11Step 1 A Human-Usable RDF
- Addressable Statements
- ? Syntactic sugar over reification
URI
Literal
0..1
Item
source
target
Statement
relation
12Step 1 A Human-Usable RDF
- Address Items via human-type-able name (e.g.
WikiWords) - ? Human-usable naming
URI
Literal
0..1
Item
source
target
Statement
NameItem
relation
13Step 1 A Human-Usable RDF
- Statements (Item, NameItem, Item)
- ? Decision that relations should be
human-name-able
URI
Literal
0..1
Item
source
target
Statement
NameItem
relation
14Step 1 A Human-Usable RDF
- Relations have always an inverse
- ? Item-centric rendering easier for tools
URI
Literal
0..1
Item
source
target
Statement
NameItem
relation
inverse
Relation
15Step 1 A Human-Usable RDF
- A Model contains Items
- A Model has a URI
Model
URI
Literal
0..n
0..1
Item
source
target
Statement
NameItem
relation
inverse
Relation
162
Creating the SWCM Step 2 Include Binary Content
17Step 2 Include Binary Content
- From addressable literals to addressable
representations
Literal
URI
0..1
Item
18Step 2 Include Binary Content
Representation
URI
0..1
Item
19Step 2 Include Binary Content
- Representations on the web have some built-in
properties - Metadata Mime-type, encoding, change-date
- Data the actual content itself
Representation
URI
0..1
Item
20Step 2 Include Binary Content
- Representations on the web have some built-in
properties - Metadata Mime-type, encoding, change-date
- Data the actual content itself
Representation
Encoding
URI
0..1
ChangeDate
Item
MimeType
Content
21Step 2 Include Binary Content
- In SWCM, representations have an author
- Like in wikis, blogs, web pages,
- Can be anonymous
Representation
Encoding
URI
0..1
ChangeDate
Item
MimeType
Content
22Step 2 Include Binary Content
- In SWCM, representations have an author
- Like in wikis, blogs, web pages,
- Can be anonymous
Representation
Encoding
URI
author
0..1
ChangeDate
Item
MimeType
Content
233
Creating the SWCM Step 3 Merge Step 1 and
Step 2
24The Semantic Web Content Model
Representation
Encoding
Model
URI
author
0..n
0..1
ChangeDate
Item
source
MimeType
target
Content
Statement
NameItem
relation
inverse
Relation
Structure
Content
25The Semantic Web Content Model
- We expect end-users to understand the circled
parts
Representation
Encoding
Model
URI
author
0..n
0..1
ChangeDate
Item
source
MimeType
target
Content
Statement
NameItem
relation
inverse
Relation
Structure
Content
26Implementation
27Swecr is implemented in two layers
swecr.model interface
swecr.core interface
28The swecr.model API (see www.swecr.org)
www.swecr.org
IMimeType
0..n
IModel
RDF2Go.URI
IRepository
author
IContent
ChangeDate
0..n
IItem
INameContent
IBinContent
source
target
0..1
INameItem
IContentItem
1. Content of a INameItem is unique within its
IModel. 2. Mimetype always text/plain
IStatement
IRelaton
inverse
29swecr.core
30Swecr.core Some Content stored in RDF
www.swecr.org
- FZI a swcmNameItem , swcmItem
swcmhasChangeDate "2007-08-24T160729Z"xsdd
ateTime swcmhasContent FZI Forschungszentrum
Informatik" . - employs a swcmNameItem , swcmItem ,
swcmRelation swcmhasAuthor swcmanonymous-auth
or swcmhasChangeDate "2007-08-24T160732Z"xs
ddateTime swcmhasContent employs"
swcmhasInverse employedBy . - worksFor a swcmNameItem , swcmItem ,
swcmRelation swcmhasAuthor swcmanonymous-auth
or swcmhasChangeDate "2007-08-24T160733Z"xs
ddateTime swcmhasContent works for"
swcmhasInverse employs .
31Implemented in two layers
www.swecr.org
- Statements stored in two RDF models user model
- lturnrnd-1d72b0a211498a0d25f-7fffgta swcmItem
, swcmStatement swcmhasChangeDate
"2007-08-24T160730Z"xsddateTime
swcmstmtRelation employs swcmstmtSource
FZI swcmstmtTarget Max .
and index model ? Query answering
FZI employs Max . Max worksFor FZI .
redundant
32But where to store binaries?
www.swecr.org
swecr.model interface
swecr.core interface
?
RDF ModelSet
user model
index model
33BinStore a simple binary store
www.swecr.org
- Intuition The simplest web-like API, that would
possibly work (and allow random-access) - Data model URI ? Metadata InputStream /
OutputStream - Simple implementation on files
- Future Consider JCR
- getReadHandle
- InputStream readStream()
- getMimeType(), getSize()
- getWriteHandle
- writeStream( InputStream, MimeType )
- setMimeType( MimeType )
- getRandomAccessHandle
- delete( URI )
Binary Store
BinStoreImpl
API
34Persistence in an RDF ModelSet and a Binary Store
www.swecr.org
- Full text queries need a full text index
swecr.model interface
swecr.core interface
Binary Store
RDF ModelSet
user model
index model
35The complete swecr.core
www.swecr.org
swecr.model interface
swecr.core interface
Binary Store
RDF ModelSet
Query Engine
IndexingBinStore
IndexingModelSet
BinStoreImpl
TextIndexImpl
ModelSetImpl
AdapterServer
Bin2Text(Aperture)
Existing component
In progress
Download from www.swecr.org ?
36Example Wiki-page
- Example A wiki-page in SWCM
- Title of wiki page ? NameItem
- Content of wiki page ? Item
- Relation between title and page content ?
Statement - Who uses it?
- WavesWiki (part of BMBF project,
http//waves.fzi.de) - SemFS a Semantic File System (presented at I-KNOW
in 2006) - Conceptual Data Structures (end-user personal KM
tool) - Interest from XWiki and Cognium Systems
37Summary
- SWCM is a content management model combining the
usability of the web with the expressivity and
flexibility of the semantic web
Item
source
target
Statement
NameItem
inverse
relation
- Future Work
- Refactor core layer into smaller parts (services)
- Create RDF with binaries API
- Unified queries (like the LuceneSAIL or LARQ)
- Crawling of external resources (index localled,
stored remote) - From structured text to SWCM models (see paper)
Relation
38Thank You.