Title: RDFStore%20Perl%20API%20for%20RDF%20Storage
1RDFStorePerl API for RDF Storage
- INTAP meeting
- 14 January 2002
- JRC, Ispra (I)
- Alberto Reggiori areggiori_at_webweaving.org
- JRC/IPSC
2Outline
- The RDF view of the Web - Semantic Web
- RDF applications and tools requirements
- RDFStore
- API
- Storage
- Querying
- Sample code and applications
- SW related applications
- Next steps
- Demonstration
3The RDF model
- RDF (Resource Description Framework) is a
foundation to structure data over the Web, using
the Web itself - RDF provides a simple and consistent mechanism to
organise, describe, manage, associate, combine
and navigate Web resources ? the Web is like a
globally linked database
4RDF concepts
Semantic Web
RDF/XML
N-triples
Namespace
Statement
Class or property
Applications and Tools
Context
RDF (schema) model
URI
Resource or value
Data
Web
5RDF applications and tools requirements
- Storage
- There is no real Semantic Webif we store RDF
models - very fast and scalable
- support atomic and/or transactional operations
- deal with arbitrary sized data
- handle multiple connections concurrently
- avoiding a 'stagger' situation
- distributed and federated
- compression
- Caching
- API
- Easy and straightforward to use
- Modular and flexible
- Users need an abstraction to
- Parsing (canonical and strawman)
- Navigation
- Validation
- Managing constructs (set/get, add/remove)
- Persistence and querying
- Inference
- Must be either resource and model/statement
centric
6Processing models
- Model or Statement centric
- a model is seen as a set of statements (logician
view) - Add, remove, find, get, count triples primitives
- Set operations on models (union, difference,
intersection) - Stanford API, Jena and RDFStore
- Resource centric
- manipulate an RDF model in terms of resources
having properties (OO view) - Models are just containers of resources, literals
and statements - Add, remove, set/get property methods
- Jena, Redland (soon RDFStore)
7RDFStore
- RDFStore is a Perl library to parse, store and
manage RDF constructs - Model centric API
- Pure Perl implementation of the Stanford Draft
Java API plus some extensions - Easy to install
- perl Makefile.pl make make test make install
- Easy to use
- use RDFStore
8Toolkit features
- Modular interface using packages and perltie
- Expat based XML streaming SiRPAC and strawman RDF
parsers - In-memory, local and remote on-disk hashed data
storage (SDBM, BerkeleyDB and DBMS) - Triples matching and SQUISH/RDQL query
- Free-text searching of literals
- Simple contexts support
- Basic RDF Schema inference
- Free software - BSD license
- Tested on FreeBSD, Linux (some win32)
9Storage and indexing
- Multiple single key hash based DBs along with
Perl based Object Serialization - Transparently read/write local and remote DBs
- One RDFStore database currently consists of 7
on-disk hashed DB files - resources, literals, namespaces, statements,
references, model and index - 237.822 triples get stored in 31MB db
(thesaurus) - On-disk Sleepycat BerkeleyDB database with
locking performs 183 read operations/second - Remote DBMS has been tested for 2000/tps
10Remote DBMS storage
- Custom built, fast DBMS server with hooks into
the Perl Language - optimized network routing daemon with a single
thread/process per database - TCP/IP based storage library
- Forking, non blocking IO and extensive buffering
- Original author Dirk Willem van-Gulik
ltdirkx_at_webweaving.orggt
11Requirements
- Required CPAN modules
- XMLParser (which requires expat)
- Storable
- Digest
- DigestMD5
- DigestSHA1
- URIÂ (which requires MIMEBase64)
- Optionals
- BerkeleyDB
- DBMS (distributed with RDFStore)
12API overview
13Packages overview
14API core classes
http//rdfstore.sourceforge.net/documentation/api.
html
15Sample code
Sample perl scripts
16Sample applications
- RDF parser
- http//rdfstoredemo.jrc.it
- RDQL/SQUISH
- http//rdfstoredemo.jrc.it/rdql
- rdf2gif (rdfviz)
- http//rdfstoredemo.jrc.it/rdf2gif.pl
- RSS viewer
- http//xml.jrc.it/wap
- Multilingual Thesaurus
- RDF Schema
- http//etbrowse.jrc.it/thesaurus/thes.rdfs
- API
- http//etbrowse.jrc.it/thesaurus/ETBT/API.html
- Display, search and indexing tool
- http//etbrowse.jrc.it/thesaurus/classifier.pl
- European Treasury Browser (ETB)
- http//etbrowse.jrc.it/
17Sample RDQL query
SELECT ?title_value, ?title_language,
?subject_value,?subject_language,
?description_value, ?description_language,
?language, ?identifier FROM
lthttp//etbrowse.jrc.it/xml/rdf/example10.xmlgt WHE
RE ( ?x, ltdctitlegt, ?tt), ( ?tt,
ltrdfvaluegt, ?title_value), ( ?tt,
ltdclanguagegt, ?ttl), ( ?ttl, ltdcqRFC1766gt,
?title_language), ( ?x, ltdcsubjectgt,
?ss1), ( ?ss1, ltetbthesETBTgt, ?ss2),
( ?ss2, ltrdfvaluegt, ?subject_value), ( ?ss2,
ltdclanguagegt, ?ss3), ( ?ss3,
ltdcqRFC1766gt, ?subject_language), ( ?x,
ltdcdescriptiongt, ?dd), ( ?dd,
ltrdfvaluegt, ?description_value), ( ?dd,
ltdclanguagegt, ?ddl), ( ?ddl,
ltdcqRFC1766gt, ?description_language), ( ?x,
ltdcidentifiergt, ?identifier), ( ?x,
ltdclanguagegt, ?ll1), ( ?ll1, ltdcqRFC1766gt,
?language) USING rdf for lthttp//www.w3.org/199
9/02/22-rdf-syntax-nsgt, rdfs for
lthttp//www.w3.org/2000/01/rdf-schemagt,
dc for lthttp//purl.org/dc/elements/1.1/gt,
dcq for lthttp//purl.org/dc/terms/gt, dct
for lthttp//purl.org/dc/dcmitype/gt, etb
for lthttp//eun.org/etb/elements/gt,
etbthes for lthttp//eun.org/etb/thesaurus/elements
/gt
18Existing RDFStore applications
- RDF web service API for MusicBrainz (Robert Kaye
rob_at_eorbit.net) - http//www.musicbrainz.org
- OCLC work on automatic classification (Devon
Smith devon_at_taller.pscl.cwru.edu) - Scorpion and Wordsmith crawler
- Volga PRISM standard based novels (Prasad A.
Chodavarapu prasad_at_bhaavana.net) - http//volga.bhaavana.net/cgi-bin/rdexplorer.cgi
- European Treasury Browser (ETB) -
http//etb.eun.org - http//etbrowse.jrc.it
- DC/DCQ/DCEDU
- http//eun.org/etb/elements.dtd
- http//eun.org/etb/elements/
- http//eun.org/etb/thesaurus/elements/
- http//eun.org/etb/thesaurus.rdf
19DCQ example
- ltdcqmodifiedgt
- ltdcqW3CDTFgt
- ltrdfvaluegt2001-05-31T14201000lt/rdfva
luegt - lt/dcqW3CDTFgt
- lt/dcqmodifiedgt
- modified records-gtfind(
- records-gtfind(node,DCQmodified)-gtelements
-gtobject, RDFvalue)-gtelements-gtobject-gttoString
20What RDFStore does not doyet ?
- Resource centric API
- First order/predicate logic or intelligent
inferencing - Mapping and/or dumb-down
- lot of other things.
- .rocks )
21Other Perl API
http//xml.jrc.it/rdfapi_perl/
22Demo time
23RDFStore links
- Software project
- http//rdfstore.sourceforge.net
- Demo
- http//rdfstoredemo.jrc.it
- RDFStore vs. rdfdb
- http//groups.yahoo.com/group/rdfstore/message/14
- Documentation
- http//rdfstore.sourceforge.net
/documentation/api.html - http//rdfstore.sourceforge.net /dbms.html
- http//rdfstore.sourceforge.net
/documentation/pod - YahooGroups developer mailing list
- http//groups.yahoo.com/group/rdfstore