Title: Smeagol:
1Smeagol A Goal Directed Learning Agent for the
Semantic Web
- Motivation
- The Semantic Web benefits from learning
- Users need personalised views of information.
- Ontologies may not provide relevant properties
for your application. - Machine Learning techniques can help identify
this missing information! - Machine Learning benefits from the Semantic Web
- Learning methods often require large amounts of
domain specific background knowledge and rules. - Previously this knowledge has been manually
acquired from domain-experts and coded in the
representation language of the applications. - Now RDF provides a standard representation
language and the Semantic Web provides a
world-wide web of knowledge backed by ontologies! - However
- In the Semantic Web context we have to confront
issues such as scalability and data sparseness.
- Goal Directed Learning
- What types of Goals are there?
- Set-Difference
- e.g. distinguish between movies from the 80s and
90s. - Identify-Sets
- e.g. differentiate Star Wars Trilogy from LOTR
trilogy. - Knowledge Reformulation
- e.g. map instances from one ontology to another.
- How do learning goals arise?
- How do learning goals affect the learning
process? - How do learning-goals relate to each other?
- How are learning goals represented?
- FOAF Experiments
- We have previously conducted learning experiments
using FOAF data. - 6.5 Million Triples
- ILP and clustering used to identify unspecified
- conceptualisations, and learn descriptions of
these. - Extremely resource-hungry! Gigabytes of memory
- and days to perform experiments.
- member(A) -
- foaf_groupHomepage(A, http//www.aktors.org).
- member(A) -
- contact_nearestAirport(A, airports?ABZ).
FOAF Experiments
Goal Directed Learning
Motivation
We use an RDF representation, where a goal has
the following properties GoalType, Parameters,
Contexts, Datasources.
ltSmeagolGoalgt lttype rdfresourcesmeagolSetD
ifference /gt ltset1gt ltQuerygtltrdqlgtSELECT ?x
WHERE (?x, ltimdbdirectorgt,
lthttp//imdb.com/name/nm0040/gt)lt/rdqlgtlt/Querygt
lt/set1gt ltexclude-predicatesgt ltrdfSeqgt
ltrdfli rdfresourceimdbdirector /gt
lt/rdfSeqgt lt/exclude-predicatesgt ltdatasource
rdfresourcedataIMDbMovies /gt ltdatasource
rdfresourcedataIMDbPeople /gt ltcontext
rdfresourceimdbArty /gt
- Smeagol
- Implemented in Python
- ILP Srinavasans Aleph
- Clustering Hierarchical Agglomerative Clusterer
- Rule learning Cohens Slipper
Smeagol
Data-Acquisition
Learning
- Sub-Goals
- Pick a context
- Learn a missing predicate
- Not all datasources will include all predicates.
- Include additional resources
- Fetch ontologies, traverse graph further.
- Sub-divide learning space
- e.g. map year to decade.
ILP
Clustering
Rule-learner
Create Sub-goal
Success!
Result
Sci-Fi movies have a special effects producer
and are directed by Stanley Kubrick or George
Lucas.
Failure
Moviehack Research Tool Built to allow easy
experimentation with and evaluation of Smeagol.
- RDF Data
- Extracted from Internet Movie Database (IMDB).
- Homemade ontology using FOAF.
- Top 250 movies ? 30k people.
- 1.5 Million Triples.
- Trivia information extracted from text
descriptions. - ltimdbMovie rdfIDhttp//imdb/title/tt0068646/gt
- ltdctitlegtThe Godfatherlt/dctitlegt
- ltimdbyeargt1972lt/imdbyeargt
- ltimdbdirector rdfresource/nm00338 /gt
- ltimdbgenre rdfresourceimdbDrama /gt
- ltimdbcastgt
- ltimdbRolegt
- ltfoafnamegtDon Vito Corleonelt/foafnamegt
- ltimdbactorgt
- ltfoafPerson rdfresourcegt
- ltfoafnamegtMarlon Brando .
- User can choose predicates to include/exclude
from learnt rules.
- RDQL is used to specify sets.
- Web-Interface written in PHP using RDF API for
PHP (RAP)
- Results presented as Prolog rules.
Demo Available!
Gunnar Aastrand Grimnes, Alun Preece Pete
Edwards Computing Science Department, University
of Aberdeen, UK
Contact ggrimnes_at_csd.abdn.ac.uk