What lies beneath Building a semantic webready repository for complex collections - PowerPoint PPT Presentation

About This Presentation
Title:

What lies beneath Building a semantic webready repository for complex collections

Description:

... about the promise of futuristic kitchens, full of labour saving devices and intelligent fridges ... Custom. Search. SPARQL. Endpoint. 15. Visualisation tools ... – PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 24
Provided by: cor9156
Category:

less

Transcript and Presenter's Notes

Title: What lies beneath Building a semantic webready repository for complex collections


1
What lies beneath? Building a semantic web-ready
repository for complex collections
  • Louise Corti UKDA
  • Agostina Martinez, Patrick Carmichael, CARET,
    Cambridge

IASSIST 2009
2
The Ensemble Project
  • Semantic Technologies for the Enhancement of Case
    Based Learning
  • 3 Year, 1.5 Million ESRC/EPSRC Project
    Research, Development and Implementation
    (2008-2011)
  • working with teachers and students in
    undergraduate and postgraduate courses to explore
    both the nature and role of the cases around
    which learning is focused
  • and the part that emerging semantic web
    technologies can play in supporting this learning
  • a big, happy interdisciplinary and
  • multi-institutional extended family
  • website http//www.ensemble.ac.uk

2
3
Pedagogy
  • examining teaching and learning in complex,
    politically or ethically contentious, and
    rapidly-evolving fields where case-based learning
    is the pedagogical approach of choice
  • how do teachers and learners design, develop,
    describe and reconstruct cases, and how do these
    processes contribute to academic and professional
    outcomes?
  • the learning technologies need to be robust yet
    flexible enough to support teachers and learners
    as they grapple with complex situations and
    develop creative solutions
  • and they need to be able to easily access, adapt
    and manage their case based learninga
    pedagogical challenge!

4
The settings
  • where reflective processes allows learners to
    achieve the higher levels of understanding and
    capability that characterise the expert or the
    virtuoso
  • advanced undergraduate, taught postgraduate and
    professional development courses (6 groups)
  • teachers and learners are taking part in
    case-building activities in which semantic web
    tools and digital repositories are used to
    support engagement with rich case data
  • data differently structured and represented and
    in which alternative constructions of cases are
    possible

5
Technical aims
  • repurposing, reconfiguring and enhancing existing
    repositories and other data sources
  • aims to easily translate research data in a
    Repository for integration into applications
    which use semantic or 'Web 3.0' technologies
  • federated searches
  • visualisation tools
  • collaborative working environments
  • allow end-users to engage in flexible discovery,
    aggregation, representation and visualisation of
    data using
  • topic maps, tag clouds, timelines and maps
  • VLE's and wikis to share data, interpretation and
    analysis

6
One Semantic Web Vision
Tim Berners Lees 2001 vision of the SW -
personalisation of services through seamless
integration of web based systems
At the doctor's office, Lucy instructed her
Semantic Web agent through her handheld Web
browser. The agent promptly retrieved information
about Mom's prescribed treatment from the
doctor's agent, looked up several lists of
providers, and checked for the ones in-plan for
Mom's insurance within a 20-mile radius of her
home and with a rating of excellent or very good
on trusted rating services. It then began trying
to find a match between available appointment
times supplied by the agents Berners-Lee et
al, 2001
The general tone is not unlike that of upbeat
1950s films about the promise of futuristic
kitchens, full of labour saving devices and
intelligent fridges
Source Stellman Greene
7
Our semantic web application
  • Backend archiving systems and tools for data
    management
  • digital repositories and libraries, with data
    and/or metadata in differing formats
  • Web services lookups, converters, searches (i.e.
    external data providers)
  • Middleware data aggregation and semantic data
    management
  • Triplestore large data aggregators containing
    data, metadata, vocabularies, ontologies and sets
    of rules
  • Endpoints and APIs to allow querying the
    Triplestore
  • Frontend presentation and visualization of data
  • Web Interfaces, portals, visualization tools,
    personal information managers

7
8
A semantic web application
9
The technologies we are using
  • Our back end repository Fedora
  • open source digital repository framework
  • specifically oriented towards supporting semantic
    web applications (Fedora 3.0 represents a major
    upgrade)
  • stores digital objects and manages external
    references
  • enforces no specific collection structure and
    allows multiple metadata schemes to be used
    describe specific resources

10
Fedoras SW potential
  • also allows in-line RDF semantic data to be
    stored in a digital object
  • these can be streamed directly to other
    applications
  • can search across the repository using exposed
    metadata AND semantic information if present
  • relationships among digital resources need to be
    defined to enable this e.g. just like DDI3 is
    doing

11
Data out
  • convert data to RDF/XML using a RDFizer
  • Triplify or RDF123
  • Eg Excel to RDF, PDF to RDF and so on
  • metadata record (in RDF/XML) accompanies data
    with permanent address to dataset
  • using the Fedora Resource Index module to index
    relationships among objects (contained in the
    inline RDF datastreams - RDF/XML)
  • now available to aggregators, triplestores,
    reasoners
  • we storing and syncronizing the metadata in every
    object into a Mulgara Triplestore

12
Triplify
small plugin which reveals the semantic
structures encoded in relational databases by
making database content available as RDF, JSON or
Linked Data
13
Mulgara Semantic Triplestore
  • is a large database optimised for very rapid
    searching and pattern matching
  • It does this by rendering all data into triples
    - a record of information in the form of subject
    - predicate object
  • eg URL - property of the resource - value of
    that property
  • can be used to describe connectedness of objects
  • a single bibliographic record is represented by
    about 10-20 triples
  • a Triplestore can contain hundreds of millions of
    triples
  • N3 format (Notation3) is a compact and readable
    alternative to RDF's XML syntax

14
SPARQL endpoints
  • emerging W3C standard for semantic data
    management, aggregation, selection and querying
    semantic triplestores
  • exploration of SPARQL as a basis for user
    interaction with data sets and a means of
    exposing repository content for querying, reuse
    and repurposing
  • we have implemented as a set of predefined
    queries running across the Triplestore
  • results are formatted on the fly for the
    visualisation tools at hand
  • with SPARQL, Web applications can be
    constructed without extensive additional
    templating or scripting - 'lowerins the bar'

15
Fedora Configuration 3 Custom search
FEDORA
OAI-PMH Feed
DC
API-A GET
RELS-EXT
XLS
Mulgara Triplestore
Inline RDF
Custom Search
15
16
Visualisation tools
  • Using SIMILE tooklit based at MIT and supported
    by WWW3 and Hewlett-Packard labs
  • SIMILE tools
  • customisable browser LONGWELL aggregates RDF
    content from multiple sources and presents them
    through a faceted browser
  • can then display through catalogues, maps,
    timelines, network views, eg using Web widgets
    such as SIMILEs Exhibit geo representations and
    Timeline

17
(No Transcript)
18
What Kinds of Questions?
  • What is the latin name for Aleppo Pine?
  • What does an Aleppo Pine look like?
  • How do Aleppo Pines reproduce?
  • Show me a map of their distribution?
  • Is this a picture of an Aleppo Pine?
  • Tell me about Aleppo Pines?
  • Show me examples of plants which frequently
    inhabit the same environment as Aleppo Pines
  • What insect life do Aleppo Pines support?
  • What do people from Aleppo call Aleppo Pines?

Source PlantWiki
19
Geo visualisation
20
Exhibit faceted browsing
21
Interactivity and creativity
  • encourages students to experiment, construct
    their own evidence-based cases
  • appreciate new data sources, be more adventurous,
    have more fun!
  • discuss findings with fellows using social
    networking tools and so on
  • and give back newly constructed datasets

22
Summary
  • Fedora Digital Repository provides a framework to
    store large and heterogeneous data
  • not only access to the metadata descriptions but
    access to the data itself
  • data structured and defined in semantic-ready
    format
  • triplestores like Mulgara enable to aggregate and
    reason across different data sources
  • visualization and presentation tools
  • process semantic-ready data and present the
    information in different formats

22
The Ensemble Project. 2009
23
Implications for the likes of us?
  • access to generically applicable and well
    documented tools scripts, APIs in an open
    access Tools Library
  • need help implementing such tools using the
    experience of existing implementers
  • We need to know
  • what technical skills does one need and what will
    it cost?
  • how much manual data manipulation needs to be
    done
  • how easy is it to integrate these tools into
    existing systems and platforms e.g. VREs and
    VLEs?
  • and so on
Write a Comment
User Comments (0)
About PowerShow.com