Extracting RDF Data from Unstructured Sources Based on an RDF Target Schema PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: Extracting RDF Data from Unstructured Sources Based on an RDF Target Schema


1
Extracting RDF Data from Unstructured Sources
Based on an RDF Target Schema
  • Tim Chartrand
  • Research Supported By NSF

2
Motivation
  • Semantic Web Global machine understandable
    knowledge base
  • WWW lots of information/data designed for human
    consumption
  • DEG contribution Extract data from the human
    readable web
  • Proposed solution Extract WWW data and
    structure it in the Semantic Web format (RDF)

3
Overview of Proposed Research
User
HTML Page
Extraction Ontology
RDF Schema
Extraction Engine
Relational Data
RDF Data
4
RDF What is it?
  • Resource Description Framework
  • Language of the Semantic Web
  • Set of subject-predicate-object triples
  • tim.html, creator, tim, tim.html, type,
    thesis
  • ltRDFgt
  • ltDescription abouttim.htmlgt
  • ltCreatorgtTimlt/Creatorgt
  • ltTypegtThesislt/Typegt
  • lt/Descriptiongt
  • lt/RDFgt

Tim
creator
tim.html
type
Thesis
5
RDF Schema Basics
  • Core Concepts
  • rdfsclass The usual concept of a class.
  • Ex. Class Person
  • rdfssubClassOf Specifies the generalization of
    a class
  • Ex. Class Teacher is subClassOf Person
  • rdfsproperty Can apply to a class. Has a value
    which.
  • Ex. Class Person has property Name
  • rdfsdomain Classes to which a property can
    apply.
  • Ex. Property Name has domain Person
  • rdfsrange Possible values of a property.
  • Ex. Property Name has range Literal
  • rdfssubPropertyOf Specifies the generalization
    of a property
  • Ex. Property Nickname is subPropertyOf Name

6
Example RDF Schema
  • Full Schema
  • ltrdfsClass rdfIDPerson gt
  • lt/rdfsClassgt
  • ltrdfsClass rdfIDFuneral gt
  • lt/rdfsClassgt
  • ltrdfProperty rdfIDPFuneral" gt
  • ltrdfsdomain rdfresource"Person"/gt
  • ltrdfsrange rdfresource"Funeral"/gt
  • lt/rdfPropertygt
  • ltrdfProperty rdfID"Name" gt
  • ltrdfsdomain rdfresource"Person"/gt
  • ltrdfsrange rdfresource"rdfsLiteral"/gt
  • lt/rdfPropertygt

7
RDF Schema Graph
8
Extraction Ontology
  • Ontology Structure
  • Classes map to object sets
  • Properties map to binary relationship sets
    between classes
  • Literal properties map to relationship sets
    between classes and lexical data frames
  • Primary Object Constraints best guess based
    on heuristics\
  • Data Frames
  • Need a data frame library
  • Match properties with data frame library
  • Specialize the property data frames

9
User Modification
  • Cardinality Constraints
  • Allow the user to edit any of the generated
    constraints
  • Keep track of changes affects database schema
  • Data Frames
  • Provide a data frame editor
  • Allow user to modify the specialized data frames
  • Usually only add key words

10
Input Web Page
11
Relational Data
12
Extracted RDF Data
  • Full RDF
  • ltobitPerson rdfID"1001"
  • obitName"Lemar K. Adamson"
  • gt
  • ltobitFuneral rdfresource"5001" /gt
  • lt/obitPersongt
  • ltobitFuneral rdfID"5001"
  • obitFuneralAddress"1540 E. Linden"
  • obitFuneralDate""
  • obitFuneralTime"1000 a.m."gt
  • lt/obitFuneralgt

13
RDF Data Graph
14
Conclusions
  • Converting RDF Schemas to Data Extraction
    Ontologies can be done with some user
    interaction.
  • The nature and amount of user interaction
    necessary for good data extraction is a good
    topic for research
  • Converting relational data to RDF data can be
    done automatically
Write a Comment
User Comments (0)
About PowerShow.com