Annotation for the Semantic Web - PowerPoint PPT Presentation

About This Presentation
Title:

Annotation for the Semantic Web

Description:

MnM allows multiple ontologies at one time. MnM also stores annotations in a knowledge base ... SemTag uses inductive learning to extract information ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 24
Provided by: tang1
Learn more at: https://tango.byu.edu
Category:

less

Transcript and Presenter's Notes

Title: Annotation for the Semantic Web


1
Annotation for the Semantic Web
  • Yihong Ding
  • A PhD Research Area Background Study

2
Introduction
  • Current web is designed for humans
  • Semantic web (next-generation web) is designed
    for both humans and machines
  • Semantic annotation
  • Disclose semantic meanings of web content
  • Convert current HTML web pages to
    machine-understandable semantic web pages

3
Outline
  • Historical Review
  • Current Status
  • Related Research Fields
  • Future Challenges

4
Semantic Annotation in Ancient Ages
  • No evidence when humans started to annotate text

history of semantic annotation history of
ontologies
5
The First Dream of Modern Semantic Annotation
  • July 1945, Vannevar Bush, As We May Think, The
    Atlantic Monthly
  • Bush's dream device
  • humans could acquire information (World Wide Web)
  • humans could contribute their own ideas (Web
    Annotation)
  • from/to the community

6
Web Annotation before 1999
  • Heck et. al., 1999
  • Developing better user interfaces
  • Improving storage structures
  • Increasing annotation sharability
  • Example systems
  • ComMentor, AnnotatorTM, Third Voice, CritLink,
    CoNote, and Futplex

7
Semantic Labeling before 1999
  • Dublin Core Metadata Standard http//dublincore.o
    rg/
  • 15 element sets encapsulate data
  • Superimposed
  • Information
  • Delcambre et. al., 2001
  • Type
  • Format
  • Identifier
  • Source
  • Language
  • Relation
  • Coverage
  • Rights
  • Title
  • Subject
  • Description
  • Creator
  • Publisher
  • Contributor
  • Date

Superimposed Layer
marks
Base Layer
Information Source1
Information Source2
Information Sourcen

8
Status of Current Web Semantic Annotation Studies
  • Interactive annotation
  • Automatic annotation

9
Interactive Annotation Systems
  • Lets humans interact through machine interfaces
    to annotate documents
  • Problems
  • Inconsistency
  • Error-proneness
  • Lack of scalability
  • Values
  • Easy to implement
  • Suitable for small-scale tasks and experiments
  • Helpful to build corpora for evaluations

10
Interactive Annotation Systems
  • Annotea Kahan et. al., 2001
  • W3C project
  • An open RDF infrastructure for shared web
    annotations
  • SHOE (Simple HTML Ontology Extensions) Heflin
    et. al., 2000
  • University of Maryland, College Park
  • Manual annotator using SHOE ontologies

11
Automatic Annotation Systems
  • Common feature use of ontologies
  • Typical approaches
  • Annotation with automatic ontology generation (1
    system)
  • Annotation with automatic information extraction
    (6 systems)

12
Annotation with Ontology Generation
  • SCORE (Semantic Content Organization and
    Retrieval Engine) Sheth et. al., 2002
  • Voquette (now acquired by Semagix Co.),
    University of Georgia

13
Annotation with Automatic IE
  • Ont-O-Mat Handschuh et. al., 2002
  • University of Karlsruhe at Germany
  • MnM Vargas-Vera et. al., 2002
  • Open University of United Kingdom
  • Common features
  • DAMLOIL ontologies
  • Supervised adaptive learning with Lazy-NLP
    (Amilcare)
  • Annotation stored inside web pages
  • Differences
  • MnM allows multiple ontologies at one time
  • MnM also stores annotations in a knowledge base
  • Ont-O-Mat uses OntoBroker both as an annotation
    server and as a reasoning engine

14
Annotation with Automatic IE
  • KIM Platform Kiryakov et. al., 2004
  • Ontotext Lab., Sirma Group, a Canadian-Bulgarian
    joint venture
  • SemTag Dill et. al., 2003
  • IBM Almaden Research Center
  • Similar features
  • Use one special designed upper-level ontology,
    KIM ontology vs. TAP ontology
  • Specific features
  • KIM uses an NLP tool (GATE) to extract
    information
  • KIM stores annotations in a separate file
  • SemTag uses inductive learning to extract
    information
  • SemTag annotates 264 million Web pages and
    generate approximately 434 million semantic tags

15
Annotation with Automatic IE
  • Stony Brook Annotator Mukherjee et. al., 2003
  • Stony Brook University
  • Structural analysis of DOM tree for HTML pages
  • Drawbacks
  • Taxonomic relationships only
  • No generic labeling algorithm disclosed
  • RoadRunner Labeller Arlotta et. al., 2003
  • Università di Roma Tre and Università della
    Basilicata
  • Automatic assign label names based on image
    recognition
  • Drawbacks
  • Semantic meaning of labels unknown
  • Difficulty in associating labels with ontologies

16
Related Research Fields
  • Semantic Web
  • Information extraction
  • Ontology related topics
  • Conceptual modeling
  • Logic languages
  • Web services

17
Semantic Web
  • Weaving the Web Berners-Lee 1999, birth of the
    Semantic Web
  • The Semantic Web Berners-Lee et. al., 2001

18
Information Extraction Laender et. al., 2002
  • Human-guided approaches
  • Wrapper languages, Modeling-based tools
  • No annotation examples
  • Too heavily human involvement
  • Non-ontology-based approaches
  • HTML-aware tools StonyBrook tool Mukherjee et.
    al., 2003,
  • RoadRunner Labeller Arlotta
    et. al., 2003
  • NLP-based tools Ont-O-Mat Handschuh et.
    al., 2002,
  • MnM Vargas-Vera et. al., 2002,
  • KIM platform Kiryakov et. al., 2004
  • ILP-based tools SemTag Dill et. al.,
    2003
  • Require extra alignment between extraction
    categories in wrappers and concepts in ontologies
  • Ontology-based Approaches
  • Ontology-based tools my proposal
  • Not require alignment, resilient to web page
    layouts
  • Slow in execution time

19
Ontology Related Topics
  • Ontology languages W3C, OWL
  • Knowledge representation and reasoning
  • Ontology generation Ding et. al., 2002a
  • Annotation domain specification
  • Ontology enrichment Parekh et. al., 2004
  • Annotation domain specification expanding
  • Ontology population Alani et. al., 2003
  • Annotation result output
  • Ontology mapping and merging Ding et. al.,
    2002b
  • Large-scale annotation requires large-scale
    ontologies
  • Small-scale ontologies are less expensive to
    build
  • Ontology mapping creates the links among
    small-scale ontologies
  • Ontology merging fuses small-scale ontologies
    into a large-scale ontology

20
Conceptual Modeling
  • Annotation requires knowledge modeling
  • Ontology is a type of conceptual modeling
  • ER Model Chen 1976
  • The most influential conceptual model
  • Influence OSM model, basis of data-extraction
    ontology

21
Logic Languages
  • Logic foundation provides reasoning and inference
    power for modeling languages
  • Examples
  • First-order logic Smullyan 1995
  • Description logics Brachman et. al., 1984

22
Web Services
  • More and more, web services become the typical
    application in semantic web scenario.
  • Two ways aligning web services with semantic
    annotation
  • Web service annotation Brodie 2003
  • Semantic annotation web service

23
Summary and Future Challenges
  • Annotation for the semantic web
  • Enable machine-understandable web
  • Support semantic searching
  • Support global-wide web services
  • Still an unsolved problem
  • Main technical challenges
  • Direct ontology-driven annotation mechanism
  • Concept disambiguation
  • Automatic domain ontology generation
  • Scalability
Write a Comment
User Comments (0)
About PowerShow.com