Mining the Semantic Web: Requirements for Machine Learning - PowerPoint PPT Presentation

About This Presentation
Title:

Mining the Semantic Web: Requirements for Machine Learning

Description:

Mining the Semantic Web: Requirements for Machine Learning Fabio Ciravegna, Sam Chapman Presented by Steve Hookway 10/20/05 What is the Semantic Web A way to automate ... – PowerPoint PPT presentation

Number of Views:166
Avg rating:3.0/5.0
Slides: 10
Provided by: SteveH220
Category:

less

Transcript and Presenter's Notes

Title: Mining the Semantic Web: Requirements for Machine Learning


1
Mining the Semantic WebRequirements for Machine
Learning
  • Fabio Ciravegna, Sam Chapman
  • Presented by
  • Steve Hookway
  • 10/20/05

2
What is the Semantic Web
  • A way to automate reasoning with web data
  • RDF
  • A uniform way to describe resources
  • (subject,predicate,object)
  • Ontology
  • Hierarchical structure of data
  • Property restrictions
  • Implicit typing

3
Adding Meta-Data
  • A prerequisite for Semantic Web (SW) is
    structured knowledge
  • Manual Approach
  • Too Much data
  • Trust Issues
  • Noise
  • This process needs to be automated

4
Armadillo
  • Automatically annotate web pages
  • Validity based on a number of weak techniques
  • Redundant Information
  • Rating of Sources
  • Context around a capture
  • (LP)² - Extraction of knowledge
  • Makes use of Natural Language Processing (NLP)

5
(LP)²
  • Induce tagging rules
  • Generalize NLP and keep best rules lttaggt
  • Remove covered instances from pool
  • High Precision, Low Recall
  • Contextual Tagging
  • Recovers rules and constrains their application
    lt/taggt
  • Correction and Validation
  • Shifts tags to correct position (within d spaces)
  • Validation

6
Heterogeneity
  • Armadillo
  • Uses weak NLP
  • Uses intra-document relation recognition
  • Requirements
  • Must adapt to different document types
  • Relation Extraction

7
Bootstrapping Learning
  • Armadillo
  • Unsupervised approach user only validates
  • User cannot drive system towards interesting
    documents and facts
  • Requirements
  • Identify triples
  • Goal Bootstrap learning on a large scale
  • User needs a role to guide learning

8
Content Cleaning and Normalization
  • Armadillo
  • Noise added during unsupervised (LP)²
  • Use the multiple weak evidence to help avoid poor
    seeds
  • Requirements
  • Handle noisy training data

9
Conclusion
  • Semantic Web
  • Meta-Data
  • Armadillo a tool for IE
  • Evidence Building and Validation
  • Extraction of knowledge (LP)²
  • A survey of requirements in mining web content
    for SW meta-data
Write a Comment
User Comments (0)
About PowerShow.com