Data Extraction From HTML Tables - PowerPoint PPT Presentation

About This Presentation
Title:

Data Extraction From HTML Tables

Description:

Attribute-Value Pair. Attribute: (part of the) constant/key word rule. How To Solve This Problem? Put the attribute-value pair together. Try both order. More ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 13
Provided by: Cui79
Learn more at: https://www.deg.byu.edu
Category:
Tags: html | data | extraction | pair | tables

less

Transcript and Presenter's Notes

Title: Data Extraction From HTML Tables


1
Data Extraction From HTML Tables
  • Cui Tao
  • Department of Computer Science
  • Brigham Young University

2
Information In Tables
  • Nowadays, significant portion of the information
    on the Wed is stored in tables.

3
The Ontology-Based Extraction
4
The Ontology-Based Extraction
5
Major Problems
  • In the tables, the values and their corresponding
    attributes are separately. But the ontology can
    only extract the data when they are together.
  • Sometimes the attributes in the table are the
    values in the database, the values in the table
    are only the identifier of the attributes.
  • Sometimes, the values in one cell of the table
    may informs several attribute values in the
    database.

6
Attribute-Value Pair
  • Attribute (part of the) constant/key word rule

7
How To Solve This Problem?
  • Put the attribute-value pair together.
  • Try both order.

8
More General
9
Attribute Value
  • The attributes in the table are actually values
    in the database

10
How To Solve This Problem?
  • Put attribute in the file depends on the Boolean
    value

11
Value Multiple Information
12
More Problems
Write a Comment
User Comments (0)
About PowerShow.com