Toward Automatic Processing and Indexing of Microfilm - PowerPoint PPT Presentation

About This Presentation
Title:

Toward Automatic Processing and Indexing of Microfilm

Description:

Extract Data. The algorithm identifies factored table. values. The algorithm stores each record ... Extracted by hand. Microfilm Queries. A web form provides ... – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 27
Provided by: drdavid59
Learn more at: https://www.deg.byu.edu
Category:

less

Transcript and Presenter's Notes

Title: Toward Automatic Processing and Indexing of Microfilm


1
Toward Automatic Processing and Indexing of
Microfilm
2
Microfilm Processing
Images are scanned from ribbons of microfilm.
Each image on the microfilm ribbon is then
cropped and de-skewed.
3
Microfilm Processing
Cropped and De-skewed Image
4
Lines in a document emit a unique
signature.
Image Zoning
  • The algorithm searches for these
  • patterns to detect the lines that
    describe a table.

5
Image Zoning
Automatically IdentifiesTable Structure.
6
Optical Character Recognition
  • A neural net evaluates each zone in the
    image.
  • The neural net converts the printed
    characters in each zone into ASCII text.

7
Optical Character Recognition
8
Column-Row Recognition
  • The algorithm uses the geometry of
  • each zone to identify the tables
    columns and rows.
  • The algorithm associates each column and
    row label with its values in the
  • table.

9
Column-Row Recognition
10
Identify Labels
  • The algorithm maps the printed text of
    each label to a standardized name.
  • The standardized names correspond to the
    fields in a database.

11
Identify Labels
ROAD, STREET, c., And No. or NAME of HOUSE
Address
12
Identify Labels
NAME and Surname of each Person
Full Name
Address
13
Identify Labels
RELATION to Head of Family
Relationship
Address
Full Name
14
Extract Data
  • The algorithm identifies factored table
    values.
  • The algorithm stores each record in an XML
    file.

15
Extract Data

Collafer
Extracted by hand.
16
Extract Data

John Eyres
Head
Collafer
Extracted by hand.
17
Extract Data

Annie Eyres
Wife
Collafer
Extracted by hand.
18
Extract Data

Lehailes Eyre
Son
Collafer
Extracted by hand.
19
Microfilm Queries
  • A web form provides the interface to query
    the microfilm database.
  • Individuals can enter keywords (such as
    a first and last name), and the system
    locates appropriate records in the indexed
    microfilm documents.

20
Web Query
John
Eyre
21
Search Results
  • The system returns the indexed images that
    contain the results.
  • Since the database indexes both the text
    and geometry of the document, the process
    can return just the relevant regions of the
    microfilm image.

22
Search Results
23
Search Results
24
Just-In-Time Browsing
  • To make the query results display quickly,
    the system uses Just-In-Time Browsing.
  • Just-In-Time Browsing will allow people to
    browse digitized microfilm and other large
    collections of images over the Internet at
    interactive rates.

25
Just-In-Time Browsing
26
Just-In-Time Browsing
Write a Comment
User Comments (0)
About PowerShow.com