Data Capture Overview - PowerPoint PPT Presentation

About This Presentation
Title:

Data Capture Overview

Description:

Data Capture Overview United Nations Statistics Division Advantages and Disadvantages of Keyboard Data Entry Advantages Method requires simple software systems and ... – PowerPoint PPT presentation

Number of Views:84
Avg rating:3.0/5.0
Slides: 20
Provided by: UnitedN8
Learn more at: https://unstats.un.org
Category:

less

Transcript and Presenter's Notes

Title: Data Capture Overview


1
  • Data Capture Overview
  • United Nations Statistics Division

2
Overview of Presentation
  • Definition of data capture
  • Methods of data capture
  • Different Methods
  • Advantages and disadvantages
  • Issues to consider

3
Whats Data Capture?
  • Data capture is the system used to convert the
    information obtained in the census to a format
    that can be interpreted by a computer.
  • Source United Nations Principles and
    Recommendations for Population and Housing
    Censuses, Rev. 2, p.68.

4
Data Capture Methods
  • Keyboard data entry
  • Optical mark recognition/reading (OMR)
  • Optical character recognition/intelligent
    character recognition (OCR/ICR)
  • Personal digital assistant (PDA)
  • Internet
  • Advantages/disadvantages/costs/impacts at both
    data capture and later stages
  • Combination of more than one of the above methods

5
Keyboard Data Entry
  • Response codes from census form are manually
    entered into computers
  • Sophisticated version involves computer assisted
    key entry where operator selects a response from
    options displayed on the screen
  • Use of method based on time and cost
    considerations, and feasibility to implement more
    sophisticated technology
  • Method also used to process textual responses
    into classification categories

6
Advantages and Disadvantages of Keyboard Data
Entry
  • Advantages
  • Method requires simple software systems and
    low-end computing hardware
  • Less costly (depending on the costs of manpower)
  • There will be a large number of PCs available for
    other uses after census
  • Disadvantages
  • Requires more staff
  • Task takes much longer time to complete than with
    automated data entry
  • Potential for errors during data entry
  • Standardization of operations is difficult as
    performance may be individually dependant

7
Data Capture Technologies
  • Imaging and intelligent character recognition
    offer great potential and benefits for data
    capture
  • Use of technology for data capture should be to
    enhance effective and efficient data capture and
    not for technologys sake
  • Awareness of long lead times and technology
    infrastructure required for successful
    implementation of intelligent character
    recognition

8
Optical Mark Recognition/Reading (OMR)
  • OMR is a form-scanning method whereby responses
    are read into a computer without a keyboard
  • OMR technology reads responses to tick-box type
    questions on specially designed paper
  • Only presence or absence of a mark is detected by
    the machine
  • The scanned responses are transformed into codes
  • Handwritten responses must be manually entered or
    coded using computer-assisted methods

9
Advantages and Disadvantages of OMR
  • Advantages
  • Improved data accuracy
  • Data capture faster than keyboard data entry
  • Equipment is relatively inexpensive
  • Relatively simple to install and run
  • A well-established technology thats been used in
    many countries
  • Disadvantages
  • Restrictions as to form design
  • Restrictions on type of paper and ink
  • Precision required in printing process/cutting of
    sheets
  • Response boxes should be correctly marked with
    appropriate pen or pencil
  • Wont capture textual responses

10
Optical Character Recognition (OCR)/ Intelligent
Character Recognition (ICR)
  • OCR and ICR combine scanning and character
    recognition technology to scan the whole form and
    interpret the responses
  • OCR technology recognizes machine-printed
    characters only
  • ICR technology reads both machine-printed and
    hand-written responses in specific locations of
    the page and transforms the responses into codes
  • For OCR, handwritten responses must be manually
    entered or coded using computer-assisted methods

11
Advantages of OCR/ICR
  • Form design is not as stringent as for OMR
  • Processing time can be reduced due to automated
    nature of the process
  • Allow for digital filing of questionnaires
    resulting in efficiency of storage and retrieval
    of questionnaires for future use
  • Some handwritten responses can be automatically
    coded thereby improving data quality


12
Disadvantages of OCR/ICR
  • Higher costs of equipment (sophisticated
    hardware/software required)
  • High calibre IT staff required to support the
    system
  • Handwriting on census forms be as close as
    possible to the model handwriting to avoid
    recognition error
  • Possibility for error during character
    substitution which would affect data quality
  • Tuning of recognition engine to accurately
    recognize characters is critical with trade-off
    between quality and cost

13
Personal Digital Assistant (PDA)
  • Contents of the census form are stored onto the
    PDA so that the questions appear sequentially on
    the screen
  • Data are entered into a hand-held computer
    instead of onto a paper census form
  • Data are then electronically transmitted to an
    NSO database for further processing

14
Advantages and Disadvantages of use of the PDA
  • Advantages
  • Instant data capturing at the point of
    collection, reducing manual input errors
  • Immediate data validation, reducing
    re-verifications at later stage
  • Time effective with real time logical validation
    rules, reducing logical errors
  • Faster processing of census information leading
    to timely availability of results
  • Disadvantages
  • Setting up of process may take a long time as it
    requires extensive testing
  • Requires that enumerators have ability to use the
    device which may require administering a test
  • Requires intensive training of enumerators on use
    of device (training is more complicated)
  • Need to recharge the battery which could run out
    during enumeration
  • Possibility of equipment failure

15
Internet-based Data Collection
  • Use of the Internet for census data collection is
    growing
  • However, the method is always complementary to
    other more established methods
  • Like with PDAs, the on-line form is not a
    downloadable version of the paper form
  • Use of this method requires a password in order
    to access and fill in the form
  • Development of the internet system for data
    collection is generally outsourced for lack of
    in-house expertise

16
Advantages/Disadvantages of use of the Internet
  • Advantages
  • Reduced resources necessary for form handling and
    data capture
  • Better opportunity to enumerate difficult to
    reach and to enumerate geographic area and
    population groups
  • Automatic filtering of irrelevant questions
  • Better quality data due to in-built interactive
    verification mechanism
  • Faster availability of census results through
    simplified data entry and editing
  • Disadvantages
  • Requires that respondents have a computer with
    Internet access
  • Management of responses can be problematic, e.g.,
    that households have responded once and only once
  • Requires high security system to ensure safe
    transfer of data
  • Need to build parallel processing system as not
    everyone will use the Internet
  • Requires mechanism to check for omitted and
    duplicate submissions
  • Is costly and requires a lot of resources for
    setting up and adequately test the system

17
Issues to Consider in Choosing a Method
  • Method to use is dependant on national
    circumstances
  • Choice of method should be part of the overall
    strategic objective of the census in terms of
    timeliness, accuracy and cost
  • Choice of processing system and technology to use
    need to be established early in census cycle
  • Enough time is required to test and implement the
    system
  • When imaging technology is used for data capture,
    extensive testing is required well in advance of
    the census
  • Possibility to outsource when the required
    expertise is not available in-house

18
Issues to consider (cont.)
  • Extensive testing of the system is also critical
    when data collection is either by PDA or via the
    Internet
  • Design and paper quality of census form should be
    linked to method of data capture
  • When imaging technology is to be used, adequate
    training of enumerators on how to properly fill
    in the forms is crucial

19
  • Thank you
Write a Comment
User Comments (0)
About PowerShow.com