External Unstructured Data And The Data Warehouse - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

External Unstructured Data And The Data Warehouse

Description:

The above figure shows: When external data enters the corporation in an undisciplined fashion ... Below given figure shows notification data ... – PowerPoint PPT presentation

Number of Views:103
Avg rating:3.0/5.0
Slides: 19
Provided by: susansh9
Category:

less

Transcript and Presenter's Notes

Title: External Unstructured Data And The Data Warehouse


1
External / Unstructured Data And The Data
Warehouse
  • Chapter 8
  • Kumar Neti
  • Amresh Mohanlal
  • Susan Shanlever

2
Internal Structured Data
  • Data that comes internally from the corporation
    and has been already shaped into a regularly
    occurring format

3
External Data
  • Data that is of legitimate use to a corporation
    that is not generated from the corporations own
    systems
  • It enters the corporation in an unstructured,
    unpredictable format
  • If external data is not stored in a centrally
    located place, several problems are sure to arise
  • Data warehouse is an ideal place to store
    external and unstructured data

4
  • The above figure shows external and
    unstructured data entering the data
  • warehouse

5
  • The above figure shows
  • When external data enters the corporation in an
    undisciplined fashion
  • The identity of the source of the data is lost
  • There is no coordination whatsoever in the
    orderly use of the data

6
External/Unstructured Data in the Data Warehouse
  • Problems of external and unstructured data
  • Are
  • Frequency of availability
  • Totally undisciplined
  • Unpredictability

7
Most Common Types of Unstructured Data
  • Image data, stored as pictures
  • Voice data, stored digitally and can be
    translated back into voice format
  • - The technology to capture and
  • manipulate image and voice data is not
  • nearly as mature as more conventional
  • technology.

8
Methods to Capture and Store Unstructured
Information
  • To place it on some bulk storage medium such as
    near line storage
  • Create two stores of unstructured data
  • - one store contain all of the unstructured
  • data
  • - Another is a much smaller store
  • containing only a subset

9
Meta Data and External Data
  • Meta data is an important component of the data
    warehouse
  • Above figure shows the importance and role of
    metadata

10
Meta Data and External Data contd
  • Through meta data, the manager determines much
    information about the external data
  • Scanning meta data eliminates much work because
    it filters out documents that are not relevant or
    are out of date
  • Properly built and maintained meta data is
    absolutely essential to the operation of the data
    warehouse, particularly with regard to the
    external data

11
Notification Data
  • Below given figure shows notification data
  • It is a file created for users of the system that
    indicates classifications of data that is
    interesting for the users

12
Storing External/Unstructured Data
  • An entry is made in the meta data of the
    warehouse describing where the actual body of
    external data can be found
  • The external data is then stored elsewhere, where
    it is convenient as shown in the figure

13
  • External Data may be stored in/on
  • - Filing Cabinet
  • - Fiche
  • - Magnetic Tape

14
Components of External/Unstructured Data
  • Contains many different components, some of which
    are more use than others
  • Large amount of unstructured data can be
    efficiently stored and managed in the following
    manner
  • - To manage the data, an experienced DSS
  • analyst needs to determine what are the most
  • important units of data
  • - Then these units are stored in an easy to get
  • to location
  • - The remaining less important data is placed
    in
  • a bulk storage location

15
Modeling and External/Unstructured Data
external data
unstructured data
data model
data warehouse
Figure 8.6 There is only a faint resemblance of
external data/unstructured data to a data model.
Furthermore, nothing can be done about reshaping
external data and unstructured data.
From Building the Data Warehouse, 3rd Ed. by W.
H. Inmon
16
Secondary Reports
From Building the Data Warehouse, 3rd Ed. by W.
H. Inmon
17
Archiving External Data
  • The useful lifetime of external data
  • Discard or archive
  • Storage

18
Comparing Internal Data to External Data
From Building the Data Warehouse, 3rd Ed. by W.
H. Inmon
Write a Comment
User Comments (0)
About PowerShow.com