Data Warehouse - PowerPoint PPT Presentation

About This Presentation
Title:

Data Warehouse

Description:

2. The ability to do 'what if' processing. 3. Step back and analyze what has previously occurred to bring ... We can ignore unimportant data and obsolete data. ... – PowerPoint PPT presentation

Number of Views:221
Avg rating:3.0/5.0
Slides: 23
Provided by: cseBu
Learn more at: https://cse.buffalo.edu
Category:

less

Transcript and Presenter's Notes

Title: Data Warehouse


1
Data Warehouse
  • Yong Shi
  • CSE DEPARTMENT

2
Strategic delivery of information
  • The current Situation
  • The never-ending quest to access any
    information, anywhere, anytime.
  • The problem
  • Data is scattered in many types of incompatible
    structures.

3
Analytical processing requirements
  • Four levels of analytical processing
  • 1. Simple queries and reports
  • 2. The ability to do what if processing
  • 3. Step back and analyze what has previously
    occurred to bring about the current state of date
  • 4. Analyze what has happened in the past and
    what needs to be done in the future for a
    specific change

4
Information data superstore(IDSS)
  • Definition
  • The architecture needed to support the
    far-ranging requirements of the four levels of
    analysis.
  • Also called super data warehouse
  • Data warehouses is not an end of themselves but
    merely a step on the path to the information data
    super store

5
Why need for a separate environment
  • The use of operational systems v.s data warehouse
  • The datas characteristics
  • The type of access

6
A strategy for building a data warehouse
  • Need indicators
  • Action steps
  • Three-stage data warehousing processing
  • model ? build ? deploy
  • (understand) (establish) (implement)

7
Organizational and cultural issues
  • Cultural imperatives
  • Success criteria
  • Satisfy users requirements
  • Make a significant contribution to the success of
    the business
  • The users accept and actively use it
  • The benefits are not exceeded by the costs
  • An adequate budget must be in place

8
Organizational and cultural issues
  • Success criteria(continued)
  • The implementation of the data warehouse must not
    cause other problems that overshadow the benefits
  • A reasonable schedule must be established

9
Organizational and cultural issues
  • End user(client)
  • Strategic architecture
  • User liaison
  • End-user support
  • Data analyst
  • Security office
  • Data administration

10
Organizational and cultural issues
  • Database administration
  • Choosing the initial data and department
  • Establishing an infrastructure
  • Training users
  • Change in the power structure

11
End Users
  • A crucial part of the project
  • Gathering requirements and managing expectations
  • Cost justification process
  • Design reviews
  • User perspective
  • User training

12
A technical architecture for DW
Data Manager Component Warehouse Data
Data Delivery Component
External Data
Source Data
Data Acquisition Component
Data Access Component
Middleware Component
Information Directory Component Warehouse Data
Design Component
External Data
Management Component
13
Data Quality
  • Why is data quality important?
  • Data is a critical issue
  • It will limit the ability of the end users
    to make informed decision.
  • It has a profound effect on the image of the
    enterprise.
  • The poor one will make it difficult to make
    major changes in an organization.

14
Data Quality
  • What is data quality?
  • The data is accurate
  • The data is stored according to data type
  • The data has integrity
  • The data is consistent
  • The databases are well designed
  • The data is accurate
  • The data is stored according to data type
  • The data has integrity
  • The data is consistent
  • The databases are well designed

15
Data Quality
  • The data is not redundant
  • The data follow business rules
  • The data corresponds to established domains
  • The data is timely
  • The data is well understood

16
Data Quality
  • The data satisfies the needs of the business
  • The user is satisfied with the quality of the
    data and the information derived from that
    data
  • There are no duplicate records
  • Data anomalies

17
Data Quality
  • Assessment of existing data quality
  • Programs that abnormally terminate with data
    exceptions
  • Clients who experience errors/anomalies
  • Clients who do not know or are confused about
    what the data actually means
  • Data that cannot be shared due to lack of
    integration

18
Data Quality
  • What data should be improved?
  • The energy should be spent on data where the
    quality improvement will bring an important
    benefit to the business.
  • We can ignore unimportant data and obsolete
    data.
  • Other criteria improve those which can be
    fixed and kept clean.

19
Data Quality
  • Purification process
  • Determine the importance of data quality to the
    organization
  • Identify the enterprises most important data and
    evaluate the quality.
  • Determine users and owners perception of data
    quality.
  • Prioritize which data to purify.
  • Assemble and train a team to clean the data.
  • Select tools to aid in the purification process,
    etc.

20
Data Quality
  • Data quality case
  • Lesson1 If those entering the data have a stake
    in the data being incorrect, the data will be
    incorrect.
  • Lesson2 Reports may show desired results, but
    the reports may be highly inaccurate.

21
Directory/Catalog
  • The challenge
  • Providing short-term benefit without disabling
    broader long-term information handling solutions.
  • Getting data into a warehouse is only half of
    the process.

22
Security in the data warehouse
  • Basic security concepts
  • Physical security
  • Stand-alone or shared security
  • Remote access
Write a Comment
User Comments (0)
About PowerShow.com