The Data Warehouse - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

The Data Warehouse

Description:

Database Systems: Design, Implementation, & Management, 5th Edition, Rob ... 25-35 45. Parties with disco and dance music. Folk music concerts. Picnic outdoors ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 30
Provided by: roger262
Category:
Tags: data | warehouse

less

Transcript and Presenter's Notes

Title: The Data Warehouse


1
Chapter 13
  • The Data Warehouse
  • Database Systems Design, Implementation, and
    Management, Fifth Edition, Rob and Coronel

2
In this chapter, you will learn
  • How operational data and decision support differ
  • What a data warehouse is and how its data are
    prepared
  • What data mining is and what role it plays in
    decision support

3
The Need for Data Analysis
  • External and internal forces require tactical and
    strategic decisions
  • Search for competitive advantage
  • Business environments are dynamic
  • Decision-making cycle time is reduced
  • Different managers require different decision
    support systems (DSS)

4
Decision Support Systems
  • Decision Support
  • Is a methodology
  • Extracts information from data
  • Uses information as basis for decision making

5
Decision Support Systems
  • Decision support system (DSS)
  • Arrangement of computerized tools
  • Used to assist managerial decision
  • Extensive data massaging to produce information
  • Used at all levels in organization
  • Tailored to focus on specific areas and needs
  • Interactive
  • Provides ad hoc query tools

6
DSS Components
Figure 13.1
7
Operational vs. Decision Support Data
  • Operational data
  • Relational, normalized database
  • Optimized to support transactions
  • Real time updates
  • DSS
  • Snapshot of operational data
  • Summarized
  • Large amounts of data
  • Data analyst viewpoint
  • Timespan
  • Granularity (different levels of aggregation)
  • Dimensionality

8
The DSS Database Requirements
  • Database schema
  • Support complex (non-normalized) data
  • Extract multidimensional time slices

9
Requirements (cntd)
  • Data extraction and filtering
  • End-user analytical interface
  • Database size
  • Very large databases (VLDBs)
  • Contains redundant and duplicated data

10
Data Warehouse
  • Integrated
  • Centralized
  • Holds data retrieved from entire organization
  • Subject-Oriented
  • Optimized to give answers to diverse questions
  • Used by all functional areas
  • Time Variant
  • Flow of data through time
  • Projected data
  • Non-Volatile
  • Data never removed
  • Always growing

11
Creating a Data Warehouse
Figure 13.3
12
Data Marts
  • Single-subject data warehouse subset
  • Decision support to small group
  • Can be test for exploring potential benefits of
    Data warehouses
  • Address local or departmental problems

13
DSS Architectural Styles
  • Traditional mainframe-based OLTP
  • Managerial information system (MIS) with 3GL
  • First-generation departmental DSS
  • First-generation enterprise data warehouse using
    RDMS
  • Second-generation data warehouse using MDBMS

14
Twelve Data Warehouse Rules
  • 1. Separated from operational environment
  • 2. Data are integrated
  • 3. Contains historical data over long time
    horizon
  • 4. Snapshot data captured at given time
  • 5. Subject-oriented data
  • 6. Mainly read-only data with periodic batch
    updates from operational source, no online
    updates
  • 7. Development life cycle differs from classical
    one, data driven not process driven

15
Twelve Data Warehouse Rules (Cont.)
  • 8. Contains different levels of data detail
  • Current and old detail
  • Lightly and highly summarized
  • 9. Characterized by read-only transactions to
    large data sets
  • 10. Environment has system to trace data
    resources, transformation, and
    storage
  • 11. Metadata critical components
  • Identify and define data elements
  • Provide the source, transformation, integration,
    storage, usage, relationships, and history of
    data elements
  • 12. Contains charge-back mechanism for usage
  • Enforces optimal use of data

16
Online Analytical Processing (OLAP)
  • Advanced data analysis environment
  • Supports decision making, business modeling, and
    operations research activities
  • Characteristics of OLAP
  • Use multidimensional data analysis techniques
  • Provide advanced database support
  • Provide easy-to-use end-user interfaces
  • Support client/server architecture

17
Relational OLAP (ROLAP)
  • OLAP functionality
  • Uses relational DB query tools
  • Extensions to RDBMS
  • Multidimensional data schema support
  • Data access language and query performance
    optimized for multidimensional data
  • Support for very large databases (VLDBs)

18
Typical ROLAP Client/Server Architecture
Figure 13.10
19
Star Schema
Easy to add new dimension
20
Multidimensional OLAP (MOLAP)
  • OLAP functionality to multidimensional databases
    (MDBMS)
  • Stored data in multidimensional data cube
  • N-dimensional cubes called hypercubes
  • Cube cache memory speeds processing
  • Affected by how the database system handles
    density of data cube called sparsity

21
MOLAP Client/Server Architecture
Figure 13.11
22
ROLAP vs. MOLAP
23
Data Mining
  • Seeks to discover unknown data characteristics
  • Automatically searches data for anomalies and
    relationships
  • Data mining tools
  • Analyze data
  • Uncover problems or opportunities
  • Form computer models based on findings
  • Predict business behavior with models
  • Require minimal end-user intervention

24
Data Mining Process
Figure 13.23
25
Example
  • Church Marketing Research
  • Structural questions
  • Age
  • How in the USA
  • Education
  • Opinion questions
  • Content of Web-Site
  • Purpose of a new building

26
Example- Contd
27
Example Contd
28
(No Transcript)
29
Outcome of Research- Market Segments
Write a Comment
User Comments (0)
About PowerShow.com