Title: The Data Warehouse
1Chapter 13
- The Data Warehouse
- Database Systems Design, Implementation, and
Management, Fifth Edition, Rob and Coronel
2In this chapter, you will learn
- How operational data and decision support differ
- What a data warehouse is and how its data are
prepared - What data mining is and what role it plays in
decision support
3The Need for Data Analysis
- External and internal forces require tactical and
strategic decisions - Search for competitive advantage
- Business environments are dynamic
- Decision-making cycle time is reduced
- Different managers require different decision
support systems (DSS)
4Decision Support Systems
- Decision Support
- Is a methodology
- Extracts information from data
- Uses information as basis for decision making
5Decision Support Systems
- Decision support system (DSS)
- Arrangement of computerized tools
- Used to assist managerial decision
- Extensive data massaging to produce information
- Used at all levels in organization
- Tailored to focus on specific areas and needs
- Interactive
- Provides ad hoc query tools
6DSS Components
Figure 13.1
7Operational vs. Decision Support Data
- Operational data
- Relational, normalized database
- Optimized to support transactions
- Real time updates
- DSS
- Snapshot of operational data
- Summarized
- Large amounts of data
- Data analyst viewpoint
- Timespan
- Granularity (different levels of aggregation)
- Dimensionality
8The DSS Database Requirements
- Database schema
- Support complex (non-normalized) data
- Extract multidimensional time slices
9Requirements (cntd)
- Data extraction and filtering
- End-user analytical interface
- Database size
- Very large databases (VLDBs)
- Contains redundant and duplicated data
10Data Warehouse
- Integrated
- Centralized
- Holds data retrieved from entire organization
- Subject-Oriented
- Optimized to give answers to diverse questions
- Used by all functional areas
- Time Variant
- Flow of data through time
- Projected data
- Non-Volatile
- Data never removed
- Always growing
11Creating a Data Warehouse
Figure 13.3
12Data Marts
- Single-subject data warehouse subset
- Decision support to small group
- Can be test for exploring potential benefits of
Data warehouses - Address local or departmental problems
13DSS Architectural Styles
- Traditional mainframe-based OLTP
- Managerial information system (MIS) with 3GL
- First-generation departmental DSS
- First-generation enterprise data warehouse using
RDMS - Second-generation data warehouse using MDBMS
14Twelve Data Warehouse Rules
- 1. Separated from operational environment
- 2. Data are integrated
- 3. Contains historical data over long time
horizon - 4. Snapshot data captured at given time
- 5. Subject-oriented data
- 6. Mainly read-only data with periodic batch
updates from operational source, no online
updates - 7. Development life cycle differs from classical
one, data driven not process driven
15Twelve Data Warehouse Rules (Cont.)
- 8. Contains different levels of data detail
- Current and old detail
- Lightly and highly summarized
- 9. Characterized by read-only transactions to
large data sets - 10. Environment has system to trace data
resources, transformation, and
storage - 11. Metadata critical components
- Identify and define data elements
- Provide the source, transformation, integration,
storage, usage, relationships, and history of
data elements - 12. Contains charge-back mechanism for usage
- Enforces optimal use of data
16Online Analytical Processing (OLAP)
- Advanced data analysis environment
- Supports decision making, business modeling, and
operations research activities - Characteristics of OLAP
- Use multidimensional data analysis techniques
- Provide advanced database support
- Provide easy-to-use end-user interfaces
- Support client/server architecture
17Relational OLAP (ROLAP)
- OLAP functionality
- Uses relational DB query tools
- Extensions to RDBMS
- Multidimensional data schema support
- Data access language and query performance
optimized for multidimensional data - Support for very large databases (VLDBs)
18Typical ROLAP Client/Server Architecture
Figure 13.10
19Star Schema
Easy to add new dimension
20Multidimensional OLAP (MOLAP)
- OLAP functionality to multidimensional databases
(MDBMS) - Stored data in multidimensional data cube
- N-dimensional cubes called hypercubes
- Cube cache memory speeds processing
- Affected by how the database system handles
density of data cube called sparsity
21MOLAP Client/Server Architecture
Figure 13.11
22ROLAP vs. MOLAP
23Data Mining
- Seeks to discover unknown data characteristics
- Automatically searches data for anomalies and
relationships - Data mining tools
- Analyze data
- Uncover problems or opportunities
- Form computer models based on findings
- Predict business behavior with models
- Require minimal end-user intervention
24Data Mining Process
Figure 13.23
25Example
- Church Marketing Research
- Structural questions
- Age
- How in the USA
- Education
- Opinion questions
- Content of Web-Site
- Purpose of a new building
26Example- Contd
27Example Contd
28(No Transcript)
29Outcome of Research- Market Segments