Introduction to Data Warehousing - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Introduction to Data Warehousing

Description:

Introduction to Data Warehousing ... Perform advanced analytical functions The star structure Multidimensional Database Model The data is found at the intersection of ... – PowerPoint PPT presentation

Number of Views:645
Avg rating:3.0/5.0
Slides: 25
Provided by: FAdam
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Data Warehousing


1
Introduction to Data Warehousing
2
From DBMS to Decision Support
  • DBMSs widely used to maintain transactional data
  • Attempts to use of these data for analysis,
    exploration, identification of trends etc. has
    led to Decision Support Systems.
  • Rapid Growth since mid 70s
  • DBMSs vendors have answered this trend by adding
    new features to existing products
  • Rarely enough

3
DBs for Decision Support
  • Trend towards Data Warehousing
  • Data Warehousing consolidation of data from
    several databases which are in turn maintained by
    individual business units along with historical
    and summary information

4
Characteristics of TPSs
  • Characteristic OLTP
  • Typical operation Update
  • Level of analytical requirements Low
  • Screens Unchanging
  • Amount of data per transaction Small
  • Data level Detailed
  • Age of data Current
  • Orientation Records

5
TPS vs Decision Support
Complex Analysis Historical informationto
analyze Data needs to be integrated Database
design Denormalized, star schema
OLTP Information to supportday-to-day
service Data stored at transactionlevel Database
design Normalized
6
MIS and Decision Support
Ad hoc access
Productionplatforms
Operational reports
Decision makers
  • MIS systems provided business data
  • Reports were developed on request
  • Reports provided little analysis capability
  • no personal ad hoc access to data

7
Analyzing Data from Operational Systems
  • Data structures are complex
  • Systems are designed for high performance and
    throughput
  • Data is not meaningfully represented
  • Data is dispersed
  • TPS systems unsuitable for intensive queries

ERP
Productionplatforms
Operational reports
8
Data Extract Processing
Extracts
Operational systems
Decision makers
  • End user computing offloaded from the operational
    environment
  • Users own data

9
Management Issues
Extracts
Operational systems
Decision makers
  • Extract explosion
  • Duplicated effort
  • Multiple technologies
  • Obsolete reports
  • No metadata

10
Data Quality Issues
  • No common time basis
  • Different calculation algorithms
  • Different levels of extraction
  • Different levels of granularity
  • Different data field names
  • Different data field meanings
  • Missing information
  • No data correction rules
  • No drill-down capability

11
From Extract to Warehouse DSS
Data warehouse
Decision makers
Internal andexternal systems
  • Controlled
  • Reliable
  • Quality information
  • Single source of data

12
Data Warehousing Architecture
External Data Sources
Visualisation
Extract Clean Transform Load Refresh
Metadata respository
Serves
OLAP
Operational Databases
Data Warehouse
Data Mining
13
Business Motivators
  • Provide superior services and products
  • Know the business
  • New products
  • Invest in customers
  • Retain customers
  • Invest in technology
  • Reinvent to face new challenges

14
Centralised data warehouse
Federated data warehouse
15
Tiered data warehouse
16
Data Warehouses Vs Data Marts
Data Warehouse
Data Mart Department Single-subject Few lt 100
GB Months
Property Scope Subjects Data Source Size
(typical) Implementation time
Data Warehouse Enterprise Multiple Many 100 GB to
gt 1 TB Months to years
17
End-user Access Tools
  • High performance is achieved by pre-planning the
    requirements for joins, summations, and periodic
    reports by end-users.
  • There are five main groups of access tools
  • Data reporting and query tools
  • Application development tools
  • Executive information system (EIS) tools
  • Online analytical processing (OLAP) tools
  • Data mining tools

18
Data Usage - 1000 questions
Need to complement RDBMS technology with a
flexible, multidimensional view of data
19
(No Transcript)
20
The Functionality of OLAP
  • Rotate and drill down
  • Create and examine calculated data
  • Determine comparative or relative differences.
  • Perform exception and trend analysis.
  • Perform advanced analytical functions

21
The star structure
22
Multidimensional Database Model
Store
Customer
Store
Time
Time
FINANCE
SALES
Product
  • The data is found at the intersection of
    dimensions.

23
Data Mining
24
Data mining functions
  • Associations
  • 85 percent of customers who buy a certain brand
    of wine also buy a certain type of pasta
  • Sequential patterns
  • 32 percent of female customers who order a red
    jacket within six months buy a gray skirt
  • Classifying
  • Frequent customers are those with incomes about
    50,000 and having two or more children
  • Clustering
  • Market segmentation
  • Predicting
  • predict the revenue value of a new customer based
    on that personal demographic variables
Write a Comment
User Comments (0)
About PowerShow.com