Title: DATA%20WAREHOUSING
1DATA WAREHOUSING
2Legacy System
- Systems that were developed in the early years of
business processing - Rich source of historical data, but its
difficult to retrieve, because of non-standard
features - This is why we need data warehouse
3Problems with Legacy System
- Access data from a legacy system may be difficult
for several reasons - Developed for a different hardware or software
platform - Use a different data model
- Use a different DBMS
- Use a different data definitions
- Use a different data format
- All these make difficulty in integration and
sharing data
4Data Definitions Problems
- Homonyms use different field names to store the
same data in the different database - Synonyms - use the same field names to store
different data in the different database - Domain integrity domain for the same field may
be different - Business rules may be different in different
database - Referential integrity may be problems linking
related records from different databases - Concurrency control when multiple users access
a database that design for single user
5Data Warehouse Concepts
- Technique of extracting and filtering data from
diverse database and use this data to build a new
database - Stores information extracted from historical,
operational and external databases - The primary purpose to provide information for
management decision making
6Database vs data warehouse
Activity Database Data warehosue
Function Support business operation Support decision making
Data Process oriented Subject-oriented
Usage Structured, repetitive Unstructured, repetitive
Processing Data entry End user initiated queries
7Data Warehouse Architecture
- Operational database / external database layer
- Information access layer
- Data access layer
- Metadata layer
- Process management layer
- Application messaging layer
- Physical layer
- Data staging layer
8Data Warehouse Implementation
- Data includes operational, historical and
external data - Extraction and transformation extract and
transform data in different table - Data warehouse storage store the extracted and
transformed data in different table - Historical data used for forecasting purposes
- Reports, statistics, data analysis and
presentation output from data warehouse to make
a decision
9Data Warehouse Benefits and Risks
- Benefits
- Reduces reporting cost
- Reduces data consolidation and integration cost
- Increase efficiency and decision making
capabilities
- Risks
- House the wrong data
- Expensive to build and maintain
- Require organizational changes
10Online Analytical Processing
- Support data modeling and multidimensional data
analysis - Share the characteristics
- Provide user-friendly interface
- Use multidimensional data analysis technique
- Provide advanced database support
- Support client/server architecture
11Online Analytical Processing
- Can be classified
- Relational Online Analytical Processing use
RDBMS - Multidimensional Online Analytical Processing
extension of RDBMS
12Data Mining
- Data mining is a decision support tools that
enables a user to access directly large amount of
data and analyzes the data - Data mining is the set of activities used to find
new, hidden, or unexpected patterns in data
13Data Mining Technique
- Data mining process has four phases
- Data preparation main data sets to be used are
identified and cleaned - Data analysis and classification identify
common data characteristic or pattern - Knowledge acquisition develop a model resemble
target data - Prediction used to predict future behaviour and
forecast business outcomes
14Data Mining Tools
- Data mining tools today has this following
characteristics - Data preparation facilities
- Selection of data mining operations
- Product scalability and performance
- Facilities for visualization of results
15