Title: ?????? Data Warehousing and OLAP ????? ???????
1??????Data Warehousing and OLAP????????????
2Agenda
- Introduction
- Data Warehouse Theory
- System Features
- Demo
- Discussions
31. Introduction
41.1 Introduction
- A data warehouse is a subject-oriented,
integrated, time-variant, nonvolatile collection
of data in support of management decisions
51.1 Introduction (contd)
- How are organizations using data warehouse ?
- Increasing customer focus, which includes the
analysis of customer buying patterns. - Repositioning products and managing product
portfolios by comparing the performance of sales
by time or regions, in order to fine-tune
production strategies - Analyzing operations and looking for sources of
profit - Managing the customer relationship, making
environmental corrections, and managing the cost
of corporate assets
61.2 Data Warehouse Characteristics
- It is a database designed for analytical tasks,
using data from multiple applications - It supports a relatively small number of users
with relatively long interactions - Its usage is read-intensive
- Its content is periodically updated
71.2 Data Warehouse Characteristics (contd)
- It contains current and historical data to
provide a historical perspective of information - It contains a few large tables
- Each query frequently results in a large result
set and involves frequent full table scan and
multi-table joins
81.3 Datawarehousing
- The Processing of constructing and using data
warehouses
Heterogeneous Data Sources
Data Cleaning
Data Integration And Consolidation
Interactive Analysis
Making Strategic Decisions
Constructing Data warehouse
Using Data Warehouse
91.4 Three-tier System Architecture
102. Data Warehouse Theory
112.1 Data Warehouse Theory
- Why not use Database directly ?
- The update-driven approach is inefficient.
- Potentially expensive for frequent queries.
- Use Data warehouse instead
- The query-driven approach is enough for making
strategic decisions. - Separate the operational DBMS for daily and
critical operations.
122.2 Data Cube
- A multidimensional, logical view of the data
- Concept hierarchy
- Multiple data granularity ????????
- Data summarization ????
- Data generalization ?????
13 14- Drill-down on time data for Q1
15- Adding a dimension supplier
162.3 Efficient Data Cube Computation
- The challenges 2N combinations
- Concept hierarchy and Aggregations
- makes it more complicated !
- Materialization of data cube ????
- Materialize every, none, or some ?
- Algorithms for selection
- Based on size
- Based on sharing,
- Based on access frequency.
172.4 On-Line Analytical Processing (OLAP)
- Fast on-line processing of data cubes or
multi-dimensional databases - OLAP operations
- Drilling
- Pivoting ????
- Slicing and Dicing
- Filtering, etc.
182.4 On-Line Analytical Processing (Contd)
- A multidimensional, logical view of the data.
- Interactive analysis of the data (drill, pivot,
slice_dice, filter) and Quick response to OLAP
queries. - Summarization and aggregations at every dimension
intersection. - Retrieval and display of data in 2-D or 3-D
cross-tabs, charts, and graphs, with easy
pivoting of the axes. - Analytical modeling deriving ratios, variance,
etc. and involving data across many dimensions. - Forecasting, trend analysis, and statistical
analysis.
193. System Feature
203.1 Data sources supported
- ODBC-compatible DBMS
- Oracle, Microsoft SQL, MySQL, IBM DB2, etc.
- Files
- MS Access, MS Excel, etc.
- Text files (CSV-format)
213.2 Data Cleansing ????
- Database schema translation
- Field selection and mapping
- Field re-naming
- Field aggregating and deriving
- Data filtering
- Data value conversion
- Data value mapping
- Data value function
- Date value conversion and decomposition
223.3 Building of Data Cube
- Support for multi-dimension data
- Support for concept hierarchy
233.5 Interactive Front-end Tools
- User-defined multi-dimension
- User-defined dimension hierarchy
- User-defined data granularity
- Real-time graph capabilities
- Bar chart
- Pie chart
- Line chart
243.6 Other features
- Web-based OLAP GUI
- Easy to access from Internet
- Easy to integrated with other systems
- Import / Export capability
254. Demo
265. Discussions
275.1 Roadmap
- Integrated with Data mining
- Major Group / Sales Analysis ????
- Prospects Analysis and Forecast ?????????
- Association of Customers and Sales ????
- Market Segment Recommendation ????
- Other Business Intelligence application
- Integrated to e-Marketing
- 1-to-1 Personalization Recommendation ?????
- Target marketing ????
- Loyalty program ???????