Title: Data Warehousing
1Data Warehousing
2Outline
- What is Data Warehousing? (Definition)
- Why does anyone need it? (Applications)
- How is the data organized? (Star Schema)
- Implementation Issues.
3Data Warehouse Definitions
- Dyche Used for decision making- duplicates
existing data - Combination of hardware,
specialized software and data extracted from
other corporate systems. - Inmon Subject-oriented, integrated, non-volatile
and time-variant collection of data in support of
management decisions.
4Why Warehouse?
- Provide single view of customers across
enterprise - Improve turnaround time for common reports
- Monitor customer behavior
- Predict future purchases
- Improved responsiveness Business issues.
5Coca Cola IBM
- IBM helping Coca Cola with warehouse.
- Deal with Global companies like McDonalds
support for negotiating global contracts.
6Financial Services Example Credit Life Cycle
Product Planning
Customer Acquisition
Collections
Customer Management
7Customer Acquisition
Product Planning
- Support for Marketing
- Market Segmentation
- Plus Forecasts with
- Response Models
- Risk / Bankruptcy Models
- Profitability Models
Customer Acquisition
8Customer Management
Who gets a credit increase? Which of delinquent
customers is likely to default? What do you do
(call, send letter, do nothing?) Decision
Support Forecast Customer Behavior (Behavior
Models)
Customer Acquisition
Customer Management
9Collections/Recovery
What is the likelihood of recovering money from
an account sent to collections? Decision
Support Collections models
Collections
Customer Management
10Other Questions
- How can we reduce attrition?
- How can we activate inactive accounts?
- How well are my current strategies performing?
- How do we detect Fraud?
11Where is the data?
- Transaction Systems
- Marketing Database
- Credit Reports
- Customer Service
12How is it Organized?
- Separate from transactional data
- Contains Historical data
- Generally aggregated to some extent
- Optimized for flexible querying of large volumes
of data
13Star Schema
- Fact Table plus several dimensional tables
- Un-normalized
- Less flexible than normalized tables
- Faster retrieval than normalized tables for large
volumes of data
14Implementation
- Start with the Business Issues
- Project Planning/Human Resources
- Database design / data sources
- Application Development
15Business Analysis
- What is the problem?
- Who owns the problem?
- Will data help solve it?
16When can data be used to Predict?
High
Coupling
Low
Low
High
Randomness
Source www.butlergroup.com
Also read article in Wired Magazine on Data
Mining and Terrorism
17(No Transcript)