Title: Data Warehouse
1Data Warehouse
2Strategic delivery of information
- The current Situation
- The never-ending quest to access any
information, anywhere, anytime. - The problem
- Data is scattered in many types of incompatible
structures.
3Analytical processing requirements
- Four levels of analytical processing
- 1. Simple queries and reports
- 2. The ability to do what if processing
- 3. Step back and analyze what has previously
occurred to bring about the current state of date - 4. Analyze what has happened in the past and
what needs to be done in the future for a
specific change
4Information data superstore(IDSS)
- Definition
- The architecture needed to support the
far-ranging requirements of the four levels of
analysis. - Also called super data warehouse
- Data warehouses is not an end of themselves but
merely a step on the path to the information data
super store
5Why need for a separate environment
- The use of operational systems v.s data warehouse
- The datas characteristics
- The type of access
6A strategy for building a data warehouse
- Need indicators
- Action steps
- Three-stage data warehousing processing
- model ? build ? deploy
- (understand) (establish) (implement)
7Organizational and cultural issues
- Cultural imperatives
- Success criteria
- Satisfy users requirements
- Make a significant contribution to the success of
the business - The users accept and actively use it
- The benefits are not exceeded by the costs
- An adequate budget must be in place
8Organizational and cultural issues
- Success criteria(continued)
- The implementation of the data warehouse must not
cause other problems that overshadow the benefits - A reasonable schedule must be established
9Organizational and cultural issues
- End user(client)
- Strategic architecture
- User liaison
- End-user support
- Data analyst
- Security office
- Data administration
10Organizational and cultural issues
- Database administration
- Choosing the initial data and department
- Establishing an infrastructure
- Training users
- Change in the power structure
11End Users
- A crucial part of the project
- Gathering requirements and managing expectations
- Cost justification process
- Design reviews
- User perspective
- User training
12A technical architecture for DW
Data Manager Component Warehouse Data
Data Delivery Component
External Data
Source Data
Data Acquisition Component
Data Access Component
Middleware Component
Information Directory Component Warehouse Data
Design Component
External Data
Management Component
13Data Quality
- Why is data quality important?
- Data is a critical issue
- It will limit the ability of the end users
to make informed decision. - It has a profound effect on the image of the
enterprise. - The poor one will make it difficult to make
major changes in an organization.
14Data Quality
- The data is accurate
- The data is stored according to data type
- The data has integrity
- The data is consistent
- The databases are well designed
- The data is accurate
- The data is stored according to data type
- The data has integrity
- The data is consistent
- The databases are well designed
15Data Quality
- The data is not redundant
- The data follow business rules
- The data corresponds to established domains
- The data is timely
- The data is well understood
16Data Quality
- The data satisfies the needs of the business
- The user is satisfied with the quality of the
data and the information derived from that
data - There are no duplicate records
- Data anomalies
17Data Quality
- Assessment of existing data quality
- Programs that abnormally terminate with data
exceptions - Clients who experience errors/anomalies
- Clients who do not know or are confused about
what the data actually means - Data that cannot be shared due to lack of
integration
18Data Quality
- What data should be improved?
- The energy should be spent on data where the
quality improvement will bring an important
benefit to the business. - We can ignore unimportant data and obsolete
data. - Other criteria improve those which can be
fixed and kept clean.
19Data Quality
- Determine the importance of data quality to the
organization - Identify the enterprises most important data and
evaluate the quality. - Determine users and owners perception of data
quality. - Prioritize which data to purify.
- Assemble and train a team to clean the data.
- Select tools to aid in the purification process,
etc.
20Data Quality
- Data quality case
- Lesson1 If those entering the data have a stake
in the data being incorrect, the data will be
incorrect. - Lesson2 Reports may show desired results, but
the reports may be highly inaccurate.
21Directory/Catalog
- The challenge
- Providing short-term benefit without disabling
broader long-term information handling solutions. - Getting data into a warehouse is only half of
the process.
22Security in the data warehouse
- Basic security concepts
- Physical security
- Stand-alone or shared security
- Remote access