Title: Data Management for Decision Support
1Data Management for Decision Support
- Session - 1
- Prof. Bharat Bhasker
2- Course Outline
- Assignments 50
- Surprises!
- HW
- Papers/ Cases (10 points)
- Final Exam 50
- Whats Expected?
- Multiple Books ( DBMS, Data Warehousing, Computer
Architecture) - Research Papers ( ToDS, Computing Surveys)
3Productivity Paradox
- In 1994, US 464 Billion were spent on IT, but
ROI didnt measure up.gt - Automation of Clerical tasks
- Improving efficiency of existing processes
- Gathering Collection of Data went up, but no
significant changes in deployment of the
technology to extract value
4Potential for ROI
- Management of data offers potential ROI
- Ability to focus on business processes, perform
complete financial analysis and make decisions
based on understanding of the entire system - Ability to organize data for both macro and micro
level analysis - Dynamic drill down through the reports generated
via EIS - Quick responses to ad-hoc queries for testing
hypothesis - Correlate current and historical data
- IDC survey showed average 3 year return reached
400 for data warehousing!!
5Why Manage the Accumulated data
- Information needs of Knowledge workers,
Strategic - planners, Customers Vendors can be met from the
- accumulated data.
- Business Perspectivegt
- Quick and correct decision making using all
available data - Users are domain expert, not software pros
- Data double every 18 months, effects response
time and comprehension - Business Intelligence the new emerging tool
6Why Manage the Accumulated data
- Technology Perspective
- Price of processing power in MIPS declines,
while power of microprocessor double every 18
months. - The prices of digital storage continues to drop
- The environment is heterogeneous is terms of H/W
and S/W - Legacy systems can be integrated with new
applications - N/W b/w is increasing, cost is declining
7Introduction- Shifting Paradigm
Users
Networks
Computers
8Introduction- Shifting Paradigm
New Enterprise BPR Led, Paperless Office, Fully
networked, distributed , Knowledge enabled
Open Systems, C/S, Distributed Computers
Automated Enterprise Back office Automation Front
office Automation
Proprietary Systems on Mainframes
Legacy Enterprise Manual Back office Manual front
office
9Problem Heterogeneous Information Sources
Heterogeneities are everywhere
Personal Databases
World Wide Web
Scientific Databases
Digital Libraries
- Different interfaces
- Different data representations
- Duplicate and inconsistent information
10Problem Data Management in Large Enterprises
- Vertical fragmentation of informational systems
(vertical stove pipes) - Result of application (user)-driven development
of operational systems
Sales Planning
Suppliers
Num. Control
Stock Mngmt
Debt Mngmt
Inventory
...
...
...
Sales Administration
Finance
Manufacturing
...
11Goal Unified Access to Data
Personal Databases
Digital Libraries
Scientific Databases
- Collects and combines information
- Provides integrated view, uniform user interface
- Supports sharing
12The Traditional Research Approach
- Query-driven (lazy, on-demand)
Clients
Metadata
Integration System
. . .
Wrapper
Wrapper
Wrapper
. . .
Source
Source
Source
13Disadvantages of Query-Driven Approach
- Delay in query processing
- Slow or unavailable information sources
- Complex filtering and integration
- Inefficient and potentially expensive for
frequent queries - Competes with local processing at sources
- Hasnt caught on in industry
14The Warehousing Approach
- Information integrated in advance
- Stored in wh for direct querying and analysis
Clients
Data Warehouse
Metadata
Integration System
. . .
Extractor/ Monitor
Extractor/ Monitor
Extractor/ Monitor
. . .
Source
Source
Source
15Advantages of Warehousing Approach
- High query performance
- But not necessarily most current information
- Doesnt interfere with local processing at
sources - Complex queries at warehouse
- OLTP at information sources
- Information copied at warehouse
- Can modify, annotate, summarize, restructure,
etc. - Can store historical information
- Security, no auditing
- Has caught on in industry
16Not Either-Or Decision
- Query-driven approach still better for
- Rapidly changing information
- Rapidly changing information sources
- Truly vast amounts of data from large numbers of
sources - Clients with unpredictable needs
17What is a Data Warehouse?A Practitioners
Viewpoint
- A data warehouse is simply a single, complete,
and consistent store of data obtained from a
variety of sources and made available to end
users in a way they can understand and use it in
a business context. - -- Barry Devlin, IBM Consultant