Business Intelligence - PowerPoint PPT Presentation

1 / 104
About This Presentation
Title:

Business Intelligence

Description:

De praktijk van een consultant iha en bij Cap Gemini Ernst & Young ... Een druk op de knop... 19. Van chaos... Naar structuur. 20. Why now? Hype? Developments: ... – PowerPoint PPT presentation

Number of Views:279
Avg rating:3.0/5.0
Slides: 105
Provided by: tomafrs
Category:

less

Transcript and Presenter's Notes

Title: Business Intelligence


1
Business IntelligenceData Warehousing
  • Tom A. Fürstenberg
  • Business Intelligence Consultant
  • Cap Gemini Ernst Young

2
Leerdoelen college
  • Wat is BI DWH? (Conceptueel en Technisch)
  • Toepassing van BI DWH
  • De praktijk van een consultant iha en bij Cap
    Gemini Ernst Young ihb

3
Inhoud College
  • Performance Management
  • Business Intelligence (Performance Measurement)
  • OLAP
  • Extranets
  • Architectuur
  • Data Warehouse
  • ETL
  • Multidimensioneel Modelleren
  • CGEY Aanpak
  • Data Mining

4
Performance Management
  • Doelgericht meten en bijsturen van
    bedrijfsdoelstellingen

5
In control of a company
6
Overview
7
Besturings visie Bouwstenenvoor besturing van
organisaties
Organisatie
Wie?
Systemen
Methoden
Hoe?
Doelstellingen Prestatie indicatoren
Wat?
Strategie Missie
Waarden normen
Stake holders
Waarom ?
8
Methoden
Diverse Financiële modellen
Balanced Scorecard
INK managementmodel
9
Naar een operationeel Besturingsmodel
10
Naar een operationeel Besturingsmodel
11
Van Model naar Gedragsverandering
Multi-dimensionale Gegevensstructuur
Verantwoordelijkheden en Bevoegdheden
Operationeel Besturingsmodel
Informatievoorziening
Planning en Commitment
Beoordeling en Sturing
12
Some Typical Mgt. Questions
PRODUCT
CUSTOMER
  • How much have we sold?
  • Which product gives the best profit?
  • Which product has the largest sales
  • volume this quarter?
  • Which product best meets market
  • needs?
  • How much to produce of each product?
  • Who is the most profitable customer?
  • What is the satisfaction level?
  • Which are the best segments?
  • Which service to improve?
  • How many customers have we lost last year?
  • Who are our biggest accounts?

CHANNEL
MARKETING
  • Which retailer yields most by
  • volume and which by profit?
  • What promotions will yield most profit?
  • What effect will discounts have on
  • the turnover?
  • What are the area coverage levels?
  • How many contacted people became
  • a customer?
  • Promotions results?
  • What is the competition doing?

13
Key Performance Indicators Top 10
Source Results FIND! The Best
benchmarkstudy conducted in 1997/1998 by Ernst
Young Consulting and VU. 103 industrial companies
participated in the study.
14
En nu alleen nog even metenBusiness
Intelligence (performance measurement)
15
The Answers
The information is there, but spread everywhere!
16
De praktijk...
17
Problemen
  • (Over)belasting IT-afdeling (queries)
  • Lange doorlooptijd rapport-fabricage
  • Hoge kosten aan manuren
  • Databronnen moeilijk integreerbaar
  • Niet-gestandaardiseerde rapporten
  • Geen eenduidige definities
  • Foutgevoelig
  • Manipuleerbaar
  • Afhankelijkheid van schakels
  • Discussies over verschillen in cijfers
  • Beperkte analyse-mogelijkheden
  • Verkeerde en te late interpretaties, conclusies,
    beslissingen
  • ...

18
Een druk op de knop...
19
Van chaos...
20
Why now? Hype? Developments
Market Pull
  • Globalisation of markets
  • Individualisation of customers
  • Shorter life cycle of products
  • Information overload
  • Mergers

Technology Push
  • Faster hardware
  • Cheaper disk capacity
  • Modern OLAP-tools
  • Any access c/s, web, mobile

21
OnLine Analytical Processing
  • Gebaseerd op de syntax van management-informatie
    vragen
  • ltmeetwaardegt per ltdim1gt per ltdim2gt per ...
  • KPIs, CSIs en OIs zijn meetwaarden
  • Produkt, Regio, Klant, Tijd, etc. zijn dimensies
    (slice dice)
  • Dimensies kennen hierachiën (drill down)

22
OLAP
23
Introduction to Cubes
24
Demo
  • eFashion Case
  • BusinessObjects Demo

25
BusinessObjects Semantic Layer
26
Any Access
27
Info- Analysis-need at 3rd parties
28
e/m-Business Intelligence Extranets
29
Extranet demos
30
Business IntelligenceTheory
31
BI Definition
  • Business Intelligence is the process of
    collection, cleansing, combining, consolidation,
    analysis, interpretation and communication of all
    internal and available external data, relevant
    for the decision making process in the
    organisation

32
BI Concept
Decisions
Action
Feedback
Knowledge
Analysis
Business Value
Information
Integration
Data
Collection
33
BI Systems
Reporting Query
DSS, MIS and EIS
OLAP
Data Mining
34
The Five Functional Levels
standard reports
bunch of reports, cube
unique report or question
i.e. finding variables
i.e stat. analysis, testing a hypothesis
35
The Five Functional Levels
Number of users
80 of all users
reporting
Static/ Dynamic
analysis
Complexity of the question
querying
exploring
mining
interactief
36
Corporate Information Factory
Any Source
Any Access
Any Data
LAN/WAN
Q U E R Y M A N A G E M E N T
L O A D M A N A G E M E N T
Data Marts
Applications
External data
Data Warehouse
WWW
Operational Data Store
37
Components of the CIF
  • Data Warehouse
  • Data Mart
  • Operational Data Store
  • ETL

38
Data Warehouse
39
Definition Bill Inmon
  • Characteristics of a data warehouse
  • Subject-oriented
  • Integrated
  • Time-variant
  • Non-volatile
  • Both summary and detailed data

40
Data Warehouse
  • Contains data that can be used to meet the
    information of (part of) the organisation
  • Contains integrated data extracted from one or
    more sources
  • Mostly contains large amounts of data
  • Contains data that is clean and consistent
  • May contain aggregated data
  • Optimised for its use

41
Data Warehouse
Data Base
Data Warehouse
Actual
Historical
Internal and External
Internal
Integrated
Isolated
Transactions
Analysis
Normalised
Dimensional
Dirty
Clean and Consistent
Detailed and Summary
Detailed
42
Data Warehouse
  • Advantages
  • One point of contact
  • Time savings
  • No loss of historical data
  • OLTPs not hampered by BI activities
  • Better consistency and quality of data
  • Improvement of Business Intelligence

43
Data Warehouse
  • Disadvantages
  • Never quite up-to-date
  • Requires a lot of storage space
  • Requires a lot of communication, coordination and
    cooperation
  • Large impact on the organisation
  • A data warehouse is only the beginning

44
Data Mart
  • DW design does not optimise query performance
  • Data is not stored in an optimal fashion for any
    given department in the DW
  • Competition to get the resources required to get
    inside the DW
  • Costs for DSS computing facilities are high
    because of the large volume in DW

45
Data Mart
  • Characteristics
  • Customised for a specific department
  • Limited amount of history
  • Summarised
  • Very flexible
  • Elegant presentation
  • Processor dedicated to the department

46
Data Mart
  • Divided by
  • Business
  • Geography
  • Security
  • Political (budget)
  • Structure (data mining)

47
Data Mart
  • Three different kinds of data marts
  • Subset/summary
  • MOLAP
  • ROLAP

48
Operational Data Store
  • Characteristics
  • Subject-oriented
  • Integrated
  • Current-valued
  • Volatile
  • Detailed data

49
ETL Extraction
  • Source selection
  • Data model is starting point determine data
    elements that are needed
  • For each data element, determine available data
    sources
  • If more han 1 source available, select on
  • Quality, reliability and integrity
  • Scope of data
  • Location and availability of data
  • Location and availability of expertise

50
ETL Transformation
  • Processing
  • Aggregate records
  • Encoding structures
  • Simple reformatting
  • Mathematical conversion
  • Resequencing of data
  • Default values
  • Key conversion
  • Cleansing

51
ETL Transformation
Key transformation
Key structure A
Key structure B
Key structure A
Key structure C
Key structure A
Key structure B
New key structure
Key structure C
52
ETL Cleansing
  • Data quality is critical for
  • Marketing communications
  • Targeted marketing
  • Customer matching
  • Retail- and commercial householding
  • Combining information
  • Tracking retail sales

53
ETL Cleansing
  • Common excuses for not cleaning
  • The data in the operational systems seem to work
    just fine
  • Data can be joined most of the time
  • Cleansing will take place after population of the
    data warehouse
  • Data entry will be improved
  • The users will never agree to change their data

54
Multi DimensionalData Modeling
55
MD Modeling Contents
  • E/R Modeling (Ex.)
  • MD Modeling (Ex.)
  • Star Schema
  • Slowly Changing Dimensions (Ex.)
  • Surrogate Keys
  • Aggregation (Ex.)
  • Measures Dimensions reviewed
  • Other important MDM aspects

56
Exercise E/R Modeling
  • How could the sales transaction database of the
    eFashion retailer look like?

57
Management Questions
  • Give me the annual revenue of all my product
    lines divided over all the sales regions over the
    last 3 years
  • Give me the top 10 of most profitable products
    this year
  • Give me the top 10 of most sold products of last
    year
  • Give me the top 10 of most profitable customers
  • Compare the YTD revenue with the one in the same
    period last year and the target

58
Why not E/R Modeling?
  • End users cannot understand, remember, navigate
    an E/R model (not even with a GUI)
  • Software cannot usefully query an E/R model
  • Use of E/R modeling doesnt meet the DW purpose
    intuitive and high performance querying

59
Exercise Model the Efashion DM
  • Sales Revenue
  • Time hierarchy (Year-Quarter-Month)
  • Store hierarchy (Region, State, City, Store)
  • Product hierarchy (Line, Category, SKU)

60
eFashion Data Mart
61
DW Modeling Components
62
Using a Star Schema
63
Components of a Star Schema
EmployeeKey
TimeKey
ProductKey
TimeKey
EmployeeKey
Dimensional Keys
ProductKey
CustomerKey
ShipperKey
CustomerKey
ShipperKey
64
Exercise Slowly Changing Dimensions
  • Suppose the product categories change from time
    to time.
  • Model the Data Mart when the manager wants to see
    historical reports against
  • 1. The present categories
  • 2. The categories at the time of the sale
  • 3. Both against the present categories and the
    immediate previous categories
  • 4. The categories at any specified time

65
SCD Exercise 1
66
SCD Exercise 2
Most Recent Product Key Map
Product_key SKU_nr
67
SCD Exercise 3
Time Month_nr Month_desc Quarter
Year
Facts Month_nr Store_nr Pr
oduct_key Sales_revenue ...
Product Product_key SKU_nr SKU_desc Cate
gory Category_old Line
Geography Store_nr Store_name City State Reg
ion
68
SCD Exercise 4
Time Month_nr Month_desc Quarter
Year
Facts Month_nr Store_nr SK
U_nr Sales_revenue ...
Product SKU_nr SKU_desc Category Line Va
lid_from Valid_until
Geography Store_nr Store_name City State Reg
ion
69
Slowly Changing Dimensions
  • Type 1 Overwrite the dimension record
  • Type 2 Create new dimension record
  • Type 3 Create an old field in the dimension
    record
  • Type 4 Add a valid_from and valid_until field in
    the dimension record

Ad. Type 2 requires surrogate keys, but in
general, one should always use these because of
performance and flexibility Ad. Type 4 Kimball
only recognizes 3 types SCDs
70
Always Use Surrogate Keys
  • Allows DWH to assign new key versions for SCDs
    (type 2)
  • Higher performance with numeric keys than with
    long, alphanumeric keys

71
Exercise Aggregation
  • Suppose the manager queries frequently on product
    line level and finds the performance too low.
  • Question How to model the data mart when we want
    to add aggregated measures on product line level?

72
Exercise Aggregation
Time Month_nr Month_desc Quarter
Year
Facts Month_nr Store_nr Pr
oduct_key Sales_revenue ...
Product Product_key SKU_nr SKU_desc Cate
gory Line
Geography Store_nr Store_name City State Reg
ion
73
Exercise Measures
Add the following measures to the eFashion Data
Mart
  • Stock Quantity
  • Product Price
  • Promotion Costs (product-specific,
    store-independent)

74
Exercise Measures
Time Month_nr Month_desc Quarter
Year
Facts Month_nr Store_nr Pr
oduct_key Sales_revenue Stock_qty
Product Product_key SKU_nr SKU_desc Pric
e Category Line (Valid_from Valid_until)
Geography Store_nr Store_name City State Reg
ion
75
Measures Dimensions reviewed
The most useful measures are
  • Numeric
  • Additive

Dimensions are
  • The natural entry points of the facts
  • I.e., used for constraints and report breaks
  • Independent of each other, not hierarchically
    related

76
Other Important MDM-Aspects
  • Cardinality
  • Grain
  • Referential Integrity
  • Conformed Dimensions
  • Drill Across
  • Traps

77
How to make the CIF?
Any Source
Any Access
Any Data
LAN/WAN
Q U E R Y M A N A G E M E N T
L O A D M A N A G E M E N T
Data Marts
Applications
External data
Data Warehouse
WWW
Operational Data Store
78
CGEY BI-Approach Overview
Incremental Delivery
Strategy objectives
DW blueprint
Source data
Metamodel
Extraction, Transformation Load
Development
Definition Increments
Implementation
Awareness
Data Warehouse Architecture I
Data Warehouse Architecture II
Evolutionary Strategy
79
(No Transcript)
80
Data Mining
81
Data Mining
  • Definition
  • The process of digging intelligently into large
    volumes of data to discover and analyse
    previously unknown relationships or to validate
    hypotheses.

82
Data Mining Versus OLAP
OLAP/Query Are there some customers from large
accounts with a high decrease in international
calls?
Data Mining Are there any common characteristics
among these customers?
83
Applications
  • Risk Analysis (grant credit, investment)
  • Fraud Detection (telephone charge, bank
    withdrawals)
  • Trouble Shooting and Diagnosis
  • Process Controls (wafer fabrication)
  • Promotion Analysis
  • Bankruptcy Prediction (mortgage lending, business
    partners)
  • Customer Churn (telco)
  • CRM (next slides)

84
Maximizing Customer Value
  • Getting more prospects in
  • Turning prospects into customers
  • Selling more products to existing customers
  • Getting less customers out

85
Which ones in and which ones out?
Yield per individual customer

Yield per customer
Costs per customer
Highest
Lowest
Customer profitability
86
Example One to One Marketing
  • Treat different customers differently
  • differentiate message
  • differentiate product offer
  • differentiate channel
  • Need for usable information gt predict customer
    behavior out of databases

87
(No Transcript)
88
(No Transcript)
89
(No Transcript)
90
(No Transcript)
91
(No Transcript)
92
(No Transcript)
93
Example clickstream analysis
  • What parts of our Web site get the most visitors?
  • What parts of the Web site do we associate most
    frequently with actual sales?
  • What parts of the Web site are superfluous or
    visited infrequently?
  • Which pages on our Web site seem to be "session
    killers," where the remote user stops the session
    and leaves?
  • What is the new-visitor click profile on our
    site?
  • What is the click profile of an existing
    customer? A profitable
    customer? A complaining customer
    that all too frequently returns our
    product?
  • What is the click profile of a customer about to
    cancel our service, complain, or sue us?
  • How can we induce the customer to register with
    our site so we learn
    some useful information about that
    customer?
  • How many visits do unregistered customers
    typically make with us before they are willing to
    register? Before they buy a product or service?

94
Customized Customer Service
95
Example Contact Strategy
Good
Data mining
Customer Data
Bad
96
The customer choses the channel
Operational systems
Analysis
Integration
97
Data Sources for Data Mining
Collecting Cleansing
DATA
  • Transactions (loyalty cards)
  • Behaviour of existing customers
  • Logfiles cookies
  • Market research
  • Data suppliers
  • Public data

98
Example Affinity Grouping
  • Market Basket what items are sold together?
  • Market Basket what categories are sold with what
    items?
  • Market Basket what is not sold with certain
    items?
  • Event Correlations what other services are
    brought in the first month after signing up for a
    satellite TV subscription?

99
Data Mining Techniques
  • Decision Trees, Classification Trees, Rule
    Induction
  • Neural Nets
  • Visualisation
  • Fuzzy Logic Nearest Neighbour Memory Based
    Reasoning Case Based Reasoning
  • Proprietary Logic
  • Classical Statistics

100
Data Mining Techniques
  • ? Statistical analysis
  • ? Neural networks
  • ? Genetic algorithms
  • ? Decision trees
  • ? Intuïtion

?
?
?
Predictive Power
?
?
Simplicity
101
Critical Success Factors
  • Data availability (large amounts of a wide
    variety of data)
  • Data consistency
  • Data quality
  • Domain expertise
  • Data used/needed is allowed by privacy laws

102
Benefits
  • Improved customer relationships
  • More revenue from existing customers
  • Market segmentation
  • Differentiated products and services
  • Differentiated sales channels
  • More effective marketing programs
  • Improved fraud detection
  • Improved investments

103
Decision Tree with BusinessMiner from
BusinessObjects
Demo
104
Contact information
  • Tom A. Fürstenberg
  • Business Intelligence Consultant
  • Cap Gemini Ernst Young
  • Sector Energy, Products Transport
  • Tel 31 6 21 878 915
  • email Tom.Furstenberg_at_CapGemini.nl
Write a Comment
User Comments (0)
About PowerShow.com