KnowledgeDriven Business Intelligence Systems: Part II - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

KnowledgeDriven Business Intelligence Systems: Part II

Description:

Emerging discipline that combines computational methods for dealing with inexact, ... is either false (0) or true (1) and can be stored in a binary fashion. ... – PowerPoint PPT presentation

Number of Views:79
Avg rating:3.0/5.0
Slides: 40
Provided by: SIMS
Category:

less

Transcript and Presenter's Notes

Title: KnowledgeDriven Business Intelligence Systems: Part II


1
Knowledge-Driven Business Intelligence Systems
Part II
  • Week 11
  • Dr. Jocelyn San PedroSchool of Information
    Management Systems
  • Monash University

2
Lecture Outline
  • Data Mining Technologies
  • Neural Networks
  • Genetic Algorithms
  • Fuzzy Logic
  • Decision Trees
  • Data Visualisation

3
Learning Objectives
  • At the end of this lecture, the students will
  • Gain some understanding of data mining
    technologies (decision trees, neural networks,
    genetic algorithms, and fuzzy logic) that are
    commonly used in data mining techniques
  • Preview some visualisation tools and gain an
    understanding of how they support business
    decision making

4
Data Mining Technologies
  • 1960s classical statistical analysis
  • Correlation, regression, chi-square,
    cross-tabulation
  • 1980s classical statistical analysis augmented
    by more powerful set of soft computing techniques
  • neural networks, genetic algorithms, fuzzy logic,
    decision trees

5
Soft Computing
  • Emerging discipline that combines computational
    methods for dealing with inexact, approximate
    reasoning approaches
  • simulating the brain-way of solving problems -
    neural networks
  • evolving solutions - genetic algorithms
  • dealing with logical ambiguity - fuzzy logic
  • representing effect of each event, or decision,
    on successive events decision trees

6
Neural Networks
  • Attempt to mirror the way human brain works in
    recognizing patterns by developing mathematical
    structures with the ability to learn (Marakas,
    2002)
  • Attempt to learn patterns from data directly,
    by sifting data repeatedly, searching for
    relationships, automatically building models, and
    correcting over and over again the models own
    mistakes (Dhar and Stein, 1997)
  • Good at modelling poorly understood problems for
    which sufficient data can be collected

7
Artificial Neural Nets (ANNs)
  • simple computer programs that build models from
    data by trial and error
  • Learning from Experience
  • Present a piece of data to a neural network
  • The net predicts an output
  • The net compares is guess to the actual correct
    value (also presented to the network)
  • If ANN guess is right, the net does nothing
  • If ANN guess is wrong, net figures out how to
    adjust some internal parameters so that it can
    make better prediction if it sees similar data
    again in future
  • Over time, the ANN begins to converge on a fairly
    accurate model of the process

8
Artificial Neural Nets (ANNs)
  • Network Topology- The number of layers and units
    in each layer and a way in which the units are
    connected together.
  • 3 basic layers
  • The input layer receives the data
  • The internal or hidden layer processes the data.
  • The output layer relays the final result of the
    net.

Output Layer
Guesses
Hidden Layer
Processing
Input Layer
Data Input
Marakas, G.M. (2002) Decision support systems in
the 21st Century. 2nd Ed, Prentice Hall
9
Artificial Neural Nets (ANNs)
Training the ANN - adjusting neural network
weights. During training the network analyses the
data you have provided and changes weights
between network units to reflect dependencies
found in your data.
Marakas, G.M. (2002) Decision support systems in
the 21st Century. 2nd Ed, Prentice Hall
10
Artificial Neural Nets (ANNs)
  • Testing is a process of estimating quality of the
    trained neural network. During this process a
    part of data that wasn't used during training is
    presented to the trained network case by case.
    Then forecasting error is measured on each case
    and used as the estimation of network quality.

Preparing the ANN in Alyuda Forecaster
www.alyuda.com
11
Artificial Neural Nets (ANNs)
  • Effective in problems of image recognition
  • Not suited well for, say, financial or serious
    medical applications.
  • highly intricate systems - include dozens of
    neurons with a couple hundred connections between
    them
  • non-transparency of forecasting models
    represented by a trained neural network
  • knowledge reflected in terms of weights of a
    couple hundred intraneural connections cannot be
    analysed and interpreted by a human.
  • Despite of these difficulties neural networks are
    actively used (with varying success) in different
    financial applications in the majority of
    developed countries.

12
ANN Applications Alyuda Forecaster
  • Credit Approval - determine risk of granting a
    loan to an applicant
  • Classify applicant as either LOW risk, HIGH risk
  • Guide decision in granting or denying new loans
  • Employee retention- identify potential employees
    who are likely to stay with the organization
    during the next year based on previous year data
  • Classify employees retention probability as LOW
    or HIGH probability
  • Identify employees who intend to leave and take
    the appropriate measures to retain them.

www.alyuda.com
13
ANN Applications Alyuda Forecaster
14
ANN Applications Alyuda Forecaster
  • Gas consumption - forecast gas consumption by a
    power plant.
  • Sales forecasting - forecast weekly sales of a
    small restaurant chain using the historical data
    over 109 weeks period
  • Stock prediction - forecast the percentage of the
    Close price change for Chevron Corp 4 days in
    advance

www.alyuda.com
15
Data Mining Technologies
  • Genetic Algorithms
  • Recognise a good solution, spreads some of that
    solutions features into a population of
    competing solutions, and breeds good solutions
  • Powerful technique for solving various
    combinatorial or optimisation problems
  • Sample Genetic algorithm online demos
  • http//math.hws.edu/xJava/GA/

16
Genetic Algorithm
  • First a population of possible solutions to a
    problem are developed.
  • Next, the better solutions are recombined with
    each other to form some new solutions.
  • Finally the new solutions are used to replace the
    poorer of the original solutions and the process
    is repeated.

17
Genetic Algorithm - Example
  • Selecting a fixed number of market parameters
    influencing the market performance the most
  • names of these parameters comprise a descriptive
    set or a set of chromosomes determining qualities
    of an "organism" - a solution of the problem
  • Values of parameters determining a solution
    correspond to genes
  • A search for the optimal solution is similar then
    to the process of evolution of a population of
    organisms, where each organism is represented by
    a set of its chromosomes.
  • http//www.megaputer.com/dm/systems.php3stat_pack
    age

18
Genetic Algorithm - Example
  • The process of evolution of population of
    organisms is driven by three mechanisms
  • selection of the strongest or survival of the
    fittest those sets of chromosomes that
    characterise the most optimal solutions
  • cross-breeding - production of new organisms by
    mixing sets of chromosomes of parent sets of
    chromosome
  • mutations - accidental changes of genes in some
    organisms of the population.
  • After a number of new generations built with the
    help of the described mechanisms one obtains a
    solution that cannot be improved any further.
    This solution is taken as a final one.

http//www.megaputer.com/dm/systems.php3stat_pack
age
19
Genetic Algorithms- Weak Points
  • The very way of formulating the problem deprives
    one of any opportunity to estimate statistical
    significance of the obtained solution.
  • Second, only a specialist can develop a
    criterion for the chromosome selection and
    formulate the problem effectively.
  • Thus genetic algorithms should be considered at
    present more as an instrument for scientific
    research rather than as a tool for generic
    practical data analysis, for instance, in finance.

http//www.megaputer.com/dm/systems.php3stat_pack
age
20
Fuzzy Logic
  • Our language is full of vague and imprecise
    concepts, and allows for conveyance of meaning
    through semantic approximations
  • These approximations are useful to humans, but do
    not readily lend themselves to the rule-based
    reasoning done on computers.
  • Use of fuzzy logic is how computers handle this
    ambiguity
  • Allows for partial or fuzzy description of rules

Marakas, G.M. (2002) Decision support systems in
the 21st Century. 2nd Ed, Prentice Hall
21
The Basics of Fuzzy Logic
  • In a crisp rule, the result is either false (0)
    or true (1) and can be stored in a binary
    fashion.
  • In a fuzzy rule, the result ranges from 0
    (absolutely false) to 1 (absolutely true), with
    stops in between.
  • absolutely false, slightly false, slightly true,
    absolutely true
  • slightly similar, similar, very similar
  • These operations utilise functions that assign a
    degree of membership in a set.
  • Degree of similarity of current data to
    historical data is 0.75

Marakas, G.M. (2002) Decision support systems in
the 21st Century. 2nd Ed, Prentice Hall
22
Membership Function Example
  • The Tallness function takes a persons height
    and converts it to a numerical scale from 0 to 1.
  • Here the statement He is Tall is absolutely
    false for heights below 5 feet and absolutely
    true for heights above 7 feet

Marakas, G.M. (2002) Decision support systems in
the 21st Century. 2nd Ed, Prentice Hall
23
Inferencing using Fuzzy Rules
  • Example
  • Well if youve got a high margin, price
    sensitive product, promoting that product via
    ads, displays, etc. is likely to have a high
    impact on sales volume. If the volume impact is
    high, its a good candidate for allocation of
    promotion dollars.
  • But you also want to promote products more
    heavily when theyre relatively new in order to
    increase market awareness and to establish market
    share

Dhar, V. and Stein, R. (1997)
24
Inferencing using Fuzzy Rules
  • One fuzzy rule If product is new, then a client
    should spend more money promoting it

Dhar, V. and Stein, R. (1997)
25
Inferencing using Fuzzy Rules
? - Degree of Membership in the fuzzy set NEW
? 1 0.3 0
0 235 365 Days since product was
introduced
Dhar, V. and Stein, R. (1997)
26
Inferencing using Fuzzy Rules
Promotion expense that is 2 of sales is
absolutely LOW
The degree of Lowness of Promotion expense that
is 2.9 of sales is 0.75.
PROMOTION
1 0.75 0
Low Medium High
0 3 5 8
15 Expense as a percentage of sales
Dhar, V. and Stein, R. (1997)
27
Inferencing using Fuzzy Rules
Price Sensitivity (ratio of change in volume
per change in price)
Price sensitivity is 0.4 LOW or 0.1 Medium
1 0.4 0.1 0
Low Medium High
Take Max value or Fuzzy Set Union Price
sensitivity is 0.4 LOW
0 1 2 3
4 5 Input
Dhar, V. and Stein, R. (1997)
28
Inferencing using Fuzzy Rules
  • Other fuzzy rules
  • If product is NEW, then a client should spend
    MORE money promoting it
  • If the price sensitivity of product is LOW, then
    promotion should be LOW
  • If the price sensitivity of product is MEDIUM,
    then promotion should be MEDIUM
  • If the price sensitivity of product is HIGH, then
    promotion should be HIGH

Dhar, V. and Stein, R. (1997)
29
Fuzzy Systems
  • Some Advantages
  • Great in dealing with qualitative data, as well
    as object attribute
  • Offers an attractive trade-off between accuracy
    and compactness express relationships in terms
    of simple rules
  • Not computationally expensive compared to
    crisp rule-based systems

30
Fuzzy Systems
  • Some Disadvantages
  • Saturation of fuzzy sets fuzzy sets get so full
    of inferences that the consequent fuzzy regions
    are overloaded gt system loses the information
    provided by the fuzzy rules
  • Needs domain expertise to setup fuzzy sets
  • Only provides approximation to human reasoning

31
Notes on Decision Trees
  • CART Classification and Regression Trees
  • Most common decision tree, statistical analysis
    data mining tool
  • automatically searches for and finds high
    performance classification and prediction
  • key elements are a set of rules for
  • splitting each node in a tree
  • deciding when a tree is complete and
  • assigning each terminal node to a class outcome
    (or predicted value for regression)
  • More info and software demo on http//www.salford-
    systems.com/

32
Data Visualisation
  • For any kind of high dimensional data set,
    displaying predictive relationships is a
    challenge.

http//www.sapdesignguild.org/editions/edition2/in
fo_zoom.asp
33
Human Visual Perception and Data Visualisation
  • Data visualisation is so powerful because the
    human visual cortex converts objects into
    information so quickly.
  • The next three slides show (1) usage of global
    private networks, (2) flow through natural gas
    pipelines, and (3) a risk analysis report that
    permits the user to draw an interactive yield
    curve.
  • All three use height or shading to add additional
    dimensions to the figure.

Marakas, G.M. (2002) Decision support systems in
the 21st Century. 2nd Ed, Prentice Hall
34
Global Private Network Activity
High Activity
Low Activity
Marakas, G.M. (2002) Decision support systems in
the 21st Century. 2nd Ed, Prentice Hall
35
Natural Gas Pipeline Analysis
Note Height shows total flow through compressor
stations.
Marakas, G.M. (2002) Decision support systems in
the 21st Century. 2nd Ed, Prentice Hall
36
An Enlivened Risk Analysis Report
Marakas, G.M. (2002) Decision support systems in
the 21st Century. 2nd Ed, Prentice Hall
37
Telephone Polling Results
Note On the live map, clicking on an area
allows the user to drill down and see results
for smaller areas.
Marakas, G.M. (2002) Decision support systems in
the 21st Century. 2nd Ed, Prentice Hall
38
References
  • Dhar, V. and Stein, R. (1997) Intelligent
    decision Support Methods the Science of
    Knowledge Work, Prentice Hall.
  • Dhar, V. and Stein, R. (1997) Seven methods for
    transforming corporate data into business
    intelligence.
  • Marakas, G.M. (2002) Decision support systems in
    the 21st Century. 2nd Ed, Prentice Hall (or
    other editions)
  • Power, D. (2002) Decision Support Systems
    Concepts and Resources for Managers, Quorum
    Books.
  • Good Online resource on fuzzy sets and operations
    http//www.doc.ic.ac.uk/nd/surprise_96/journal/vo
    l4/sbaa/report.fuzzysets.html

39
  • Questions?
  • Jocelyn.sanpedro_at_sims.monash.edu.au
  • School of Information Management and Systems,
    Monash University
  • T1.28, T Block, Caulfield Campus
  • 9903 2735
Write a Comment
User Comments (0)
About PowerShow.com