VISUALIZATION TECHNIQUES UTILIZING THE SENSITIVITY ANALYSIS OF MODELS - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

VISUALIZATION TECHNIQUES UTILIZING THE SENSITIVITY ANALYSIS OF MODELS

Description:

sigmoid. MEDV=1/(1-exp(-5.861*AGE 2.111)) Error: 0.13. Error: 0.21. 9 ... sigmoid. sigmoid. sigmoid. linear. polyno. mial. polyno. mial. linear. expo. nential ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 24
Provided by: pavel48
Category:

less

Transcript and Presenter's Notes

Title: VISUALIZATION TECHNIQUES UTILIZING THE SENSITIVITY ANALYSIS OF MODELS


1
VISUALIZATION TECHNIQUES UTILIZING THE
SENSITIVITY ANALYSIS OF MODELS
Ivo Kondapaneni, Pavel Kordík, Pavel
Slavík Department of Computer Science and
Engineering, Faculty of Eletrical
Engineering, Czech Technical University in
Prague, Czech Republic Presenting author Pavel
Kordík (kordikp_at_fel.cvut.cz)
  • Ivo Kondapaneni, Pavel Kordík, Pavel Slavík
  • Department of Computer Science and Engineering,
    Faculty of Eletrical Engineering,
  • Czech Technical University in Prague, Czech
    Republic
  • Presenting author Pavel Kordík
    (kordikp_at_fel.cvut.cz)

2
Overview
  • Motivation
  • Data mining models
  • Visualization based on sensitivity analysis
  • Classification problems
  • Regression problems
  • Definition of interesting plots
  • Genetic search for 2D and 3D plots

3
Motivation
  • Data mining extracting new, potentially useful
    information from data
  • DM Models are automatically generated
  • Are models always credible?
  • Are models comprehensible?
  • How to extract information from models?

Visualization
4
Data mining models
  • Often black-box models generated from data
  • E.g. Neural networks
  • What is inside?

Input variables
Data mining black box model
Output variable (s)
5
Inductive model
  • Estimates output from inputs
  • Generated automatically
  • Evolved by niching GA
  • Grows from minimal form
  • Contains hybrid units
  • Several training methods
  • Ensemble of models

6
Example Housing data
Input variables
CRIM ZN INDUS NOX RM AGE DIS
RAD TAX PTRATIO B LSTA
Weighted distances to five Boston employment
centres
Per capita crime rate by town
Proportion of owner-occupied units built prior to
1940
Median value of owner-occupied homes in 1000's
MEDV
Output variable
7
Housing data records
Input variables
CRIM ZN INDUS NOX RM AGE DIS
RAD TAX PTRATIO B LSTA
24 0.00632 18 2.31 53.8 6.575
65.2 4.09 1 296 15.3
396.9
21.6 0.02731 0 7.07 46.9 6.421
78.9 4.9671 2 242 17.8 396.9

4.98
9.14
MEDV
Output variable
8
Housing data inductive model
Input variables
CRIM ZN INDUS NOX RM AGE DIS
RAD TAX PTRATIO B LSTA
Niching genetic algorithm evolves units in first
layer
sigmoid
sigmoid
Error 0.13
Error 0.21
MEDV1/(1-exp(-5.724CRIM 1.126))
MEDV1/(1-exp(-5.861AGE 2.111))
MEDV
Output variable
9
Housing data inductive model
Input variables
CRIM ZN INDUS NOX RM AGE DIS
RAD TAX PTRATIO B LSTA
sigmoid
sigmoid
linear
sigmoid
Error 0.13
Error 0.21
Error 0.26
Error 0.24
polyno mial
MEDV0.747(1/(1-exp(-5.724CRIM 1.126)))
0.582(1/(1-exp(-5.861AGE 2.111)))20.016
Niching genetic algorithm evolves units in second
layer
Error 0.10
MEDV
Output variable
10
Housing data inductive model
Input variables
CRIM ZN INDUS NOX RM AGE DIS
RAD TAX PTRATIO B LSTA
sigmoid
sigmoid
sigmoid
linear
polyno mial
polyno mial
linear
Constructed model has very low validation error!
expo nential
Error 0.08
MEDV
Output variable
11
Housing data inductive model
Input variables
CRIM ZN INDUS NOX RM AGE DIS
RAD TAX PTRATIO B LSTA
MEDV(exp((0.038 3.451(1/(1-exp(-5.724CRIM
1.126)))(1/(1-exp(2.413DIS-2.581)))(1/(1-exp(2.
413DIS-2.581)))0.429(1/(1-exp(-5.861AGE
2.111)))0.024(1/(1-exp(2.413DIS-2.581)))0.036
0.0380.350(1/(1-exp(-3.613RAD-0.088)))
0.999( 0.747(1/(1-exp(-5.724CRIM
1.126)))0.582(1/(1-exp(-5.861AGE
2.111)))(1/(1-exp(-5.861AGE 2.111)))0.016)-0.0
46(1/(1-exp(-5.724CRIM 1.126)))-0.079
0.002INDUS-0.001LSTA 0.150)0.860)13.072)-14.8
74 Math equation is not comprehensible any more
we have to threat it as a black box model!
S
S
S
L
P
P
L
E
Error 0.08
MEDV
Output variable
12
Visualization based on sensitivity analysis
GAME
GAME
13
Sensitivity analysis of inductive model of MEDV
House no. 189
House no. 164
What will happen with the value of house when
criminality in the area decreases/increases?
14
Ensemble of inductive models
  • Random initialization
  • Developing on the same
  • training set
  • Training affect just well
  • defined areas of input space
  • Each model - unique architecture,
  • similar complexity
  • similar transfer functions
  • Similar behavior for well defined areas
  • Different behavior under-defined areas

yk
yk-1
yk1
GAME
i x2
min
max
GAME
GAME
15
Credibility of models Artificial data set
Advantages
  • No need of the training data set,
  • Modeling method success considered,
  • Inputs importance considered.

Credibility the criterion is a dispersion of
models responses.
16
Example Models of hot water consumption
17
Cold water consumption, increasing humidity
18
When a plot is interesting for us?
xi
xisize
xistart
19
Definition of interesting plot
  • Minimal volume of the envelope p min
  • Maximal sensitivity of the output to the change
    of xi input variable ysize max
  • Maximal size of the area xisize max

20
Multiobjective optimization
  • Interestingness
  • Unknown variables
  • x1,x2,..., xi-1,xi1,xn xistart, xisize
  • We will use Niching genetic algorithm

Chromosome x1 x2 ... xi-1 xi1 xn
xistart xisize
21
(No Transcript)
22
Niching GA locates also local optima
  • Three subpopulations (niches) of individuals
    survived

23
Automated retrieval of plots showing interesting
behavior
Genetic Algorithm
Genetic algorithm with special fitness function
is used to adjust all other inputs (dimensions)
Write a Comment
User Comments (0)
About PowerShow.com