Advanced Model Based Process Engineering Tools - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Advanced Model Based Process Engineering Tools

Description:

Department of Process Engineering. FMT. DPE. From Data to Information. Production, Database ... Fischer interclass separability criterion. Feature extraction ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 26
Provided by: abon4
Category:

less

Transcript and Presenter's Notes

Title: Advanced Model Based Process Engineering Tools


1
János Abonyi and Szeifert Ferenc
CI in data mining
CI in modeling and control
Advanced Model Based Process Engineering Tools
www.fmt.vein.hu/softcomp
2
Outline
  • Goal Show that different CI tools can be
    favorably combined for data mining
  • Introduction to Data Mining (DM)
  • CI based DM algorithms
  • Examples Wine data
  • Conclusions

3
From Data to Information
Useful knowledge
Decision
Model
Models for knowledge representation
Data Mining
Data extraction
pre-processed data
Data warehouse
Production Nature
Production, Database
4
Steps of Data Mining
How CI can help ???
5
Tasks of Data Mining
  • Clustering (prototypes, codebook, signatures,
    prob. density estimation )
  • Summation (inc. Visualisation, Feature
    extraction)
  • Regression and time-series analysis
  • Classification
  • Change and Deviation Detection
  • Dependency Modelling(belief networks)

6
Clustering
  • Detect groups of data
  • Prototypes (signatures)
  • Based on similarity measure (distance)
  • Adaptive distance measure (correlation)
  • Supervised or unsupervised
  • Hierarchical or not
  • Can be fuzzy !!!

7
Feature Extraction
  • (Nonlinear) mapping of the input space (PCA)
  • Reduction of the number of inputs
  • Useful for visualisation (SOM)
  • Non-parametric (Sammon projection) or
    Model-based (principal curves, NN, Gaussian
    mixtures)

8
Regression
  • TS Fuzzy Models Operating Regime Based
    Modelling
  • Local Linear models
  • Identification by clustering
  • Recently Mixture of Gaussians

9
Classification
Which class (A or B)?
  • Labelled classes
  • Decision support systems (Rule based)
  • Identification can be based on clustering
    (Bayess Rule)
  • Can be fuzzy !!!

Decision border
x2
A
B
x1
10
DM Algorithms
  • Representation (Language to describe the
    patterns)
  • Fuzzy Logic helps by allowing overlapping regions
    and interpretability by providing insight into
    the model
  • Model Evaluation Criteria Accuracy (prediction
    error) and interpretability (complexity)
  • Search Method (Parameter and structure search)
  • Standard linear (LS, TLS, OLS, SVD, QR)
  • Neuro-Fuzzy (back-propagation)
  • Clustering (alternating optimisation, EM)
  • Genetic Algorithm

11
Model Representation
  • Fuzzy classifier structure
  • Certainty factor

class
no. of rules
degree of firing
decision
12
Fuzzy Clustering and Classification
IF x1 is SMALL AND x2 is BIG THEN Class RED
13
Decision Tree
  • Each class is approximated by a hyperbox based on
    a decision tree
  • Supervised learning

14
Model Evaluation Criteria
  • Accuracy
  • Modeling or classification error
  • Certainty degree
  • Local models/global models
  • Transparency and Interpretability
  • Moderate number of rules
  • Distinguishability
  • Normality
  • Coverage

15
Proposed modeling method
Feature selection, extraction Clustering, DT,
...
Supervised or unsupervised learning
introduces some error
Rule base design (MF functions)
Initialize
fit data
Estimate rules consequents (LS)
reduce complexity
Rule and featurereduction
e.g. multi-objective GA MSE redundancy
Iterate
Optimization
Fuzzy Set merging
Optimization
reduce premise
Finish
multi-objective MSE transparency
final model
16
Multi-objective optimization
  • Model performance (classification error)
  • Multi-objective function
  • ??-1,1 determines whether similarity is
    rewarded (?lt0) or penalized (?gt0).

S(A,B) gt ?
17
Model Reduction
  • Improves interpretability capabilities
  • Orthogonal methods (SVD, OLS, QR)
  • Fuzzy set merging
  • Feature selection
  • Based on statistical properties of the clustersa
    feature ranking is made
  • Fischer interclass separability criterion
  • Feature extraction
  • Interpretable transformation of the features

18
Wine data classification
  • 179 samples, 3 classes, 13 attributes

19
Visualization by SOM
20
GA-based Scheme
  • 7,4,1,12,13 were selected based on Fischer
    interclass ranking.
  • Initial model contains 9 misclassifications.
  • 200 GA-iterations in loop and 400 in final
    optimization.
  • 3 additional fuzzy sets were removed, Final
    classifier contains 4 features and 9 fuzzy sets.
  • 3 misclassifications.

21
Example for a classifier
22
Clustering based result
23
Discussion
  • CI (Fuzzy, Neural, and GA) tools can be
    effectively used in Data Mining
  • For model representation
  • For search
  • For model evaluation
  • Applications
  • Fuzzy models
  • Fuzzy clustering (AO, EM)
  • Neural (Neuro-fuzzy)
  • Genetic Algorithm
  • Accuracy
  • Interpretability gt Model reduction tools
  • Process industry
  • Chemometrics

24
Conclusions
Database Technology
Other Disciplines
Data Mining
Statistics
Information Science
Machine Learning
Visualization
C.I.
www.fmt.vein.hu/softcomp
25
Acknowledgements
  • Janos Bolyai Research Fellowship of the HAS
    (CI in Process Engineering)
  • FKFP 0073/2001 (Intelligent Process Control
    Lab)

Magne Setnes
Hans Roubos
Write a Comment
User Comments (0)
About PowerShow.com