Title: RASC 2000 1
1magne_at_ieee.org
hans_at_ieee.org
Abonyij_at_fmt.vein.hu
Learning Fuzzy ClassificationRules from Data
Hans Roubos 1, Magne Setnes 2, and Janos Abonyi 3
1 Delft University of Technology, Control
Laboratory, The Netherlands 2 Heineken Technical
Services, RD, The Netherlands 3 University of
Veszprem, Dep. Process Engineering, Hungary
2Goal of this work
- Automatic design of fuzzy rule-based classifiers
based on data with both high accuracy and high
transparency. - Show that different tools for modeling and
complexity reduction can be favorably combined - feature selection,
- fuzzy model identification,
- rule base simplification,
- constrained genetic optimization.
3Outline
- Transparency and accuracy issues
- Proposed modeling method
- Iterative complexity reduction
- Three modeling schemes
- Examples Wine data
- Conclusion
4What makes a good fuzzy classifier?
- Accuracy
- Classification error
- Certainty degree
- Local models/global models
- Transparency and Interpretability
- Moderate number of rules
- Distinguishability
- Normality
- Coverage
5Coding of the fuzzy classifier
- Fuzzy classifier structure
- Certainty factor
class
no. of rules
degree of firing
decision
6Proposed modeling method
reduce input space
good initial model structure
Feature selection
create rule base
Data clustering
Initialize
introduces some error
Project clusters
reduce premise
Fuzzy Set merging
Iterate
GA with multi-objective MSE
redundancy
Global optimization
Global Optimization
Finish
multi-objective MSE transparency
final model
7Initial fuzzy model
- Data-driven initialization
- Each class is approximated an elipsoid based on
statistical properties of the data. - Other methods fixed partitioning, clustering,
e.g. C-means, Gustafson-Kessel, splitting trees
e.g. LOLIMOT. - Note that projection introduces error!
8Feature selection
- Improves prediction and interpretability
capabilities. - Fischer interclass separability criterion
- Based on statistical properties of the labeled
data,between-class and within-class scatter, a
feature ranking is made. - Fuzzy models are made by adding features
one-by-one, according the ranking, in an
iterative way. - The initial model to proceed with is chosen from
the performance curve.
9Similarity driven rule base simplification
A
B
- Sets are merged in iterative way by checking if
S(A,B) gt ? (default ? 0.5 ). - Sets equal to the universal set are removed.
- If only one set remains in an input domain then
the feature is left out. - If two antecedent parts of rules become similar
the consequents are merged and double rule are
removed.
10Aggregated similarity measure S
- Search for redundancy
- Aggregated similarity measure S for the complete
rulebase
11Genetic multi-objective optimization
- Classification error
- Multi-objective function
- ??-1,1 determines whether similarity is
rewarded (?lt0) or penalized (?gt0). - (In addition, CF may also be included)
12GA-coding of the fuzzy model
- Triangular membership functions
- Chromosome coding
- The consequent variables can also be appended if
these are also subject to optimization.
13Real-coded Genetic Algorithm
- 1. Load data and initial fuzzy model (FM)
- 2. Make a prototype chromosome from initial FM
- 3. Calculate constraints and create initial
population P0 - 4. Repeat until termination conditions tT
(a) deal with constraints on the fuzzy
sets (b) evaluate Pt by simulation and obtain Jt
(c) select chromosomes for operation (d)
select chromosomes for deletion (e) operate on
chromosomes (f) create new population Pt1
Initialize Evaluate Select Operate
14Three modeling schemes
15Wine data classification problem
- 179 samples, 3 classes, 13 attributes
16Wine data Scheme 1
- Fischer interclass separability
ranking8,5,11,2,3,9,10,6,7,4,1,12,13 - Classifiers are made by adding features one by
one. - 400 GA-iterations in the optimization.
- Best results with first 5 or 7 features, giving 2
and1 misclassification. - CF5 0.96,0.94,0.94 and CF7
0.94,0.99,0.97. - Final classifiers contain 15 and 21 fuzzy sets .
17Wine data Scheme 2
- Eight inputs were removed in 3 iterations3,5,
2,4,8,9, 6,12. - 200 GA-iterations in loop and 400 in final
optimization. - CF 0.96,0.94,0.94, 1 misclassification
- Final classifier contains 5 features defined
by11 fuzzy sets.
18Wine data Scheme 3
- 7,4,1,12,13 were selected based on Fischer
interclass ranking. - Initial model contains 9 misclassifications.
- 200 GA-iterations in loop and 400 in final
optimization. - 4 and 3 additional fuzzy sets were removed .
- CF 0.93,0.91,0.91, 3 misclassifications.
- Final classifier contains 4 features and 9 fuzzy
sets.
19Classifier obtained by scheme 1,2 and 3
1.
2.
3.
20Classifier of scheme 2
21Discussion (1)
- This method and some variations were also
successfully applied to (NAFIPS99,
FUZZ-IEEE00,TFS00) - Iris data, Wisconsin breast cancer data.
- Function approximation Sugenos rule base.
- Nonlinear dynamic plant of Wang and Yen.
- Dynamic model pressure in fermentor, Diesel
engine
22Discussion (2)
- All three schemes resulted in compact and
transparent models with 4-5 features and 9-15
fuzzy sets. - The Fischer interclass separability tool is not
necessary to make the smallest models for the
Wine data but still reduces the amount of
GA-iterations. - The interclass separability results in open-loop
feature selection while the similarity analysis
results in a closed-loop feature selection. - In similarity based-reduction, features can be
removed from single rules. This is not the case
in the presented open-loop feature selection but
is under investigation.
23Conclusion and future work
- The proposed method provides accurate and
transparent rule-based classifiers in a
semi-automatic way. - The evolutionary optimization is naturally
combined with the complexity reduction
techniques. - Study other multi-objective criteria for MIMO
system identification, controller design and
high-dimensional data-mining problems.