Industrial Applications of Neuro-Fuzzy Networks - PowerPoint PPT Presentation

About This Presentation
Title:

Industrial Applications of Neuro-Fuzzy Networks

Description:

Industrial Applications of Neuro-Fuzzy Networks – PowerPoint PPT presentation

Number of Views:300
Avg rating:3.0/5.0
Slides: 70
Provided by: DrDetle
Category:

less

Transcript and Presenter's Notes

Title: Industrial Applications of Neuro-Fuzzy Networks


1
Industrial Applications of Neuro-Fuzzy Networks
2
Example Continously Adapting Gear Shift Schedule
in VW New Beetle
3
Continously Adapting Gear Shift Schedule
Technical Details
  • Mamdani controller with 7 rules
  • Optimized program
  • 24 Byte RAM on Digimat
  • 702 Byte ROM
  • Runtime 80 ms12 times per second a new sport
    factor is assigned
  • How to generate knowledge automatically from data?

AG4
4
Learning from Examples (Observations, Databases)
  • Statistics parameter fitting, structure
    identification, inference method, model
    selection
  • Machine Learning computational learning (PAC
    learning), inductive learning, learning
    decision trees, concept learning, ...
  • Neural Networks learning from data
  • Cluster Analysis unsupervised classification

? Learning Problem is transformed into an
optimization problem. ? How to use these methods
in fuzzy systems?
5
Function Approximation with Fuzzy Rules
6
How to Derive a Fuzzy Controller Automatically
from Observed Process Data
  • Function approximation
  • Perform fuzzy cluster analysis of input-output
    data (FCM, GK, GG, ...)
  • Project clusters
  • Obtain fuzzy rules of the kind If x is small
    then y is medium

7
Fuzzy Cluster Analysis
  • Classification of a given data set X x1, ...,
    xn ? ?p into c clusters.
  • Membership degree of datum xk to class i is uik.
  • Representation of cluster i by prototype vi ? ?p.
  • Formal Minimisation of functional
  • under constraints

8
Simplest Algorithm Fuzzy-c-Means (FCM)
Iterative Procedure (with random initialisation
of prototypes vi)
and
FCM is searching for equally large clusters in
form of (hyper-)balls.
9
Examples
10
Fuzzy Cluster Analysis
  • Fuzzy C-Means simple, looks for spherical
    clusters of same size, uses Euclidean distance
  • Gustafson Kessel looks for hyper-ellipsoidal
    clusters of same size, distance via matrices
  • Gath Geva looks for hyper-ellipsoidal clusters
    of arbitrary size, distance via matrices
  • Axis-parallel variations exist that use diagonal
    matrices (computationally less expensive and less
    loss of information when rules are created)

11
Fuzzy Cluster Analysis with DataEngine
12
Construct Fuzzy Sets by Cluster Projection
Projecting a cluster means to project the degrees
of membership of the data on the single
dimensions Histograms are obtained.
13
FCLUSTER Tool for Fuzzy Cluster Analysis
14
Introduction
  • Building a fuzzy system requires
  • prior knowledge (fuzzy rules, fuzzy sets)
  • manual tuning time consuming and error-prone
  • Therefore Support this process by learning
  • learning fuzzy rules (structure learning)
  • learning fuzzy set (parameter learning)

Approaches from Neural Networks can be used
15
Learning Fuzzy Sets Problems in Control
  • Reinforcement learning must be used to compute an
    error value (note the correct output is
    unknown)
  • After an error was computed, any fuzzy set
    learning procedures can be used
  • Example GARIC (Berenji/Kedhkar 1992)online
    approximation to gradient-descent
  • Example NEFCON (Nauck/Kruse 1993)online
    heuristic fuzzy set learning using arule-based
    fuzzy error measure

16
(No Transcript)
17
Neuro-Fuzzy Systems in Data Analysis
  • Neuro-Fuzzy System
  • System of linguistic rules (fuzzy rules).
  • Not rules in a logical sense, but function
    approximation.
  • Fuzzy rule vague prototype / sample.
  • Neuro-Fuzzy-System
  • Adding a learning algorithm inspired by neural
    networks.
  • Feature local adaptation of parameters.

18
Example Prognosis of the Daily Proportional
Changes of the DAX at the Frankfurter Stock
Exchange (Siemens)
  • Database time series from 1986 - 1997

19
Fuzzy Rules in Finance
  • Trend RuleIF DAX decreasing AND US-
    decreasingTHEN DAX prediction
    decreaseWITH high certainty
  • Turning Point RuleIF DAX decreasing AND US-
    increasingTHEN DAX prediction
    increaseWITH low certainty
  • Delay RuleIF DAX stable AND US-
    decreasingTHEN DAX prediction
    decreaseWITH very high certainty
  • In generalIF x1 is m1 AND x2 is m2THEN y
    hWITH weight k

20
Classical Probabilistic Expert Opinion Pooling
Method
  • DM analyzes each source (human expert, data
    forecasting model) in terms of (1) Statistical
    accuracy, and (2) Informativeness by asking the
    source to asses quantities (quantile assessment)
  • DM obtains a weight for each source
  • DM eliminates bad sources
  • DM determines the weighted sum of source outputs
  • Determination of Return of Invest

21
  • E experts, R quantiles for N quantities
  • ? each expert has to asses RN values
  • stat. Accuracy
  • information score
  • weight for expert e
  • outputt
  • roi

22
Formal Analysis
  • Sources of information R1 rule set given by
    expert 1 R2 rule set given by expert 2 D data
    set (time series)
  • Operator schema fuse (R1, R2) fuse two rule
    sets induce(D) induce a rule set from
    D revise(R, D) revise a rule set R by D

23
Formal Analysis
  • Strategies
  • fuse(fuse (R1, R2), induce(D))
  • revise(fuse(R1, R2), D) ?
  • fuse(revise(R1, D), revise(R2, D))
  • Technique Neuro-Fuzzy Systems
  • Nauck, Klawonn, Kruse, Foundations of Neuro-Fuzzy
    Systems, Wiley 97
  • SENN (commercial neural network environment,
    Siemens)

24
From Rules to Neural Networks
1. Evaluation of membership degrees 2. Evalu
ation of rules (rule activity) 3. Accumulation
of rule inputs and normalization
25
Neuro-Fuzzy Architecture
26
The Semantics-Preserving Learning Algorithm
Reduction of the dimension of the weight space
1. Membership functions of different inputs share
their parameters, e.g. 2. Membership functions
of the same input variable are not allowed to
pass each other, they must keep their original
order, e.g. Benefits ? the optimized rule
base can still be interpreted ? the number of
free parameters is reduced
27
Return-on-Investment Curves of the Different
Models
Validation data from March 01, 1994 until April
1997
28
A Neuro-Fuzzy System
  • is a fuzzy system trained by heuristic learning
    techniques derived from neural networks
  • can be viewed as a 3-layer neural network with
    fuzzy weights and special activation functions
  • is always interpretable as a fuzzy system
  • uses constraint learning procedures
  • is a function approximator (classifier,
    controller)

29
Learning Fuzzy Rules
  • Cluster-oriented approachesgt find clusters in
    data, each cluster is a rule
  • Hyperbox-oriented approachesgt find clusters in
    the form of hyperboxes
  • Structure-oriented approachesgt used predefined
    fuzzy sets to structure the data space, pick
    rules from grid cells

30
Hyperbox-Oriented Rule Learning
Search for hyperboxes in the data space Create
fuzzy rules by projecting the hyperboxes Fuzzy
rules and fuzzy sets are created at the same
time Usually very fast
31
Hyperbox-Oriented Rule Learning
  • Detect hyperboxes in the data, example XOR
    function
  • Advantage over fuzzy cluster anlysis
  • No loss of information when hyperboxes are
    represented as fuzzy rules
  • Not all variables need to be used, dont care
    variables can be discovered
  • Disadvantage each fuzzy rules uses individual
    fuzzy sets, i.e. the rule base is complex.

32
Structure-Oriented Rule Learning
Provide initial fuzzy sets for all variables. The
data space is partitioned by a fuzzy grid Detect
all grid cells that contain data (approach by
Wang/Mendel 1992) Compute best consequents and
select best rules (extension by Nauck/Kruse 1995,
NEFCLASS model)
33
Structure-Oriented Rule Learning
  • Simple Rule base available after two cycles
    through the training data
  • 1. Cycle discover all antecedents
  • 2. Cycle determine best consequents
  • Missing values can be handled
  • Numeric and symbolic attributes can be processed
    at the same time (mixed fuzzy rules)
  • Advantage All rules share the same fuzzy sets
  • Disadvantage Fuzzy sets must be given

34
Learning Fuzzy Sets
  • Gradient descent proceduresonly applicable, if
    differentiation is possible, e.g. for Sugeno-type
    fuzzy systems.
  • Special heuristic procedures that do not use
    gradient information.
  • The learning algorithms are based on the idea of
    backpropagation.

35
Learning Fuzzy Sets Constraints
  • Mandatory constraints
  • Fuzzy sets must stay normal and convex
  • Fuzzy sets must not exchange their relative
    positions (they must not pass each other)
  • Fuzzy sets must always overlap
  • Optional constraints
  • Fuzzy sets must stay symmetric
  • Degrees of membership must add up to 1.0
  • The learning algorithm must enforce these
    constraints.

36
Different Neuro-Fuzzy Approaches
  • ANFIS (Jang, 1993)no rule learning, gradient
    descent fuzzy set learning, function approximator
  • GARIC (Berenji/Kedhkar, 1992)no rule learning,
    gradient descent fuzzy set learning, controller
  • NEFCON (Nauck/Kruse, 1993)structure-oriented
    rule learning, heuristic fuzzy set learning,
    controller
  • FuNe (Halgamuge/Glesner, 1994)combinatorical
    rule learning, gradient descent fuzzy set
    learning, classifier
  • Fuzzy RuleNet (Tschichold-Gürman,
    1995)hyperbox-oriented rule learning, no fuzzy
    set learning, classifier
  • NEFCLASS (Nauck/Kruse, 1995)structure-oriented
    rule learning, heuristic fuzzy set learning,
    classifier
  • Learning Fuzzy Graphs (Berthold/Huber,
    1997)hyperbox-oriented rule learning, no fuzzy
    set learning, function approximator
  • NEFPROX (Nauck/Kruse, 1997)structure-oriented
    rule learning, heuristic fuzzy set learning,
    function approx.

37
Example Medical Diagnosis
  • Results from patients tested for breast cancer
    (Wisconsin Breast Cancer Data).
  • Decision support Do the data indicate a
    malignant or a benign case?
  • A surgeon must be able to check the
    classification for plausibility.
  • We are looking for a simple and interpretable
    classifier ?knowledge discovery.

38
Example WBC Data Set
  • 699 cases (16 cases have missing values).
  • 2 classes benign (458), malignant (241).
  • 9 attributes with values from 1, ... ,
    10(ordinal scale, but usually interpreted as a
    numerical scale).
  • Experiment x3 and x6 are interpreted as nominal
    attributes.
  • x3 and x6 are usually seen as important
    attributes.

39
Applying NEFCLASS-J
  • Tool for developing Neuro-Fuzzy Classifiers
  • Written in JAVA
  • Free version for research available
  • Project started at Neuro-Fuzzy Group of
    University of Magdeburg, Germany

40
NEFCLASS Neuro-Fuzzy Classifier
41
NEFCLASS Features
  • Automatic induction of a fuzzy rule base from
    data
  • Training of several forms of fuzzy sets
  • Processing of numeric and symbolic attributes
  • Treatment of missing values (no imputation)
  • Automatic pruning strategies
  • Fusion of expert knowledge and knowledge obtained
    from data

42
Representation of Fuzzy Rules
Example 2 Rules R1 if x is large and y is
small, then class is c1. R2 if x is large and y
is large, then class is c2. The connections x ?
R1 and x ? R2are linked. The fuzzy set large
is a shared weight. That means the term large
has always the same meaning in both rules.
43
1. Training Step Initialisation
Specify initial fuzzy partitions for all input
variables
44
2. Training Step Rule Base
Algorithm for (all patterns p) do find
antecedent A, such that A( p) is maximal if (A
? L) then add A to L end for (all
antecedents A ? L) do find best consequent C for
A create rule base candidate R
(A,C) Determine the performance of R Add R to
Bend Select a rule base from B
Variations Fuzzy rule bases can also be created
by using prior knowledge, fuzzy cluster analysis,
fuzzy decision trees, genetic algorithms, ...
45
Selection of a Rule Base
  • Order rules by performance.
  • Either selectthe best r rules orthe best r/m
    rules per class.
  • r is either given or is determined automatically
    such that all patterns are covered.

46
Rule Base Induction
NEFCLASS uses a modified Wang-Mendel procedure
47
Computing the Error Signal
Fuzzy Error ( jth output)
Rule Error
48
3. Training Step Fuzzy Sets
Exampletriangularmembershipfunction.
Parameterupdates for anantecedentfuzzy set.
49
Training of Fuzzy Sets
Heuristics a fuzzy set is moved away from x
(towards x) and its support is reduced
(enlarged), in order to reduce (enlarge) the
degree of membership of x.
50
Training of Fuzzy Sets
  • Variations
  • Adaptive learning rate
  • Online-/Batch Learning
  • optimistic learning(n step look ahead)

Observing the error on a validation set
51
Constraints for Training Fuzzy Sets
  • Valid parameter values
  • Non-empty intersection of adjacent fuzzy sets
  • Keep relative positions
  • Maintain symmetry
  • Complete coverage (degrees of membership add up
    to 1 for each element)

52
4. Training Step Pruning
Goal remove variables, rules and fuzzy sets, in
order to improve interpretability and
generalisation.
53
Pruning
Algorithm repeat select pruning
method repeat execute pruning step train
fuzzy sets if (no improvement) then undo
step until (no improvement) until (no
further method)
Pruning Methods 1. Remove variables(use
correlations, information gain etc.) 2. Remove
rules(use rule performance) 3. Remove
terms(use degree of fulfilment) 4. Remove
fuzzy sets(use fuzziness)
54
WBC Learning Result Fuzzy Rules
R1 if uniformity of cell size is small and bare
nuclei is fuzzy0 then benign R2 if uniformity of
cell size is large then malignant
55
WBC Learning Result Classification Performance
Estimated Performance on Unseen Data (Cross
Validation)
  • NEFCLASS-J 95.42
  • Discriminant Analysis 96.05
  • C 4.5 95.10
  • NEFCLASS-J (numeric) 94.14
  • Multilayer Perceptron 94.82
  • C 4.5 Rules 95.40

56
WBC Learning Result Fuzzy Sets
57
NEFCLASS-J
58
Resources
Detlef Nauck, Frank Klawonn Rudolf
Kruse Foundations of Neuro-Fuzzy Systems Wiley,
Chichester, 1997, ISBN 0-471-97151-0
Neuro-Fuzzy Software (NEFCLASS, NEFCON,
NEFPROX) http//www.neuro-fuzzy.de Beta-Version
of NEFCLASS-J http//www.neuro-fuzzy.de/nefclass/
nefclassj
59
Conclusions
  • Neuro-Fuzzy-Systems can be useful for knowledge
    discovery.
  • Interpretability enables plausibility checks and
    improves acceptance.
  • (Neuro-)Fuzzy systems exploit tolerance for
    sub-optimal solutions.
  • Neuro-fuzzy learning algorithms must observe
    constraints in order not to jeopardise the
    semantics of the model.
  • Not an automatic model creator, the user must
    work with the tool.
  • Simple learning techniques support explorative
    data analysis.

60
Download NEFCLASS-J
Download the free version of NEFCLASS-J
athttp//fuzzy.cs.uni-magdeburg.de
61
Fuzzy Methods in Information Mining Examples
  • here Exploiting quantitative and qualitative
    information
  • Fuzzy Data Analysis (Projects with Siemens)
  • Information Fusion (EC Project)
  • Dependency Analysis (Project with
    Daimler/Chrysler)

62
Analysis of Daimler/Chrysler Database
  • Database 18.500 passenger cars gt 100
    attributes per car
  • Analysis of dependencies between special
    equipment and faults.
  • Results used as a starting point for technical
    experts looking for causes.

63
Learning Graphical Models
local models
64
The Learning Problem
65
Possibility Theory
  • fuzzy set induces possibility
  • axioms

66
General Structure of (most) Learning Algorithms
for Graphical Models
  • Use a criterion to measure the degree to which a
    network structure fits the data and the prior
    knowledge(model selection, goodness of
    hypergraph)
  • Use a search algorithm to find a model that
    receives a high score by the criterion(optimal
    spanning tree, K2 greedy selection of parents,
    ...)

67
Measuring the Deviation from an Independent
Distribution
Probability- and Information-based Measures
  • information gain identical with mutual
    information
  • information gain ratio
  • g-function (Cooper and Herskovits)
  • minimum description length
  • gini index

Possibilistic Measures
  • expected nonspecificity
  • specificity gain
  • specificity gain ratio

(Measures marked with originated from decision
tree learning)
68
Data Mining Tool Clementine
69
Analysis of Daimler/Chrysler Database
electrical roof top
air con- ditioning
type of engine
type of tyres
slippage control
faulty battery
faulty compressor
faulty brakes
Fictituous example There are significantly more
faulty batteries, if both air conditioning and
electrical roof top are built into the car.
Write a Comment
User Comments (0)
About PowerShow.com