Title: Producing Artificial Neural Networks using a Simple Embryogeny
1Producing Artificial Neural Networks using a
Simple Embryogeny Chris Bowers School of Computer
Science, University of Birmingham cpb_at_cs.bham.ac.u
k
Introduction
Process of mapping genotype to phenotype
Artificial Neural Networks (ANNs) can be
extremely powerful and this power is mainly a
result of the hugely parallel structure exhibited
by these systems. However predetermining the
structure of such complex networks is entirely
problem dependent and so there is no strong
theory behind how to construct an ANN given a
particular problem. The fact that the structure
of a successful ANN cannot be predetermined has
resulted in limited success in producing networks
which can solve large complex problems. This is
due to an almost exponential increase in the
numbers of parameters describing a network as the
size of a network is increased. In order to
overcome this problem of scalability, neural
networks have been integrated into evolutionary
systems (EANNs) so that optimal neural networks
for more complex problems can be found. This
approach has resulted in novel architectures and
networks never achieved by hand coded methods.
However, this has had limited success at
producing large scale complex networks since the
size of the genetic search space is directly
related to the number of free parameters in the
ANN. So for larger networks this results in an
enormous search space. Therefore, most EANN
implementations still result in a lack of
scalability when considering large ANNs. In
recent years, the use of an even more
biologically inspired method for EANNs has been
suggested. Nature solves this problem of
scalability by evolving the rules upon which
biological neural networks are grown. These
growth rules are applied at the cellular level
during development in a process known as
embryogeny. This means that the size of the
genetic search space is no longer dependent upon
the size of the neural networks but upon the
amount of linearity, repetition and modularity
exhibited within the resultant network. The work
shown on this poster is an example of how a
simple model of embryogeny can be used to grow
neural networks and identifies some of the
difficulties highlighted by this approach.
- The genotypes are mapped to phenotypes using a
growth process. This growth process is performed
on a grid which defines an environment where,
within each location, a cell can exist. In order
to produce a growth step the state of each cell
in the grid is encoded into a binary string upon
which the genotype operates. The result of this
operation is the new state which the cell must
take. This is done in parallel on each cell in
the grid. The state of a cell consists of - the current cell type.
- The cell type of surrounding cells.
- The level of chemical in the surrounding area.
- Axon direction and length
- This allows each cell to perform operations such
as die, divide, move, differentiate and produce
and axon.
Discussion
The aim of this work is not to produce a model of
neural cell growth for developmental biologists
or neuroscience but to try to understand the
basic forces in development that allow it to be
so powerful in expressing complex systems in
comparatively simple genotypes. It is hoped that
in future work it will be shown that development
can be used to produce more complex systems than
can currently be produced using a traditional
approach without restrictions from issues of
scalability. Only preliminary testing has been
performed in order to determine the capabilities
of this model. Initially a simple XOR network was
found quite easily and several solutions have
been found for various sized n-parity problems.
However, in order to really test the scalability
of this embryogeny the task was extended to the
harder problem of finding a genome which solves
all n-parity problems in n growth steps. This has
proved extremely difficult. There are a number
of important problems which have not been dealt
with so far in this work. Since a simple back
propagation method was used to determine the
weights in the network the training performance
was still dependent upon the size of the network
and so network size is still limited in this
model by computational power required to obtain
optimal weights. The genotype search space can
still be quite large for more complex genotypes
and it is likely that the representation used
here results in a rather rugged search space
which is difficult for evolution to navigate.
Determining the fitness of a genotype
The fitness of a given genotype is dependent upon
the fitness of its resultant phenotype. Since the
phenotype defines a grid occupied by an
arrangement of cells with various axons
protruding from cells then an interpretation
algorithm is required to produce a ANN structure
form this phenotype. The process consists of
first determining each cell type and then
determining its connectivity with other cells in
the network. Connectivity between two cells is
dependent upon the distance of the axon head of
one cell to the body of the other (figure 2).
Once this is completed the network can be
evaluated which requires a three stage process.
Genotype structure
The genotypic structure is based upon a recurrent
version of the Cartesian Genetic Programming
(CGP) approach originally developed for Boolean
function learning. The genotype consists of a set
of nodes represented in grid formation (figure
1). The number of rows in the grid is dependent
upon the number of binary outputs required from
the Boolean function. The number of columns is
dependent upon the complexity of the Boolean
function required. Each nodes defines a Boolean
operation, such as NAND or NOT, and a set of
binary inputs upon which the Boolean function
operates. These inputs are either direct inputs
to the CGP or from the output of other nodes. In
this manor Boolean operations can be connected
together in a graph to form a Boolean function
which operates on a binary string.
- Does the network consist of the correct number of
inputs and outputs. - Does the network connectivity result in a valid
network. - 3. How well does the network perform when trained
on a set of training patterns taking an average
of a number of randomly initialised networks.
Problems with an embrogeny approach
- Since an embryogeny introduces a complex mapping
into the evolutionary process this effectively
results in two separate search spaces. - a genotype search space which consists of all
possible growth rules and is entirely dependent
upon the representation used for these growth
rules. - a phenotype search space which consists of all
possible neural networks, many of which would
produce identical results and many of which are
totally invalid neural networks (this space may
be infinite in size). - Considering that the size of the phenotype space
is likely to be much larger than that of the
genotype space and that also there is likely to
be a many to one mapping between the genotype
space and phenotype space means that the genotype
space will only map to certain locations in the
phenotype space. Therefore the representation
used will limit the number of possible ANN
architectures that can be produced (figure 3). It
is imperative in this case that the
representation used exhibit the following two
characteristics when considering the search
space. - produce a fitness landscape in the genotypic
space which is conducive to evolutionary search,
i.e. as smooth and with as few local optima as
possible. - only map to areas in the phenotype space that
consists of valid solutions and to keep
neutrality in the fitness landscape of this areas
in the phenotype space to a minimum.
Further Work
- The model described here has highlighted many
problem areas in the use of an embryological
system. Mainly that of the huge and rugged
genotype landscape which makes evolution towards
an optimal ANN extremely hard. Directions for
further work are mainly concerned with improving
the evolvability of the systems described here - Ensure the representation used has the
flexibility to exhibit safe mutation and allow
development processes to be initiated at
different times in the development? In natural
development these are know as canalisation and
heterochrony respectively and seem to be
extremely important in allowing efficient
evolution. - The capabilities of a genotype representation
can be expanded by adding more genes to the
genotype. However this results in variable genome
length which introduces a new set of problems for
evolutionary computation. Nature overcomes these
problems using gene duplication.
X. Yao, Evolving artificial neural networks,
Proceedings of the IEEE, 87(9)1423-1447,
September 1999 J. F. Miller, P. Thomson,
Cartesian Genetic Programming, Third European
Conference on Genetic Programming , Edinburgh,
April 15-16, 2000, Proceedings published as
Lecture Notes in Computer Science, Vol. 1802, pp.
121-132