Title: Case Study: Genetic Algorithms
1Case Study Genetic Algorithms
- GAs are an area of AI research
- used to solve search problems with potentially
better performance than traditional search
problem - also possibly used as a form of learning
- In an AI search problem, you have a search space
- search space is the space of possible solutions,
for instance - in chess, it would be all possible board
configurations - in design it would be all of the possible
components available to build the device along
with all the possible ways to configure them - often, AI problems require a brute-force search,
but sometimes a heuristic is available to help
guide the search - in chess, we use a function based on the worth of
each piece and the strategy of moving them into
certain board locations to see if a move is worth
pursuing or not - in design, we might rate components based on
their utility, cost, weight, and their
configurations based on how easy it is to get to
a component - the guided search, known as heuristic search, is
still intractable (for n moves, there are 2n
board configurations) - GAs are an attempt to get around that problem
2Search Using Randomness
- Assume we are searching for the proper values to
place into a list - (a b c d e f) each of these represents
something meaningful - for instance a number of nails to use, b
number of boards to use, c number of wall
outlets, d number of windows, etc - Assume we have a function that can take a vector
(or list) and tell us approximately how good it
is - I can try to generate all possible combinations
of vectors but that is intractable - now consider many vectors are probably worthless
- if I can identify a worthless vector, I probably
wouldnt want to use any variations of that
vector - similarly, if I find a good but not great vector,
I might try variations of it - What is a variation?
- Some minor but random change to the original
- (a b c d e f) becomes (a b d c e f) or (a b c d
e f) - where a is 1 greater or less than a
3Inspired by Natural Selection
- GAs get their names because they are an attempt
to mimic the evolutionary process of genetic
material - Our vector represents a chromosome
- We start with a population of chromosomes that we
will breed - The children will be made up of much the same
genetic material as their parents (that is, the
childrens chromosomes will be very similar to
the parents chromosomes) - but there will be some variations
- possible mutations a particular gene mutates,
perhaps for the better, perhaps for the worse - possible cross-overs one child might inherit
the genes and another worse genes - Now we use natural selection
- we pick only a subset of children, which are the
fittest to survive, and use them to be parents of
a new generation - to determine the fitness of a child, we need a
fitness function - We iterate through many generations, hopefully
moving our genes towards a solution guided by our
fitness function
4General Procedure for GAs
Starting with a base population of parents,
mutate them into children, rate the children with
the fitness function, select from the children
those to go on to the next generation Notice
that they take efforts to make copies, these
steps can be skipped if we dont mind working
with the original parents
5Requirements
- We need to be able to express our problem as a
vector - the vector will be the set of attributes whose
values we want to assign or learn - consider that we want to make a better chocolate
chip cookie, our vector might include the amount
of flour, sugar, salt, water and chocolate chips
that we will put into our mixture - We need a fitness function to evaluate the
children - for a cookie, we might ask humans to eat the
cookies and rate them, this of course means that
our evolutionary process is greatly slowed down - We need to know how to mutate our parents into
new children - cross-over, point mutation, inversion
- We need to know how many children to generate
- such decisions might dictate how quickly or
slowly we reach a solution
6Vectors
- Vectors represent features (attributes)
- they could be binary values
- (a, b, c, d, e) where afever?, bsoreness?,
cachyeyes? drunny nose?, ecoughing? - we might wish to learn a concept like flu
- start with (0, 0, 0, 0, 0) (no symptoms) and
learn that this is not the flu - eventually we may learn that flu can be
represented by the vector (1, 1, 1, ?, ?) (?
means dont care what the value is) - vectors might store integer values
- vectors might contain a mixture of values, in
which case we have to be careful when doing
things like cross-over - consider an animal vector (a, b, c, d, e, f)
where aheight, bweight, c of legs, dcolor,
eshape and ftail?) - elephant(4, 2000 lbs, 4, grey, round, y)
- mouse(3, 2 lbs, 4, grey, round, y)
- dog(1, 20 lbs, 4, varies, long, y)
7Natural Selection Methods
- Evolutionary mechanisms are
- Inversion - moving around features in the vector
such as reversing 3 features - Point mutation - changing a features value to
another value (in binary vectors, simply
complementing the bit, in multi-valued vectors,
requires random selection of a new value) - Crossover using two parents and swapping
portions of their two chromosomes - The choice of which mechanism to use is made
randomly, and the choice of how/where to apply it
is made randomly
- Natural Selection mechanisms include
- Fitness Ranking - use a fitness function to
select the best available vector (or vectors) and
use it (them) - Rank Method - use the fitness function but do not
select the best, use probabilities instead - Random Selection - in addition to the top
vector(s), some approaches randomly select some
number of vectors from the remaining, lesser
ranked ones - Diversity - determine which vectors are the most
diverse from the top ranked one(s) and select it
(them)
8Choices/Diversity
- How many vectors should make up the population of
one generation? - if too low, vectors will be the same or similar
to previous generations - if too high, computation time may be too long
- What is the mutation rate?
- If too low, changes will occur infrequently, if
too high, there is no guided search - Is mating allowed?
- Are duplicate vectors allowed?
- Based on the idea that diversity helps promote
survival, it might also be reasonable to select
vectors which are most diverse from other
selections - Select the first vector(s) for the new generation
using the rank or fitness method - Select the remaining vector(s) by finding one(s)
most diverse with those already chosen