Title: H8AAA
1CY3G2 Modern Heuristics
Lecture 9
- Dr. Gillian Walker
- Room 182
- g.c.walker_at_reading.ac.uk
2Problem of last Week
- There are five houses, each of a different colour
and inhabited by women of different
nationalities, with different pets, favourite
drinks and cars. Moreover - The Englishwoman lives in the red house.
- The Spaniard owns the dog.
- The woman in the green house drinks cocoa.
- The Ukrainian drinks eggnog.
- The green house is immediately to the right of
the ivory house. - The owner of the Toyota car also owns snails.
- The owner of the Ford lives in the yellow house.
3Problem of last week
- The man in the middle house drinks milk.
- The Norwegian lives in the first house on the
left. - The woman who owns the Chevrolet lives in the
house next to the house where the woman owns a
fox. - The Ford owners house is next to the house where
the horse is kept. - The Mercedes-Benz owner drinks orange juice.
- The Japanese drives a Volkswagen.
- The Norwegian lives next to the blue house.
- The question is .... Who owns the zebra?
Furthermore, who drinks water?
4Solution of last week.
- Establish a table representing all the
information, or implied information we were
given. - Fill in the direct information first.
- Sentence 1 states the englishwoman lives in the
red house - Sentence 5 states that the green house is
immediately to the right of the red house - So.... House 1 isnt red, house 3 4 or 4 5
are ivory and green.
House 1 2 3 4 5
Colour Blue
Drink Milk
Country Norwegian
Car
Pet
5Solution of last week.
- So House 1 must be yellow. Therefore the
Norwegian owns the ford and the horse is kept in
house two. - From here there are two possibilities. The only
possible sequences for the colours of houses 3, 4
and 5 are - ivory, green, red or red ivory green.
House 1 2 3 4 5
Colour Yellow Blue
Drink Milk
Country Norwegian
Car Ford
Pet Horse
6Solution of last week.
- Consider ivory, green red.
- We can now infer more information The
englishwoman lives in house 5, as it is red.
Cocoa is drunk in house 4 because it is green.
The Ukrainian must live in house 2 because he
drinks eggnog. The Englishwoman owns a Mercedes
and drinks orange juice. Our information is
House 1 2 3 4 5
Colour Yellow Blue Ivory Green Red
Drink Eggnog Milk Cocoa Orange
Country Norwegian Ukrainian English
Car Ford Mercedes
Pet Horse
7Solution of last week.
- Consider ivory, green red.
- This wont lead to a solution!
- Who owns the Toyota? The Japanese owns the
Volkswagen, it isnt the Ukrainian because the he
owns a horse not snails, and it cant be the
Spaniard because she owns a dog. - We have reached a contradiction. So we will try
the other colour combination.
House 1 2 3 4 5
Colour Yellow Blue Ivory Green Red
Drink Eggnog Milk Cocoa Orange
Country Norwegian Ukrainian English
Car Ford Mercedes
Pet Horse
8Solution of last week.
- Consider red, ivory, green
- The Ukrainian drinks eggnog and so must live in
house 2 and 4. - If the Ukrainian lives in house 4 then the
Spaniard (who owns a dog) must live in house 5
and the Japanese must live in house 2. - Also Orange juice must be drunk in house 2 whose
inhabitant drives a Mercedes. This is a
contradiction because the Japanese owns a
Volkswagon SO the Ukrainian must live in house 2.
House 1 2 3 4 5
Colour Yellow Blue Red Ivory Green
Drink Milk Cocoa
Country Norwegian English
Car Ford
Pet Horse
9Solution of last week.
- Consider red, ivory, green
- The owner of the Mercedes drinks orange juice and
must live in house 4. - The Japanese owns a Volkswagen must live in house
5. - The Spaniard (who owns a dog) lives in house 4
House 1 2 3 4 5
Colour Yellow Blue Red Ivory Green
Drink Eggnog Milk Orange Cocoa
Country Norwegian Ukrainian English Spaniard Japanese
Car Ford Mercedes Volkswagen
Pet Horse Dog
10Solution of last week.
- The Toyota owner also owns snails. So he must
live in house 3. - The Chevrolet owner is in house 2 and the fox
house 1. - So the Japanese owns the zebra and
- The Norwegian drinks the water.
House 1 2 3 4 5
Colour Yellow Blue Red Ivory Green
Drink Eggnog Milk Orange Cocoa
Country Norwegian Ukrainian English Spaniard Japanese
Car Ford Chevrolet Toyota Mercedes Volkswagen
Pet Fox Horse Snails Dog
11Summary
- In the last lecture we looked at how we could
further develop evolutionary algorithms to deal
with problem constraints. - This lecture we are going to discuss how to tune
our algorithms to our problems. - In addition we are going to look at
hybridisation. - Finally we are going to test to make sure we have
the best solution.
12Tuning
- Each problem solving technique we have looked at
has had parameters, in general the more
complicated the technique the more parameters
were involved - Hill climbing (size of local search space)
- Tabu (how to impliment memory structure)
- Simulated annealing (Temperature parameter and
how to cool) - Evolutionary algorithms have even more
parameters Population size, who breeds, how they
breed, who passes on to the next generation,
mutation probabilities, termination conditions
etc.
13Tuning
- How do we tune these parameters to
- Solve a given problem in the shortest possible
time? - Solve as many problems with the one algorithm as
possible? - You have experience of this from looking at the
Sudoku code. - Trial and Error is tedious and time consuming.
- There are loads of parameters to do.
- Once you have optimised one parameter this may
affect the optimum value of another parameter and
so on. - Trying all possible combinations is practically
impossible and it is difficult to know when you
have the optimum parameter combination.
14Tuning
- There are two approaches to parameter tuning
- Tuning parameters before you run the algorithm.
- Tune parameters during the running of the
algorithm. - The latter option inevitably leads you to the
concept of control. - Parameter control can take the form of
- Deterministic parameter control (search over
time) - Adaptive parameter control (use feedback from
the environment to determine optimum parameters
(credit based)) - Self-Adaptive run an evolutionary algorithm
inside the evolutionary algorithm for each of the
parameters evolve the parameters.
15No Free Lunch
- The No Free Lunch theorem states that
- All algorithms that search for an extreme of a
cost function perform exactly the same, according
to any performance measures, when averaged over
all possible cost functions. Wolpert
and Macready 96 - No single optimum search algorithm exists for
blind search. - Optimisation of individual algorithms has to be
related to the problem.
16Hybridisation
- No single algorithm can be the best approach to
solve every problem. - We need to incorporate knowledge from our problem
to help us solve it, otherwise all we have is a
blind search. - One way to try to achieve a more automated
algorithm is to hybridise evolutionary algorithms
with other more standard approaches (hill
climbing or greedy methods) to problem solving. - Improve individual solutions with a local search
and replace them in the solution population to
compete with the rest. - Seed the initial population with solutions which
are found with standard techniques. - The only restriction to hybridisation is your
imagination.
17Hybrid Systems
- Evolutionary algorithms are flexible and can be
extended by including diverse concepts and
alternative approaches - Incorporate local searches to improve solutions
in a population. - Use Lamarckian and Baldwin ideas of evolution to
handle constraints. - Incorporate control parameters so that the
evolutionary algorithm can tune itself. - Introduce memory into a proportion, or the whole
population so as to better deal with time-varying
environments. - This is just the beginning.
- The most common problem with hybrid systems is
that the designer gets carried away and
incorporates too many concepts.
18Hybrid Enhancements
- Some hybrid enhancements are
- Incorporate memory
- Temperature
- Mating
- Attractiveness
- Subpopulations
- Gender
- These lead to the possibility of other hybrid
enhancements If we have a subpopulation - We could run different evolutionary algorithms on
each sub population (island) - From time to time we could move solutions from
one sub population to the next (migration)
19Hybrid enhancements
- These enhancements in turn lead to new problem
parameters - How many subpopulations should there be?
- Should all the subpopulations be the same size?
- Should the same or different evolutionary
algorithms be used on each subpopulation? (you
could tune each algorithms parameters
specifically to the population in question) - What topology should you use to connect the
subpopulations? - What are the mechanisms for migration?
- How often should migration occur?
- Who should migrate? (Best, worse or random
solutions?) - What are the rules for immigration. (do we
replace solutions or increase the population?)
20How do we know if we have a good solution?
- If we blindly accept the inputs to our algorithm
often it will blindly generate the outputs. - Garbage In Garbage Out
- Evolutionary based algorithms are designed to
produce optimal or near optimal solutions. - The final population (whatever your termination
criteria) will likely be a population of good
solutions. - As you have seen each run of an evolutionary
algorithm took a different number of iterations
to generate the solution, and the final
population in each case will be different too. - There are a number of statistical methods to
determine the deviation of the final population,
or a subset of the final population to give us a
measure of the quality of the solution.
21Standard Deviation
- Standard deviation is the most common measure of
statistical dispersion. - The standard deviation is the square root of the
variance. This means it is the root mean square
(RMS) deviation from the average. - This gives a measure of dispersion that is
- A non-negative number
- Has the same units as the data.
- A distinction is made between the standard
deviation of a whole population or of a random
sample of the population, or a subpopulation.
22Students t-test
- "Student" was the pen name of William Sealy
Gosset, a statistician for Guinness brewery in
Dublin, Ireland. - Gosset was hired as a result of an innovative
policy of Claude Guinness to recruit the best
graduates from Oxford and Cambridge for the
application of biochemistry and statistics to
Guinness's industrial processes. Gosset published
the t-test in Biometrika in 1908, but was forced
to use a pen name by his employer who regarded
the fact that they were using statistics as a
trade secret. In fact, Gosset's identity was
unknown not only to fellow statisticians but to
his employer - the company insisted on the
pseudonym so that it could turn a blind eye to
the breach of its rules. - Gosset invented the t-statistic to enable the
quality of beer brews to be monitored in a
cost-effective manner. - Today, it is more generally applied to the
confidence that can be placed in judgements made
from small samples.
23Students t-test
- The test compares the means of two treatments (or
final populations of two evolutionary algorithms)
even if they have different numbers of
replicates. - The t-test compares the actual difference between
two means in relation to the variation in the
data (expressed as the standard deviation of the
difference between the means.)
24Students t-test
- SE standard error of the difference.
- Take the variance for each group and divide it
by the number of people in that group. Add these
two values and then take their square root.
25The Analysis of Variance (ANOVA)
- ANOVA is a family of general techniques used to
test the hypothesis that the means among two or
more groups are equal, R. Fisher 1920s - - under the assumption that the sampled
populations are normally distributed. - Multiple t-tests are not useful as the number of
groups for comparison grows. As the number of
comparison pairs grows, the more likely we are to
observe things that happen only 5 of the time - Thus P0.05 for one pair cannot be considered
significant. - ANOVA puts all the data into one number (F) and
gives us one P for the null hypothesis.
26Significance
- In statistical hypothesis testing, two hypotheses
are stated, only one of which can be true. - The null hypothesis, is what is presumed to be
true. - The alternative hypothesis, is will be
considered true only if the facts are strong
enough. - The statistical hypothesis testing procedure
(e.g. t-test) produces a value, - If the t value that is calculated is greater than
the threshold chosen for statistical significance
(usually the 0.05 level), then the null
hypothesis that the two groups do not differ is
rejected in favour of the alternative hypothesis,
which typically states that the groups do differ.
27Confusion Matrix
- The confusion matrix is a visualization tool used
in supervised learning. - Each column of the matrix represents the
instances in a predicted class, while each row
represents the instances in an actual class. - One benefit of a confusion matrix is that it is
easy to see if the system is confusing two
classes (i.e. commonly mislabelling one as an
other). - Often used in the diagnosis of diseases.
28Confusion Matrix
- True Positive an individual classified as
positive by the test and verified by the gold
standard - True Negative an individual classified as
negative by the test and verified by the gold
standard - False-Positive and False-Negative also used
- Sensitivity True Positive Decisions
- All Gold Standard Positives
- Specificity True Negative Decisions
- All Gold Standard Negative
29Problem of the Week
- Day of the week of January the 1st.
- Which day of the week appears more often as the
first day of a year - Saturday or Sunday?