Title: Diapositivo 1
1How To Conduct Good Experiments?
Ernesto Costa DEI/CISUC ernesto_at_dei.uc.pt http//w
ww.dei.uc.pt/ernesto
2Summary
- What is the goal of this talk?
- Background
- Probabilities
- Random Variables and Probability distributions
- Inferential Statistics
- Applying the Theory
- Conclusions
3What is the goal of this talk?
- I dont know! I have been asked to give a talk on
that subject - I do know!
- EC is (much) an experimental discipline
- Most of our work is to compare things
- Algorithms
- Parameters settings
- What is a fair comparison?
4What is the goal of this talk?
- Looking for EC papers
- One problem
- One run
- Several runs
- 10, 20, 30?
- Use average values
- Use average of the bests
- Use the mean
- Use the mean and the standard deviation
- Use Confidence Levels / Intervals
5What is the goal of this talk?
- What is a good experiment?
- Identify independent and dependent variables
- Mutation rate ? fitness
- Different crossover operators ? fitness
- Evolution and Learning ? of survivors
- Identify the conditions of the experiment
- Initial conditions
- Number of runs
- Parameters Settings
- Identify the kind of Statistics you will need
- Descriptive
- Inferential
- Non parametric
6Background
Probabilities
- Experiment procedure whose variable result
cannot be predicted ahead of time. - Tossing a coin, rolling a dice
- Sample Space set of possible outcomes of an
experiment. - Heads, Tails
- 1,2,3,4,5,6
- Event subset of the sample space
- Heads
- 1,3,5
7Background
Probabilities
- Probability of an Event
- Measure the likelihod that the event will occur
- Tossing a (fair) coin probability(outcomeheads)
1/2 - Axioms
- P(E)?0
- P(S)1
- For mutually exclusive events
8Background
Probabilities
- Example
- What is the probability of when rolling two dice
the sum of the two outcomes equal 7? - Working Methodology
1/6
9Probabilities
Example A family has two children. Knowing that
one is a boy what is the probability that they
have two boys?
1/3
Definition Let E and F be two events, with
p(F)gt0. The conditional probability of E given
F, p(EF), is defined as
10Probabilities
Example A building has two lifts. One is used by
45 of the residents And the other by 55. The
first one, 5 of the time have problems,
while The second 8 of the time can let you in
trouble. Knowing that one lift had a problem ,
what is the probability of being lift number 1?
33,8
Theorem of Bayes
11Random Variables and Probability Distributions
Random Variables
Definition A random variable, X, is a function
from the sample space of an experiment to the set
of real numbers.
A RV is a function and is not random!!!
12Random Variables and Probability Distributions
13Random Variables and Probability Distributions
Example Suppose you toss a coin three times. Let
X(t) denote the number of heads that appear when
t is the result. Então X(t)
X(HHH) 3 X(HHT) X(HTH) X(THH) 2 X(TTH)
X(THT) X(HTT) 1 X(TTT) 0
Probabilty Distribution
14Random Variables and Probability Distributions
Types of Random Variables
- Discrete
- Probability Mass Function
- Continuous
- Probability Density Function (pdf)
15Random Variables and Probability Distributions
Measures of Random Variables
- Location
- Mean
-
-
- Dispersion
- Variance
-
-
- Standard Deviation
-
16Random Variables and Probability Distributions
Independence of Random Variables
- Two random Variables, X and Y, over the same
sample space S, are said to be independent iff - Theorem of the Product
- Theorem of Sum
17Random Variables and Probability Distributions
Discrete Probability Distributions
- Binomial Distribution
- Domain 0,1,2,n
- Probability mass function
- Mean np ?
- Variance npq ?
P0.3
P0.5
18Random Variables and Probability Distributions
Discrete Probability Distributions
- Poisson Distribution
- Approach the Binomial Distribution
- Domain 0,1,2,3,...
- Probability mass function
- Mean l
- Variance l
lnp
19Random Variables and Probability Distributions
Continuous Probability Distributions
- Normal (Gaussian) Distribution
- Standard Normal Distribution
20Random Variables and Probability Distributions
Continuous Probability Distributions
- Converting a normal distribution to a standard
normal distribution - X a random Variable with
- Mean ?
- Standard Deviation s
- Using a translation
- Defining a new Random variable
21Random Variables and Probability Distributions
Continuous Probability Distributions
- Students t-Distribution
- Approximates the standard normal distribution
N(0,1) - Degrees of freedom (df),?
- Mean 0, ?gt1
- Variance ?/(?-2), ?gt2
22Background
Statistics
- Goal to apply probability theory to data
analysis - How?
- Model the data (population) by mean of a
probability distribution - Use a sample of the data instead of the all
population - Estimate the population parameters (?, s, p)
using correspondent sample statistics (x, s,
)
sample
population
x
?
statistics
s
parameters
s
p
23Background
Statistics
- Unbiased estimator
- A statistics with mean value equal to the
population parameter being estimated - Point Estimators
- Interval Estimators
24Background
Sample distribution of the sample mean and the
Central Limit Theorem
- Consider a population with mean ? and standard
deviation s. Let denote the mean of the
observations in random samples of size n. Then -
-
- When the population distribution is normal, the
sampling distribution of is also normal
for any sample size n - (Central Limit Theorem) When n is sufficient
large (ngt30) the sampling distribution is well
aproximated by a normal curve, even if the
population distribution is not itself normal
25Background
Sample distribution of the sample mean
- Unbiased estimators
- Mean
- Standard Deviation
(n-1) are the degrees of freedom (df)
26Background
Sample distribution of the sample mean and the
Central Limit Theorem
- Consequence
- For a large sample or population whose
distribution is normal - has (approximately) a standard normal (Z)
distribution.
27Background
Confidence Intervals one sample
- Estimate the mean ?
- The population standard deviation, s, is known
- The sample mean from a random sample,
is known, - The sample size is large (gt30)
- The one sample Z confidence interval is
- Example for an 95 confidence interval Z1.96.
28Background
Confidence Intervals one sample
- Example we want a confidence level of 90
- Look into a N(0,1)
- For a CL of 90, we have to isolate the area of
5 to the left and to the right of the bell
shaped normal distribution. - The confidence interval will be given by
- Looking in a table for the value of Z we obtain
Z1.65
29Background
Confidence Intervals one sample
- What does it means having a confidence interval
of 95? - That there is a probability of 95 that the true
mean (population) is in the interval? NO!! - Mean that 95 of all possible samples result in
an interval that includes the true mean!
30Background
Confidence Intervals one sample
- Estimate the mean ?
- The population standard deviation, is NOT known
- The sample mean from a random sample,
is known, - The sample size is large (gt30) OR the population
distribution is normal - The one sample t confidence interval is
- where the t critical value is based on (n-1)
degrees of freedom (df). - Example for an 95 confidence interval and 19 df
t2.09. - The Student T Distribution can be used for small
samples assuming that the population distribution
is approximately normal
31Background
Hypothesis Testing one sample
- A hypothesis is a claim about the value of one or
more population characteristics. - A test procedure is a method for using sample
data to decide between to competing claims about
population characteristics. (? 100 or ? ?100) - Method by contradiction we assume a particular
hypothesis. Using the sample data we try to find
out if there is convincing evidence to reject
this hypothesis in favor of a competing one
32Background
Hypothesis Testing one sample
- The null hypothesis, H0, is a claim about a
population characteristic that is initially
assumed to be true. - Ha is the alternative hypothesis or competing
claim. - Testing H0 versus Ha can lead to the conclusion
the H0 must be rejected or we fail to reject H0.
I that last case we cannot say that H0 is
accepted!
33Background
Hypothesis Testing one sample
- Errors
- Type I error
- Rejecting H0 when H0 is true
- The probability of a type I error, ?, is called
Level of Significance of the test. - Type II error
- Failing to reject H0 when H0 is false
- The probability of a Type II error is denoted by
?. - There is a tradeoff between ? and ? making type
I error very small increase the probability of
type II error.
34Background
Hypothesis Testing one sample
- Test Statistic (Z,t) function of the sample data
on which a decision about reject or fail to
reject H0 is based - p-value (observed significance level) is the
probability, assuming that H0 is true, of
obtaining a test statistics at least as
inconsistent with H0 as what actually resulted. - Decision about H0 comparing the p-value with the
chosen ?. - Reject H0 if p-value? ?
35Background
Hypothesis Testing one sample
- Hypothesis Testing principles
- What is the population parameter (mean,)
- State the H0 and Ha
- Define the significance level ?
- The assumptions for the test are reasonable (big
sample,) - Calculate the test statistic (Z,)
- Calculate the associated p-value
- State the conclusion (reject if p-value ? ?,)
36Background
Hypothesis Testing one sample
- Example
- Population parameter the mean, ?
- H0 ?100, Ha ??100
- Significance level ?0.01
- n40 is large
- From the sample 105,3, s8.4
-
- From the z-curve we know that the p-value ?0
- Therefore the null hypothesis, H0, is rejected
with a significance level of 0.01.
37Background
Comparing Two Populations based on independent
samples
- Use the sample distribution of the difference of
the sample means - Properties
- The mean of the difference is equal to the
difference of the means - The variance of the difference is equal to the
sum of the individuals variances. Thus, the
standard deviation - The sampling distribution of the difference of
the sample means, can be considered approximately
normal (each n large, each sample mean come from
a population (approximately) normal
38Background
Confidence interval for the mean of
- Assumptions
- The two samples are independently random samples
- Sample sizes are both large (n gt30) OR the
population distributions are (approximately)
normal. - Formulas
-
39Background
Hypothesis Test
- Same procedure, only the formulas are different!
- Z Test
- Large samples OR
- Population distributions are (at least
approximately) normal
40Background
Hypothesis Test
- t test
- Large samples OR
- Population distributions normal AND the random
samples are independent
41Applying the Theory
The Busy Beaver Problem
- Two algorithms
- A standard GA
- A standard GA local learning (Baldwin Effect)
- Goal good quality machines
- Who is better? Comparing the means!
- H0?1 ?2 (no improvement!!!), Ha ?1? ?2
- Confidence level, ? 0.01
- Assuming that the population distributions are
normal - Number of (independent) runs 30 for each case
- Use t test
42Applying the Theory
The Busy Beaver Problem
- From the samples ( good machines)
- ?sga0.1
- ?be0.23
- Sga20.093
- Sbe20.185
- From the formulas
- df53
- t1.35
- p-value?20.10.2
- Conclusion
- With ?0.01and p-value 0.2, the null hypothesis
H0 cannot be rejected
43Applying the Theory
Function Optimization
- Two different GAs applied to function
optimization - A standard GA using a 2 point CXover
- A modified GA using transformation
- Goal find the minimum
The Schwefel Function
Minimum 0
44Applying the Theory
Function Optimization
- Who is better? Two point Crossover or
Transformation? - Comparing the means of the best fit!
- H0?1 ?2 (no improvement!!!), Ha ?1? ?2
- Confidence level, ? 0.05
- Assuming the population distributions are normal
- Number of (independent) runs 30 for each case
- Use t test
45Applying the Theory
Function Optimization
- From the samples (fitness of the best
individuals) - ?sga5.4838
- ?tr0.0768
- Sga2149.788
- Str20.02958
- From the formulas
- df29
- t2.42
- p-value?20.0120.024
- Conclusion
- With ?0.05 and p-value 0.024, the null
hypothesis H0 is rejected.
46Conclusions
- This is a very simple presentation
- Assuming Normal distributions
- There are many others
- In many situations we cannot assume a normal
distribution - Many things left unmentioned
- More than two populations
- Analysis of Variance (ANOVA)
- Regression and Correlation
- Non parametric methods
47Want to know more?
- Paul Cohen, Empirical Methods for Artificial
Intelligence. MIT Press, Boston, 1995 - James Kennedy and Russell Eberhart, Swarm
Intelligence (Appendix A),Morgan Kaufman, 2001. - Roxy Peck, Chris Olsen and Jay Devore,
Introduction to Statistics and Data
Analysis,Duxbury, 2001. - Mark Wineberg and Steffen Christensen, Using
Appropriate Statistics, GECCO2003 Tutorial.