Title: Experimental Design - The basics
1Experimental Design - The basics
2How to formulate hypotheses
- Where do you start?
- What is a hypothesis?
- Stating a hypothesis
- Generating predictions
- Statistical hypotheses (different!)
- Only after completing this process will you be
able to decide what data to collect
3Hypotheses Where do you start?
- Start by stating your research question
- E.g. Why are male and female humans different
sizes? - Your question may easily produce more than one
hypothesis, thats fine.
4Hypothesis
- A hypothesis is a clear statement articulating a
plausible candidate explanation for observations - It should be constructed in such a way as to
allow gathering of data that can be used to
either refute or support the candidate explanation
5Stating a Hypothesis
- Phrase your hypothesis as a possible answer to
your research question. - E.g. Male and female size differ because males
grow faster than females
6Generating predictions
- These are the testable statements that follow
logically from your hypothesis - E.g. males have a faster growth rate than
females
7Statistical hypotheses
- Predictions should lead you to testable
statistical hypotheses - Note that the hypothesis of interest in
statistics is the one where nothing is different
(the null hypothesis) - A clearly stated null hypothesis will generally
lead you to the correct statistical test - E.g. There is no difference in the growth rate
of males and females
8Question
Hypothesis
Predictions
Statistical (Null) hypothesis
9Pitfalls of generating predictions
- Weak tests
- Indirect measures
- Non-useful outcomes
- Your tests must satisfy the devils advocate
(e.g. reviewers or examiners)
10Weak test
- Consider the hypothesis Students enjoy the
course in radiation training more than the
workshop in experimental design - Prediction Students will get better grades in
radiation training than in experimental design - This is a weak test (prediction) because other
explanations are equally likely AND because we
have used an indirect measure (grades as a
measure of enjoyment)
11Non-useful outcomes
- These are hypotheses that may well prove
interesting if true but are uninformative if false
12Satisfying skeptics
- Reviewers will look for logical flaws in your
experiments. You do not want to finish your paper
with - My results indicate that mechanism A determines
apoptosis rates. Although mechanism B could also
produce the same response I believe that
mechanism A is the important one - This will earn you a review of the form
- This study provides no clear evidence to
distinguish between mechanisms A and B. The
authors need to redesign their study and start
again. Recommendation, reject this manuscript
13Pilot Studies and Preliminary Data
- May be observational or mini-experiments
- Ensures sensible questions
- Can you observe the phenomenon?
- Practice and validate techniques
- Minimize training effects of data
- Recognize logistic constraints
- Standardization across observers
- Allows tuning of design and statistics
- Assessment of sample sizes (power)
- Test run of statistical analysis
14Experimental ManipulationVs. Natural Variation
- In Manipulation studies you change an aspect of
the system and measure effects on traits of
interest (majority of lab studies and
Agricultural studies) - In Correlational studies you measure associations
between traits of interest (often assuming one is
influencing the other) (Many Environmental and
most Human studies) - Consider the hypothesis Long tail streamers seen
in many species of birds have evolved to make
males more attractive to females
15Correlational study usingNatural Variation
- In the bird tail length example we could
- Measure the tails of males at the beginning of
the breeding season - Observe the number of matings each male has
- Do statistics to determine if there is a
relationship between tail length and number of
matings - Results showing a relationship would support our
hypothesis - Results not showing a relationship would go
against our hypothesis
16Manipulative study
- In the bird tail length example we could have 4
groups of birds - Results showing males with artificially long
tails had more mates supports our hypothesis - Results showing males with reduced tails had
fewer mates also supports our hypothesis - A comparison of group 1 males with the
unmanipulated males acts as a control comparison
17Arguments for correlational studies
- Often less work (but larger sample sizes usually
needed) - Deals with real levels of biological variation
(manipulations may take things outside naturally
occurring limits) - Requires less handling of organisms (important if
there are constraints like stress to animals or
endangered species) - Manipulative studies may produce unintended
effects (e.g. flight ability in example or
epistatic effects in knockouts) - Manipulation may not be possible
- May provide a baseline study manipulative expts.
18Arguments for manipulative studies(really,
against correlational studies)
- Third variables
- Reverse causation
- These can be BIG problems if they occur
19Third Variables
- Third variables occur when there is an apparent
link between A and B but in fact there is no
direct link or mechanism. Instead both A and B
depend on C, the third variable. - This means that patterns in correlations studies
are just that, correlations. - Remember, correlation does not imply causation
20Third Variables - an example
- In the bird tail length example lets say that we
do see a correlation between tail length and
number of mates - Suppose that females are actually attracted to
territories not males, but that males on better
territories can grow larger tails - The third variable here is territory quality and
it drives both tail length and number of mates
and produces an apparent relationship.
21Third Variables - Two famous examples
- Fisher suggested that the link between smoking
and cancer was correlational not causative and
that another factor, perhaps stress, led people
both to smoke and develop cancer. - Fewer women postgrads marry than women in the
population as a whole. This relationship is
presumable due to some other correlated factor
(third variable)
22Reverse causation
- This occurs when it is assumed that correlation
implies causation - In some cases this can be ruled out based on
other data or common sense - In the bird example it is unlikely that the
number of mates for a male has any effect on tail
length measured at the start of the mating
season.
23Reverse causation - a famous example
- There is a correlation between the number of
storks nesting in chimneys and the number of
children in a house (old data from Holland) - Although storks bringing babies makes a nice
story the causation is likely reversed - Larger families tend to live in larger houses
with more chimneys, and hence more opportunities
for storks to nest.
24Variation, replication and sampling
- Variation among individuals
- Replication and the experimental unit
- Pseudoreplication
25Variation among individuals
- Variation among individuals is a given for most
biological systems - In any experiment we are concerned with variation
in the Response or Dependent Variable - Variation in the response variable can be divided
into - Variation explained by experimental factors (IV)
- Variation not explained by experimental factors
(AKA error variation, random variation noise) - In most studies we are interested in reducing
noise and, hopefully, increasing explained
variation
26Variation among individuals
- Single measurements from each treatment do not
allow us to distinguish between noise and effect - make sure you have a sufficient number of
individuals that experience the same manipulation - These individuals that receive the same
manipulation are called replicates - What is the experimental unit?
27Pseudoreplication
- This occurs when there is confusion between
treatments, replicates and blocks. - Consider an experiment comparing the effect of a
toxicant on fish behaviour. - Lets say the toxicant is prepared in a batch and
drip fed into the treatments tanks (water is drip
fed into the control tanks) - Are the replicates
- Each fish in a tank?
- Each tank?
- Each set of tanks on a common drip?
- Each batch of toxicant?
- Dont expect a simple answer, the answer is in
the biology, not in statistics
28Common sources of Pseudoreplication
- Shared enclosures
- Common environments
- Relatedness
- Pseudoreplicated stimulus
- Non-independence of group behaviour
- Pseudoreplicated measurements over time
- Species comparisons
-
- Sometimes pseudoreplication is unavoidable
29Random sampling
- Proper random sampling means that each individual
has an equal chance of being allocated to each
treatment group - The problem with non-random treatment of samples
is that any bias in assignment of individuals or
systematic pattern to errors may bias your
results - True random samples almost always require the use
of computers or random number tables
30Random assignment and treatment
- Random means not only random assignment but also
random treatment - Lets say that you are examining the effect of
rhizosphere bacteria on plant growth. - Not only should each plant have an equal
opportunity of being assigned to the bacterial or
non-bacterial (control) group all other aspects
of the process should be random as well. - Plants should be planted in equivalent compost
(possibly in random order) - Plants should be randomly allocated to growth
chambers and perhaps positions in chambers
31Haphazard sampling
- Haphazard does not mean Random
- A haphazard sample is based on personal
assignment by the experimenter in a fashion that
they believe is random - Often severely biased even if the experimenter is
consciously trying to take a random sample - Consider trying to randomly select mice from a
bucket or randomly pippetting out aliquots of a
cell culture - True random samples usually involve setting up
experimental units BEFORE assigning treatments - BUT this is not always possible, use common sense
(or blind assignment)
32Self selection
- This is a real problem with survey or poll data
- The subset of a population that respond to
surveys is rarely a random sample and thus may
bias your results - By all means use surveys to inform your research
BUT be very suspicious of anything but general
conclusions
33Pitfalls of Random Sampling
- Make sure that the randomization procedure you
use does what you intend - Randomise the order of collecting data - learning
effects - Random samples Vs. Representative samples - dont
let computers do your thinking for you
34Sample size - how many replicates
- Too few replicates can be a disaster - too many
can be a crime! - Always use educated guesswork - i.e. look at
similar experiments by previous workers and
determine what worked. - Pay attention to differences between the studies
- Formal power analysis - do if possible!!!
- Requires that you have some guess of variation
among replicates - Requires that you have an idea of how big of a
treatment effect you can expect (or require) - Requires that you know what statistical test you
will use
35Sample size - Resource Equation Model
- Can be used for complex studies or when variation
among individuals is unknown - Only appropriate for quantitative data
- Gives conservative estimates of sample size so
more appropriate for - Large effect size (e.g. lab rather than clinical)
- Testing for significant effects rather than
estimating parameters - E N - T - B
- N is the total number of individuals -1
- T is the number of treatments -1
- B is the number of blocks -1
- E is the error df and should be between 10 and 20
- In some cases E should be larger (see Festing et
al.)
36Sample size optimization (Festing et al.)
37Controls
- This is the reference against which the results
of an experimental manipulation can be compared - Thus your control group should be identical to
your treatment group in everything except the
treatment itself - Simple concept, common mistake
- If the predictions and statistical hypotheses
have been constructed well then the control group
will be obvious - Lack of a control group makes an experiment
pointless
38Types of Controls
- Negative control - unmanipulated
- Positive control - manipulated but not treated
(vehicle control, sham procedure control) - Concurrent control - run at the same time as the
treatment group - Historic control - based on previous data (be
certain that individuals are identical except for
the treatment)
39Blind Procedures
- Designed to remove the perception that
unconscious bias might taint results - Particularly useful when response variables are
measured in a subjective way - Blind Procedure - person measuring has does not
know what treatment has been applied - Double Blind - Both the subject and the person
measuring does not know the treatment assigned
(human studies)
40When controls are not needed (or allowed)
- In medical or veterinary studies controls may be
an ethical issue, Historical controls can be used
but give careful consideration to criticisms - When sets of treatments are being compared (e.g.
effect of two drugs on rat behaviour)
41Factorial experiments
- 2 group comparison (t-test) design
- Treatment and control compared
- 1 factor design
- Control and several levels of treatment compared
- 2 factor design
- More than one treatment considered simultaneously
- Allows estimation of both main effects AND the
interaction between them
42Main effects and interactions
X - -
X X -
X X X
X X X
X X X
43Main effects and interactions
44Completely randomized designs Vs. Blocking
- Completely Randomized designs are usually simple
- Completely Randomized designs assume small among
individual variation - If among individual variation can be attributed
to a known factor then you can BLOCK by that
factor, reduce error variation and increase your
signal to noise ratio (clearer results)
45Advantages of blocking
46Advantages of blocking
- Blocking is commonly used to remove effects of
- Space
- Time
- Individual characters that can be ranked
- Continuous characters that effect among
individual variation can be used as covariates to
remove effects and improve signal to noise ratio
47The most common design errors
- Ad hoc designs
- Inappropriate control/treatment groups
- Sample sizes too large or too small
- Failure to use blocking
- Lab animal studies failure to use isogenic
strains when GxE unimportant