Title: Statistical Analysis
1Statistical Analysis Design in Research
Structure in theExperimental MaterialPGRM 10
2Blocking the idea
- Detecting differences between treatments depends
on the background noise (BN) - BN is
- caused by inherent differences between the
experimental units - measured by the residual (error) mean square RMS
(alternatively! MSE) - Comparing treatments on similar units would
reduce background noise - With blocks of units of differing contributing
characteristics we measures the variation due to
blocks and reduce residual variation
3Blocking the benefit
- Reducing background noise
- Gives more precise estimates
- Allows a reduction in replication, without loss
of power(the probability of detecting an effect
of a specified size) - Reduces cost!
4Blocking and experimental material
- Examples
- A field with fertility increasing from top to
bottomWith 3 treatments group plots into BLOCKS
of 3, starting at top and continuing to
bottom.Randomise treatments within each block
5Block Design
What is the experimental unit?
How many replicates per treatment?
What is the block?
6Example
- 2 drugs (A, B) to control blood pressure
- 100 subjects randomly assign 50 each to A and B
- Valid - but is it efficient?
- If subjects are heterogenous - likely to be a
large variation (?2) in the responses within each
group. - Design may not be very efficient.
7Factors affecting BP variation
8Blocking and experimental material
- 100 subjects are selected to compare new drug to
control BP with a Control -
Block into pairs by age weight (believed to
affect BP) In each pair one is selected at
random to receive the new drug, the other
receives Control
Alternatively see next slide
9Groups (Blocks)
10Groups (Blocks)
11Blocking and experimental material
- Examples
- A field with fertility increasing from top to
bottomWith 3 treatments group plots into BLOCKS
of 3, starting at top and continuing to
bottom.Randomise treatments within each block - 100 subjects are selected to compare new drug to
control BP with a ControlBlock into pairs by age
weight (believed to affect BP)In each pair one
is selected at random to receive the new drug,
the other receives Control - 3 products to be compared in 15 supermarketsAll
3 compared in each supermarket, regarded as BLOCKS
12Blocking and experimental material
- Examples (contd)
- A crop experiment will take 5 days to
harvest.The material is blocked into 5 sets of
plots, and treatments assigned at random within
each setA BLOCK of plots is harvested each
dayHere day effects, such as rain etc will be
allowed for in the ANOVA table, not clouding the
estimation of treatment effects, and reducing
residual variation.
13Blocking factors in your work area?
14Reasons to BLOCK
- Reduce BN (as above)
- Material is naturally blocked (eg identical
twins)so using this a part of the design may
reduce BN - To protect against factors that may influence the
experimental outcomes, and so cloud comparison of
treatments - To assess block variation itselfeg day to day
variation large may indicate a process that is
not well controlled.
15Typical Randomised Block Design (RBD) Layout
4 treatments T1 T4 ? BLOCKS of size 4
Example of random allocation within blocks
Block
1 T3 T1 T2 T4
2 T2 T3 T1 T4
3 T1 T2 T3 T4
4 T2 T4 T1 T3
5 T4 T2 T3 T1
6 T3 T1 T4 T2
16ANOVA table
each treatment occurs once in each blockt
treatmentsb blockstb experimental units
Source DF SS MS F Pr gt F
Treatments t 1 TSS TMS TMS/RMS Small?
Blocks b 1 BSS BMS BMS/RMS Small?
Residual (t-1)(b-1) RSS RMS
Total tb - 1
MS SS/DF
17ExamplePGRM pg 10-2
- Compare effect of washing solution used in
retarding bacterial growth in food processing
containers. - Only 3 trials can be run each day, and
temperature is not controlled so day to day
variability is expected. - BLOCKS day
- Treatments 2, 4, 6 of active ingredient
- Randomisation 3 containers randomly allocated to
3 treatments on each of 4 days. - Response bacterial count on each container each
day (low score cleaner)
18Example (contd)
Day Solution() Count
1 2 13
1 4 10
1 6 5
2 2 18
2 4 20
2 6 6
3 2 18
3 4 17
3 6 7
4 2 30
4 4 31
4 6 10
Day,Solution(),Count 1,2,13 1,4,10 1,6,5 2,2,18 2
,4,20 ...
csv
Excel
- Note
- Response values in a single column
- Extra column to identify
- BLOCK (day)
- TREATMENT (solution)
19SAS GLM code
proc glm data randb class solution
day model score solution day lsmeans
solution lsmeans day estimate 2-6 solution
1 0 -1 estimate linear ok? solution 1
-2 1 quit
20GLM OUTPUT ANOVA
425.17 322.92 748.09 So the Model SS has been
partitioned into TREATMENT (solution) and BLOCK
(Day)
21GLM OUTPUT means
22ANOVA table
23More Blocking Latin square designs
24Latin Square design blocking by 2 Sources of
variation
- Variation in milk yield among cows is large (CV
25) - Variation in Yield across lactation is large
- Use different treatments in sequence on each cow
- Need to allow for a standardisation period (1-2)
weeks between treatments
25Data
Columns for period,cow and treatment codes
26SAS GLM code
proc glm data latinsq class period cow
treat model yield period cow treat lsmeans
treat lsmeans period lsmeans cow estimate
1v2 treat 1 -1 0 0 Run
27Results
Cow and Period removed much variation
Means
28Conclusions on Latin square design
- CV greatly reduced to 6 - When the effect of
period is allowed for, repeated measurements
within a cow are not very variable. - Periods and cows are nuisance variables.
Sometimes the row and column variables are of
interest in themselves and so design is very
efficient information on 3 factors. (e.g.
treatments, machines, operators). - Useful for screening but questionable whether
short term results would apply for the long term.