Biostatisics and Computer Applications - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

Biostatisics and Computer Applications

Description:

1. Biostatisics and Computer Applications. ANOVA of ... Also we test the effect of clipping. ... Test warming and clipping effect. Split plot experiment. 37 ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 41
Provided by: dafen
Category:

less

Transcript and Presenter's Notes

Title: Biostatisics and Computer Applications


1
Biostatisics and Computer Applications
  • ANOVA of hierarchical data
  • Experimental design
  • ANOVA of common designs
  • SAS programming
  • 1/6/2003

2
Recap (Analysis of Variance)
  • Analysis of variance
  • One-way ANOVA
  • Two-way ANOVA
  • Multiple comparisons

3
Recap (Data for One-Way ANOVA)
K independent samples, n observations, kn total
4
Recap( One-way ANOVA)
5
Recap (Two-way ANOVA data)
6
Recap (two-way ANOVA)
7
Recap (multiple comparisons)
  • PLSD (LSD, t test) method.
  • Confidence interval (1-alpha)

8
Hierarchical (Nested classification) data
  • If the experimental data have l groups, each
    group has u subgroups, each subgroup has v
    sub-subgroup, , each of the last sub-sub-group
    has n observations, we call this data
    hierarchical data (Nested classification).
  • The simplest one is 2 levels hierarchic data. It
    contains l group, each group has m subgroup, each
    subgroup has n observation. Total number of
    observations is lmn.
  • Example College-gtyear-gtmajor-gtstudent

9
Hierarchical data table (i1,2,,lj1,2,,mk1,2
,,n)
10
Linear mathematic model for hierarchical data
11
Linear mathematic model for hierarchical ANOVA
12
ANOVA Total Variation Partitioning
Total Variation
SS(Total)
Variation Due to Group
Variation Due to subgroup within group
  • SSt

SSd
Variation Due to Random Sampling
  • SSe

13
ANOVA Summary Table
Source of
Degrees of
Sum of
Mean
F
Variation
Freedom
Squares
Square
l - 1
SSt
MSt
MSt
Group
MSd
Subgroup within group
l(m-1)
SSd
MSd
MSd
MSe
SSe
lm(n-1)
MSe
Error
Total
lmn - 1
SST
14
Expected mean square of hierarchical ANOVA
Source of
Degrees of
Mean
Expected Mean
Variation
Freedom
Square
Square
l - 1
MSt
Group
Subgroup within group
l(m-1)
MSd
MSe
lm(n-1)
Error
Total
lmn - 1
15
ANOVA Null Hypotheses
  • 1. No difference in means due to group
  • H01 ?1 ?2... ?k
  • 2. No difference in means due to subgroup within
    group
  • H02 ?12 ?12 ... ?lm
  • If H02 is accepted, test H01,

16
Example of hierarchical ANOVA
Measured lead concentrations in 4 vegetables
after the soil was supplied with a pesticide.
Each vegetable was planted 3 pots with
contaminated soil. There were 5 plants per pot.
17
Result of ANOVA Table
Source of
Degrees of
Sum of
Mean
F
Variation
Freedom
Squares
Square
3
76.74
25.58
405.80
Plant
Pot within plant
8
0.63
0.078
1.31
2.90
48
0.060
Error
59
Total
80.27
18
Multiple comparison
  • Tukey fixed range method
  • K4,df60(56), q0.053.74 (table)

19
SAS program
  • DATA Hierarchic
  • INPUT vegetable pot _at_
  • DO k1 to 5
  • INPUT concentration _at_
  • OUTPUT
  • END
  • DATALINES
  • A 1 0.7 0.6 0.9 0.5 0.6
  • A 2 0.9 0.9 0.7 1.1 0.7
  • A 3 0.8 0.6 0.9 1.0 0.8
  • B 1 1.2 1.4 1.6 1.2 1.5
  • B 2 1.1 0.9 1.3 1.2 1.0
  • B 3 1.5 1.4 0.9 1.3 1.6
  • C 1 0.6 0.6 0.8 0.9 0.7
  • C 2 0.5 0.8 0.9 1.0 0.6
  • C 3 0.6 1.2 0.8 0.9 1.0
  • D 1 4.2 3.7 2.9 3.5 3.6
  • D 2 2.9 3.5 3.8 3.1 3.5
  • D 3 3.6 3.5 4.0 3.3 3.7
  • PROC ANOVA
  • CLASS vegetable pot
  • MODEL concentrationvegetable pot(vegetable)
  • TEST Hvegetable Epot(vegetable)
  • MEANS vegetable/Tukey Epot(vegetable)
  • RUN

We test vegetation effects using MS of pot.
We use two INPUT statements.
20
Experimental design
  • Experimental design is a planned interference in
    the natural order of events by the researcher.
  • Why design?
  • inferences about what produced, contributed to,
    or caused events
  • gain such information without ambiguity.

21
Experimental design
  • Terminology
  • Experimental (and environmental) factor A
    variable of specific experimental interest. For
    example, fertilizer type amount of nutrient.
    Treatment.
  • Experiment different level of an experimental
    factor or combination of levels in a multiple
    factors. Level refers to the degree or intensity
    of a factor.
  • Random refers to the property of completely
    chance events that are not predictable.
    Elimination of systematic influence upon
    assignment.
  • Control refers to a group not being exposed to
    the treatment.
  • Block refers to categories of subjects with a
    treatment group. Within a block, environmental
    factors are homogeneity.

22
Experimental design
  • Principles of experimental design
  • 1. Randomization.
  • Assign treatments to each unit (plot) randomly
    (with same probability).
  • Provides unbiased estimate of error (normal
    distribution)
  • 2. Replication.
  • Estimate random error ( )
  • Increase the precision of the estimation
    .
  • 3. Block control.
  • One set of experiments with similar environmental
    conditions
  • Further decrease standard error by separating
    block effect.

A,B,C,D,E
23
Experimental design
  • According to number of factors
  • Single factor experiment
  • Detect simple effect of experimental factor
  • Easy to apply and analyze.
  • Multiple factors experiment
  • 2, 3 factors and more
  • Detect both main effect and interaction
  • Lower standard error, easy to find smaller true
    effects
  • Difficult to analyze data.

24
Experimental effects
  • Experimental effects
  • Simple effect change in response produced by a
    change in the level of a factor
  • Main effect mean of simple effect
  • Interaction change in response caused by the
    interaction of experimental factors.
  • Example
  • Test the effects of N and P on wheat yield. Two
    levels for N (n1,n2) and two level for P (p1,p2).
    Yields (kg/plot) are shown in table.

25
Experimental effects

No interaction!
Simple effect Main effect Interaction
26
Experimental effects

Positive interaction!
Simple effect Main effect Interaction
27
Experimental effects

Negative interaction!
Simple effect Main effect Interaction
28
Experimental design
  • According to unit arrangement
  • Completely randomized experiment
  • Random, replicate One or multiple factors
  • Easy to apply and analyze
  • Randomized block experiment
  • Random, replicate and block control
  • One or multiple factors
  • Most commonly used
  • Latin square experiment
  • Random, replicate and block control on row and
    column
  • One or more factors, but treatment No k510
  • Split plot experiment
  • Special requirement for different factors
  • Different precisions for factors
  • Multiple factors only.

29
Completely randomized experiment
  • One factor experiment
  • This is exactly the same as one-way ANOVA.
  • Multiple factors experiment
  • Similar to randomized block experiment, just
    remove Block effect as shown next.

30
Randomized block experiment (one factor)
  • Example We want to compare the yield of 7 barley
    varieties. Randomized block design, replicate 3
    times. Plots and yields per plot show below.

DATALINES I F 20 I A 24 I E 22
DATA rbe1 INPUT block variety yield
31
Randomized block experiment (one factor)
  • DATA rbe1
  • INPUT block variety yield
  • datalines
  • I F 20
  • I A 24
  • I E 22
  • I D 18
  • I C 21
  • I G 20
  • I B 20
  • II A 20
  • II D 16
  • II C 19
  • II F 21
  • II B 19
  • II E 20
  • II G 19
  • III F 21
  • III B 21
  • PROC ANOVA
  • CLASS block variety
  • MODEL yieldvariety block
  • MEANS variety /LSD alpha0.05
  • RUN

Here we are interested in the effect of variety,
not block. Block is used to decrease the standard
error. If block effect is not significant, it
means no big difference in environmental factors
among blocks. If block effect is significant, we
are happy we separated this effect from model
error. We do not do multiple comparisons for
block.
32
Randomized block experiment (two factors)
  • Example We want to test the N and P effects on
    plant yield. Three levels for N (0, 5, 10 kg) and
    five levels for P (0,2,4,6,8 kg), the total
    treatments is 15. Randomized block design,
    replicate twice.
  • DATA rbe2
  • INPUT BLOCK N 5-6 P 7-8 yield
  • DATALINES
  • 1 A2B2 5.0
  • 1 A2B4 4.9
  • 1 A1B1 4.3
  • 1 A3B2 4.4
  • 1 A1B5 4.7
  • 1 A2B1 5.2

We use column input to read in N and P.
33
Randomized block experiment (two factors)
  • DATA rbe2
  • INPUT BLOCK N 5-6 P 7-8 yield
  • DATALINES
  • 1 A2B2 5.0
  • 1 A2B4 4.9
  • 1 A1B1 4.3
  • 1 A3B2 4.4
  • 1 A1B5 4.7
  • 1 A2B1 5.2
  • 1 A3B4 3.4
  • 1 A1B4 4.8
  • 1 A3B5 3.7
  • 2 A2B3 3.4
  • 2 A3B1 4.7
  • 2 A3B3 3.4
  • 2 A2B2 5.2
  • 2 A1B4 4.0
  • 2 A3B5 4.2
  • PROC ANOVA
  • CLASS block n p
  • MODEL yieldn p np block
  • MEANS n p np /t
  • RUN

If the interaction (np) is not significant, then
the best combination of N and P is highest N
treatment and highest P treatment. Otherwise, you
need to compare NP.
34
Latin square experiment
  • Five N treatment, (0kg, 10kg, 15kg, 20kg, 25kg)
    on wheat yield. Latin square design. (Code for
    treatment 1-0kg, 2-10kg, 3-15 kg, 4-20kg, 5-25
    kg).

35
Latin square design
  • PROC ANOVA
  • CLASS row column treatment
  • MODEL yieldtreatment row column
  • MEANS treatment /t alpha0.05
  • MEANS treatment /t alpha0.01
  • RUN
  • DATA latin
  • DO row1 to 5
  • DO column1 to 5
  • INPUT treatment yield _at__at_
  • OUTPUT
  • END
  • END
  • DATALINES
  • 3 10.1 1 7.9 2 9.8 5 7.1 4 9.6
  • 1 7.0 4 10.0 5 7.0 3 9.7 2 9.1
  • 5 7.6 3 9.7 4 10.0 2 9.3 1 6.8
  • 4 10.5 2 9.6 3 9.8 1 6.6 5 7.9
  • 2 8.9 5 8.9 1 8.6 4 10.6 3 10.1

We focus on treatment effect only.
36
Split plot experiment
  • To test the warming effect on plant growth, we
    set four level of increased temperature (A1 3o,
    A2 2o A31o and A40o, control.). Also we test
    the effect of clipping. Within each warming plot,
    we set 3 levels for clipping (B1 clipping twice,
    summer and winter B2 clipping once in winter
    B3 no clipping). Test warming and clipping
    effect.

37
Split plot experiment
  • DATA splitplot
  • INPUT block 1 warming 2-3 clipping 5-6 yield
  • DATALINES
  • 1A3 B2 20
  • 1A3 B1 18
  • 1A3 B3 18
  • 1A2 B3 20
  • 1A2 B1 24
  • 3A3 B3 18
  • 3A3 B2 18
  • 3A2 B3 23
  • 3A2 B2 22
  • 3A2 B1 25
  • PROC ANOVA
  • CLASS block warming clipping
  • MODEL yieldblock warming blockwarming clipping
    warmingclipping
  • TEST Hwarming block Eblockwarming
  • MEANS warming /LSD Eblockwarming
  • MEANS clipping/LSD CLDIFF
  • RUN

We use different error items for warming and
clipping in F test as well as multiple
comparisons.
38
How if missing data? (PROC GLM)
  • The GLM procedure uses the method of least
    squares to fit general linear models. Its
    powerful procedure. You can perform regression,
    analysis of variance, analysis of covariance,
    multivariate analysis of variance, and partial
    correlation using PROC GLM.
  • With PROC GLM, you can use one or several
    continuous dependent variables to one or several
    independent variables. The independent variables
    may be either classification variables, which
    divide the observations into discrete groups, or
    continuous variables.
  • For normal balanced data, you may use PROC ANOVA.
    But for unbalanced data, you should use PROC GLM.

39
Deal with missing data
  • PROC GLM lt options gt
  • CLASS variables
  • MODEL dependentsindependents lt / options gt
  • TEST lt Heffects gt Eeffect lt / options gt
  • MEANS effects lt / options gt
  • LSMEANS effects lt / options gt
  • OUTPUT lt OUTSAS-data-set gt
  •       keywordnames lt ... keywordnames gt lt /
    option gt
  • RANDOM effects lt / options gt

40
Deal with missing data
  • DATA unbalanced
  • DO variety1 to 2
  • DO fertilizer1 to 3
  • DO block1 to 3
  • INPUT yield _at__at_
  • OUTPUT
  • END
  • END
  • END
  • DATALINES
  • 7 6 8
  • . 9 10
  • 5 4 3
  • 6 6 7
  • 8 . 9
  • 7 6 5
  • PROC GLM
  • CLASS variety fertilizer block
  • MODEL yieldblock varietyfertilizer
  • MEANS variety fertilizer /LSD LINES
  • LSMEANS variety fertilizer /T
  • RUN
Write a Comment
User Comments (0)
About PowerShow.com