IE341: Introduction to Design of Experiments - PowerPoint PPT Presentation

1 / 93
About This Presentation
Title:

IE341: Introduction to Design of Experiments

Description:

The F distribution is a ratio of two chi-square variables. ... effect of four different tips on the readings from a hardness testing machine. ... – PowerPoint PPT presentation

Number of Views:116
Avg rating:3.0/5.0
Slides: 94
Provided by: ie1Ka
Category:

less

Transcript and Presenter's Notes

Title: IE341: Introduction to Design of Experiments


1
IE341 Introduction to Design of Experiments
2
  • Last term we talked about testing the
    difference between two independent means. For
    means from a normal population, the test
    statistic is

3
  • We also covered the case where the two means
    are not independent, and what we must do to
    account for the fact that they are dependent.

4
  • And finally, we talked about the difference
    between two variances, where we used the F ratio.
    The F distribution is a ratio of two chi-square
    variables. So if s21 and s22 possess
    independent chi-square distributions with v1 and
    v2 df, respectively, then
  • has the F distribution with v1 and v2 df.

5
  • All of this is valuable if we are testing
    only two means. But what if we want to test to
    see if there is a difference among three means,
    or four, or ten?
  • What if we want to know whether fertilizer
    A or fertilizer B or fertilizer C is best? In
    this case, fertilizer is called a factor, which
    is the condition under test.
  • A, B, C, the three types of fertilizer
    under test, are called levels of the factor
    fertilizer.
  • Or what if we want to know if treatment A
    or treatment B or treatment C or treatment D is
    best? In this case, treatment is called a
    factor.
  • A,B,C,D, the four types of treatment under
    test, are called levels of the factor treatment.
  • It should be noted that the factor may be
    quantitative or qualitative.

6
  • Enter the analysis of variance!
  • ANOVA, as it is usually called, is a way to
    test the differences between means in such
    situations.
  • Previously, we tested single-factor
    experiments with only two treatment levels.
    These experiments are called single-factor
    because there is only one factor under test.
    Single-factor experiments are more commonly
    called one-way experiments.
  • Now we move to single-factor experiments with
    more than two treatment levels.

7
  • Lets start with some notation.
  • Yij ith observation in the jth level
  • N total number of experimental
    observations
  • the grand mean of all N
    experimental observations
  • the mean of the observations
    in the jth level
  • nj number of observations in the jth
    level the nj are called replicates.
  • Replication of the design refers to using
    more than one experimental unit for each level.

8
  • Designs are more powerful if they are
    balanced, but balance is not always possible.
  • Suppose you are doing an experiment and the
    equipment breaks down on one of the tests. Now,
    not by design but by circumstance, you have
    unequal numbers of replicates for the levels.
  • In all the formulas, we used nj as the number
    of replicates in treatment j, not n, so there is
    no problem.

9
  • Notation continued
  • the effect of the jth level
  • L number of treatment levels
  • eij the error associated with the ith
    observation in the jth level,
    assumed to be independent normally distributed
    random variables with mean 0 and variance
    s2, which are constant for all levels of the
    factor.

10
  • For all experiments, randomization is
    critical. So to draw any conclusions from the
    experiment, we must require that the treatments
    be applied in random order.
  • We must also assign the experimental units to
    the treatments randomly.
  • If all this randomization occurs, the design
    is called a completely randomized design.

11
  • ANOVA begins with a linear statistical model

12
  • This model is for a one-way or single-factor
    ANOVA. The goal of the model is to test
    hypotheses about the treatment effects and to
    estimate them.
  • If the treatments have been selected by the
    experimenter, the model is called a fixed-effects
    model. In this case, the conclusions will apply
    only to the treatments under consideration.

13
  • Another type of model is the random effects
    model or components of variance model.
  • In this situation, the treatments used are a
    random sample from large population of
    treatments. Here the ti are random variables and
    we are interested in their variability, not in
    the differences among the means being tested.

14
  • First, we will talk about fixed effects,
    completely randomized, balanced models.
  • In the model we showed earlier, the tj are
    defined as deviations from the grand mean so
  • It follows that the mean of the jth treatment
    is

15
  • Now the hypothesis under test is
  • Ho µ1 µ2 µ3 µL
  • Ha µj? µk for at least one j,k
    pair
  • The test procedure is ANOVA, which is a
    decomposition of the total sum of squares into
    its components parts according to the model.

16
  • The total SS is
  • and ANOVA is about dividing it into its
    component parts.
  • SS variability of the differences
    among the L levels
  • SSe pooled variability of the random
    error within levels

17
  • This is easy to see because
  • But the cross-product term vanishes because

18
  • So SStotal SS treatments SS error
  • Most of the time, this is called
  • SStotal SS between SS within
  • Each of these terms becomes an MS (mean
    square) term when divided by the appropriate df.

19
  • The df for SSerror N-L because
  • and the df for SSbetween L-1 because
    there are L levels.

20
  • Now the expected values of each of these terms
    are
  • E(MSerror) s2
  • E(MStreatments)

21
  • Now if there are no differences among the
    treatment means, then for all j.
  • So we can test for differences with our old
    friend F
  • with L -1 and N -L df.
  • Under Ho, both numerator and denominator are
    estimates of s2 so the result will not be
    significant.
  • Under Ha, the result should be significant
    because the numerator is estimating the treatment
    effects as well as s2.

22
  • The results of an ANOVA are presented in an
    ANOVA table. For this one-way, fixed-effects,
    balanced model
  • Source SS df MS p
  • Model SSbetween L-1 MSbetween p
  • Error SSwithin N-L MSwithin
  • Total SStotal N-1

23
  • Lets look at a simple example.
  • A product engineer is investigating the
    tensile strength of a synthetic fiber to make
    mens shirts. He knows from prior experience
    that the strength is affected by the weight
    percent of cotton in the material. He also knows
    that the percent should range between 10 and
    40 so that the shirts can receive permanent
    press treatment.

24
  • The engineer decides to test 5 levels
  • 15, 20, 25, 30, 35
  • and to have 5 replicates in this design.
  • His data are


15 7 7 15 11 9 9.8
20 12 17 12 18 18 15.4
25 14 18 18 19 19 17.6
30 19 25 22 19 23 21.6
35 7 10 11 15 11 10.8
15.04
25
  • In this tensile strength example, the
    ANOVA table is
  • In this case, we would reject Ho and declare
    that there is an effect of the cotton weight
    percent.

Source SS df MS p
Model 475.76 4 118.94 lt0.01 Error
161.20 20 8.06 Total 636.96
24
26
  • We can estimate the treatment parameters by
    subtracting the grand mean from the treatment
    means. In this example,
  • t1 9.80 15.04 -5.24
  • t2 15.40 15.04 0.36
  • t3 17.60 15.04 -2.56
  • t4 21.60 15.04 6.56
  • t5 10.80 15.04 -4.24
  • Clearly, treatment 4 is the best because it
    provides the greatest tensile strength.

27
  • Now you could have computed these values from
    the raw data yourself instead of doing the ANOVA.
    You would get the same results, but you wouldnt
    know if treatment 4 was significantly better.
  • But if you did a scatter diagram of the
    original data, you would see that treatment 4 was
    best, with no analysis whatsoever.
  • In fact, you should always look at the
    original data to see if the results do make
    sense. A scatter diagram of the raw data usually
    tells as much as any analysis can.

28
(No Transcript)
29
  • How do you test the adequacy of the model?
  • The model assumes certain assumptions that must
    hold for the ANOVA to be useful. Most
    importantly, that the errors are distributed
    normally and independently.
  • The error for each observation, sometimes
    called the residual, is

30
  • A residual check is very important for testing
    for nonconstant variance. The residuals should
    be structureless, that is, they should have no
    pattern whatsoever, which, in this case, they do
    not.

31
  • These residuals show no extreme differences in
    variation because they all have about the same
    spread.
  • They also do not show the presence of any
    outlier. An outlier is a residual value that is
    vey much larger than any of the others. The
    presence of an outlier can seriously jeopardize
    the ANOVA, so if one is found, its cause should
    be carefully investigated.

32
  • A histogram of residuals shows the
    distribution is slightly skewed. Small
    departures from symmetry are of less concern than
    heavy tails.

33
  • Another check is for normality. If we do a
    normal probability plot of the residuals, we can
    see whether normality holds.

34
  • A normal probability plot is made with
    ascending ordered residuals on the x-axis and
    their cumulative probability points, 100(k-.5)/n,
    on the y-axis. k is the order of the residual and
    n number of residuals. There is no evidence of
    an outlier here.
  • The previous slide is not exactly a normal
    probability plot because the y-axis is not
    scaled properly. But it does gives a pretty good
    suggestion of linearity.

35
  • A plot of residuals vs run order is useful to
    detect correlation between the residuals, a
    violation of the independence assumption.
  • Runs of positive or of negative residuals
    indicates correlation. None is observed here.

36
  • One of the goals of the analysis is to
    estimate the level means. If the results of the
    ANOVA shows that the factor is significant, we
    know that at least one of the means stands out
    from the rest. But which one or ones?
  • The procedures for making these mean
    comparisons are called multiple comparison
    methods. These methods use linear combinations
    called contrasts.

37
  • A contrast is a particular linear combination
    of level means, such as
    to test the difference between level 4 and level
    5.
  • Or if one wished to test the average of levels
    1 and 3 vs levels 4 and 5, he would use
    .
  • In general, where

38
  • An important case of contrasts is called
    orthogonal contrasts. Two contrasts in a design
    with coefficients cj and dj are orthogonal if

39
  • There are many ways to choose the orthogonal
    contrast coefficients for a set of levels. For
    example, if level 1 is a control and levels 2 and
    3 are two real treatments, a logical choice is to
    compare the average of the two treatments with
    the control
  • and then the two treatments against one
    another
  • These two contrasts are orthogonal because

40
  • Only L-1 orthogonal contrasts may be chosen
    because the L levels have only L-1 df. So for
    only three levels, the contrasts chosen exhaust
    those available for this experiment.
  • Contrasts must be chosen before seeing the data
    so that experimenters arent tempted to contrast
    the levels with the greatest differences.

41
  • For the tensile strength experiment with 5
    levels and thus 4 df, the 4 contrasts are
  • C1 0(5)(9.8)0(5)(15.4)0(5)(17.6)-1(5)(21.6)
    1(5)(10.8) -54
  • C2 1(5)(9.8)0(5)(15.4)1(5)(17.6)-1(5)(21.6)-
    1(5)(10.8) -25
  • C3 1(5)(9.8)0(5)(15.4)-1(5)(17.6)0(5)(21.6)
    0(5)(10.8) -39
  • C4 -1(5)(9.8)4(5)(15.4)-1(5)(17.6)-1(5)(21.6)-
    1(5)(10.8) 9
  • These 4 contrasts completely partition the
    SStreatments. Then the SS for each contrast is
    formed

42
  • So for the 4 contrasts we have

43
  • Now the revised ANOVA table is
  • Source SS df MS p
  • Weight 475.76 4 118.94 lt0.001
  • C1 291.60 1 291.60 lt0.001
  • C2 31.25 1 31.25 lt0.06
  • C3 152.10 1 152.10 lt0.001
  • C4 0.81 1 0.81 lt0.76
  • Error 161.20 20 8.06
  • Total 636.96 24

44
  • So contrast 1 (level 5 level 4) and contrast
    3 (level 1 level 3) are significant.
  • Although the orthogonal contrast approach is
    widely used, the experimenter may not know in
    advance which levels to test or they may be
    interested in more than L-1 comparisons. A
    number of other methods are available for such
    testing.

45
  • These methods include
  • Scheffes Method
  • Least Significant Difference Method
  • Duncans Multiple Range Test
  • Newman-Keuls test
  • There is some disagreement about which is the
    best method, but it is best if all are applied
    only after there is significance in the overall F
    test.

46
  • Now lets look at the random effects model.
  • Suppose there is a factor of interest with an
    extremely large number of levels. If the
    experimenter selects L of these levels at random,
    we have a random effects model or a components of
    variance model.

47
  • The linear statistical model is
  • as before, except that both and
  • are random variables instead of simply .
  • Because and are independent, the variance
    of any observation is
  • These two variances are called variance
    components, hence the name of the model.

48
  • The requirements of this model are that the
    are NID(0,s2), as before, and that the
    are NID(0, ) and that and are
    independent. The normality assumption is not
    required in the random effects model.
  • As before,
  • SSTotal SStreatments SSerror
  • And the E(MSerror) s2.
  • But now E(MStreatments) s2 n
  • So the estimate of is

49
  • The computations and the ANOVA table are the
    same as before, but the conclusions are quite
    different.
  • Lets look at an example.
  • A textile company uses a large number of
    looms. The process engineer suspects that the
    looms are of different strength, and selects 4
    looms at random to investigate this.

50
  • The results of the experiment are shown in the
    table below.
  • The ANOVA table is
  • Source SS df MS
    p
  • Looms 89.19 3 29.73 lt0.001
  • Error 22.75 12 1.90
  • Total 111.94 15

Loom
1 98 97 99 96 97.5
2 91 90 93 92 91.5
3 96 95 97 95 95.75
4 95 96 99 98 97.0
95.44
51
  • In this case, the estimates of the variances
    are
  • 1.90
  • Thus most of the variability in the
    observations is due to variability in loom
    strength. If you can isolate the causes of this
    variability and eliminate them, you can reduce
    the variability of the output and increase its
    quality.

52
  • When we studied the differences between two
    treatment means, we considered repeated measures
    on the same individual experimental unit.
  • With three or more treatments, we can still do
    this. The result is a repeated measures design.

53
  • Consider a repeated measures ANOVA partitioning
    the SSTotal.
  • This is the same as
  • SStotal SSbetween subjects SSwithin
    subjects
  • The within-subjects SS may be further
    partitioned into SStreatment SSerror .

54
  • In this case, the first term on the RHS is the
    differences between treatment effects and the
    second term on the RHS is the random error.

55
  • Now the ANOVA table looks like this.
  • Source SS df MS p
  • Between subjects n-1
  • Within Subjects
    n(L-1)
  • Treatments
    L-1
  • Error
    (L-1)(n-1)
  • Total
    Ln-1

56
  • The test for treatment effects is the usual
  • but now it is done entirely within subjects.
  • This design is really a randomized complete
    block design with subjects considered to be the
    blocks.

57
  • Now what is a randomized complete blocks
    design?
  • Blocking is a way to eliminate the effect of a
    nuisance factor on the comparisons of interest.
    Blocking can be used only if the nuisance factor
    is known and controllable.

58
  • Lets use an illustration. Suppose we want to
    test the effect of four different tips on the
    readings from a hardness testing machine.
  • The tip is pressed into a metal test coupon,
    and from the depth of the depression, the
    hardness of the coupon can be measured.

59
  • The only factor is tip type and it has four
    levels. If 4 replications are desired for each
    tip, a completely randomized design would seem to
    be appropriate.
  • This would require assigning each of the 4x4
    16 runs randomly to 16 different coupons.
  • The only problem is that the coupons need to
    be all of the same hardness, and if they are not,
    then the differences in coupon hardness will
    contribute to the variability observed.
  • Blocking is the way to deal with this problem.

60
  • In the block design, only 4 coupons are used
    and each tip is tested on each of the 4 coupons.
    So the blocking factor is the coupon, with 4
    levels.
  • In this setup, the block forms a homogeneous
    unit on which to test the tips.
  • This strategy improves the accuracy of the tip
    comparison by eliminating variability due to
    coupons.

61
  • Because all 4 tips are tested on each coupon,
    the design is a complete block design. The data
    from this design are shown below.

Test coupon Test coupon Test coupon Test coupon
Tip type 1 2 3 4
1 9.3 9.4 9.6 10.0
2 9.4 9.3 9.8 9.9
3 9.2 9.4 9.5 9.7
4 9.7 9.6 10.0 10.2
62
  • Now we analyze these data the same way we did
    for the repeated measures design. The model is
  • where ßk is the effect of the kth block and the
    rest of the terms are those we already know.

63
  • Since the block effects are deviations from the
    grand mean,
  • just as

64
  • We can express the total SS as
  • which is equivalent to
  • SStotal SStreatments SSblocks SSerror
  • with df
  • N-1 L-1 B-1 (L-1)(B-1)

65
  • The test for equality of treatment means
  • is
  • and the ANOVA table is
  • Source SS df MS p
  • Treatments SStreatments L-1
    MStreatments
  • Blocks SSblocks
    B-1 MSblocks
  • Error SSerror
    (L-1)(B-1) MSerror
  • Total SStotal
    N-1

66
  • For the hardness experiment, the ANOVA table is
  • Source SS df MS p
  • Tip type 38.50 3 12.83 0.0009
  • Coupons 82.50 3 27.50
  • Error 8.00 9 .89
  • Total 129.00 15
  • As is obvious, this is the same analysis as the
    repeated measures design.

67
  • Now lets consider the Latin Square design.
    Well introduce it with an example.
  • The object of study is 5 different formulations
    of a rocket propellant on the burning rate of
    aircraft escape systems. Each formulation comes
    from a batch of raw material large enough for
    only 5 formulations. Moreover, the formulations
    are prepared by 5 different operators, who differ
    in skill and experience.

68
  • The way to test in this situation is with a
    5x5 Latin Square, which allows for double
    blocking and therefore the removal of two
    nuisance factors. The Latin Square for this
    example is

Batches of raw material Operators Operators Operators Operators Operators
Batches of raw material 1 2 3 4 5
1 A B C D E
2 B C D E A
3 C D E A B
4 D E A B C
5 E A B C D
69
  • Note that each row and each column has all 5
    letters, and each letter occurs exactly once in
    each row and column.
  • The statistical model for a Latin Square is
  • where Yjkl is the jth treatment observation in
    the kth row and the lth column.

70
  • Again we have
  • SStotalSSrowsSScolumnsSStreatmentsSSerror
  • with df
  • N R-1 C-1 L-1 (R-2)(C-1)
  • The ANOVA table for propellant data is
  • Source SS df MS p
  • Formulations 330.00 4 82.50
    0.0025
  • Material batches 68.00 4
    17.00
  • Operators 150.00 4
    37.50 0.04
  • Error 128.00 12
    10.67
  • Total 676.00 24

71
  • So both the formulations and the operators
    were significantly different. The batches of raw
    material were not, but it still is a good idea to
    block on them because they often are different.
  • This design was not replicated, and Latin
    Squares often are not, but it is possible to put
    n replicates in each cell.

72
  • Now if you superimposed one Latin Square on
    another Latin Square of the same size, you would
    get a Graeco-Latin Square. In one Latin Square,
    the treatments are designated by roman letters.
    In the other Latin Square, the treatments are
    designated by Greek letters.
  • Hence the name Graeco-Latin Square.

73
  • A 5x5 Graeco-Latin Square is
  • Note that the five Greek treatments appear
    exactly once in each row and column, just as the
    Latin treatments did.

Batches of raw material Operators Operators Operators Operators Operators
Batches of raw material 1 2 3 4 5
1 Aa B? Ce Dß Ed
2 Bß Cd Da E? Ae
3 C? De Eß Ad Ba
4 Dd Ea A? Be Cß
5 Ee Aß Bd Ca D?
74
  • If Test Assemblies had been added as an
    additional factor to the original propellant
    experiment, the ANOVA table for propellant data
    would be
  • Source SS df MS p
  • Formulations 330.00 4 82.50
    0.0033
  • Material batches 68.00 4
    17.00
  • Operators 150.00 4
    37.50 0.0329
  • Test Assemblies 62.00 4 15.50
  • Error 66.00 8
    8.25
  • Total 676.00 24
  • The test assemblies turned out to be
    nonsignificant.

75
  • Note that the ANOVA tables for the Latin Square
    and the Graeco-Latin Square designs are
    identical, except for the error term.
  • The SS(error) for the Latin Square design was
    decomposed to be both Test Assemblies and error
    in the Graeco-Latin Square. This is a good
    example of how the error term is really a
    residual. Whatever isnt controlled falls into
    error.

76
  • Before we leave one-way designs, we should look
    at the regression approach to ANOVA. The model
    is
  • Using the method of least squares, we rewrite
    this as

77
  • Now to find the LS estimates of µ and tj,
  • When we do this differentiation with respect to
    µ and tj, and equate to 0, we obtain
  • for all j

78
  • After simplification, these reduce to
  • In these equations,

79
  • These j 1 equations are called the least
    squares normal equations.
  • If we add the constraint
  • we get a unique solution to these normal
    equations.

80
  • It is important to see that ANOVA designs are
    simply regression models. If we have a one-way
    design with 3 levels, the regression model is
  • where Xi1 1 if from level 1
  • 0 otherwise
  • and Xi2 1 if from level 2
  • 0 otherwise
  • Although the treatment levels may be
    qualitative, they are treated as dummy
    variables.

81
  • Since Xi1 1 and Xi2 0,
  • so
  • Similarly, if the observations are from level
    2,
  • so

82
  • Finally, consider observations from level 3,
    for which Xi1 Xi2 0. Then the regression
    model becomes
  • so
  • Thus in the regression model formulation of
    the one-way ANOVA, the regression coefficients
    describe comparisons of the first two level means
    with the third.

83
  • So
  • Thus, testing ß1 ß2 0 provides a test of
    the equality of the three means.
  • In general, for L levels, the regression model
    will have L-1 variables
  • and

84
  • Now what if you have two factors under test?
    Or three?
  • Here the answer is the factorial design. A
    factorial design crosses all factors. Lets take
    a two-way design. If there are L levels of
    factor A and M levels of factor B, then all LM
    treatment combinations appear in the experiment.
  • Most commonly, L M 2.

85
  • In a two-way design, with two levels of each
    factor, we have
  • We can have as many replicates as we want in
    this design. With n replicates, there are n
    observations in each cell of the design.

Factor A Factor B Response
-1 (low level) -1 (low level) 20
1 (high level) -1 (low level) 50
-1 (low level) 1 (high level) 40
1 (high level) 1 (high level) 12
86
  • SStotal SSA SSB SSAB SSerror
  • This decomposition should be familiar by now
    except for SSAB. What is this term? Its
    official name is interaction.
  • This is the magic of factorial designs. We
    find out about not only the effect of factor A
    and the effect of factor B, but the effect of the
    two factors in combination.

87
  • Now lets look at the main effects of the
    factors graphically.

88
  • Now lets look at the interaction effect. This
    is the effect of factors A and B in combination,
    and is often the most important effect.

89
  • Interaction of factors is the key to the East,
    as we say in the West.
  • Suppose you wanted the factor levels that give
    the lowest possible response. If you picked by
    main effects, you would pick A low and B high.
  • But look at the interaction plot and it will
    tell you to pick A high and B high.

90
  • This is why, if the interaction term is
    significant, you never interpret main effects.
    They are meaningless in the presence of
    interaction.
  • And it is because factorial designs provide
    interactions that they are so popular and so
    successful.

91
  • Now what if the interaction term is not
    significant? What if the results instead were

92
  • and the interaction is
  • The clearest indication of no interaction is
    the parallel lines.

93
  • So this time, if you wanted the lowest
    response, you would pick A low and B low and that
    would be correct.
Write a Comment
User Comments (0)
About PowerShow.com