Inferences About Process Quality PowerPoint PPT Presentation

presentation player overlay
1 / 113
About This Presentation
Transcript and Presenter's Notes

Title: Inferences About Process Quality


1
Chapter 3
  • Inferences About Process Quality

2
3-1. Statistics and Sampling Distributions
  • Statistical methods are used to make decisions
    about a process
  • Is the process out of control?
  • Is the process average you were given the true
    value?
  • What is the true process variability?

3
3-1. Statistics and Sampling Distributions
  • Statistics are quantities calculated from a
    random sample taken from a population of
    interest.
  • The probability distribution of a statistic is
    called a sampling distribution.

4
3-1.1 Sampling from a Normal Distribution
  • Let X represent measurements taken from a normal
    distribution. X
  • Select a sample of size n, at random, and
    calculate the sample mean,
  • Then

5
3-1.1 Sampling from a Normal Distribution
  • Probability example
  • The life of an automotive battery is normally
    distributed with mean 900 days and standard
    deviation 35 days. What is the probability that
    a random sample of 25 batteries will have an
    average life of more than 910 days?

6
Example
  • Z (910-900)/(35/SQRT(25) 1.429
  • P(Xbar gt 910) 1 - .9235 .0765

7
3-1.1 Sampling from a Normal Distribution
  • Chi-square (?2) Distribution
  • If x1, x2, , xn are normally and independently
    distributed random variables with mean zero and
    variance one, then the random variable
  • is distributed as chi-square with n degrees of
    freedom

8
3-1.1 Sampling from a Normal Distribution
  • Chi-square (?2) Distribution
  • Furthermore, the sampling distribution of
  • is chi-square with n 1 degrees of freedom
    when sampling from a normal population.

9
3-1.1 Sampling from a Normal Distribution
  • Chi-square (?2) Distribution for various degrees
    of freedom.

5
10
20
10
3-1.1 Sampling from a Normal Distribution
  • t-distribution
  • If x is a standard normal random variable and if
    y is a chi-square random variable with k degrees
    of freedom, then
  • is distributed as t with k degrees of freedom.

11
3-1.1 Sampling from a Normal Distribution
  • F-distribution
  • If w and y are two independent chi-square random
    variables with u and v degrees of freedom,
    respectively, then
  • is distributed as F with u numerator degrees
    of freedom and v denominator degrees of freedom.

12
3-1.2 Sampling from a Bernoulli
Distribution
  • A random variable, x, with probability function
  • is called a Bernoulli random variable.
  • The sum of a sample from a Bernoulli process has
    a binomial distribution with parameters n and p.

13
3-1.2 Sampling from a Bernoulli
Distribution
  • x1, x2, , xn taken from a Bernoulli process
  • The sample mean is a discrete random variable
    given by
  • The mean and variance of are

14
3-1.3 Sampling from a Poisson
Distribution
  • Consider a random sample of size n x1, x2, ,
    xn, taken from a Poisson process with parameter ?
  • The sum, x x1 x2 xn is also Poisson
    with parameter n?.
  • The sample mean is a discrete random variable
    given by
  • The mean and variance of are

15
3-2. Point Estimation of Process Parameters
  • Parameters are values representing the
    population. (Ex.) The population mean
    and variance, respectively.
  • Parameters in reality are often unknown and must
    be estimated.
  • Statistics are estimates of parameters.
  • (Ex.) are the sample mean and
    sample variance, respectively.

16
3-2. Point Estimation of Process Parameters
  • Two properties of good point estimators
  • The point estimator should be unbiased.
  • E(Q) Q
  • The point estimator should have minimum variance.

17
Unbiased estimators
  • The sample mean (Xbar) and variance (S2) are
    unbiased estimators of the population mean (m)
    and variance (s2)
  • The sample standard deviation is not an unbiased
    estimator (S) of the standard deviation (s)
  • E(S) c4s
  • So, sest S/c4 (sest is s with a hat on it)
  • App. Table VI gives values of c4 for 2 lt n lt 25

18
Range Method
  • The range of a sample is often used in QC
  • R xmax xmin
  • The relative range is given by
  • W R/s
  • W has been well studied
  • E(W) d2

19
Range method
  • Since W R/s
  • Then, sW R
  • And s R/W
  • And sest R/d2
  • Values of d2 for 2 lt n lt 25 are in App. Table VI

20
Range method
  • Using S is better than using the range method
  • But, for small sample sizes (n lt 10) the range
    method is acceptable
  • Often times, sample sizes are n 5 or n 6
  • The relative efficiency of the range method is
    shown on the next frame

21
Range method
22
3-3. Statistical Inference for a Single
Sample
  • Two categories of statistical inference
  • Parameter Estimation
  • Hypothesis Testing

23
3-3. Statistical Inference for a Single
Sample
  • A statistical hypothesis is a statement about the
    values of the parameters of a probability
    distribution.

24
3-3. Statistical Inference for a Single
Sample
  • Steps in Hypothesis Testing
  • Identify the parameter of interest
  • State the null hypothesis, H0 and alternative
    hypotheses, H1
  • Choose a significance level
  • State the appropriate test statistic
  • State the rejection region
  • Compare the value of the test statistic to the
    rejection region. Can the null hypothesis be
    rejected?

25
3-3. Statistical Inference for a Single
Sample
  • Example An automobile manufacturer claims a
    particular automobile averages 35 mpg (highway).
  • Suppose we are interested in testing this claim.
    We will sample 25 of these particular autos and
    under identical conditions calculate the average
    mpg for this sample.
  • Before actually collecting the data, we decide
    that if we get a sample average less than 33 mpg
    or more than 37 mpg, we will reject the makers
    claim. (Critical Values)

26
3-3. Statistical Inference for a Single
Sample
  • Example (continued)
  • H0
  • H1
  • From the sample of 25 cars, the average mpg was
    found to be 31.5. What is your conclusion?

27
3-3. Statistical Inference for a Single
Sample
  • Choice of Critical Values
  • How are the critical values chosen?
  • Wouldnt it be easier to decide how much room
    for error you will allow instead of finding the
    exact critical values for every problem you
    encounter?
  • OR
  • Wouldnt it be easier to set the size of the
    rejection region, rather than setting the
    critical values for every problem?

28
3-3. Statistical Inference for a Single
Sample
  • Significance Level
  • The level of significance, ? determines the size
    of the rejection region.
  • The level of significance is a probability. It is
    also known as the probability of a Type I error
    (want this to be small)
  • Type I error - rejecting the null hypothesis when
    it is true.
  • How small? Usually want

29
3-3. Statistical Inference for a Single
Sample
  • Types of Error
  • Type I error - rejecting the null hypothesis when
    it is true.
  • Pr(Type I error) ?. Sometimes called the
    producers risk.
  • Type II error - not rejecting the null hypothesis
    when it is false.
  • Pr(Type II error) ?. Sometimes called the
    consumers risk.

30
3-3. Statistical Inference for a Single
Sample
  • Power of a Test
  • The Power of a test of hypothesis is given by 1 -
    ?
  • That is, 1 - ? is the probability of correctly
    rejecting the null hypothesis, or the probability
    of rejecting the null hypothesis when the
    alternative is true

31
3-3.1 Inference on the Mean of a
Population, Variance Known
  • Hypothesis Testing
  • Hypotheses H0 H1
  • Test Statistic
  • Significance Level, ?
  • Rejection Region
  • If Z0 falls into either of the two regions above,
    reject H0

32
3-3.1 Inference on the Mean of a
Population, Variance Known
  • Example 3-1
  • Hypotheses H0 H1
  • Test Statistic
  • Significance Level, ? 0.05
  • Rejection Region
  • Since 3.50 gt 1.645, reject H0 and conclude that
    the lot mean pressure strength exceeds 175 psi

33
3-3.1 Inference on the Mean of a
Population, Variance Known
  • Confidence Intervals
  • A general 100(1- ?) two-sided confidence
    interval on the true population mean, ? is
  • 100(1- ?) One-sided confidence intervals are
  • Upper
    Lower

34
3-3.1 Inference on the Mean of a
Population, Variance Known
  • Confidence Interval on the Mean with Variance
    Known
  • Two-Sided
  • See the text for one-sided confidence intervals

35
3-3.1 Inference on the Mean of a
Population, Variance Known
  • Example 3-2
  • Reconsider Example 3-1. Suppose a 95 two-sided
    confidence interval is specified. Using Equation
    (3-28) we compute
  • Our estimate of the mean bursting strength is 182
    psi ? 3.92 psi with 95 confidence

36
3-3.2 The Use of P-Values in
Hypothesis Testing
  • If it is not enough to know if your test
    statistic, Z0 falls into a rejection region, then
    a measure of just how significant your test
    statistic is can be computed - P-value.
  • P-values are probabilities associated with the
    test statistic, Z0.

37
3-3.2 The Use of P-Values in Hypothesis
Testing
  • Definition
  • The P-value is the smallest level of significance
    that would lead to rejection of the null
    hypothesis H0.

38
3-3.2 The Use of P-Values in Hypothesis
Testing
  • Example
  • Reconsider Example 3-1. The test statistic was
    calculated to be Z0 3.50 for a right-tailed
    hypothesis test. The P-value for this problem is
    then
  • P 1 - ?(3.50) 0.00023
  • Thus, H0 ? 175 would be rejected at any level
    of significance ? ? P 0.00023

39
3-3.3 Inference on the Mean of a
Population, Variance Unknown
  • Hypothesis Testing
  • Hypotheses H0 H1
  • Test Statistic
  • Significance Level, ?
  • Rejection Region
  • Reject H0 if

40
3-3.3 Inference on the Mean of a
Population, Variance Unknown
  • Confidence Interval on the Mean with Variance
    Unknown
  • Two-Sided
  • See the text for the one-sided confidence
    intervals.

41
3-3.3 Inference on the Mean of a
Population, Variance Unknown
  • Computer Output

42
3-3.4 Inference on the Variance of a
Normal Distribution
  • Hypothesis Testing
  • Hypotheses H0 H1
  • Test Statistic
  • Significance Level, ?
  • Rejection Region

43
3-3.4 Inference on the Variance of a
Normal Distribution
  • Confidence Interval on the Variance
  • Two-Sided
  • See the text for the one-sided confidence
    intervals.

44
Example
  • Compute a 95 2-sided CI on the data in Table
    3.1.
  • S2 2.76
  • Find c2.025, 15 27.49 and c2.975,15 6.27
  • 15(2.76)/27.49 lt s2 lt 15(2.76)/6.27
  • 1.51 lt s2 lt 6.60

45
3-3.5 Inference on a Population
Proportion
  • Hypothesis Testing
  • Hypotheses H0 p p0 H1 p ? p0
  • Test Statistic
  • Significance Level, ?
  • Rejection Region

46
3-3.5 Inference on a Population
Proportion
  • Confidence Interval on the Population Proportion
  • Two-Sided
  • See the text for the one-sided confidence
    intervals.

47
Example
  • A foundry produces castings used in the
    automotive industry. We wish to test the
    hypothesis that the fraction nonconforming from
    this process is 10. In a random sample of 250
    castings, 41 were found to be nonconforming.
  • H0 p .10
  • H1 p .10

48
Example, cont.
  • np0 250(.10) 25
  • So, x 41 gt np0
  • Use Z0 (x - .5) - np0/SQRTnp0(1 p0)
  • (41 - .5) - 25/SQRT25(.1)(1 .1) 3.27
  • At a .05, Z.025 1.96
  • So, reject H0
  • P 2(1 - .99946) 2(.00054) .00108

49
3-3.6 The Probability of Type II Error
  • Calculation of P(Type II Error)
  • Assume the test of interest is H0 H1
  • P(Type II Error) is found to be
  • The Power of the test is then 1 - ?

50
Explanation
  • Test statistic is
  • Z0 (xbar m0)/s/SQRT(n) N(0,1)
  • Assume H0 is false
  • Then find the distribution of Z
  • Suppose that the mean is really m1 m0 d where
    d gt 0
  • Under this assumption Z0N(d/s/SQRT(n),1)
  • Or, Z0Nd(SQRT(n)/s,1

51
Explanation, cont.
  • Now, take a look at Fig. 3-6
  • We are trying to calculate the shaded area
  • We must calculate F(Za/2) F(-Za/2)
  • So, we use the equation shown previously

52
Example
  • Mean contents of coffee cans
  • Given that s .1 oz.
  • H0 m 16.0
  • H1 m 16.0
  • Suppose we want to find b if m 16.1 oz.

53
Example, cont.
  • Now, d 16.1 16.0 .1
  • Z.025 1.96
  • Then b F1.96(.1)(3)/.1 -
    F-1.96(.1)(3)/.1
  • Or, F(-1.04) F(-4.96)
  • .1492

54
3-3.6 The Probability of Type II Error
  • Operating Characteristic (OC) Curves
  • Operating Characteristic (OC) curve is a graph
    representing the relationship between ?, ?, ? and
    n.
  • OC curves are useful in determining how large a
    sample is required to detect a specified
    difference with a particular probability.

55
3-3.6 The Probability of Type II Error
  • Operating Characteristic (OC) Curves

Previous example d .1/.1 1 b .1492
d/s
56
3-3.7 Probability Plotting
  • Probability plotting is a graphical method for
    determining whether sample data conform to a
    hypothesized distribution based on a subjective
    visual examination of the data.
  • Probability plotting uses special graph paper
    known as probability paper. Probability paper is
    available for the normal, lognormal, and Weibull
    distributions among others.
  • Can also use the computer.

57
3-3.7 Probability Plotting
  • Example 3-8

58
Tensile strength data
Note m and s can be estimated from the
probability plot as shown
59
3-4. Statistical Inference for Two Samples
  • Previous section presented hypothesis testing and
    confidence intervals for a single population
    parameter.
  • Results are extended to the case of two
    independent populations
  • Statistical inference on the difference in
    population means,

60
3-4.1 Inference For a Difference in
Means, Variances Known
  • Assumptions
  • X11, X12, , X1n1 is a random sample from
    population 1.
  • X21, X22, , X2n2 is a random sample from
    population 2.
  • The two populations represented by X1 and X2 are
    independent
  • Both populations are normal, or if they are not
    normal, the conditions of the central limit
    theorem apply

61
3-4.1 Inference For a Difference in
Means, Variances Known
  • Point estimator for is
  • where

62
3-4.1 Inference For a Difference in
Means, Variances Known
  • Hypothesis Tests for a Difference in Means,
    Variances Known
  • Null Hypothesis
  • Test Statistic

63
3-4.1 Inference For a Difference in
Means, Variances Known
  • Hypothesis Tests for a Difference in Means,
    Variances Known
  • Alternative Hypotheses Rejection
    Criterion

64
3-4.1 Inference For a Difference in
Means, Variances Known
  • Confidence Interval on a Difference in Means,
    Variances Known
  • 100(1 - ?) confidence interval on the
    difference in means is given by

65
Example 3-9
  • Drying time is being studied
  • 10 samples of each paint
  • Xbar1 121 and Xbar2 112
  • Standard deviation, s 8, unaffected by paint
    formulation
  • H0 m1 - m2 0
  • H1 m1 - m2 gt 0

66
Example, cont.
  • Z0 2.52 gt Z.05 1.645
  • Therefore, reject H0
  • P-value 1 F(2.52) .0059
  • H0 would be rejected at any significance level a
    gt .0059

67
Confidence Interval
  • On a difference in means, variances known
  • See Eq. 3-49

68
3-4.2 Inference For a Difference in
Means, Variances Unknown
  • Hypothesis Tests for a Difference in Means,
  • Case I
  • Point estimator for is
  • where

69
3-4.2 Inference For a Difference in
Means, Variances Unknown
  • Hypothesis Tests for a Difference in Means,
  • Case I
  • The pooled estimate of , denoted by is
    defined by

70
3-4.2 Inference For a Difference in
Means, Variances Unknown
  • Hypothesis Tests for a Difference in Means,
  • Case I
  • Null Hypothesis
  • Test Statistic

71
3-4.2 Inference For a Difference in
Means, Variances Unknown
  • Hypothesis Tests for a Difference in Means,
    Variances Unknown
  • Alternative Hypotheses Rejection
    Criterion

72
Example 3-10
  • Discuss this example on pgs. 120-121
  • The variances are assumed to be equal, so that
    they are pooled
  • t.025,14 2.145

73
3-4.2 Inference For a Difference in
Means, Variances Unknown
  • Hypothesis Tests for a Difference in Means,
  • Case II
  • Null Hypothesis
  • Test Statistic

74
3-4.2 Inference For a Difference in
Means, Variances Unknown
  • Hypothesis Tests for a Difference in Means,
  • Case II
  • The degrees of freedom for are given by

75
3-4.2 Inference For a Difference in
Means, Variances Unknown
  • Confidence Intervals on a Difference in Means,
    Case I
  • 100(1 - ?) confidence interval on the
    difference in means is given by

76
3-4.2 Inference For a Difference in
Means, Variances Unknown
  • Confidence Intervals on a Difference in Means,
    Case II
  • 100(1 - ?) confidence interval on the
    difference in means is given by

77
Example 3-11
  • Discuss this example beginning on pg. 122
  • The variances are assumed to be equal, so that
    they are pooled
  • t.025,23 2.069
  • Note that the 95 CI contains zero, so that we
    cannot conclude that there is a difference in the
    means

78
Minitab solution
Two-sample T for Catalyst 1 vs Catalyst 2
N Mean StDev SE Mean Catalyst 8
92.26 2.39 0.84 Catalyst 8 92.73
2.99 1.1 Difference mu Catalyst 1 -
mu Catalyst 2 Estimate for difference -0.47 95
CI for difference (-3.37, 2.42) T-Test of
difference 0 (vs not ) T-Value -0.35
P-Value 0.731 DF 14 Both use Pooled StDev
2.70
Note 0 is included in the CI
79
Minitab solution
No obvious differences
80
Minitab solution
Normality and equal variances are reasonable
81
3-4.2 Paired Data
  • Observations in an experiment are often paired to
    prevent extraneous factors from inflating the
    estimate of the variance.
  • Difference is obtained on each pair of
    observations, dj x1j x2j, where j 1, 2, ,
    n.
  • Test the hypothesis that the mean of the
    difference, ?d, is zero.

82
3-4.2 Paired Data
  • The differences, dj, represent the new set of
    data with the summary statistics

83
3-4.2 Paired Data
  • Hypothesis Testing
  • Hypotheses H0 ?d 0 H1 ?d ? 0
  • Test Statistic
  • Significance Level, ?
  • Rejection Region t0 ? t?/2,n-1

84
Example 3-12
  • dbar -1.38
  • t0 -1.46
  • t.025, 7 2.365
  • Conclusion No strong evidence to indicate the
    the two machines differ in their tensile strength
  • P-value .1877

85
Solution using Minitab
Paired T for Mach 1 - Mach 2
N Mean StDev SE Mean Mach 1
8 69.13 5.96 2.11 Mach 2
8 70.50 6.07 2.15 Difference
8 -1.375 2.669 0.944 95 CI for mean
difference (-3.608, 0.858) T-Test of mean
difference 0 (vs not 0) T-Value -1.46
P-Value 0.188
86
3-4.3 Inferences on the Variances of Two
Normal Distributions
  • Hypothesis Testing
  • Consider testing the hypothesis that the
    variances of two independent normal distributions
    are equal.
  • Assume random samples of sizes n1 and n2 are
    taken from populations 1 and 2, respectively

87
3-4.3 Inferences on the Variances of Two
Normal Distributions
  • Hypothesis Testing
  • Hypotheses
  • Test Statistic
  • Significance Level, ?
  • Rejection Region

88
3-4.3 Inferences on the Variances of Two
Normal Distributions
  • Alternative Test Rejection
  • Hypothesis Statistic Region

89
3-4.3 Inferences on the Variances of Two
Normal Distributions
  • Confidence Intervals on Ratio of the Variances of
    Two Normal Distributions
  • 100(1 - ?) two-sided confidence interval
    on the ratio of variances is given by

90
3-4.4 Inference on Two Population
Proportions
  • Large-Sample Hypothesis Testing
  • Hypotheses H0 p1 p2 H1 p1 ? p2
  • Test Statistic
  • Significance Level, ?
  • Rejection Region

91
3-4.4 Inference on Two Population
Proportions
  • Alternative Hypothesis Rejection Region

92
3-4.4 Inference on Two Population
Proportions
  • Confidence Interval on the Difference in Two
    Population Proportions
  • Two-Sided
  • See the text for the one-sided confidence
    intervals.

93
3-5. What If We Have More Than Two
Populations?
  • Example
  • Investigating the effect of one factor (with
    several levels) on
  • some response. See Table 3-5
  • Hardwood Observations
  • Concentration 1 2 3 4 5 6
    Totals Avg
  • 5 7 8 15 11 9 10
    60 10
  • 10 12 17 13 18 19 15
    94 15.67
  • 15 14 18 19 17 16 18
    102 17
  • 20 19 25 22 23 18 20
    127 21.17
  • Overall 383 15.96

94
3-5. What If We Have More Than Two
Populations?
  • Analysis of Variance
  • Always a good practice to compare the levels of
    the factor using graphical methods such as
    boxplots.
  • Comparative boxplots show the variability of the
    observations within a factor level and the
    variability between factor levels.

95
3-5. What If We Have More Than Two
Populations?
  • Figure 3-14 (a)

96
3-5. What If We Have More Than Two
Populations?
  • The observations yij can be modeled by
  • a number of factor levels
  • n number of replicates of observations
  • per treatment (factor) level.

97
3-5. What If We Have More Than Two
Populations?
  • The hypotheses being tested are
  • H0
  • H1 for at least one i
  • Total variability can be measured by the total
    corrected sum of squares

98
3-5. What If We Have More Than Two
Populations?
  • The sum of squares identity is
  • Notationally, this is often written as
  • SST SSTreatments SSE

99
3-5. What If We Have More Than Two
Populations?
  • The expected value of the treatment sum of
    squares is
  • If the null hypothesis is true, then

100
3-5. What If We Have More Than Two
Populations?
  • The error mean square
  • If the null hypothesis is true, the ratio
  • has an F-distribution with a 1 and a(n 1)
    degrees of freedom.

101
3-5. What If We Have More Than Two
Populations?
  • The following formulas can be used to calculate
    the sums of squares.
  • Total Sum of Squares (SST)
  • Sum of Squares for the Treatments (SSTreatment)
  • Sum of Squares for error (SSE)
  • SSE SST -SSTreatment

102
3-5. What If We Have More Than Two
Populations?
  • Analysis of Variance Table 3-7

103
Example 3-13
  • Four different hardwood concentrations
  • Trts
  • Six observations of each
  • Completely randomized
  • Hypotheses
  • H0 t1 t2 t3 t4 0
  • H1 ti 0 for at least one i

104
Example 3-13
  • SST 7282 202 (383)2/24 512.96
  • SStrts 60294210221272/6 (383)2/24
  • 382.79
  • SSE SST SStrts 512.96 382.79 130.17

105
Example 3-13, cont.
106
Example 3-13 cont.
  • Since F.05,3,20 3.10, reject H0
  • (H0 No difference in concentrations)
  • Since F.01,3,20 4.94, reject H0
  • (H0 No difference in concentrations)

107
3-5. What If We Have More Than Two
Populations?
  • Analysis of Variance Table 3-8

108
3-5. What If We Have More Than Two
Populations?
  • Residual Analysis
  • Assumptions model errors are normally and
    independently distributed with equal variance.
  • Check the assumptions by looking at residual
    plots.

109
3-5. What If We Have More Than Two
Populations?
  • Residual Analysis
  • Residual is given by eij yij ybari.
  • ybar1. 10
  • e11 7 10 3, and so on

110
3-5. What If We Have More Than Two
Populations?
  • Residual Analysis
  • Plot of residuals versus factor levels

111
3-5. What If We Have More Than Two
Populations?
  • Residual Analysis
  • Normal probability plot of residuals

112
Exercises
  • Work as many odd-numbered exercises as necessary
    to make sure that you understand this chapter

113
End
Write a Comment
User Comments (0)
About PowerShow.com