Title: Ch7 Inference concerning means II
1Ch7 Inference concerning means II
- Dr. Deshi Ye
- yedeshi_at_zju.edu.cn
2Review
- Point estimation calculate the estimated
standard error to accompany the point
estimate of a population. - Interval estimation
- whatever the population, when the sample size
is large, calculate the 100(1-a) confidence
interval for the mean - When the population is normal, calculate the
100(1-a) confidence interval for the mean -
- Where is the obtained from
t-distribution with n-1 degrees of freedom.
3Review con.
- Test of Hypothesis
- 5 steps totally. Formulate the assertion that
the experiment seeks to confirm as the
alternative hypothesis - P-value calculation
the smallest fixed level at which the null
hypothesis can be rejected.
4Outline
- The relation between tests and confidence
intervals - Operating characteristic curves
- Inference concerning two means
- Design Issues Randomization and Pairing
57.6 The relation between tests and confidence
intervals
- Tests for two-sided alternatives and confidence
intervals. - Consider the 100(1-a) confidence interval for
the mean given on p233
A level test of null hypothesis
Versus
Test critical region
6Relations
Can also be expressed as
Relation confidence interval gives the interval
of plausible values for
So, if is contained in this interval, then
it cannot be rejected.
7Calculating Type II Error Probabilities
8Power of Test
- 1. Probability of Rejecting False H0
- Correct Decision
- 2. Designated 1 - ?
- 3. Used in Determining Test Adequacy
- 4. Affected by
- True Value of Population Parameter
- Significance Level ?
- Standard Deviation Sample Size n
9Finding Power Step 1
Reject
?
HypothesisH0 ?0 ? 368H1 ?0 lt 368
Do Not
Draw
Reject
? .05
?
368
?
X
0
10Finding Power Steps 2 3
Reject
?
???n 15/?25
HypothesisH0 ?0 ? 368H1 ?0 lt 368
Do Not
Draw
Reject
? .05
?
368
?
X
0
?
True Situation ?1 360
Draw
?
?
1-?
Specify
?
X
?
360
1
11Finding Power Step 4
Reject
?
???n 15/?25
HypothesisH0 ?0 ? 368H1 ?0 lt 368
Do Not
Draw
Reject
? .05
?
368
?
X
0
?
True Situation ?1 360
Draw
?
?
Specify
?
X
?
360
363.065
1
12Finding PowerStep 5
Reject
?
???n 15/?25
HypothesisH0 ?0 ? 368H1 ?0 lt 368
Do Not
Draw
Reject
? .05
?
368
?
X
0
?
True Situation ?1 360
Draw
? .154
?
?
?
1-? .846
Specify
Z Table
?
X
?
360
363.065
1
13Power Curves
H0 ? ???0
H0 ? ???0
Power
Power
Possible True Values for ?1
Possible True Values for ?1
H0 ? ??0
Power
?? 368 in Example
Possible True Values for ?1
147.7 Operating Characteristic Curves
- How about Type II errors?
- Review the example in Section 7.4 (P238)
- We investigate the probability of not rejecting
(accepting) the null hypothesis under a range of
values for
Probability of accepting the null hypothesis when
prevails
15If equals a value of the null hypothesis
is true, then is the probability of the
Type I error.
When has a value where the alternative
hypothesis is true, then is the
probability of a Type II error.
Example P238
If the prevailing population mean is
16Calculation
- Students are asked to calculate the table in P253
17OC curve
- The graph of for various value of
- is called operating characteristic curve, or
simply OC curve.
Null Hypothesis
Alternative Hypothesis
18Type II error
- Table 8 at end of the textbook, the probabilities
of Type II errors can be determined directly
corresponding to the value
Quantity needed for use of Table 8
19Thinking Challenge
How Would You Try to Answer These Questions?
- Who Gets Higher Grades Males or Females?
- Which Programs Are Faster to Learn Windows or
DOS?
D O S
207.8 Inference concerning two means
- In many statistical problems, we are faced with
decision about the relative size of the means of
two or more populations. - Tests concerning the difference between two means
- Consider two populations having the mean
- and and the variances of and
- and we want to test null hypothesis
Random samples of size
21Two Population Tests
22Testing Two Means
- Independent Sampling Paired Difference
Experiments
23Two Population Tests
24Independent Related Populations
Independent
Related
25Independent Related Populations
Independent
Related
- 1. Different Data Sources
- Unrelated
- Independent
26Independent Related Populations
Independent
Related
- 1. Different Data Sources
- Unrelated
- Independent
- 1. Same Data Source
- Paired or Matched
- Repeated Measures(Before/After)
27Independent Related Populations
Independent
Related
- 1. Different Data Sources
- Unrelated
- Independent
- 2. Use Difference Between the 2 Sample Means
- ?X1 -?X2
- 1. Same Data Source
- Paired or Matched
- Repeated Measures(Before/After)
28Independent Related Populations
Independent
Related
- 1. Different Data Sources
- Unrelated
- Independent
- 2. Use Difference Between the 2 Sample Means
- ?X1 -?X2
- 1. Same Data Source
- Paired or Matched
- Repeated Measures(Before/After)
- 2. Use Difference Between Each Pair of
Observations - Di X1i - X2i
29Two Independent Populations Examples
- 1. An economist wishes to determine whether there
is a difference in mean family income for
households in 2 socioeconomic groups. - 2. An admissions officer of a small liberal arts
college wants to compare the mean SAT scores of
applicants educated in rural high schools in
urban high schools.
30Two Related Populations Examples
- 1. Nike wants to see if there is a difference in
durability of 2 sole materials. One type is
placed on one shoe, the other type on the other
shoe of the same pair. - 2. An analyst for Educational Testing Service
wants to compare the mean GMAT scores of students
before after taking a GMAT review course.
31Thinking Challenge
Are They Independent or Paired?
- 1. Miles per gallon ratings of cars before
after mounting radial tires - 2. The life expectancy of light bulbs made in 2
different factories - 3. Difference in hardness between 2 metals one
contains an alloy, one doesnt - 4. Tread life of two different motorcycle tires
one on the front, the other on the back
32Testing 2 Independent Means
33Two Population Tests
34Test
- The test will depend on the difference between
the sample means and if both samples
come from normal population with known variances,
it can be based on the statistic
35Theorem
- If the distribution of two independent random
variables have the mean and - and the variance and , then the
distribution of their sum (or difference) has the
mean (or ) and the
variance
Two different sample of size
36Statistic for test concerning different between
two means
Is a random variable having the standard normal
distribution.
Or large samples
37Criterion Region for testing
38EX.
- To test the claim that the resistance of electric
wire can be reduced by more than 0.05 ohm by
alloying, 32 values obtained for standard wire
yielded ohm and
ohm , and 32 values obtained for alloyed
wire yielded - ohm and ohm
- Question At the 0.05 level of significance,
does this support the claim?
39Solution
Alternative hypothesis
2. Level of significance 0.05
3. Criterion Reject the null hypothesis if Z gt
1.645
4. Calculation
5. The null hypothesis must be rejected.
6. P-value 1-0.9960.04 lt level of significance
40Critical values
41Type II errors
- To judge the strength of support for the null
hypothesis when it is not rejected. - Check it from Table 8 at the end of the textbook
-
The size of two examples are not equal
42Small sample size
43Criterion Region for testing (Statistic for
small sample )
44EX
The following random samples are measurements of
the heat-producing capacity of specimens of coal
from two mines Question use the 0.01 level of
significance to test where the difference between
the means of these two samples is significant.
- Mine 1 Mine 2
- 8260 7950
- 8130 7890
- 8350 7900
- 8070 8140
- 8340 7920
- 7840
45Solution
Alternative hypothesis
2. Level of significance 0.01
- Criterion Reject the null hypothesis if t gt 3.25
or tlt -3.25
4. Calculation
5. The null hypothesis must be rejected.
6. P-value 0.004 lt level of significance 0.01
46Calculate it in Minitab
47Output
- Two-sample T for Mine 1 vs Mine 2
- SE
- N Mean StDev Mean
- Mine 1 5 8230 125 56
- Mine 2 6 7940 104 43
- Difference mu (Mine 1) - mu (Mine 2)
- Estimate for difference 290.000
- 99 CI for difference (43.293, 536.707)
- T-Test of difference 0 (vs not ) T-Value
4.11 P-Value 0.004 DF 7
48- SE mean (standard error of mean) is calculated
by dividing the standard deviation by the square
root of n. - StDev standard deviation .
49Confidence interval
- 100(1-a) confidence interval for
Where is based on
degrees of freedom.
50CI for large sample
51Matched pairs comparisons
- Question Are the samples independent in the
application of the two sample t test? - For instance, the test cannot be used when we
deal with before and after data, where the data
are naturally paired. - EX A manufacturer is concerned about the loss of
weight of ceramic parts during a baking step. Let
the pair of random variables denote the
weight before and weight after baking for the
i-th specimen.
52Statistical analysis
- Considering the difference
- This collection of differences is treated as
random sample of size n from a population having
mean
indicates the means of the two responses are
the same
Null hypothesis
53EX
- The following are the average weekly losses of
worker-hours due to accidents in 10-industrial
plants before and after a certain safety program
was put into operation - Before 45 73 46 124 33 57 83 34 26 17
- After 36 60 44 119 35 51 77 29 24 11
- Question Use the 0.05 level of significance to
test whether the safety program is effective.
54Solution
Alternative hypothesis
2. Level of significance 0.05
3. Criterion Reject the null hypothesis if t gt
1.833
4. Calculation
5. The null hypothesis must be rejected at level
0.05.
6. P-value 1-0.99850.0015 lt level of
significance
55Confidence interval
- A 90 confidence interval for the mean of a
paired difference. - Solution since n10 difference have the mean 5.2
and standard variance 4.08,
567.9 Design issues Randomization and Pairing
Randomization of treatments prevents
uncontrolled sources of variation from exerting a
systematic influence on the response
Pairing according to some variable(s) thought to
influence the response will remove the effect of
that variable from analysis
Randomizing the assignment of treatments within a
pair helps prevent any other uncontrolled
variables from influencing the responses in a
systematic manner.
57Data Collection and Analysis
- Practical project.
- 1. Data
- Your goal is to see how the American and National
leagues compare on these variables.