Title: Further Inference
1Chapter 7
2Inference when population standard deviation is
unknown
There is one problem.we do not usually know s so
we cannot calculate sx. We could use the sample
standard deviation, however.
3The t-distribution
- Just as
- has a standard normal (Z) distribution
- either when X is normal or n is large, so does
- follow a t-distribution with n-1 degrees
- of freedom
4The t-distribution
- The t-distribution looks almost like the
standard normal distribution in that it is
symmetric about zero. - However, the tails of the t-distribution are
fatter than that of the standard normal. -
This is to take into account the use of the
sample standard deviation (s) instead of the
population standard deviation (s).
5(No Transcript)
6(No Transcript)
7The t-distribution
- Table values used to construct confidence
intervals using the t-distribution will be
different from the standard normal. - The shape of the t-distribution is different
depending on the degrees of freedom (df). When
used to compute confidence intervals for the mean
(m) the df n -1. - When the df are very large, the t-distribution is
close to the standard normal distribution
(z-distribution).
895 C.I. for m using the t-distribution
Suppose you have collected a sample of
20 observations from a normal distribution, your
sample mean is 5.5 and your sample standard
deviation is 1.7 df 19 t 2.093
9Group Work 90 C.I. For m
Suppose you have collected a sample of
25 observations from a normal distribution, your
sample mean is 8.0 and your sample standard
deviation is 2.1
df 24 t ____
10The t-procedure for C.I.s is robust. Check for
normality with skewness kurtosis measures (both
must be within (1) and absence of outliers for
very small samples (n lt 15). For 15 lt n lt 40,
t-procedure O.K. as long as skewness kurtosis
within 3 and no outliers. For n gt 40,
t-procedure O.K.
11(No Transcript)
12Hypothesis Test Example 1
- What are we given? n 400 s 15 x 23 ?
.05 - Step 1, establish hypotheses
- H0 ? 20 vs. Ha ? ? 20
- Step 2, set significance level. a .05 (given)
- Step 3, compute the test statistic
- t (23 - 20)/0.75 4.0
- Step 4, estimate the p-value. Using df 100,
t-table gives P(T gt 3.390) .0005. So, P(T gt
t4) lt .0005. Since test is two tailed, p-value
lt 20.0005 i.e. lt 0.001 - Step 5, decision reject Ho since p-value (.001)
lt ? .05 - Step 6, conclusion within context none given
13Since we have a 2-tailed test, p-value 2 x P(T
gt t). 2 x .0005 .001, so p-value lt .001 lt a
(.05). Reject H0 since p-value lt a
.0005
½ p-value lt .0005
3.390
4.0
14Macro Output for Example 1
15Hypothesis Test Example 2
- What are we given? n 30 s 15 x 21 ?
.05 - Step 1, establish hypotheses (given)
- H0 ? 20 vs. Ha ? gt 20
- Step 2, set significance level. a .05 (given)
- Step 3, compute the test statistic
- t (21 - 20)/2.74 0.365
- Step 4, estimate the p-value. Using df 29,
t-table gives P(T gt 0.683) .25. So, P(T gt
0.365) gt .25. Since test is one tailed, this is
the p-value (gt .25) - Step 5, decision fail to reject Ho since p-value
(gt .25) gt ? .05 - Step 6, conclusion within context none given
16SExbar 15/?30 2.74
Using df 29, t-table gives P(T gt 0.683)
.25 P-value P(T gt t) gt .25
0.25
t 0.365
0.683
17Macro Output for Example 2
18Hypothesis Test Example 3
- What are we given? n 30 s 15 x 18.7 ?
.05 - Step 1, establish hypotheses (given)
- H0 ? 20 vs. Ha ? lt 20
- Step 2, set significance level. a .05 (given)
- Step 3, compute the test statistic
- t (18.7 - 20)/2.74 - 0.474
- Step 4, estimate the p-value. Using df 29,
t-table gives P(T lt -0.683) .25. So, P(T lt
-0.474) gt .25. Since test is one tailed, this is
the p-value (gt .25) - Step 5, decision fail to reject Ho since p-value
(gt .25) gt ? .05 - Step 6, conclusion within context none given
19Using df 29, t-table gives P(T lt -0.683)
P-value P(T lt t)
0.25
gt .25
.25
-0.683
-.474
20(No Transcript)
21(No Transcript)
22Confidence Intervals and Hypothesis Tests for
Comparing Two Means
23Difference between Means
- Do female students perform better than male
students on average? - We can answer this by drawing random samples of
female students and male students and looking to
see how far apart the sample means are
24Likely scenario when female and male students
form a single population with one Mean
These two sample means probably ARE NOT very far
apart.
m1
25If they form two Populations with two Means?
However, if the true means are different we are
more likely to get two sample means which are far
apart.
m1
m2
26Two Populations, Two Means
However, if the true means are different we are
more likely to get two sample means which are far
apart.
m1
m2
27Difference between Means
- If the sample means are far apart (statistically)
then we conclude that the population means are
different. - We will, first, calculate the difference between
the sample means and then compute a t-statistic
(like a z-score) to measure the distance apart in
terms of standard deviations (called std. error
here)
28To test the hypothesis Ho m1 m2 when s1 and s2
are unknown we use the equivalent two-sample
t-statistic
29To test the hypothesis Ho m1 m2 using the
two-sample t-statistic we recognize that under
Ho, m1 m2 0. So our test statistic
becomes
30(No Transcript)
31(No Transcript)
32(No Transcript)
33Healthy Companies vs. Failed Companies
To test the hypothesis Is the mean current ratio
for healthy companies greater than the current
ratio for failed companies? We need Ho m1 m2
vs. Ha m1 gt m2 which is the same as writing
Ho m1 m2 0 vs. Ha m1 m2 gt 0
34Excel Output
Back-up hand calculations
35Macro Output
36Healthy Companies vs. Failed Companies
Suppose we choose a .01. Then we compare our
p-value against a. Decision Rule is reject Ho
(or conclude Ha) if p-value lt a Since p-value lt
.0005 which is lt a .01 we reject Ho (conclude
Ha) The test is highly significant since the
p-value is so small. We conclude that there is
overwhelming evidence that the mean current ratio
among healthy companies is greater than that
among failed companies
37Group Work
Test the hypothesis, at the 5 significance
level, that the mean salaries for males and
females are different.
38Excel Output for ex. 7.97
39Confidence Intervals for the Difference Between
Two Means
40Difference between Means
- If the sample means are far apart (statistically)
then we conclude that the population means are
different. - We will, first, calculate the difference between
the sample means and then construct a confidence
interval around that difference.
41Difference between Means
- What value will we look for (in the confidence
interval) to see whether the means are the same? - Zero
42(No Transcript)
43(No Transcript)
44(No Transcript)
45Excel Descriptive Statistics and hand
calculations
46Output from Macro
47Group Work
Compute a 95 C.I. for the difference mm-mf
48Sample statistics for ex. 7.97
49Macro Output for ex. 7.97