Review

About This Presentation

Title:

Review

Description:

Assume donations for the two populations are normally distributed. ... in charge of the production of car seats are concerned about the compliance of ... – PowerPoint PPT presentation

Number of Views:37

Avg rating:3.0/5.0

Slides: 75

Provided by: businessF

Category:

Tags: review

more less

Transcript and Presenter's Notes

Title: Review

1
Review 2

Chapter 9
Chapter 10
Chapter 11 and 12

2
Chapter 9Sampling Distributions

A statistic is a random variable describing a
characteristic of a random samples.
Sample mean
Sample variance
We use statistic values in inferential statistics
(make inference about population characteristics
from sample characteristics).
Statistics have distributions of their own.

3
Chapter 9 The Central Limit Theorem

The distribution of the sample mean is normal if
the parent distribution is normal.
The distribution of the sample mean approaches
the normal distribution for sufficiently large
samples (n ³ 30), even if the parent
distribution is not normal.
The parameters of the sample distribution of the
mean are
Mean
Standard deviation
(Assumption The population is sufficiently
large. No correction is needed in the
calculation of the variance).

4
Chapter 9 The Central Limit Theorem

Problem 1 (Using Excel) Given a normal
population whose mean is 50 and whose standard
deviation is 5,
Question 1 Find the probability that a random
sample of 4 has a mean between 49 and 52
Answer

-.4
.8
5
Chapter 9The Central Limit Theorem
Normal table

Problem 1 (Using the table) Given a normal
population whose mean is 50 and whose standard
deviation is 5,
Question 1 Find the probability that a random
sample of 4 has a mean between 49 and 52
Answer

-.4
.8
6
Chapter 9The Central Limit Theorem
Normal table

Problem 1
Question 2 Find the probability that a random
sample of 16 has a mean between 49 and 52.
Answer

7
Chapter 9 The Central Limit Theorem
Normal table

Problem 2 The amount of time per day spent by
adults watching TV is normally distributed with
m6 and s1.5 hours.
Question 1 What is the probability that a
randomly selected adult watches TV for more
than 7 hours a day?
Answer

Question 2 What is the probability that 5 adults
watch TV on the average 7 or more hours?Answer

8
Chapter 9 The Central Limit Theorem
Normal table

Problem 2
Question 3 What is the probability that the
total time of watching TV of the five adults will
not exceed 28 hours?
Answer
Question 4 What total TV watching time is
exceeded by only 3 of the population for samples
of 5 adults?

Comments 1.Excel returns X for agiven left hand
tail probability 2. .670822 1.5/5.5
9
Chapter 9 The Central Limit Theorem
Normal table

Problem 3
Assume that the monthly rents paid by students
in a particular town is 350 with a standard
deviation of 40. A random sample of 100 students
who rented apartments was taken.
Question1 What is the probability that the
sample mean of the monthly rent exceeds 355?

10
Chapter 9 The Central Limit Theorem
Normal table

Problem 3 - continued
Question2 What is the probability that the
total revenue from renting 10 randomly selected
apartments falls between 3300 and 3700 dollars?

11
Chapter 9 The Central Limit Theorem
Normal table

Problem 3 - continued
Question3 Lets assume the population mean was
unknown, but the standard deviation was known to
be 40. A sample of 100 rentals was selected in
order to estimate the mean monthly rent paid by
the whole student population. What is the
probability that the sample mean differ from the
actual mean by more than 5? How about more than
10?

12
Chapter 9 The Central Limit Theorem

Problem 3
continued

13
Chapter 9Sampling distribution of the sample
proportion

In a sample of size n, if np gt 5 and n(1-p) gt 5,
then the sample proportion p x/n is
approximately normally distributed with the
following parameters

(Assumption The population is sufficiently
large. No correction is needed in the
calculation of the variance).
14
Sampling distribution of the sample proportion

Problem 4
A commercial of a household appliances
manufacturer claims that less than 5 of all of
its products require a service call in the first
year.
A survey of 400 households that recently
purchased the manufacturer products was conducted
to check the claim.

15
Sampling distribution of the sample proportion
Normal table

Problem 4 - Continued Assuming the
manufacturer is right, what is the probability
that more than 10 of the surveyed households
require a service call within the first year?

If indeed 10 of the sampled households reported
a call for service within the first year, what
does ittell you about the the manufacturer
claim?
16
Sampling Distribution of the Difference Between
two Means

If two independent variables are normally
distributed with means and variances m1, s21,
and m2, s22 respectively, then x1 x2 is also
normally distributed with

17
Sampling Distribution of the Difference Between
two Means

When at least one of the populations is not
normally distributed but the samples sizes are
both at least 30, x1 x2 is approximately
normally distributed, with a mean and a variance
as indicated above.

18
Sampling Distribution of the Difference Between
two Means

Example A national TV telethon committee is
interested in determining whether donations made
by males are on the average larger than those
made by females by 4. Two samples of 25 males
and 25 females were selected, and the donations
made recorded. If the standard deviations of the
male and female populations are 2.4 and 1.8
respectively, what is the probability that sample
mean of the male donations exceeds the sample
mean of the female donations by at least 5?
Assume donations for the two populations are
normally distributed.

19
Sampling Distribution of the Difference Between
two Means

Solution

For males For females
20
Chapter 10Introduction to Estimation

A populations parameter can be estimated by a
point estimator and by an interval estimator.
A confidence interval with 1-a confidence level
is an interval estimator that covers the
estimated parameters (1-a) of the time.
Confidence intervals are constructed using
sampling distributions.

21
Confidence interval of the mean Known Variance

We use the central limit theorem to build the
following confidence interval

22
Confidence interval of the mean Known Variance

Problem 5 How many classes university students
miss each semester? A survey of 100 students was
conducted. (See Data next)
Assuming the standard deviation of the number of
classes missed is 2.2, estimate the mean number
of classes missed per student. Use 99 confidence
level.

23
Confidence interval of the mean Known Variance
Data

Solution 10.21 2.575
10.21 .57

1- a .99 a .01 a/2 .005 Za/2 Z.005 2.575
LCL 9.64, UCL 10.78
You can used Data Analysis Plus gt Z-Estimate Mean
24
Confidence interval of the mean Known Variance
Data

Solution (using Data Analysis Plus)
Shade the data set (you may include the title
label)
Select Data Analysis Plus, then Z-Estimate
Mean
Type in the sigma (2.2), check Labels (if
appropriate), type in alpha (.01), click OK.

25
Selecting the sample size

The shorter the confidence interval, the more
accurate the estimate.
We can, therefore, limit the width of the
interval to 2W, and get
From here we have

W is called Margin of error, or Bound on the
error estimate
26
Selecting the sample size

Problem 6An operation manager wants to estimate
the average amount of time needed by a worker to
assemble a new electronic component.
Sigma is known to be 6 minutes.
The required estimate accuracy is within 20
seconds.
The confidence level is 90 95.
Find the sample size.

27
Selecting the sample size

Solution
s 6 min W 20 sec 1/3 min
1 - a .90 Za/2 Z.05 1.645
1-a .95, Za/2 Z.025 1.96

28
Chapter 11Hypotheses tests

In hypothesis tests we hypothesize on a value of
a population parameter, and test to see if there
is sufficient evidence to support our belief.
The structure of hypotheses test
Formulate two hypotheses.
H0 The one we try to reject in favor of
H1 The alternative hypothesis, the one we try to
prove.
Define a significance level a.

29
Hypotheses tests

The significance level is the probability of
erroneously reject the null hypothesis.
a P(reject H0 when H0 is true)
Sample from the population and calculate a
statistic that provides an indication whether or
not the parameter value under H1 is more likely
to be true.
We shall test the population mean assuming the
standard deviation is known.

30
Hypotheses tests of the Mean Known Variance

Problem 7 A machine is set so that the average
diameter of ball bearings it produces is .50
inch. In a sample of 100 ball bearings the mean
diameter was .51 inch. Assuming the standard
deviation is .05 inch, can we conclude at 5
significance level that the mean diameter is not
.50 inch.

31
Hypotheses tests of the Mean Known Variance

SolutionThe population studied is the
ball-bearing diameters.
We hypothesize on the population mean.
A good point estimator for the population mean is
the sample mean.
We use the distribution of the sample mean to
build a sample statistic to test whether m .50
inch.

32
Hypotheses tests of the Mean Known Variance

Solution (A Two Tail rejection region)
Define the hypotheses
H0 m .50
H1 m .50

The probability of conducting atype one error
33
Hypotheses tests of the Mean Known Variance
Solution - A Two Tail rejection region
Critical Z
Z.025 1.96 (obtained from the Z-table) Build a
rejection region Zsamplegt Za/2, or
Zsamplelt-Za/2
1.96
-1.96
Calculate the value of the sample Z statistic
and compare it to the critical value
Since 2 gt 1.96, there is sufficient evidence to
rejectH0 in favor of H1 at 5 significance
level.
34
Hypotheses tests of the Mean Known Variance
Solution - A Two Tail rejection region

We can perform the test in terms of the mean
value.
Let us find the critical mean values for
rejection
XL2m0 Z.025 .501.96(.05)/(100)1/2
.5098
XL1m0 - Z.025 .50
-1.96(.05)/(100)1/2.402

Since.51 gt .5098, there is sufficient evidence to
reject the null hypothesis at 5 significance
level.
35
Hypotheses tests of the Mean Known Variance

Calculate the p value of this test
Solutionp-value P(Z gt Zsample) P(Z lt
-Zsample) P(Z gt 2) P(Z lt -2) 2P(Z gt 2)
21 - .9772 .0456
Since .0456 lt .05, H0 is rejected.

36
Hypotheses tests of the Mean Known Variance

Problem 8
The average annual return on investment for
American banks was found to be 10.2 with
standard deviation of 0.8.
It is believed that banks that exercise
comprehensive planning do better.
A sample of 26 banks that exercise comprehensive
training provide the following result Mean
return 10.5
Can we infer that the belief about bank
performance is supported at 10 significance
level by this sample result?

37
Hypotheses tests of the Mean Known Variance
Data

Solution (A right Hand Tail Rejection
region)The population tested is the annual rate
of return.
H0 m 10.2
H1 m gt 10.2
Let us perform the test with the standardized
rejection region approach Zsample gt Z.10
(Right hand tail rejection region) Z.10 1.28.
Reject H0 if Zsample gt 1.28

38
Hypotheses tests of the Mean Known Variance

Conclusion
At 10 significance level there is sufficient
evidence in the data to reject H0 in favor of H1,
since the sample statistic falls inside the
rejection region.
Interpretation
If we are willing to accept 10 chance of making
the wrong conclusion, we can conclude banks
conducting comprehensive training perform better
than banks who do not.

39
Hypotheses tests of the Mean Known Variance
Data

Let us perform the test with the p-value method
P(X gt 10.5 given that m 10.2) P(Z gt (10.5
10.2)/.8/(26)1/2 P(Z gt 1.91) .5 - .4719
.0281
Since .0281 lt .10 we reject the null hypothesis
at 10 significance level.

40
Hypotheses tests of the Mean Known Variance

Note the equivalence between the standardized
method or the rejection region method and the
p-value method.
P(ZgtZ.10) .10Z10 1.28

The statement p-value is smallerthan alpha, is
equivalent to the statement the test statistic
fallsin the rejection region
1.91
1.28
41
Hypotheses tests of the Mean Known Variance

Problem 9
In the midst of labor-management negotiations,
the president of a company argues that the
companys blue collar workers, who are paid an
average of 30K a year, are well-paid because the
mean annual pay for blue-collar workers in the
country is less than 30K.
This figure is disputed by the union. To test the
presidents belief an arbitrator draws a random
sample of 350 blue-collar workers from across the
country and their income recorded (see file
Salaries).
If the arbitrator assumes that income is normally
distributed with a standard deviation of 8,000,
can it be inferred at 5 significance level that
the companys president is correct?

42
Hypotheses tests of the Mean Known Variance
Data

Solution (A left Hand Tail Rejection Region)The
population tested is the ann. Salary
H0 m 30KH1 m lt 30K
Left hand Tail Rejection region Z lt -Z.05 or Z lt
-1.645ZSample (29,119.5-30,000)/(8,000/350.5)
-2.059Since 2.059 lt -1.645 there is sufficient
evidence to infer that on the average blue collar
workers income is lower than 30K at 5
significance level.

43
Hypotheses tests of the Mean Known Variance

Calculate the p-value of this test
Solutionp-value P(Z lt Zsample) P(Z lt -2.059)

44
Type II Error

Problem 7a Calculate b for the two-tail
hypotheses test performed in problem 7, when the
actual mean diameter is .515 inch.
Solution
The rejection region in terms of the critical
values of the sample mean was found before XL1
.402 XL2 .5098.
b P(Do not reject H0 when H1 is true)
P(.402 lt lt .5098 when m .515)
P(.402-.515)/.05/(100).5 lt Z lt
(.5098-.515)/.05/(100).5 P(-22.6 lt Z lt -1.04)
P(1.04 lt Z lt 22.6)
1 - .8508 .1492
This large probability may be reduced by taking
larger samples

H0 m .500H1 m .515
P(Zlt22.6) P(Zlt1.04) 1-P(Zlt1.04)
45
Ch 12 Inference when the Variance is Unknown

Generally, the variance may be unknown
In this case we change the test statistic from
Z to t, when testing the population mean.
To test the population proportion well use the
normal distribution (under certain conditions).

46
Testing the mean unknown variance

Replace the statistic Z with t
The original distribution must be normal (or at
least mound shaped).

47
Testing the mean unknown variance

Problem 10
A federal agency inspects packages to determine
if the contents is at least as large as that
advertised.
A random sample of (i)5, (ii)50 containers whose
packaging states that the weight was 8.04 ounces
was drawn. (data is provided later)
From the sample results
Can we conclude that the average weight does not
meet the weight stated? (use a .05).
Estimate the mean weight of all containers with
99 confidence
What assumption must be met?

48
Testing the mean unknown variance

Solution
We hypothesize on the mean weight.
H0 m 8.04
H1 m lt 8.04
(i) n5. For small samples let us solve
manuallyAssume the sample was 8.07, 8.03,
7.99, 7.95, 7.94
The rejection region t lt -ta, n-1 -t.05,5-1
-2.132The tsample ?
Mean (8.077.94)/5 7.996Std.
Dev.(8.07-7.996)2(7.94- 7.996)2/41/2
0.054

-2.132
49
Testing the mean unknown variance

The tsample is calculated as follows
Since -1.32 gt -2.132 the sample statistic does
not fall in the rejection region. There is
insufficient evidence to conclude that the mean
weight is smaller than 8, at 5 significance
level.

-.165
-2.132
50
Testing the mean unknown variance

(ii) n50. To calculate the sample statistics we
use Excel, Descriptive statistics from the
ToolsgtData analysis menu. From the sample we
obtainMean 8.02 Std. Dev. .04
The confidence interval is calculated by
8.02 2.678
8.02 .015

LCL 8.005, UCL 8.35
51
Testing the mean unknown variance
Data

Comments
Check whether it appears that the distribution is
normal

52
Using Excel
Data

To obtain an exact value for t use the TINV
function
The exact value

Degrees of freedom
TINV(0.01,49)
.01 is the two tail probability .0052
2.6799535
53
Testing the mean unknown variance

Problem 11
Engineers in charge of the production of car
seats are concerned about the compliance of the
springs used with design specifications.
Springs are designed to be 500mm long.
Springs too long or too short must be reworked.
A standard deviation of 2mm in springs length
will result in an acceptable number of reworked
springs.
A sample of 100 springs was taken and measured.

54
Testing the mean unknown variance
Data

Problem continued
Can we infer at 10 significance level that the
mean spring length is not 500mm?

SolutionH0 m 500 Since the standard
deviation is unknown H1 m ¹ 500 We need to
run a t-test, assuming the
spring length is normally distributed.
Rejection region t lt -ta/2 or t gt ta/2with d.f.
99
t lt -1.6604 ort gt 1.6604
55
Inference about a population proportion

The test and the confidence interval are based on
the approximated normal distribution of the
sample proportion, if npgt5 and n(1-p)gt5.
For the confidence interval of p we have
where p x/n
For the hypotheses test, we use a Z test.

56
Inference about a population proportion

Problem 12 (problem 11 continued). The engineers
were interested in the percentage of springs that
are the correct length. They marked each spring
in the sample as
Correct 1
Too long 2
Too short 3

Can we infer that less than 90 of the springs
are the correct length, at 10 sig.
level?
57
Inference about a population proportion
Data

Problem 12 - Solution
H0 p .9H1 p lt .9
Rejection regionZ lt -Za, or Z lt -1.28

ConclusionSince 1.33 lt -1.28 we can infer
that less than 90 of the springs do not need
reworking.
58
Inference about a population proportion
Data

Problem 12 solution continued
Let us estimate the proportion of good springs at
99 confidence level.

59
Inference about a population proportion

Problem 12 solution continued
Find the sample size if the proportion of good
springs is to be estimated to within .035.
Consider the given sample an initial sample.

60
Inference about a population proportion

Problem 13
A consumer protection group runs a survey of 400
dentists to check a claim that more than 4 out of
5 dentists recommend ingredients included in a
certain toothpaste.
The survey results are as follows 71 No 329
Yes
At 5 significance level, can the consumer group
infer that the claim is true?

61
Inference about a population proportion

Problem 13 - Solution
The two hypotheses are
H0 p .8
H1 p gt .8
Z.05 1.645
Conclusion Since 1.125 lt 1.645 the consumer
group cannot confirm the claim at 5 significance
level.

The rejection region Z gt Za
62
Summary Example

An automotive expert claims that the large number
of self-serve gas stations has resulted in poor
automobile maintenance, and that the average tire
pressure is more than 4.5 psi below its
manufacturer specifications.
A random sample of 50 tires revealed the results
stored in the file TirePressure.
Assume the tire pressure is normally distributed
with s 1.5 psi, and answer the following
questions

63
Summary Example
Tire Pressure

At 10 significance level can we infer that the
expert is correct? What is the p value?

Solution
The HypothesesH0 m 4.5H1 m gt 4.5 The
rejection region Z gt Z.10 or Z gt 1.28.From the
data we have mean 5.04, soZ(5.04
4.5)/(1.5/50.5) 2.545
Since 2.545 gt 1.28, there is sufficient evidence
to infer that the expert is correct.

The p value P(Sample Mean gt 5.04 when m
4.5)P(Z gt 2.545) 1- .9945 .0055
64
Summary Example

Find the probability of making a type II error
when the actual tire under-inflation is 5 psi on
the average.
SolutionThe Rejection Region in terms of the
sample means is found firstZL 1.28 (XL
4.5)/(1.5/50.5). XL 4.5 1.28(1.5/50.5)
4.77. So, the Rejection Region is Sample mean
gt 4.77. b P(accept H0 when H1 is true)
P(sample mean does not fall in the RR, when m
5) P( lt 4.77 when m 5) P(Z lt
(4.77-5)/(1.5/50.5)) P(Z lt -1.08)
From Excel NORMSDIST(-1.077) .1407

65
Inference about the population Variance

The following statistic is c2 (Chi squared)
distributed with n-1 degrees of freedom
We use this relationship to test and estimate the
variance.

66
Inference about the population Variance

The Hypotheses tested are
The rejection region is

67
Testing the Variance

Problem 15
Engineers in charge of the production of car
seats are concerned about the compliance of the
springs used with design specifications.
Springs are designed to be 500mm long.
Springs too long or too short must be reworked.
A standard deviation of 2mm in springs length
will result in an acceptable number of reworked
springs.
A sample of 100 springs was taken and measured.

68
Testing the Variance
Data

Problem 15 - continued Can we infer at 10
significance level that the number of springs
requiring reworking is unacceptably large?

H0 s2 4 H1 s2 gt 4
The number of springs requiring reworkingdepends
on the standard deviation, or the variance.
Rejection regionc2Sample gt c2ad.f. 99
c2Sample gt 117.4069
69
Testing the Variance

Problem 15 - conclusion Since 161.25 gt 117.4069,
we can infer at 10 significance level that the
standard deviation is greater than 2, thus the
number of springs that require reworking is
unacceptably large.

70
Testing the Variance

Problem 16
A random sample of 100 observations was taken
from a normal population. The sample variance
was 29.76.
Can we infer at 2.5 significance level that the
population variance DOES NOT exceeds 30?
Estimate the population variance with 90
confidence.

71
Testing the Variance

Problem 16 Solution
H0s2 30
H1s2 lt 30
c2
98.21

Rejection region c2 lt c21-a, n-1 c2 lt 73.36
!
72
Testing the Variance

Problem 16 - conclusion Since 98.208 gt 73.36 we
conclude that there is insufficient evidence at
2.5 significance level to infer that the
variance is smaller than 30.

73
Using Excel

We can get an exact value of the probability
P(c2d.f.gt c2) ? for a given c2 and known d.f.,
and then determine the p-value.
Use the CHIDIST function For example
.50359
That is P(c299gt 98.208) .50359
In our example we had a left hand tail rejection
region, and therefore the p-value is P(c299 lt
98.208) 1 - .50359 .49641gt .025

CHIDIST(c2,d.f.)
CHIDIST(98.208,99)
74
Using Excel

We can get the exact c2 value for which
P(c2d.f.gt c2) a, for any given probability a
and known d.f., then define the rejection region
Use the CHIINV functionFor example
CHIINV(.975,99) 73.36
That is P(c299 gt ?) .975. c2 73.36The
rejection region is c2 lt 73.36.

CHIINV(a,d.f.)

Write a Comment

User Comments (0)