Review - PowerPoint PPT Presentation

1 / 74
About This Presentation
Title:

Review

Description:

Assume donations for the two populations are normally distributed. ... in charge of the production of car seats are concerned about the compliance of ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 75
Provided by: businessF
Category:
Tags: review

less

Transcript and Presenter's Notes

Title: Review


1
Review 2
  • Chapter 9
  • Chapter 10
  • Chapter 11 and 12

2
Chapter 9Sampling Distributions
  • A statistic is a random variable describing a
    characteristic of a random samples.
  • Sample mean
  • Sample variance
  • We use statistic values in inferential statistics
    (make inference about population characteristics
    from sample characteristics).
  • Statistics have distributions of their own.

3
Chapter 9 The Central Limit Theorem
  • The distribution of the sample mean is normal if
    the parent distribution is normal.
  • The distribution of the sample mean approaches
    the normal distribution for sufficiently large
    samples (n ³ 30), even if the parent
    distribution is not normal.
  • The parameters of the sample distribution of the
    mean are
  • Mean
  • Standard deviation
  • (Assumption The population is sufficiently
    large. No correction is needed in the
    calculation of the variance).

4
Chapter 9 The Central Limit Theorem
  • Problem 1 (Using Excel) Given a normal
    population whose mean is 50 and whose standard
    deviation is 5,
  • Question 1 Find the probability that a random
    sample of 4 has a mean between 49 and 52
  • Answer

-.4
.8
5
Chapter 9The Central Limit Theorem
Normal table
  • Problem 1 (Using the table) Given a normal
    population whose mean is 50 and whose standard
    deviation is 5,
  • Question 1 Find the probability that a random
    sample of 4 has a mean between 49 and 52
  • Answer

-.4
.8
6
Chapter 9The Central Limit Theorem
Normal table
  • Problem 1
  • Question 2 Find the probability that a random
    sample of 16 has a mean between 49 and 52.
  • Answer

7
Chapter 9 The Central Limit Theorem
Normal table
  • Problem 2 The amount of time per day spent by
    adults watching TV is normally distributed with
    m6 and s1.5 hours.
  • Question 1 What is the probability that a
    randomly selected adult watches TV for more
    than 7 hours a day?
  • Answer
  • Question 2 What is the probability that 5 adults
    watch TV on the average 7 or more hours?Answer

8
Chapter 9 The Central Limit Theorem
Normal table
  • Problem 2
  • Question 3 What is the probability that the
    total time of watching TV of the five adults will
    not exceed 28 hours?
  • Answer
  • Question 4 What total TV watching time is
    exceeded by only 3 of the population for samples
    of 5 adults?

Comments 1.Excel returns X for agiven left hand
tail probability 2. .670822 1.5/5.5
9
Chapter 9 The Central Limit Theorem
Normal table
  • Problem 3
  • Assume that the monthly rents paid by students
    in a particular town is 350 with a standard
    deviation of 40. A random sample of 100 students
    who rented apartments was taken.
  • Question1 What is the probability that the
    sample mean of the monthly rent exceeds 355?

10
Chapter 9 The Central Limit Theorem
Normal table
  • Problem 3 - continued
  • Question2 What is the probability that the
    total revenue from renting 10 randomly selected
    apartments falls between 3300 and 3700 dollars?

11
Chapter 9 The Central Limit Theorem
Normal table
  • Problem 3 - continued
  • Question3 Lets assume the population mean was
    unknown, but the standard deviation was known to
    be 40. A sample of 100 rentals was selected in
    order to estimate the mean monthly rent paid by
    the whole student population. What is the
    probability that the sample mean differ from the
    actual mean by more than 5? How about more than
    10?

12
Chapter 9 The Central Limit Theorem
  • Problem 3
  • continued

13
Chapter 9Sampling distribution of the sample
proportion
  • In a sample of size n, if np gt 5 and n(1-p) gt 5,
    then the sample proportion p x/n is
    approximately normally distributed with the
    following parameters


(Assumption The population is sufficiently
large. No correction is needed in the
calculation of the variance).
14
Sampling distribution of the sample proportion
  • Problem 4
  • A commercial of a household appliances
    manufacturer claims that less than 5 of all of
    its products require a service call in the first
    year.
  • A survey of 400 households that recently
    purchased the manufacturer products was conducted
    to check the claim.

15
Sampling distribution of the sample proportion
Normal table
  • Problem 4 - Continued Assuming the
    manufacturer is right, what is the probability
    that more than 10 of the surveyed households
    require a service call within the first year?

If indeed 10 of the sampled households reported
a call for service within the first year, what
does ittell you about the the manufacturer
claim?
16
Sampling Distribution of the Difference Between
two Means
  • If two independent variables are normally
    distributed with means and variances m1, s21,
    and m2, s22 respectively, then x1 x2 is also
    normally distributed with

17
Sampling Distribution of the Difference Between
two Means
  • When at least one of the populations is not
    normally distributed but the samples sizes are
    both at least 30, x1 x2 is approximately
    normally distributed, with a mean and a variance
    as indicated above.

18
Sampling Distribution of the Difference Between
two Means
  • Example A national TV telethon committee is
    interested in determining whether donations made
    by males are on the average larger than those
    made by females by 4. Two samples of 25 males
    and 25 females were selected, and the donations
    made recorded. If the standard deviations of the
    male and female populations are 2.4 and 1.8
    respectively, what is the probability that sample
    mean of the male donations exceeds the sample
    mean of the female donations by at least 5?
    Assume donations for the two populations are
    normally distributed.

19
Sampling Distribution of the Difference Between
two Means
  • Solution

For males For females
20
Chapter 10Introduction to Estimation
  • A populations parameter can be estimated by a
    point estimator and by an interval estimator.
  • A confidence interval with 1-a confidence level
    is an interval estimator that covers the
    estimated parameters (1-a) of the time.
  • Confidence intervals are constructed using
    sampling distributions.

21
Confidence interval of the mean Known Variance
  • We use the central limit theorem to build the
    following confidence interval

22
Confidence interval of the mean Known Variance
  • Problem 5 How many classes university students
    miss each semester? A survey of 100 students was
    conducted. (See Data next)
  • Assuming the standard deviation of the number of
    classes missed is 2.2, estimate the mean number
    of classes missed per student. Use 99 confidence
    level.

23
Confidence interval of the mean Known Variance
Data
  • Solution 10.21 2.575
    10.21 .57

1- a .99 a .01 a/2 .005 Za/2 Z.005 2.575
LCL 9.64, UCL 10.78
You can used Data Analysis Plus gt Z-Estimate Mean
24
Confidence interval of the mean Known Variance
Data
  • Solution (using Data Analysis Plus)
  • Shade the data set (you may include the title
    label)
  • Select Data Analysis Plus, then Z-Estimate
    Mean
  • Type in the sigma (2.2), check Labels (if
    appropriate), type in alpha (.01), click OK.

25
Selecting the sample size
  • The shorter the confidence interval, the more
    accurate the estimate.
  • We can, therefore, limit the width of the
    interval to 2W, and get
  • From here we have

W is called Margin of error, or Bound on the
error estimate
26
Selecting the sample size
  • Problem 6An operation manager wants to estimate
    the average amount of time needed by a worker to
    assemble a new electronic component.
  • Sigma is known to be 6 minutes.
  • The required estimate accuracy is within 20
    seconds.
  • The confidence level is 90 95.
  • Find the sample size.

27
Selecting the sample size
  • Solution
  • s 6 min W 20 sec 1/3 min
  • 1 - a .90 Za/2 Z.05 1.645
  • 1-a .95, Za/2 Z.025 1.96

28
Chapter 11Hypotheses tests
  • In hypothesis tests we hypothesize on a value of
    a population parameter, and test to see if there
    is sufficient evidence to support our belief.
  • The structure of hypotheses test
  • Formulate two hypotheses.
  • H0 The one we try to reject in favor of
  • H1 The alternative hypothesis, the one we try to
    prove.
  • Define a significance level a.

29
Hypotheses tests
  • The significance level is the probability of
    erroneously reject the null hypothesis.
  • a P(reject H0 when H0 is true)
  • Sample from the population and calculate a
    statistic that provides an indication whether or
    not the parameter value under H1 is more likely
    to be true.
  • We shall test the population mean assuming the
    standard deviation is known.

30
Hypotheses tests of the Mean Known Variance
  • Problem 7 A machine is set so that the average
    diameter of ball bearings it produces is .50
    inch. In a sample of 100 ball bearings the mean
    diameter was .51 inch. Assuming the standard
    deviation is .05 inch, can we conclude at 5
    significance level that the mean diameter is not
    .50 inch.

31
Hypotheses tests of the Mean Known Variance
  • SolutionThe population studied is the
    ball-bearing diameters.
  • We hypothesize on the population mean.
  • A good point estimator for the population mean is
    the sample mean.
  • We use the distribution of the sample mean to
    build a sample statistic to test whether m .50
    inch.

32
Hypotheses tests of the Mean Known Variance
  • Solution (A Two Tail rejection region)
  • Define the hypotheses
  • H0 m .50
  • H1 m .50

The probability of conducting atype one error
33
Hypotheses tests of the Mean Known Variance
Solution - A Two Tail rejection region
Critical Z
Z.025 1.96 (obtained from the Z-table) Build a
rejection region Zsamplegt Za/2, or
Zsamplelt-Za/2
1.96
-1.96
Calculate the value of the sample Z statistic
and compare it to the critical value
Since 2 gt 1.96, there is sufficient evidence to
rejectH0 in favor of H1 at 5 significance
level.
34
Hypotheses tests of the Mean Known Variance
Solution - A Two Tail rejection region
  • We can perform the test in terms of the mean
    value.
  • Let us find the critical mean values for
    rejection
  • XL2m0 Z.025 .501.96(.05)/(100)1/2
    .5098
  • XL1m0 - Z.025 .50
    -1.96(.05)/(100)1/2.402

Since.51 gt .5098, there is sufficient evidence to
reject the null hypothesis at 5 significance
level.
35
Hypotheses tests of the Mean Known Variance
  • Calculate the p value of this test
  • Solutionp-value P(Z gt Zsample) P(Z lt
    -Zsample) P(Z gt 2) P(Z lt -2) 2P(Z gt 2)
    21 - .9772 .0456
  • Since .0456 lt .05, H0 is rejected.

36
Hypotheses tests of the Mean Known Variance
  • Problem 8
  • The average annual return on investment for
    American banks was found to be 10.2 with
    standard deviation of 0.8.
  • It is believed that banks that exercise
    comprehensive planning do better.
  • A sample of 26 banks that exercise comprehensive
    training provide the following result Mean
    return 10.5
  • Can we infer that the belief about bank
    performance is supported at 10 significance
    level by this sample result?

37
Hypotheses tests of the Mean Known Variance
Data
  • Solution (A right Hand Tail Rejection
    region)The population tested is the annual rate
    of return.
  • H0 m 10.2
  • H1 m gt 10.2
  • Let us perform the test with the standardized
    rejection region approach Zsample gt Z.10
    (Right hand tail rejection region) Z.10 1.28.
    Reject H0 if Zsample gt 1.28

38
Hypotheses tests of the Mean Known Variance
  • Conclusion
  • At 10 significance level there is sufficient
    evidence in the data to reject H0 in favor of H1,
    since the sample statistic falls inside the
    rejection region.
  • Interpretation
  • If we are willing to accept 10 chance of making
    the wrong conclusion, we can conclude banks
    conducting comprehensive training perform better
    than banks who do not.

39
Hypotheses tests of the Mean Known Variance
Data
  • Let us perform the test with the p-value method
  • P(X gt 10.5 given that m 10.2) P(Z gt (10.5
    10.2)/.8/(26)1/2 P(Z gt 1.91) .5 - .4719
    .0281
  • Since .0281 lt .10 we reject the null hypothesis
    at 10 significance level.

40
Hypotheses tests of the Mean Known Variance
  • Note the equivalence between the standardized
    method or the rejection region method and the
    p-value method.
  • P(ZgtZ.10) .10Z10 1.28

The statement p-value is smallerthan alpha, is
equivalent to the statement the test statistic
fallsin the rejection region
1.91
1.28
41
Hypotheses tests of the Mean Known Variance
  • Problem 9
  • In the midst of labor-management negotiations,
    the president of a company argues that the
    companys blue collar workers, who are paid an
    average of 30K a year, are well-paid because the
    mean annual pay for blue-collar workers in the
    country is less than 30K.
  • This figure is disputed by the union. To test the
    presidents belief an arbitrator draws a random
    sample of 350 blue-collar workers from across the
    country and their income recorded (see file
    Salaries).
  • If the arbitrator assumes that income is normally
    distributed with a standard deviation of 8,000,
    can it be inferred at 5 significance level that
    the companys president is correct?

42
Hypotheses tests of the Mean Known Variance
Data
  • Solution (A left Hand Tail Rejection Region)The
    population tested is the ann. Salary
  • H0 m 30KH1 m lt 30K
  • Left hand Tail Rejection region Z lt -Z.05 or Z lt
    -1.645ZSample (29,119.5-30,000)/(8,000/350.5)
    -2.059Since 2.059 lt -1.645 there is sufficient
    evidence to infer that on the average blue collar
    workers income is lower than 30K at 5
    significance level.

43
Hypotheses tests of the Mean Known Variance
  • Calculate the p-value of this test
  • Solutionp-value P(Z lt Zsample) P(Z lt -2.059)

44
Type II Error
  • Problem 7a Calculate b for the two-tail
    hypotheses test performed in problem 7, when the
    actual mean diameter is .515 inch.
  • Solution
  • The rejection region in terms of the critical
    values of the sample mean was found before XL1
    .402 XL2 .5098.
  • b P(Do not reject H0 when H1 is true)
    P(.402 lt lt .5098 when m .515)
    P(.402-.515)/.05/(100).5 lt Z lt
    (.5098-.515)/.05/(100).5 P(-22.6 lt Z lt -1.04)
    P(1.04 lt Z lt 22.6)
  • 1 - .8508 .1492
  • This large probability may be reduced by taking
    larger samples

H0 m .500H1 m .515
P(Zlt22.6) P(Zlt1.04) 1-P(Zlt1.04)
45
Ch 12 Inference when the Variance is Unknown
  • Generally, the variance may be unknown
  • In this case we change the test statistic from
    Z to t, when testing the population mean.
  • To test the population proportion well use the
    normal distribution (under certain conditions).

46
Testing the mean unknown variance
  • Replace the statistic Z with t
  • The original distribution must be normal (or at
    least mound shaped).

47
Testing the mean unknown variance
  • Problem 10
  • A federal agency inspects packages to determine
    if the contents is at least as large as that
    advertised.
  • A random sample of (i)5, (ii)50 containers whose
    packaging states that the weight was 8.04 ounces
    was drawn. (data is provided later)
  • From the sample results
  • Can we conclude that the average weight does not
    meet the weight stated? (use a .05).
  • Estimate the mean weight of all containers with
    99 confidence
  • What assumption must be met?

48
Testing the mean unknown variance
  • Solution
  • We hypothesize on the mean weight.
  • H0 m 8.04
  • H1 m lt 8.04
  • (i) n5. For small samples let us solve
    manuallyAssume the sample was 8.07, 8.03,
    7.99, 7.95, 7.94
  • The rejection region t lt -ta, n-1 -t.05,5-1
    -2.132The tsample ?
  • Mean (8.077.94)/5 7.996Std.
    Dev.(8.07-7.996)2(7.94- 7.996)2/41/2
    0.054

-2.132
49
Testing the mean unknown variance
  • The tsample is calculated as follows
  • Since -1.32 gt -2.132 the sample statistic does
    not fall in the rejection region. There is
    insufficient evidence to conclude that the mean
    weight is smaller than 8, at 5 significance
    level.

-.165
-2.132
50
Testing the mean unknown variance
  • (ii) n50. To calculate the sample statistics we
    use Excel, Descriptive statistics from the
    ToolsgtData analysis menu. From the sample we
    obtainMean 8.02 Std. Dev. .04
  • The confidence interval is calculated by
  • 8.02 2.678
    8.02 .015

LCL 8.005, UCL 8.35
51
Testing the mean unknown variance
Data
  • Comments
  • Check whether it appears that the distribution is
    normal

52
Using Excel
Data
  • To obtain an exact value for t use the TINV
    function
  • The exact value

Degrees of freedom
TINV(0.01,49)
.01 is the two tail probability .0052
2.6799535
53
Testing the mean unknown variance
  • Problem 11
  • Engineers in charge of the production of car
    seats are concerned about the compliance of the
    springs used with design specifications.
  • Springs are designed to be 500mm long.
  • Springs too long or too short must be reworked.
  • A standard deviation of 2mm in springs length
    will result in an acceptable number of reworked
    springs.
  • A sample of 100 springs was taken and measured.

54
Testing the mean unknown variance
Data
  • Problem continued
  • Can we infer at 10 significance level that the
    mean spring length is not 500mm?

SolutionH0 m 500 Since the standard
deviation is unknown H1 m ¹ 500 We need to
run a t-test, assuming the
spring length is normally distributed.
Rejection region t lt -ta/2 or t gt ta/2with d.f.
99
t lt -1.6604 ort gt 1.6604
55
Inference about a population proportion
  • The test and the confidence interval are based on
    the approximated normal distribution of the
    sample proportion, if npgt5 and n(1-p)gt5.
  • For the confidence interval of p we have
  • where p x/n
  • For the hypotheses test, we use a Z test.


56
Inference about a population proportion
  • Problem 12 (problem 11 continued). The engineers
    were interested in the percentage of springs that
    are the correct length. They marked each spring
    in the sample as
  • Correct 1
  • Too long 2
  • Too short 3

Can we infer that less than 90 of the springs
are the correct length, at 10 sig.
level?
57
Inference about a population proportion
Data
  • Problem 12 - Solution
  • H0 p .9H1 p lt .9
  • Rejection regionZ lt -Za, or Z lt -1.28

ConclusionSince 1.33 lt -1.28 we can infer
that less than 90 of the springs do not need
reworking.
58
Inference about a population proportion
Data
  • Problem 12 solution continued
  • Let us estimate the proportion of good springs at
    99 confidence level.

59
Inference about a population proportion
  • Problem 12 solution continued
  • Find the sample size if the proportion of good
    springs is to be estimated to within .035.
    Consider the given sample an initial sample.

60
Inference about a population proportion
  • Problem 13
  • A consumer protection group runs a survey of 400
    dentists to check a claim that more than 4 out of
    5 dentists recommend ingredients included in a
    certain toothpaste.
  • The survey results are as follows 71 No 329
    Yes
  • At 5 significance level, can the consumer group
    infer that the claim is true?

61
Inference about a population proportion
  • Problem 13 - Solution
  • The two hypotheses are
  • H0 p .8
  • H1 p gt .8
  • Z.05 1.645
  • Conclusion Since 1.125 lt 1.645 the consumer
    group cannot confirm the claim at 5 significance
    level.

The rejection region Z gt Za
62
Summary Example
  • An automotive expert claims that the large number
    of self-serve gas stations has resulted in poor
    automobile maintenance, and that the average tire
    pressure is more than 4.5 psi below its
    manufacturer specifications.
  • A random sample of 50 tires revealed the results
    stored in the file TirePressure.
  • Assume the tire pressure is normally distributed
    with s 1.5 psi, and answer the following
    questions

63
Summary Example
Tire Pressure
  • At 10 significance level can we infer that the
    expert is correct? What is the p value?
  • Solution
  • The HypothesesH0 m 4.5H1 m gt 4.5 The
    rejection region Z gt Z.10 or Z gt 1.28.From the
    data we have mean 5.04, soZ(5.04
    4.5)/(1.5/50.5) 2.545
  • Since 2.545 gt 1.28, there is sufficient evidence
    to infer that the expert is correct.

The p value P(Sample Mean gt 5.04 when m
4.5)P(Z gt 2.545) 1- .9945 .0055
64
Summary Example
  • Find the probability of making a type II error
    when the actual tire under-inflation is 5 psi on
    the average.
  • SolutionThe Rejection Region in terms of the
    sample means is found firstZL 1.28 (XL
    4.5)/(1.5/50.5). XL 4.5 1.28(1.5/50.5)
    4.77. So, the Rejection Region is Sample mean
    gt 4.77. b P(accept H0 when H1 is true)
    P(sample mean does not fall in the RR, when m
    5) P( lt 4.77 when m 5) P(Z lt
    (4.77-5)/(1.5/50.5)) P(Z lt -1.08)
  • From Excel NORMSDIST(-1.077) .1407

65
Inference about the population Variance
  • The following statistic is c2 (Chi squared)
    distributed with n-1 degrees of freedom
  • We use this relationship to test and estimate the
    variance.

66
Inference about the population Variance
  • The Hypotheses tested are
  • The rejection region is

67
Testing the Variance
  • Problem 15
  • Engineers in charge of the production of car
    seats are concerned about the compliance of the
    springs used with design specifications.
  • Springs are designed to be 500mm long.
  • Springs too long or too short must be reworked.
  • A standard deviation of 2mm in springs length
    will result in an acceptable number of reworked
    springs.
  • A sample of 100 springs was taken and measured.

68
Testing the Variance
Data
  • Problem 15 - continued Can we infer at 10
    significance level that the number of springs
    requiring reworking is unacceptably large?

H0 s2 4 H1 s2 gt 4
The number of springs requiring reworkingdepends
on the standard deviation, or the variance.
Rejection regionc2Sample gt c2ad.f. 99
c2Sample gt 117.4069
69
Testing the Variance
  • Problem 15 - conclusion Since 161.25 gt 117.4069,
    we can infer at 10 significance level that the
    standard deviation is greater than 2, thus the
    number of springs that require reworking is
    unacceptably large.

70
Testing the Variance
  • Problem 16
  • A random sample of 100 observations was taken
    from a normal population. The sample variance
    was 29.76.
  • Can we infer at 2.5 significance level that the
    population variance DOES NOT exceeds 30?
  • Estimate the population variance with 90
    confidence.

71
Testing the Variance
  • Problem 16 Solution
  • H0s2 30
  • H1s2 lt 30
  • c2
    98.21

Rejection region c2 lt c21-a, n-1 c2 lt 73.36
!
72
Testing the Variance
  • Problem 16 - conclusion Since 98.208 gt 73.36 we
    conclude that there is insufficient evidence at
    2.5 significance level to infer that the
    variance is smaller than 30.

73
Using Excel
  • We can get an exact value of the probability
    P(c2d.f.gt c2) ? for a given c2 and known d.f.,
    and then determine the p-value.
  • Use the CHIDIST function For example
    .50359
  • That is P(c299gt 98.208) .50359
  • In our example we had a left hand tail rejection
    region, and therefore the p-value is P(c299 lt
    98.208) 1 - .50359 .49641gt .025

CHIDIST(c2,d.f.)
CHIDIST(98.208,99)
74
Using Excel
  • We can get the exact c2 value for which
    P(c2d.f.gt c2) a, for any given probability a
    and known d.f., then define the rejection region
  • Use the CHIINV functionFor example
    CHIINV(.975,99) 73.36
  • That is P(c299 gt ?) .975. c2 73.36The
    rejection region is c2 lt 73.36.

CHIINV(a,d.f.)
Write a Comment
User Comments (0)
About PowerShow.com