Economics 173 Business Statistics - PowerPoint PPT Presentation

1 / 63
About This Presentation
Title:

Economics 173 Business Statistics

Description:

Statistical inference is the process by which we acquire information about ... types of television programs and commercials targeted at children is affected by ... – PowerPoint PPT presentation

Number of Views:16
Avg rating:3.0/5.0
Slides: 64
Provided by: sba461
Learn more at: http://www.econ.uiuc.edu
Category:

less

Transcript and Presenter's Notes

Title: Economics 173 Business Statistics


1
Economics 173Business Statistics
  • Lectures 3 4
  • Summer, 2001
  • Professor J. Petry

2
Introduction to Estimation
  • Chapter 9

3
9.1 Introduction
  • Statistical inference is the process by which we
    acquire information about populations from
    samples.
  • There are two procedures for making inferences
  • Estimation.
  • Hypotheses testing.

4
9.2 Concepts of Estimation
  • The objective of estimation is to determine the
    value of a population parameter on the basis of a
    sample statistic.
  • There are two types of estimators
  • Point Estimator
  • Interval estimator

5
Point Estimator
  • A point estimator draws inference about a
    population by estimating the value of an unknown
    parameter using a single value or a point.

6
  • Point Estimator
  • A point estimator draws inference about a
    population by estimating the value of an unknown
    parameter using a single value or a point.

Parameter
Population distribution
?
Sample distribution
Point estimator
7
Interval Estimator
  • An interval estimator draws inferences about a
    population by estimating the value of an unknown
    parameter using an interval.
  • The interval estimator is affected by the sample
    size.

Interval estimator
8
9.3 Estimating the Population Mean when the
Population Standard Deviation is Known
  • How is an interval estimator produced from a
    sampling distribution?
  • To estimate m, a sample of size n is drawn from
    the population, and its mean is calculated.
  • Under certain conditions, is normally
    distributed (or approximately normally
    distributed.), thus

9
  • We know that
  • This leads to the relationship

10
1 - a
Upper confidence limit
Lower confidence limit
See simulation results demonstrating this point
11
  • The confidence interval are correct most, but
    not all, of the time.

UCL
LCL
Not all the confidence intervals cover the real
expected value of 100.
100
0
The selected confidence level is 90, and 10 out
of 100 intervals do not cover the real m.
12
  • Four commonly used confidence levels

The mean values obtained in repeated draws of
samples of size 100 result in interval
estimators of the form sample mean - .28,
Sample mean .28 90 of which cover the real
mean of the distribution.
za/2
13
  • Recalculate the confidence interval for 95
    confidence level.
  • Solution
  • The width of the 90 confidence interval
    2(.28) .56
  • The width of the 95 confidence interval
    2(.34) .68
  • Because the 95 confidence interval is wider,
    it is more likely to include the value of m.

.95
.90
14
  • Example 9.1
  • The number and the types of television programs
    and commercials targeted at children is affected
    by the amount of time children watch TV.
  • A survey was conducted among 100 North American
    children, in which they were asked to record the
    number of hours they watched TV per week.
  • The population standard deviation of TV watch was
    known to be s 8.0
  • Estimate the watch time with 95 confidence
    level.

15
  • Solution
  • The parameter to be estimated is m, the mean time
    of TV watch per week per child (of all American
    Children).
  • We need to compute the interval estimator for m.
  • From the data provided in file XM09-01, the
    sample mean is

Since 1 - a .95, a .05. Thus a/2 .025.
Z.025 1.96
16
  • Interpreting the interval estimate
  • It is wrong to state that the interval
    estimator is an interval for which there is 1 - a
    chance that the population mean lies between the
    LCL and the UCL.
  • This is so because the m is a parameter, not a
    random variable.

17
  • LCL, UCL and the sample mean are the random
    variables, m is a parameter, NOT a random
    variable.
  • Thus, it is correct to state that there is 1 - a
    chance that LCL will be less than m and UCL will
    be greater than m.

18
  • Example 9.2
  • To lower inventory costs, the Doll Computer
    company wants to employ an inventory model.
  • Lead time demand is normally distributed with
    standard deviation of 50 computers.
  • It is required to know the mean in order to
    calculate optimum inventory levels.
  • Estimate the mean demand during lead time with
    95 confidence.

19
  • Solution
  • The parameter to be estimated is m.The interval
    estimator is
  • Demand during 60 lead times is recorded514, 525,
    ., 476.
  • The sample mean is calculated
  • The 95 confidence interval is

20
Information and the Width of the Interval
  • Wide interval estimator provides little
    information.

Where is m ?

?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
Ahaaa!
Here is a much smaller interval. If the
confidence level remains unchanged, the smaller
interval provides more meaningful information.
21
  • The width of the interval estimate is a function
    of
  • the population standard deviation
  • the confidence level
  • the sample size.

22
Suppose the standard deviation has increased by
50.
90 Confidence level
To maintain a certain level of confidence, changin
g to a larger standard deviation requires a
longer confidence interval.
23
Let us increase the confidence level from 90
to 95.
Increasing the confidence level produces a wider
interval
90 Confidence level
95
There is an inverse relationship between the
width of the interval and the sample size
Increasing the sample size decreases the width
of the interval estimate while the confidence
level can remain unchanged.
24
9.4 Selecting the Sample size
  • We can control the width of the interval estimate
    by changing the sample size.
  • Thus, we determine the interval width first, and
    derive the required sample size.
  • The phrase estimate the mean to within W units,
    translates to an interval estimate of the form

25
  • The required sample size to estimate the mean is
  • Example 9.3
  • To estimate the amount of lumber that can be
    harvested in a tract of land, the mean diameter
    of trees in the tract must be estimated to within
    one inch with 99 confidence.
  • What sample size should be taken? (assume
    diameters are normally distributed with s 6
    inches.

26
  • Solution
  • The estimate accuracy is /-1 inch. That is w
    1.
  • The confidence level 99 leads to a .01, thus
    za/2 z.005 2.575.
  • We compute

27
Introduction to Hypothesis Testing
  • Chapter 10

28
10.1 Introduction
  • The purpose of hypothesis testing is to determine
    whether there is enough statistical evidence in
    favor of a certain belief about a parameter.
  • Examples
  • Is there statistical evidence in a random sample
    of potential customers, that support the
    hypothesis that more than p of the potential
    customers will purchase a new products?
  • Is a new drug effective in curing a certain
    disease? A sample of patient is randomly
    selected. Half of them are given the drug where
    half are given a placebo. The improvement in the
    patients conditions is then measured and compared.

29
10.2 Concept of hypothesis testing
  • The critical concepts of hypothesis testing.
  • There are two hypotheses (about a population
    parameter(s))
  • H0 - the null hypothesis for example
    m 5
  • H1 - the alternative hypothesis m gt 5

This is what you want to prove
  • Assume the null hypothesis is true.
  • Build a statistic related to the parameter
    hypothesized.
  • Pose the question How probable is it to obtain a
    statistic value at least as extreme as the one
    observed from the sample?

m 5
30
  • Continued
  • Make one of the following two decisions (based on
    the test)
  • Reject the null hypothesis in favor of the
    alternative hypothesis.
  • Do not reject the null hypothesis in favor of the
    alternative hypothesis.
  • Two types of errors are possible when making the
    decision whether to reject H0
  • Type I error - reject H0 when it is true.
  • Type II error - do not reject H0 when it is false.

31
10.3 Testing the Population Mean When the
Population Standard Deviation is Known
  • Example 10.1
  • A new billing system for a department store will
    be cost- effective only if the mean monthly
    account is more than 170.
  • A sample of 400 monthly accounts has a mean of
    178.
  • If the account are approximately normally
    distributed with s 65, can we conclude that
    the new system will be cost effective?

32
  • Solution
  • The population of interest is the credit accounts
    at the store.
  • We want to show that the mean account for all
    customers is greater than 170.

H1 m gt 170
  • The null hypothesis must specify a single value
    of the parameter m

H0 m 170
33
  • Is a sample mean of 178 sufficiently greater
    than 170 to infer that the population mean is
    greater than 170?

34
The rejection region method
The rejection region is a range of values such
that if the test statistic falls into that range,
the null hypothesis is rejected in favor of the
alternative hypothesis.
35
The Rejection region is
Do no reject the null hypothesis
Reject the null hypothesis
36
The Rejection region is
a
Reject the null hypothesis here
a P(commit a type I error) P(reject H0 given
that H0 is true)

37
The Rejection region is
a
0.05
38
The rejection region is
Conclusion Since the sample mean (178) is greater
than the critical value of 175.34, there is
sufficient evidence in the sample to reject H0 in
favor of H1, at 5 significance level.
178
39
The standardized test statistic
  • Instead of using the statistic , we can use
    the standardized value z.
  • Then, the rejection region becomes

One tail test
40
  • Example 10.1 - continued
  • We redo this example using the standardized test
    statistic.
  • H0 m 170
  • H1 m gt 170
  • Test statistic
  • Rejection region z gt z.05 1.645.
  • Conclusion Since 2.46 gt 1.645, reject the null
    hypothesis in favor of the alternative
    hypothesis.

41
P-value method
  • The p - value provides information about the
    amount of statistical evidence that supports the
    alternative hypothesis.

42
The probability of observing a test statistic at
least as extreme as 178, given that the null
hypothesis is true is
The p-value
43
  • Interpreting the p-value
  • Because the probability that the sample mean will
    assume a value of more than 178 when m 170 is
    so small (.0069), there are reasons to believe
    that m gt 170.

We can conclude that the smaller the p-value
the more statistical evidence exists to support
the alternative hypothesis.
44
  • Describing the p-value
  • If the p-value is less than 1, there is
    overwhelming evidence that support the
    alternative hypothesis.
  • If the p-value is between 1 and 5, there is a
    strong evidence that supports the alternative
    hypothesis.
  • If the p-value is between 5 and 10 there is a
    weak evidence that supports the alternative
    hypothesis.
  • If the p-value exceeds 10, there is no evidence
    that supports of the alternative hypothesis.

45
  • The p-value and rejection region methods
  • The p-value can be used when making decisions
    based on rejection region methods as follows
  • Define the hypotheses to test, and the required
    significance level a.
  • Perform the sampling procedure, calculate the
    test statistic and the p-value associated with
    it.
  • Compare the p-value to a. Reject the null
    hypothesis only if p lta otherwise, do not reject
    the null hypothesis.

46
Conclusions of a test of Hypothesis
  • If we reject the null hypothesis, we conclude
    that there is enough evidence to infer that the
    alternative hypothesis is true.
  • If we do not reject the null hypothesis, we
    conclude that there is not enough statistical
    evidence to infer that the alternative
    hypothesis is true.

The alternative hypothesis is the more
important one. It represents what we are
investigating.
47
  • Example 10.2
  • A government inspector samples 25 bottles of
    catsup labeled Net weight 16 ounces, and
    records their weights.
  • From previous experience it is known that the
    weights are normally distributed with a standard
    deviation of 0.4 ounces.
  • Can the inspector conclude that the product
    label is unacceptable?

48
  • Solution
  • We need to draw a conclusion about the mean
    weights of all the catsup bottles.
  • We investigate whether the mean weight is less
    than 16 ounces (bottle label is unacceptable).

H0 m 16
Then
H1 m lt 16
  • Select a significance level
  • a 0.05
  • Define the rejection region
  • z lt - za -1.645

One tail test
49
we want this
mistake to happen not more than 5 of the time.
16
A sample mean far below 16, should be a rare
event if m 16.
-za -1.645
50
Since the value of the test statistic does not
fall in the rejection region, we do not reject
the null hypothesis in favor of the alternative
hypothesis.
There is insufficient evidence to infer that the
mean is less than 16 ounces.
The p-value P(Z lt - 1.25) .1056 gt .05
0
-za -1.645
51
  • Example 10.3
  • The amount of time required to complete a
    critical part of a production process on an
    assembly line is normally distributed. The mean
    was believed to be 130 seconds.
  • To test if this belief is correct, a sample of
    100 randomly selected assemblies was drawn, and
    the processing time recorded. The sample mean was
    126.8 seconds.
  • If the process time is really normal with a
    standard deviation of 15 seconds, can we conclude
    that the belief regarding the mean is incorrect?

52
  • Solution
  • Is the mean different than 130?

H0 m 130
Then
  • Define the rejection region
  • z lt - za/2 or z gt za/2

53
we want this mistake to happen not more than 5
of the time.
130
A sample mean far below 130 or far above 130,
should be a rare event if m 130.
54
Since the value of the test statistic falls in
the rejection region, we reject the null
hypothesis in favor of the alternative
hypothesis.
There is sufficient evidence to infer that the
mean is not 130.
The p-value P(Z lt - 2.13)P(Z gt 2.13)
2(.0166) .0332 lt .05
a/2 0.025
a/2 0.025
0
-2.13
2.13
55
Testing hypotheses and intervals estimators
  • Interval estimators can be used to test
    hypotheses.
  • Calculate the 1 - a confidence level interval
    estimator, then
  • if the hypothesized parameter value falls within
    the interval, do not reject the null hypothesis,
    while
  • if the hypothesized parameter value falls outside
    the interval, conclude that the null hypothesis
    can be rejected (m is not equal to the
    hypothesized value).

56
  • Drawbacks
  • Two-tail interval estimators may not provide the
    right answer to the question posed in one-tail
    hypothesis tests.
  • The interval estimator does not yield a p-value.

There are cases where only tests produce the
information needed to make decisions.
57
Calculating the Probability of a Type II Error
  • To properly interpret the results of a test of
    hypothesis, we need to
  • specify an appropriate significance level or
    judge the p-value of a test
  • understand the relationship between Type I and
    Type II errors.
  • How do we compute a type II error?

58
  • Calculation of a type II error requires that
  • the rejection region be expressed directly, in
    terms of the parameter hypothesized (not
    standardized).
  • the alternative value (under H1) be specified.

H0 m m0 H1 m m1 (m0 is not equal to m1)
m m0
59
  • Let us revisit example 10.1
  • The rejection region was with
    a .05.
  • A type II error occurs when a false H0 is not
    rejected.

Do not reject H0
m0 170
175.34
but H0 is false
m1 180
175.34
60
  • Effects on b of changing a
  • Decreasing the significance level a, increases
    thethe value of b, and vice versa.

a1
gt a2
b1
lt b2
61
  • Judging the test
  • A hypothesis test is effectively defined by the
    significance level a and by the the sample size
    n.
  • If the probability of a type II error b is judged
    to be too large, we can reduce it by
  • increasing a, and/or
  • increasing the sample size.

62
a
b1
gt b2
As a result b decreases
  • In example 10.1, suppose n increases from 400 to
    1000.

63
  • In summary,
  • By increasing the sample size, we reduce the
    probability of type II error.
  • Hence, we shall accept the null hypothesis when
    it is false less frequently.
  • Power of a test
  • The power of a test is defined as 1 - b.
  • It represents the probability to reject the null
    hypothesis when it is false.
Write a Comment
User Comments (0)
About PowerShow.com