Business Statistics for Managerial Decision - PowerPoint PPT Presentation

About This Presentation
Title:

Business Statistics for Managerial Decision

Description:

Business Statistics for Managerial Decision Comparing two Population Means * You must carefully distinguish two-sample problems from the matched pairs designs studied ... – PowerPoint PPT presentation

Number of Views:154
Avg rating:3.0/5.0
Slides: 34
Provided by: wiu
Learn more at: http://faculty.wiu.edu
Category:

less

Transcript and Presenter's Notes

Title: Business Statistics for Managerial Decision


1
Business Statistics for Managerial Decision
  • Comparing two Population Means

2
Comparing Two means
  • How do small businesses that fail differ from
    those that succeed?
  • Business school researchers compare two samples
    of firms started in 2000, one sample of failed
    businesses and one of firms that are still going
    after two years.
  • This study compares two random samples, one from
    each of two different populations.

3
Two-Sample problems
  • The goal of inference is to compare the responses
    in two groups
  • Each group is considered to be a sample from a
    distinct population.
  • The responses in each group are independent of
    those in other group

4
Two-Sample problems
  • Notation
  • We have two independent samples, from two
    distinct populations (such as failed businesses
    and successful businesses).
  • We measure the same variable (such as initial
    capital) in both samples
  • We call the variable x1 in the first population
    and x2 in the second population.
  • Population Variable Mean Standard deviation
  • 1 x1 ??1 ?1
  • 2 x2 ?2
    ?2

5
Two-Sample problems
  • We want to compare the two population means,
    either by giving a confidence interval for ?1-?2
    or by testing the hypothesis of no difference,
    H0?1?2.
  • We base inference on two independent SRSs, one
    from each population.
  • Sample Sample
  • Population Sample size mean standard
    deviation
  • 1 n1 s1
  • 2 n2 s2

6
The Two-Sample z Statistic
  • The natural estimator of the difference ?1-?2 is
    the difference between the sample means
    .
  • To base inference on this statistic we need to
    know its sampling distribution.
  • The mean of the difference is the
    difference of the means ?1-?2.
  • Because the samples are independent, their sample
    means and are independent.
  • The variance of the is the sum of
    their variances which is

7
The Two-Sample z Statistic
  • Suppose that is the mean of a SRS of size n1
    drawn from a N(?1, ?1) population and that
    is the mean of an independent SRS of size n2
    drawn from a N(?2, ?2) population. Then the
    two-sample z statistic
  • has the standard Normal (0, 1) sampling
    distribution.

8
The Two-Sample t Procedures
  • In practice, the two population standard
    deviations ?1 and ?2 are not known
  • We estimate them by sample standard deviations s1
    and s2 from our two samples.
  • The two-sample t statistic
  • This statistic does not have a t distribution.
  • We can approximate the distribution of the
    two-sample t statistic by using the t(k)
    distribution with an approximation for the
    degrees of freedom k.

9
The Two-Sample t Procedures
  • We use the approximation to find approximate
    value of t for confidence intervals and to find
    approximate P-values for significance tests.
  • This can be done in two ways
  • Scatterwait approximation to calculate a value of
    k from data. In general, the resulting k will not
    be a whole number.
  • Use degrees of freedom k equal to the smaller of
    n1-1 and n2-1.

10
The Two-Sample t Significance Test
  • Draw a SRS of size n1 from a Normal population
    with unknown mean ?1 and an independent SRS of
    size n2 from another Normal population with
    unknown ?2. To test the hypothesis H0 ?1-?2 0,
    compute the two-sample t statistic
  • And use P-values or critical values for the t(k)
    distribution, where the degree of freedom k are
    the smaller of n1-1 and n2-1.

11
Example Is our product effective?
  • A company that sells educational materials
    reports statistical studies to convince customers
    that its materials improve learning. One new
    product supplies directed reading activities
    for class room use. These activities should
    improve the reading ability of elementary school
    pupils.

12
Example Is our product effective?
  • A consultant arranges for a third-grad class of
    21 students to take part in these activities for
    an eight-week period. A control classroom of 23
    third-graders follows the same curriculum without
    the activities. At the end of the eight weeks,
    all students are given a Degree of Reading Power
    (DRP) test, which measures the aspects of reading
    ability that the treatment is designed to
    improve. The data appear in table 7.3.

13
Example Is our product effective?
14
Example Is our product effective?
  • A back to back stemplot suggests that there is a
    mild outlier in the control group but no
    deviation from Normality serious enough to forbid
    use of t procedure.

15
Example Is our product effective?
  • The summary statistics are
  • Group n s
  • Treatment 21 51.48 11.01
  • Control 23 41.52 17.15
  • We hope to show that the treatment (group 1) is
    better than the control (group 2), therefore the
    hypotheses are
  • H0 ?1 ?2
  • Ha ?1 gt ?2

16
Example Is our product effective?
  • The two-sample t statistic is

17
Example Is our product effective?
  • The P-value for the one-sided test is
  • The degrees of freedom k are equal to the smaller
    of n1-1 21-120 and n2-123-122 comparing t
    2.31 with entries in t-table for 20 degrees of
    freedom, we see that P lies between .02 and .01.
  • Conclusion
  • The data strongly suggest that directed reading
    activity improves the DRP score.

18
The Two Sample t Confidence Interval
  • The same ideas that we used for the two-sample t
    significance test can apply to give us two-sample
    t confidence interval.
  • Draw a SRS of size n1 from a Normal population
    with unknown mean ?1 and an independent SRS of
    size n2 from another Normal population with
    unknown mean ?2. The confidence interval for ?1-
    ?2 given by
  • t is the value for t(k) density curve with area
    C between t and t. The value of the degrees
    of freedom k is approximated by software or we
    use the smaller of n1-1 and n2-1.

19
ExampleHow much improvement?
  • We will find a 95 confidence interval for the
    mean improvement in the entire population of
    third-graders. The interval is
  • Using t(20) distribution, t-table gives t
    2.086
  • We estimate the mean improvement in DRP scores to
    be about 10 point, but with a margin of error of
    almost 9 points.

20
The Pooled Two-sample t Procedures
  • There is one situation in which a t statistic for
    comparing two means has exactly a t distribution.
  • Suppose that the two Normal population
    distribution have the same standard deviation.
  • Call the common standard deviations ?. Both
    sample variances s12 and S22 estimate ?2.
  • The best way to combine these two estimates is to
    average them with weights equal to their degrees
    of freedom.
  • The resulting estimate of ?2 is

21
The Pooled Two-sample t Procedures
  • Sp2 is called the pooled estimator of ?2.
  • When both populations have variance ?2, the
    addition rule for variance says that
    has variance equal to the sum of the individual
    variances
  • Now we can substitute sp2 in the test statistic,
    and the resulting t statistic has a t
    distribution.

22
The Pooled Two-sample t Procedures
  • Draw a SRS of size n1 from a Normal population
    with unknown mean ?1 and an independent SRS of
    size n2 from another Normal population with
    unknown mean ?2. Suppose that the two populations
    have the same unknown standard deviation. A level
    C confidence interval for ?1- ?2 is
  • Here t is the value for the t(n1n2 -2) density
    curve with area C between -t and t.

23
The Pooled Two-sample t Procedures
  • To test the hypothesis H0 ?1?2, compute the
    pooled two-sample t statistic
  • and use P-values from the t(n1 n2 - 2)
    distribution.

24
Healthy Companies versus Failed Companies
  • In what ways are companies that fail different
    from those that continue to do business?
  • To answer this question, one study compared
    various characteristics of 68 healthy and 33
    failed firms.
  • One of the variables was the ratio of current
    assets to current liabilities.
  • The data appear in table 7.4.

25
Healthy Companies versus Failed Companies
26
Healthy Companies versus Failed Companies
  • First lets Look at the data.
  • Histograms for the two groups of firms
    superimposed with a Normal curve with mean and
    standard deviation equal to the sample values is
    given.
  • The distribution for the healthy firms looks more
    Normal than the distribution for the failed firms.

27
Healthy Companies versus Failed Companies
  • The back to back stemplot confirms our findings
    from the previous plots that there are no
    outliers or strong departure from Normality that
    prevent us from using the t procedure for these
    data.

28
Example Do mean asset/liability ratio differ?
  • Take group 1 to be the firms that were healthy
    and group 2 to be those that failed. The question
    of interest is whether or not the mean ratio of
    current assets to current liabilities is
    different for the two groups. We therefore test
  • H0?1 ?2
  • Ha?1 ? ?2

29
Example Do mean asset/liability ratio differ?
  • Here are the summary statistics
  • Group Firm n s
  • 1 Health 68 1.7256 .6393
  • 2 Failed 33 0.8236 .4811
  • The sample standard deviations are fairly
    close.We are willing to assume equal population
    standard deviations. The pooled sample variance
    is
  • and

30
Example Do mean asset/liability ratio differ?
  • The pooled two-sample t statistic is
  • The P-value is
  • Where t has t(99) distribution. In t-table we
    have entries for100 degrees of freedom. We will
    use the entries for 100. Our calculated value of
    t is larger than the t-value corresponding to p
    .0005 entry in the table. Doubling 0.0005 , we
    conclude that the two sided P-value is less than
    .001.

31
Example How different are mean Asset/liability
ratios?
  • P-value is rarely a complete summary of a
    statistical analysis. To make a judgment
    regarding the size of the difference between the
    two groups of firms, we need a confidence
    interval.
  • The difference in mean current assets to current
    liabilities ratios for healthy versus failed
    firms is

32
Example How different are mean Asset/liability
ratios?
  • For a 95 margin of error we will use the
    critical value t 1.984 from the t(100)
    distribution. The margin of error is
  • This will gives the following 95 confidence
    interval

33
Example How different are mean Asset/liability
ratios?
  • We report that the successful firms have current
    assets to current liabilities ratio that average
    1.15 higher than failed firms, with margin of
    error 0.25 for 95 confidence.
  • Alternatively, we are 95 confident that the
    difference is between 0.90 and 1.40.
Write a Comment
User Comments (0)
About PowerShow.com