Lecture 6 Hypothesis Tests for Proportions - PowerPoint PPT Presentation

About This Presentation
Title:

Lecture 6 Hypothesis Tests for Proportions

Description:

Slides available from Statistics & SPSS page of www.gpryce.com ... Homogenous variances. Heterogeneous variances. 4. Deciding on whether variances are equal ... – PowerPoint PPT presentation

Number of Views:75
Avg rating:3.0/5.0
Slides: 37
Provided by: author91
Category:

less

Transcript and Presenter's Notes

Title: Lecture 6 Hypothesis Tests for Proportions


1
Lecture 6Hypothesis Tests for Proportions Two
PopulationsSlides available from Statistics
SPSS page of www.gpryce.com
  • Social Science Statistics Module I
  • Gwilym Pryce

2
Notices
  • Register
  • Class Reps and Staff Student committee.

3
Aims Objectives
  • Aim
  • the aim of this lecture is to continue with our
    introduction of the method of hypothesis testing
    and to demonstrate a number of applications
  • Objectives
  • by the end of this lecture students should be
    able to carry out hypothesis tests on
  • two population means
  • one population proportion
  • two population proportions

4
Plan
  • 1. Review of Significance
  • 2. Review of one sample tests on the mean
  • 3. Hypothesis tests about Two population means
  • Homogenous variances
  • Heterogeneous variances
  • 4. Deciding on whether variances are equal
  • 5. Hypothesis tests about proportions
  • One population
  • Two populations

5
Macro commands
6
Review of Significance
  • P significance level chances of our observed
    sample mean occurring given that our assumption
    about the population (denoted by H0) is true.
  • So if we find that this probability is small, it
    might lead us to question our assumption about
    the population mean.
  • I.e. if our sample mean is a long way from our
    assumed population mean then it is
  • either a freak sample
  • or our assumption about the population mean is
    wrong.

7
  • If we draw the conclusion that it is our
    assumption re m that is wrong and reject H0 then
    we have to bear in mind that there is a chance
    that H0 was in fact true.
  • In other words
  • when P 0.05, for every twenty times we reject
    H0, then on one of those occasions we would have
    rejected H0 when it was in fact true.

8
2. Review of one sample tests on the mean
  • We introduced a common framework for hypothesis
    testing

4 Steps of Hypothesis testing Step (1) state H0
and H1 Step (2) state a and formula Step (3)
state decision rule Step (4) compute P decide
9
We also looked at 2 specific tests
  • Large sample sig. Test on one mean
  • Formula
  • Macro syntax
  • H_L1M n(?) x_bar(?) m(?) s(?).
  • Small sample sig. Test on one mean
  • Formula
  • Macro syntax
  • H_S1M n(?) x_bar(?) m(?) s(?).

10
3. Hypothesis tests about two population means
  • In SPSS this is called the Independent Sample
    t-test
  • go to Analyse, Compare Means...
  • Two different formulas for computing t

Equal Variances (formula has an exact
t-distribution)
Unequal Variances (does not have an exact
t-distribution)
11
Example where variances are different
  • As part of your PhD, you want to test whether the
    new Fun Phonics reading method is better than
    the Letterland method. You examine the reading
    power of 6 year old children from two similar
    schools.
  • The first used the FP method and you found that
    this produced an average reading proficiency
    score of 53.7 (based on a sample of 22 children
    s.d. 11.5).
  • The second school used the Letterland method and
    you found that this produced an average reading
    proficiency score of 42.51 (sample 24 s.d.
    16.9).
  • Test whether the FP method produces higher
    results at the 1 significance level.

12
  • Use the 4 steps and the following formula to test
    whether the FP method produces higher results at
    the 1 significance level.
  • Can you use the canned SPSS procedure to do this
    problem?

4 Steps of Hypothesis testing Step (1) state H0
and H1 Step (2) state a and formula Step (3)
state decision rule Step (4) compute P decide
13
  • (1) H0 mFP mL (means are equal)
  • H1 mFP gt mL (upper tail test)
  • (2) a 0.01 (implies critical t value of
    2.528),
  • (3) Reject H0 iff P lt a, I.e. if P lt 0.01
  • (4) P Prob(t gt 2.644) 0.0076, so reject H0

14
Doing the calculation in SPSS
  • You cannot use the canned SPSS procedure unless
    you have the original data.
  • But you can use the following macro commands
  • Homogenous variances
  • H_S2Mp n1(?) n2(?) x_bar1(?)
    x_bar2(?) s1(?) s2(?).
  • Heterogeneous variances
  • H_S2Md n1(?) n2(?) x_bar1(?)
    x_bar2(?) s1(?) s2(?).

15
For the Letterland/FP example we would use the
diff. Variances syntax
H_S2Md n1(22) n2(24) x_bar1(53.7)
x_bar2(42.51) s1(11.5) s2(16.9).
  • The upper tail sig. 0.007588
  • I.e. less than 1 chance of false rejection,
    therefore reject H0 of equal means in favour of
    the alternative hypothesis that Fun Phonics
    results in higher reading scores on average than
    Letterland.

16
4. How do we decide on whether the variances are
similar?
  • Where variances are hugely different or exactly
    the same, the decision is simple.
  • When there is any ambiguity, we can use one of
    two tests to help us
  • Simple Ratio of Variances Test
  • Levenes Test

17
Simple Ratio of Variances test
  • If we divide the ratio of variances of samples
    from two independent populations we find that
    that ratio has an F distribution in repeated
    samples
  • where the denominator degrees of freedom
    calculated as n11 and the numerator degrees of
    freedom calculated as n21.
  • NB Because the critical values for the F
    distribution are only calculated for the upper
    tail, if the F value you are have calculated is
    less than one, you need to invert it
  • i.e. swap round the numerator and denominator.

F s12 / s22
18
  • This is the formula behind the following command
  • H8_S2VF n1(?) n2(?) s1(?) s2(?)
  • E.g. For the Letterland/FP example
  • H8_S2VF n1(22) n2(24) s1(11.5) s2(16.9).
  • Which tells us that there is less than a 5
    chance of false rejection if we reject the null
    of equal variances. So reject the null
  • I.e. we can be sure that the population variances
    are indeed different.

19
The Levenes test
  • If we have the original data (rather than just
    the summary statistics) we can use Levenes test
    which is a canned routine in SPSS.
  • The Levenes test is more sophisticated robust
    than the simple ratio of variances test
  • If P (I.e. sig.) from the Levenes test is
    small reject the H0 of equal variances use the
    1st t-formula.
  • If P from the Levenes test is large, accept H1
    use the 2nd t-formula to compute the test
    statistic.

20
SPSS Output from test equal purchase prices
between Cumberland and Durham (Nationwide)
21
Two tails from one
  • Along with the Levenes test results, SPSS
    automatically supplies t-test results for both
    the equal and different variances formulas.
  • One problem with the SPSS t-test, however, is
    that it only gives the 2 tail sig., but you can
    work out the one tail sigs as follows
  • The two tailed significance is twice that of the
    smallest one tailed significance
  •   2 tailed sig. 2 ? minlower tail sig., upper
    tail sig.
  • But it can be a bit confusing working out which
    one tail significance level is the one you want
    (see notes).

22
Testing for 2 means summary
  • If youve got the original data,
  • First do the Levenes test in SPSS
  • Analyze, Compare Means, Independent Samples
  • Then do the appropriate macro t-test to avoid
    confusion.
  • H_S2Mp for equal variances or H_S2Md for
    different variances
  • If you dont have the original data,
  • First do the ratio of variances test
  • H8_S2VF
  • Then do the appropriate macro t-test
  • H_S2Mp for equal variances or H_S2Md for
    different variances

23
5.1 Hypothesis tests on proportions 1 population
(large samples only)
  • So far looked at
  • how to make inferences about the population mean
    from our sample mean.
  • But sometimes the variable of interest is
    categorical
  • household has or has not insurance
  • person is homosexual or not homosexual
  • a person has Aids or does not have Aids

24
  • In such cases, what we are interested in is the
    proportion of cases that fall into a particular
    category
  • the proportion of households with insurance
  • the proportion of people who are homosexual
  • the proportion of people with Aids

25
  • Calculating the sample proportion
  • p x / n
  • where
  • x cases with the attribute of interest
  • e.g. the number of households with insurance
  • n sample size

26
CLT and Proportions
  • Q/ Does the Central Limit Theorem apply to sample
    proportions?
  • A/ Yes.
  • Proportions from repeated random samples will be
    normally distributed around the population
    proportion p.
  • We can then translate any sample proportion onto
    the standard normal curve by calculating its z
    score

27
Example
  • E.g. 1 As a historian, you want to find the
    proportion of citizens in medieval Scotland that
    contracted the plague. From a sample of 400
    parish records, you find that 22 died of the
    plague. The assumption in the literature has
    been that 10 of the population had died. Test
    whether this assumption is valid using both 2 and
    1 tailed tests.

28
Summary of data n 400 x 22 p0 0.1
  • (1) H0 p 10
  • H1 p ? 10 (2-tailed test)
  • (2) a 0.02, for example.

29
  • (3) Reject H0 iff P lt a, I.e. if P lt 0.02
  • (this will happen if zc lt - 2.33 or if zc gt 2.33,
    where 2.33 is the z value associated with a
    0.02. Since zc -3.948, we know we can safely
    reject H0).
  • (4) Calculate z
  • P 2x(Prob(z lt -3.00))
  • 2x 0.0013 0.0026
  • since P lt 0.02 (I.e. less than one in 50 chance
    of type I error) we can reject H0.
  • In fact, the chances of incorrect rejection of H0
    are less than one in 3,000.
  • I.e. the chances of observing p (our sample
    proportion) assuming H0 (p 10) to be true are
    so small that we are forced to question this
    assumption about p

30
One tailed test
  • (1) H0 p 10
  • H1 p lt 10 (lower tail test)
  • (2) a 0.02
  • (3) Reject H0 iff P lt a, I.e. if P lt 0.02
  • (4) Lower tail sig. P Prob(z lt -3.00)
    .001350
  • since P lt 0.02 we can reject H0 knowing that the
    chances of incorrect rejection of H0 are less
    than one in 740
  • our cut-off rule for rejecting H0 was no more
    than a one in 50 chance
  • one in 740 is a lot less than one in 50 so we can
    reject H0 with confidence.

31
  • The macro syntax for one proportion tests is as
    follows
  • H6_L1P n(400) x(22) pi(0.1).
  • Which comes to the same result.

32
5.2 Hypothesis tests about Two population
proportions
  • To test the hypothesis that the population
    proportions are equal
  • H0 p1 p2
  • compute the z statistic

where SEDp is the pooled standard error
and
33
Example
  • Two surveys of mortgage payment protection
    insurance (MPPI) are carried out, one on single
    parents with 1 child and one on single parents
    with 3 children. Amongst the first group, 67 out
    of a sample of 300 were found to have taken out
    MPPI, compared with 15 out of a sample of 101 in
    the second group. Is take-up significantly lower
    amongst the HHs with three children?
  • p1 67/300 0.2233
  • p2 15/101 0.1485
  • p (300 101)/(6715) 0.2045

34
  • (1) H0 p1 p2
  • H1 p1 gt p2
  • (2) a 0.01 (z ?2.33)
  • (3) Reject H0 if P lt 0.01
  • (4) P 0.053.
  • Take-up is not significantly lower amongst HHs
    with 3 children at the1 sig. level or even at
    5 significance level.
  • I.e. we cannot say that the difference in
    proportions is anything more than the effect of
    sampling variation.

35
  • H7_L2P n1(300) n2(101) x1(67) x2(15)
    .

36
Summary
  • 1. Review of Significance
  • 2. Review of one sample tests on the mean
  • 3. Hypothesis tests about Two population means
  • Homogenous variances
  • Heterogeneous variances
  • 4. Deciding on whether variances are equal
  • 5. Hypothesis tests about proportions
  • One population
  • Two populations
Write a Comment
User Comments (0)
About PowerShow.com