Comparing Two Groups Means or Proportions - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Comparing Two Groups Means or Proportions

Description:

t curves are symmetric and bell-shaped like the normal distribution. ... For example, does sex affect income? Women's mean = Men's Mean ? ... – PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 23
Provided by: JamesD171
Category:

less

Transcript and Presenter's Notes

Title: Comparing Two Groups Means or Proportions


1
Comparing Two Groups Means or Proportions
  • Independent Samples t-tests

2
Review
  • Confidence Interval for a Mean
  • Slap a sampling distribution over a sample mean
    to determine a range in which the population mean
    has a particular probability of beingsuch as 95
    CI.
  • If our sample is one of the middle 95, we know
    that the mean of the population is within the CI.

Significance Test for a Mean Slap a sampling
distribution over a guess of the population mean
to determine if the sample has a very low
probability of having come from a population
where the guess is truesuch as a-level .05. If
our sample mean is in the outer 5, we know to
reject the guess, our sample has a low chance of
having come from a population with the mean we
guessed.
Y-bar?
Y-bar?
µ?
2.5
2.5
2.5
2.5
-1.96z
1.96z
Y-bar 95 CI Y-bar /- 1.96 (s.e.)
-1.96z
1.96z
µoguess z or t (Y-bar - µo)/ s.e.
20 21 22 23 24 X H 27 28 29 30
sampling distribution the way statistics for
samples of a certain size would stack up or be
distributed after all possible samples are
collected
3
Review
  • Lets collect some data on educational
    aspirations and produce a 95 confidence interval
    to tell us where the population parameter likely
    falls and then lets do a test of significance
    where we guess that average aspiration will be 16
    years.
  • I collected a sample of 625 kids who reported
    their educational aspirations where 12 high
    school, 16 equals 4 years of college and so
    forth. The average for the sample was 15 years
    with a standard deviation of 2 years.
  • 95 confidence interval 95 CI Sample Mean /-
    z s.e.
  • Find the standard error of the sampling
    distribution
  • s / ?n 2/v625 2/25 0.08
  • Build the width of the Interval. 95 corresponds
    with a z of /- 1.96.
  • /- z s.e 1.96 0.08 0.157
  • Insert the mean to build the interval
  • 95 CI Sample Mean /- z s.e 15 /-
    0.157
  • The interval 14.84 to 15.16
  • We are 95 confident that the population mean
    falls between these values. (What does this say
    about my guess???)

4
Review
  • Lets collect some data on educational
    aspirations and produce a 95 confidence interval
    to tell us where the population parameter likely
    falls and then lets do a test of significance
    where we guess that average aspiration will be 16
    years.
  • I collected a sample of 625 kids who reported
    their educational aspirations where 12 high
    school, 16 equals 4 years of college and so
    forth. The average for the sample was 15 years
    with a standard deviation of 2 years.
  • Significance Test z or t (Y-bar - µo)/ s.e.
  • Decide ?-level (? .05) and nature of test
    (two-tailed)
  • Set critical z or t (/- 1.96)
  • Make guess or null hypothesis,
  • Ho ? 16
  • Ha ? ? 16
  • Collect and analyze data
  • Calculate Z or t z/t Y-bar - ?o
    (s.e. s/vn 2/v625 2/25 .08)

  • s.e.
  • z/t (15 16)/.08 -1/.08 -12.5
  • Make a decision about the null hypothesis (reject
    the null -12.5 lt -1.96)
  • Find the P-value (look up 12.5 in z or t table).
    P lt .0001
  • It is extremely unlikely that our sample came
    from a population where the mean is 16.

5
Other Probability Distributions
  • A Note Not all theoretical probability
    distributions are Normal. One example of many is
    the binomial distribution.
  • The binomial distribution gives the discrete
    probability distribution of obtaining exactly n
    successes out of N trials where the result of
    each trial is true with known probability of
    success and false with the inverse probability.
  • The binomial distribution has a formula and
    changes shape with each probability of success
    and number of trials.
  • However, in this class the normal probability
    distribution is the most useful!

a binomial distribution, used with proportions
Successes 0 1 2 3 4 5 6 7 8 9 10 11 12
6
t
  • We use t instead of z to be more accurate
  • t curves are symmetric and bell-shaped like the
    normal distribution. However, the spread is more
    than that of the standard normal distributionthe
    tails are fatter.

Tea Tests?
df 1, 2, 3, and so on, approaching normal as df
exceeds 120.
7
t
  • The reason for using t is due to the fact that we
    use sample standard deviation (s) rather than
    population standard deviation (s) to calculate
    standard error. Since s, standard deviations,
    will vary from sample to sample, the variability
    in the sampling distribution ought to be greater
    than in the normal curve. t has a larger spread,
    more accurately reflecting the likelihood of
    extreme samples, especially when sample size is
    small.
  • The larger the degrees of freedom (n 1 when
    estimating the mean), the closer the t curve is
    to the normal curve. This reflects the fact that
    the standard deviation s approaches s for large
    sample size n.
  • Even though z-scores based on the normal curve
    will work for larger samples (n gt 120) SPSS uses
    t for all tests because it works for small
    samples and large samples alike.
  • (df the number of scores that are free to vary
    when calculating a statistic . . . n - ?)

Tea Tests?
8
Comparing Two Groups
  • Were going to move forward to more sophisticated
    statistics, building on what we have learned
    about confidence intervals and significance
    tests.
  • Sociologists look for relationships between
    concepts in the social world.
  • For example
  • Does ones sex affect income?
  • Focus on the relationship between the concepts
    Sex and Income
  • Does ones race affect educational attainment?
  • Focus on the relationship between the concepts
    Race and Educational Attainment

I love sophisticated statistics!
9
Comparing Two Groups
  • In this section of the course, you will learn
    ways to infer from a sample whether two concepts
    are related in a population.
  • Independent variable (X) That which causes
    another variable to change when it changes.
  • Dependent variable (Y) That which changes in
    response to change in another variable.
  • X ? Y
  • (X Sex or Race) (Y Income or
    Education)
  • The statistical technique you use will depend of
    the level of measurement of your independent and
    dependent variablesthe statistical test must
    match the variables!
  • Levels of Measurement Nominal, Ordinal,
    Interval-Ratio

10
Comparing Two Groups
  • The test you choose depends on level of
    measurement
  • Independent Dependent Statistical Test
  • Dichotomous Interval-ratio Independent Samples
    t-test
  • Dichotomous
  • Nominal Nominal Cross Tabs
  • Ordinal Ordinal
  • Dichotomous Dichotomous
  • Nominal Interval-ratio ANOVA
  • Ordinal Dichotomous
  • Dichotomous
  • Interval-ratio Interval-ratio Correlation and
    OLS Regression
  • Dichotomous

11
Comparing Two Groups
  • Independent Dependent Statistical Test
  • Dichotomous Interval-ratio Independent Samples
    t-test
  • Dichotomous
  • An independent samples t-test is concerned with
    whether a mean or proportion is equal between two
    groups. For example, does sex affect income?

? Income
? Income
µ
µ
Womens mean
Mens Mean ???
12
Comparing Two Groups
  • Independent Samples t-tests
  • Earlier, our focus was on the mean. We used the
    mean of the sample (statistic) to infer a range
    for what our population mean (parameter) might be
    (confidence interval) or whether it was like some
    guess or not (significance test).
  • Now, our focus is on the difference in the mean
    for two groups. We will use the difference of
    the sample means (statistic) to infer a range for
    what our population difference in means
    (parameter) might be (confidence interval) or
    whether it is like some guess (significance test).

13
Comparing Two Groups
  • The difference will be calculated as such
  • D-bar Y-bar2 Y-bar1
  • For example
  • Average Difference in Income by Sex
  • Male Average Income Female Average Income
  • (What would it mean if mens income minus womens
    income equaled zero?)

14
Comparing Two Groups
  • Like the mean, if one were to take random sample
    after random sample from two groupswith normal
    population distributionsand calculate and record
    the difference between groups each time, one
    would see the formation of a Sampling
    Distribution for D-bar that was normal and
    centered on the two populations difference.

average difference between two groups samples

Sampling
Distribution of D-bar
Z -3 -2 -1 0 1 2 3
95 Range
15
Comparing Two Groups
  • So the rules and techniques we learned for means
    apply to the differences in groups means.
  • One creates sampling distributions to create
    confidence intervals and do significance tests in
    the same ways.
  • However, the standard error of D-bar has to be
    calculated slightly differently.
  • For Means
  • (s1)2 (s2)2
  • s.e. (s.d. of the sampling distribution)
    n1 n2
  • (not assuming equal variances)
  • For Proportions
  • s.e. ?1 (1 - ?1) ?2
    (1 - ?2)
  • n1 n2

df less than n1 n2 - 2
df n1 n2 - 2
16
Comparing Two Groups
  • When variances are assumed to be equal, and
    sample sizes differ, we use the pooled estimate
    of variance for the standard error.

Estimated Standard error pooled Start with a
pooled variance.                              
             Then
df n1 n2 - 2
17
Comparing Two Groups
  • Calculating a Confidence Interval for the
    Difference between Two Groups Means
  • By slapping the sampling distribution for the
    difference over our samples difference between
    groups, D-bar, we can find the values between
    which the population difference is likely to be.
  • 95 C.I. D-bar /- 1.96 (s.e.) Remember
    When
  • (Y-bar2 Y-bar1) /- 1.96
    (s.e.) sample sizes are
  • Or (?2 ?1) /- 1.96
    (s.e.) small, t ? z, and
  • /- 1.96 may not be
  • 99 C.I. D-bar /- 2.58 (s.e.) appropriate.
  • (Y-bar2 Y-bar1) /- 2.58
    (s.e.)
  • Or (?2 ?1) /- 2.58 (s.e.)

18
Comparing Two Groups
  • EXAMPLE
  • We want to know what the likely difference is
    between male and female GPAs in a population of
    college students with 95 confidence.
  • Sample 50 men, average gpa 2.9, s.d. 0.5
  • 50 women, average gpa 3.1, s.d. 0.4
  • 95 C.I. Y-bar2 Y-bar1 /- 1.96 s.e.
  • Find the standard error of the sampling
    distribution
  • s.e. ?(.5)2/ 50 (.4)2/50 ? .005
    .003 ? .008 0.089
  • Build the width of the Interval. 95 corresponds
    with a z or t of /- 1.96.
  • /- z s.e /- 1.96 0.089 /- 0.174
  • Insert the mean difference to build the interval
  • 95 C.I. (Y-bar2 Y-bar1) /- 1.96 s.e.
    3.1 - 2.9 /- 0.174 0.2 /- 0.174
  • The interval 0.026 to 0.374

19
Comparing Two Groups
  • Conducting a Test of Significance for the
    Difference between Two Groups Means
  • By slapping the sampling distribution for the
    difference over a guess of the difference between
    groups, Ho, we can find out whether our sample
    could have been drawn from a population where the
    difference is equal to our guess.
  • Two-tailed significance test for ?-level .05
  • Critical z or t /- 1.96
  • To find if there is a difference in the
    population,
  • Ho ?2 - ?1 0
  • Ha ?2 - ?1 ? 0
  • Collect Data
  • Calculate z or t z or t (Y-bar2 Y-bar1)
    (?2 - µ1)

  • s.e.
  • Make decision about the null hypothesis (reject
    or fail to reject)
  • Report P-value

20
Comparing Two Groups
  • EXAMPLE
  • We want to know whether there is a difference in
    male and female GPAs in a population of college
    students.
  • Two-tailed significance test for ?-level .05
  • Critical z or t /- 1.96
  • To find if there is a difference in the
    population,
  • Ho ?2 - ?1 0
  • Ha ?2 - ?1 ? 0
  • Collect Data
  • Sample 50 men, average gpa 2.9, s.d. 0.5
  • 50 women, average gpa 3.1, s.d. 0.4
  • s.e. ?(.5)2/ 50 (.4)2/50 ? .005 .003
    ? .008 0.089
  • Calculate z or t z or t 3.1 2.9
    0 0.2 2.25

  • 0.089 0.089
  • Make decision about the null hypothesis Reject
    the null. There is enough difference between
    groups in our sample to say that there is a
    difference in the population. 2.25 gt1.96
  • Find P-value p or (sig.) .0122 x2 (table
    gives one-tail only) .0244
  • We have a 2.4 chance that the difference in our
    sample could have come from a population where
    there is no difference between men and women.
    That chance is low enough to reject the null, for
    sure!

21
Comparing Two Groups
  • The steps outlined above for
  • Confidence intervals
  • And
  • Significance tests
  • for differences in means are the same you would
    use for differences in proportions.
  • Just note the difference in calculation of the
    standard error for the difference.

22
Comparing Two Groups
  • Now lets do an example with SPSS, using the
    General Social Survey.
Write a Comment
User Comments (0)
About PowerShow.com