Chapter 9 Comparing Two Groups - PowerPoint PPT Presentation

About This Presentation
Title:

Chapter 9 Comparing Two Groups

Description:

Example: Aspirin, the Wonder Drug ... Example: Aspirin, the Wonder Drug. What is the response variable? What are the groups to compare? ... – PowerPoint PPT presentation

Number of Views:94
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: Chapter 9 Comparing Two Groups


1
Chapter 9Comparing Two Groups
  • Learn .
  • How to Compare Two Groups On a Categorical or
    Quantitative Outcome Using Confidence Intervals
    and Significance Tests

2
Bivariate Analyses
  • The outcome variable is the response variable
  • The binary variable that specifies the groups is
    the explanatory variable

3
Bivariate Analyses
  • Statistical methods analyze how the outcome on
    the response variable depends on or is explained
    by the value of the explanatory variable

4
Independent Samples
  • The observations in one sample are independent of
    those in the other sample
  • Example Randomized experiments that randomly
    allocate subjects to two treatments
  • Example An observational study that separates
    subjects into groups according to their value for
    an explanatory variable

5
Dependent Samples
  • Data are matched pairs each subject in one
    sample is matched with a subject in the other
    sample
  • Example set of married couples, the men being
    in one sample and the women in the other.
  • Example Each subject is observed at two times,
    so the two samples have the same people

6
Section 9.1
  • Categorical Response How Can We Compare Two
    Proportions?

7
Categorical Response Variable
  • Inferences compare groups in terms of their
    population proportions in a particular category
  • We can compare the groups by the difference in
    their population proportions
  • (p1 p2)

8
Example Aspirin, the Wonder Drug
  • Recent Titles of Newspaper Articles
  • Aspirin cuts deaths after heart attack
  • Aspirin could lower risk of ovarian cancer
  • New study finds a daily aspirin lowers the risk
    of colon cancer
  • Aspirin may lower the risk of Hodgkins

9
Example Aspirin, the Wonder Drug
  • The Physicians Health Study Research Group at
    Harvard Medical School
  • Five year randomized study
  • Does regular aspirin intake reduce deaths from
    heart disease?

10
Example Aspirin, the Wonder Drug
  • Experiment
  • Subjects were 22,071 male physicians
  • Every other day, study participants took either
    an aspirin or a placebo
  • The physicians were randomly assigned to the
    aspirin or to the placebo group
  • The study was double-blind the physicians did
    not know which pill they were taking, nor did
    those who evaluated the results

11
Example Aspirin, the Wonder Drug
  • Results displayed in a contingency table

12
Example Aspirin, the Wonder Drug
  • What is the response variable?
  • What are the groups to compare?

13
Example Aspirin, the Wonder Drug
  • The response variable is whether the subject had
    a heart attack, with categories yes or no
  • The groups to compare are
  • Group 1 Physicians who took a placebo
  • Group 2 Physicians who took aspirin

14
Example Aspirin, the Wonder Drug
  • Estimate the difference between the two
    population parameters of interest

15
Example Aspirin, the Wonder Drug
  • p1 the proportion of the population who would
    have a heart attack if they participated in this
    experiment and took the placebo
  • p2 the proportion of the population who would
    have a heart attack if they participated in this
    experiment and took the aspirin

16
Example Aspirin, the Wonder Drug
Sample Statistics
17
Example Aspirin, the Wonder Drug
  • To make an inference about the difference of
    population proportions, (p1 p2), we need to
    learn about the variability of the sampling
    distribution of

18
Standard Error for Comparing Two Proportions
  • The difference, , is obtained from
    sample data
  • It will vary from sample to sample
  • This variation is the standard error of the
    sampling distribution of

19
Confidence Interval for the Difference between
Two Population Proportions
  • The z-score depends on the confidence level
  • This method requires
  • Independent random samples for the two groups
  • Large enough sample sizes so that there are at
    least 10 successes and at least 10 failures
    in each group

20
Confidence Interval Comparing Heart Attack Rates
for Aspirin and Placebo
  • 95 CI

21
Confidence Interval Comparing Heart Attack Rates
for Aspirin and Placebo
  • Since both endpoints of the confidence interval
    (0.005, 0.011) for (p1- p2) are positive, we
    infer that (p1- p2) is positive
  • Conclusion The population proportion of heart
    attacks is larger when subjects take the placebo
    than when they take aspirin

22
Confidence Interval Comparing Heart Attack Rates
for Aspirin and Placebo
  • The population difference (0.005, 0.011) is small
  • Even though it is a small difference, it may be
    important in public health terms
  • For example, a decrease of 0.01 over a 5 year
    period in the proportion of people suffering
    heart attacks would mean 2 million fewer people
    having heart attacks

23
Confidence Interval Comparing Heart Attack Rates
for Aspirin and Placebo
  • The study used male doctors in the U.S
  • The inference applies to the U.S. population of
    male doctors
  • Before concluding that aspirin benefits a larger
    population, wed want to see results of studies
    with more diverse groups

24
Interpreting a Confidence Interval for a
Difference of Proportions
  • Check whether 0 falls in the CI
  • If so, it is plausible that the population
    proportions are equal
  • If all values in the CI for (p1- p2) are
    positive, you can infer that (p1- p2) gt0
  • If all values in the CI for (p1- p2) are
    negative, you can infer that (p1- p2) lt0
  • Which group is labeled 1 and which is labeled
    2 is arbitrary

25
Interpreting a Confidence Interval for a
Difference of Proportions
  • The magnitude of values in the confidence
    interval tells you how large any true difference
    is
  • If all values in the confidence interval are near
    0, the true difference may be relatively small in
    practical terms

26
Significance Tests Comparing Population
Proportions
  • 1. Assumptions
  • Categorical response variable for two groups
  • Independent random samples

27
Significance Tests Comparing Population
Proportions
  • Assumptions (continued)
  • Significance tests comparing proportions use the
    sample size guideline from confidence intervals
    Each sample should have at least about 10
    successes and 10 failures
  • Twosided tests are robust against violations of
    this condition
  • At least 5 successes and 5 failures is
    adequate

28
Significance Tests Comparing Population
Proportions
  • 2. Hypotheses
  • The null hypothesis is the hypothesis of no
    difference or no effect
  • H0 (p1- p2) 0
  • Under the presumption that p1 p2, we create a
    pooled estimate of the common value of p1and p2
  • This pooled estimate is

29
Significance Tests Comparing Population
Proportions
  • 2. Hypotheses (continued)
  • Ha (p1- p2) ? 0 (two-sided test)
  • Ha (p1- p2) lt 0 (one-sided test)
  • Ha (p1- p2) gt 0 (one-sided test)

30
Significance Tests Comparing Population
Proportions
  • 3. The test statistic is

31
Significance Tests Comparing Population
Proportions
  • 4. P-value Probability obtained from the
    standard normal table
  • 5. Conclusion Smaller P-values give stronger
    evidence against H0 and supporting Ha

32
Example Is TV Watching Associated with
Aggressive Behavior?
  • Various studies have examined a link between TV
    violence and aggressive behavior by those who
    watch a lot of TV
  • A study sampled 707 families in two counties in
    New York state and made follow-up observations
    over 17 years
  • The data shows levels of TV watching along with
    incidents of aggressive acts

33
Example Is TV Watching Associated with
Aggressive Behavior?
34
Example Is TV Watching Associated with
Aggressive Behavior?
  • Test the Hypotheses
  • H0 (p1- p2) 0
  • Ha (p1- p2) ? 0
  • Using a significance level of 0.05
  • Group 1 less than 1 hr. of TV per day
  • Group 2 at least 1 hr. of TV per day

35
Example Is TV Watching Associated with
Aggressive Behavior?
36
Example Is TV Watching Associated with
Aggressive Behavior?
  • Conclusion Since the P-value is less than 0.05,
    we reject H0
  • We conclude that the population proportions of
    aggressive acts differ for the two groups
  • The sample values suggest that the population
    proportion is higher for the higher level of TV
    watching

37
Section 9.2
  • Quantitative Response How Can We Compare Two
    Means?

38
Comparing Means
  • We can compare two groups on a quantitative
    response variable by comparing their means

39
Example Teenagers Hooked on Nicotine
  • A 30-month study
  • Evaluated the degree of addiction that teenagers
    form to nicotine
  • 332 students who had used nicotine were evaluated
  • The response variable was constructed using a
    questionnaire called the Hooked on Nicotine
    Checklist (HONC)

40
Example Teenagers Hooked on Nicotine
  • The HONC score is the total number of questions
    to which a student answered yes during the
    study
  • The higher the score, the more hooked on nicotine
    a student is judged to be

41
Example Teenagers Hooked on Nicotine
  • The study considered explanatory variables, such
    as gender, that might be associated with the HONC
    score

42
Example Teenagers Hooked on Nicotine
  • How can we compare the sample HONC scores for
    females and males?
  • We estimate (µ1 - µ2) by (x1 - x2)
  • 2.8 1.6 1.2
  • On average, females answered yes to about one
    more question on the HONC scale than males did

43
Example Teenagers Hooked on Nicotine
  • To make an inference about the difference between
    population means, (µ1 µ2), we need to learn
    about the variability of the sampling
    distribution of

44
Standard Error for Comparing Two Means
  • The difference, , is obtained from
    sample data. It will vary from sample to sample.
  • This variation is the standard error of the
    sampling distribution of

45
Confidence Interval for the Difference between
Two Population Means
  • A 95 CI
  • Software provides the t-score with right-tail
    probability of 0.025

46
Confidence Interval for the Difference between
Two Population Means
  • This method assumes
  • Independent random samples from the two groups
  • An approximately normal population distribution
    for each group
  • this is mainly important for small sample sizes,
    and even then the method is robust to violations
    of this assumption

47
Example Nicotine How Much More Addicted Are
Smokers than Ex-Smokers?
  • Data as summarized by HONC scores for the two
    groups
  • Smokers x1 5.9, s1 3.3, n1 75
  • Ex-smokersx2 1.0, s2 2.3, n2 257

48
Example Nicotine How Much More Addicted Are
Smokers than Ex-Smokers?
  • Were the sample data for the two groups
    approximately normal?
  • Most likely not for Group 2 (based on the sample
    statistics) x2 1.0, s2 2.3)
  • Since the sample sizes are large, this lack of
    normality is not a problem

49
Example Nicotine How Much More Addicted Are
Smokers than Ex-Smokers?
  • 95 CI for (µ1- µ2)
  • We can infer that the population mean for the
    smokers is between 4.1 higher and 5.7 higher than
    for the ex-smokers

50
How Can We Interpret a Confidence Interval for a
Difference of Means?
  • Check whether 0 falls in the interval
  • When it does, 0 is a plausible value for (µ1
    µ2), meaning that it is possible that µ1 µ2
  • A confidence interval for (µ1 µ2) that contains
    only positive numbers suggests that (µ1 µ2) is
    positive
  • We then infer that µ1 is larger than µ2

51
How Can We Interpret a Confidence Interval for a
Difference of Means?
  • A confidence interval for (µ1 µ2) that contains
    only negative numbers suggests that (µ1 µ2) is
    negative
  • We then infer that µ1 is smaller than µ2
  • Which group is labeled 1 and which is labeled
    2 is arbitrary

52
Significance Tests Comparing Population Means
  • 1. Assumptions
  • Quantitative response variable for two groups
  • Independent random samples

53
Significance Tests Comparing Population Means
  • Assumptions (continued)
  • Approximately normal population distributions for
    each group
  • This is mainly important for small sample sizes,
    and even then the two-sided test is robust to
    violations of this assumption

54
Significance Tests Comparing Population Means
  • 2. Hypotheses
  • The null hypothesis is the hypothesis of no
    difference or no effect
  • H0 (µ1- µ2) 0

55
Significance Tests Comparing Population
Proportions
  • 2. Hypotheses (continued)
  • The alternative hypothesis
  • Ha (µ1- µ2) ? 0 (two-sided test)
  • Ha (µ1- µ2) lt 0 (one-sided test)
  • Ha (µ1- µ2) gt 0 (one-sided test)

56
Significance Tests Comparing Population Means
  • 3. The test statistic is

57
Significance Tests Comparing Population Means
  • 4. P-value Probability obtained from the
    standard normal table
  • 5. Conclusion Smaller P-values give stronger
    evidence against H0 and supporting Ha

58
Example Does Cell Phone Use While Driving
Impair Reaction Times?
  • Experiment
  • 64 college students
  • 32 were randomly assigned to the cell phone group
  • 32 to the control group

59
Example Does Cell Phone Use While Driving
Impair Reaction Times?
  • Experiment (continued)
  • Students used a machine that simulated driving
    situations
  • At irregular periods a target flashed red or
    green
  • Participants were instructed to press a brake
    button as soon as possible when they detected a
    red light

60
Example Does Cell Phone Use While Driving
Impair Reaction Times?
  • For each subject, the experiment analyzed their
    mean response time over all the trials
  • Averaged over all trials and subjects, the mean
    response time for the cell-phone group was 585.2
    milliseconds
  • The mean response time for the control group was
    533.7 milliseconds

61
Example Does Cell Phone Use While Driving
Impair Reaction Times?
  • Data

62
Example Does Cell Phone Use While Driving
Impair Reaction Times?
  • Test the hypotheses
  • H0 (µ1- µ2) 0
  • vs.
  • Ha (µ1- µ2) ? 0
  • using a significance level of 0.05

63
Example Does Cell Phone Use While Driving
Impair Reaction Times?
64
Example Does Cell Phone Use While Driving
Impair Reaction Times?
  • Conclusion
  • The P-value is less than 0.05, so we can reject
    H0
  • There is enough evidence to conclude that the
    population mean response times differ between the
    cell phone and control groups
  • The sample means suggest that the population mean
    is higher for the cell phone group

65
Example Does Cell Phone Use While Driving
Impair Reaction Times?
  • What do the box plots tell us?
  • There is an extreme outlier for the cell phone
    group
  • It is a good idea to make sure the results of the
    analysis arent affected too strongly by that
    single observation
  • Delete the extreme outlier and redo the analysis
  • In this example, the t-statistic changes only
    slightly

66
Example Does Cell Phone Use While Driving
Impair Reaction Times?
  • Insight
  • In practice, you should not delete outliers from
    a data set without sufficient cause (i.e., if it
    seems the observation was incorrectly recorded)
  • It is however, a good idea to check for
    sensitivity of an analysis to an outlier
  • If the results change much, it means that the
    inference including the outlier is on shaky ground

67
How much more time do women spend on housework than men? Data is Hours per Week. How much more time do women spend on housework than men? Data is Hours per Week. How much more time do women spend on housework than men? Data is Hours per Week. How much more time do women spend on housework than men? Data is Hours per Week.
Gender Sample Size Mean St. Dev.
Women 6764 32.6 18.2
Men 4252 18.1 12.9
  • What is a point estimate of µ1- µ2?
  • 18.2 12.9
  • 32.6 18.1
  • 6764 - 4252
  • 32.6/18.2 18.1/12.9

68
How much more time do women spend on housework than men? Data is Hours per Week. How much more time do women spend on housework than men? Data is Hours per Week. How much more time do women spend on housework than men? Data is Hours per Week. How much more time do women spend on housework than men? Data is Hours per Week.
Gender Sample Size Mean St. Dev.
Women 6764 32.6 18.2
Men 4252 18.1 12.9
  • What is the standard error for comparing the
    means?
  • 5.3
  • .076
  • .297
  • .088

69
How much more time do women spend on housework than men? Data is Hours per Week. How much more time do women spend on housework than men? Data is Hours per Week. How much more time do women spend on housework than men? Data is Hours per Week. How much more time do women spend on housework than men? Data is Hours per Week.
Gender Sample Size Mean St. Dev.
Women 6764 32.6 18.2
Men 4252 18.1 12.9
  • What factor causes the standard error to be
    small compared to the sample standard deviations
    for the two groups?
  • sample means
  • sample standard deviations
  • sample sizes
  • genders

70
Section 9.3
  • Other Ways of Comparing Means and Comparing
    Proportions

71
Alternative Method for Comparing Means
  • An alternative t- method can be used when, under
    the null hypothesis, it is reasonable to expect
    the variability as well as the mean to be the
    same
  • This method requires the assumption that the
    population standard deviations be equal

72
The Pooled Standard Deviation
  • This alternative method estimates the common
    value s of s1 and s1 by

73
Comparing Population Means, Assuming Equal
Population Standard Deviations
  • Using the pooled standard deviation estimate, a
    95 CI for (µ1 - µ2) is
  • This method has df n1 n2- 2

74
Comparing Population Means, Assuming Equal
Population Standard Deviations
  • The test statistic for H0 µ1µ2 is
  • This method has df n1 n2- 2

75
Comparing Population Means, Assuming Equal
Population Standard Deviations
  • These methods assume
  • Independent random samples from the two groups
  • An approximately normal population distribution
    for each group
  • This is mainly important for small sample sizes,
    and even then, the CI and the two-sided test are
    usually robust to violations of this assumption
  • s1s2
Write a Comment
User Comments (0)
About PowerShow.com