Describing Relationships: Regression, Prediction, and Causation - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Describing Relationships: Regression, Prediction, and Causation

Description:

Explanatory: Possession of gun in home. Response: Occurrence of a homicide. Chapter 15 ... Case Study: Vaccines and Brain Damage ... Case Study ... – PowerPoint PPT presentation

Number of Views:311
Avg rating:3.0/5.0
Slides: 25
Provided by: jamesemays
Category:

less

Transcript and Presenter's Notes

Title: Describing Relationships: Regression, Prediction, and Causation


1
Chapter 15
  • Describing Relationships Regression,
    Prediction, and Causation

2
Thought Question 1
Suppose you were to make a scatterplot of (adult)
sons heights versus fathers heights, by
collecting data on both from several of your male
friends. You would now like to predict how tall
your nephew will be when he grows up, on the
basis of his fathers height. Could you use your
scatterplot to help you make this prediction?
Explain.
3
Thought Question 2
A strong positive correlation has been found in a
certain city in the northeastern United States
between weekly sales of hot chocolate and weekly
sales of facial tissues. Would you interpret
that to mean that hot chocolate causes people to
need facial tissues? Explain.
4
Thought Question 3
Researchers have shown that there is a positive
correlation between the average fat intake and
the breast cancer rate across countries. In
other words, countries with higher fat intake
tend to have higher breast cancer rates. Does
this correlation provide evidence that dietary
fat is a contributing cause of breast cancer?
Explain.
5
Thought Question 4
If you were to draw a scatterplot of number of
women in the work force versus number Christmas
trees sold in the United States for each year
between 1930 and the present, you would find a
very strong positive correlation. Why do you
think this would be true? Does one cause the
other?
6
Linear Regression
  • Objective To quantify the linear relationship
    between an explanatory variable and response
    variable. We can then predict the average
    response for all subjects with a given value of
    the explanatory variable.
  • Regression equation y a bx
  • x is the value of the explanatory variable
  • y is the average value of the response variable
  • note that a and b are just the intercept and
    slope of a straight line
  • note that r and b are not the same thing, but
    their signs will agree

Plot
7
Least Squares
  • Used to determine the best line
  • We want the line to be as close as possible to
    the data points in the vertical (y) direction
    (since that is what we are trying to predict)
  • Least Squares use the line that minimizes the
    sum of the squares of the vertical distances of
    the data points from the line

8
Prediction via Regression Line Husband and Wife
Ages
Hand, et.al., A Handbook of Small Data Sets,
London Chapman and Hall
  • The regression equation is y 3.6 0.97x
  • y is the average age of all husbands who have
    wives of age x
  • For all women aged 30, we predict the average
    husband age to be 32.7 years
  • 3.6 (0.97)(30) 32.7 years
  • Suppose we know that an individual wifes age is
    30. What would we predict her husbands age to
    be?

9
Coefficient of Determination (R2)
  • Measures usefulness of regression prediction
  • R2 (or r2, the square of the correlation)
    measures how much variation in the values of the
    response variable (y) is explained by the
    regression line
  • r1 R21 regression line explains all (100)
    of the variation in y
  • r.7 R2.49 regression line explains almost
    half (50) of the variation in y

10
A CautionBeware of Extrapolation
  • Sarahs height was plotted against her age
  • Can you predict her height at age 42 months?
  • Can you predict her height at age 30 years (360
    months)?

11
A CautionBeware of Extrapolation
  • Regression liney 71.95 .383 x
  • height at age 42 months? y 88
  • height at age 30 years? y 209.8
  • She is predicted to be 6 10.5 at age 30.

12
Correlation Does Not Imply Causation
  • Even very strong correlations may not correspond
    to a real causal relationship.

13
Evidence of Causation
  • A properly conducted experiment establishes the
    connection
  • Other considerations
  • A reasonable explanation for a cause and effect
    exists
  • The connection happens in repeated trials
  • The connection happens under varying conditions
  • Potential confounding factors are ruled out
  • Alleged cause precedes the effect in time

14
Reasons Two Variables May Be Related (Correlated)
  • Explanatory variable causes change in response
    variable
  • Response variable causes change in explanatory
    variable
  • Explanatory may have some cause, but is not the
    sole cause of changes in the response variable
  • Confounding variables may exist
  • Both variables may result from a common cause
  • such as, both variables changing over time
  • The correlation may be merely a coincidence

15
Explanatory causes Response
  • Explanatory pollen count from grasses
  • Response percentage of people suffering from
    allergy symptoms
  • Explanatory amount of food eaten
  • Response hunger level

16
Response causes Explanatory
  • Explanatory Divorce among men
  • Response Percent abusing alcohol
  • Conclusion was that getting divorced caused
    alcohol abuse in men.
  • Could it be that alcohol abuse
  • caused divorce?

17
Explanatory is notSole Contributor
  • Explanatory Possession of gun in home
  • Response Occurrence of a homicide
  • tendency toward violence may be
  • another contributor

18
Confounding VariablesCase Study Meditation
vs. Aging
  • Explanatory Meditation
  • Response Aging (measurable aging factor)
  • general concern for ones well
  • being may be confounded with
  • decision to try meditation

19
Common Response(both variables change due to
common cause)
  • Explanatory Divorce among men
  • Response Percent abusing alcohol
  • Both may result from an unhappy
  • marriage.

20
Both Variables are Changing Over Time
  • Both divorces and suicides have increased
    dramatically since 1900.
  • Are divorces causing suicides?
  • Are suicides causing divorces???
  • The population has increased dramatically since
    1900 (causing both to increase).
  • Better to investigate Has the rate of divorce
    or the rate of suicide changed over time?

21
The Relationship May Be Just a Coincidence
  • We will see some strong correlations (or
    apparent associations) just by chance, even when
    the variables are not related in the population

22
Coincidence (?)Case Study Vaccines and Brain
Damage
  • A required whooping cough vaccine was blamed for
    seizures that caused brain damage
  • led to reduced production of vaccine (due to
    lawsuits)
  • Study of 38,000 children found no evidence for
    the accusations (reported in New York Times)
  • people confused association with
    cause-and-effect
  • virtually every kid received the vaccineit was
    inevitable that, by chance, brain damage caused
    by other factors would occasionally occur in a
    recently vaccinated child

23
Case Study
Social Relationships and Health
House, J., Landis, K., and Umberson, D. Social
Relationships and Health, Science, Vol. 241
(1988), pp 540-545.
  • Does lack of social relationships cause people to
    become ill?
  • Or, are unhealthy people less likely to establish
    and maintain social relationships?
  • Or, is there some other factor that predisposes
    people both to have lower social activity and
    become ill?

24
Key Concepts
  • Least Squares Regression Equation
  • R2
  • Correlation does not imply causation
  • Confirming causation
  • Reasons variables may be correlated
Write a Comment
User Comments (0)
About PowerShow.com