Extension - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

Extension

Description:

The General Linear Model with Categorical Predictors Comparison to the t-test Furthermore, the previous gives the same results we would have gotten via a t-test, to ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 12
Provided by: Mike2163
Category:

less

Transcript and Presenter's Notes

Title: Extension


1
Extension
  • The General Linear Model with Categorical
    Predictors

2
Extension
  • Regression can actually handle different types of
    predictors, and in the social sciences we are
    often interested in differences between groups
  • For now we will concern ourselves with the two
    independent groups case
  • E.g. gender, republican vs. democrat etc.

3
Dummy coding
  • There are different ways to code categorical data
    for regression, and in general, to represent a
    categorical variable you need k-1 coded
    variables1
  • k number of categories/groups
  • Dummy coding involves using zeros and ones to
    identify group membership, and since we only have
    two groups, one group will be zero (the reference
    group) and the other 1

4
Dummy coding
  • Example
  • The thing to note at this point is that we have a
    simple bivariate correlation/simple regression
    setting
  • The correlation between group and the DV is .76
  • This is sometimes referred to as the point
    biserial correlation (rpb) because of the
    categorical variable
  • However, dont be fooled, it is calculated
    exactly the same way as the Pearson before i.e.
    you treat that 0,1 grouping variable like any
    other in calculating the correlation coefficient
  • However, the sign is arbitrary since either group
    could have been a one or zero, and so that needs
    to be noted

Group Outcome 0 3 0 5 0 7 0 2 0 3 1 6 1 7
1 7 1 8 1 9
5
Example
  • Graphical display
  • The R-square is .762 .577
  • The regression equation is

6
Example
  • Look closely at the descriptive output compared
    to the coefficients.
  • What do you see?

7
The constant
  • Note again our regression equation
  • Recall the definition for the slope and constant
  • First the constant, what does when X O mean
    here in this setting?
  • It means when we are in the O group
  • What is that predicted value?
  • Ypred 4 3.4(0) 4
  • That is the groups mean
  • The constant here is thus the reference groups
    mean

8
The coefficient
  • Now think about the slope
  • What does a 1 unit change in X mean in this
    setting?
  • It means we go from one group to the other
  • Based on that coefficient, what does the slope
    represent in this case (i.e. can you derive that
    coefficient from the descriptive stats in some
    way?)
  • The coefficient is the difference between means

9
The regression line
  • The regression line covers the values represented
  • i.e. 0, 1, for the two groups
  • It passes through each of their means
  • Using least squares regression the regression
    line always passes through the mean of X and Y,
    though the mean of X here is nonsensical
  • The constant (if we are using dummy coding) is
    the mean for the zero (reference) group
  • The coefficient is the difference between means

10
Comparison to the t-test
  • Furthermore, the previous gives the same results
    we would have gotten via a t-test, to which we
    are about to turn,
  • However, you now can see it is not a distinct
    procedure from regression with a linear model of
    some outcome predicted by a grouping variable.
  • Two Sample t-test
  • data Outcome by Group
  • t 3.3024, df 8, p-value 0.01082
  • 95 percent confidence interval
  • 5.774177 1.025823

11
Summary
  • Understanding the basics regarding the general
    linear model can go a long way toward ones
    ability to understand any analysis
  • It not only specifically holds here but is
    utilized in more complex univariate and
    multivariate analyses, and even in some nonlinear
    situations (e.g. logistic regression), we use
    generalized linear models
  • Y Xb e
  • For properly specified models, linear models
    provide reasonable fits and an intuitive
    understanding relative to more complex approaches.
Write a Comment
User Comments (0)
About PowerShow.com