I271B Quantitative Methods - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

I271B Quantitative Methods

Description:

... our observed values of Y and our predicted values of Y (often called y-hat) ... to the cases that might actually have the most error from the predicted line. ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 16
Provided by: coyech
Category:

less

Transcript and Presenter's Notes

Title: I271B Quantitative Methods


1
Regression and Diagnostics
  • I271B Quantitative Methods

2
Regression versus Correlation
  • Correlation makes no assumption about one whether
    one variable is dependent on the other only a
    measure of general association
  • Regression attempts to describe a dependent
    nature of one or more explanatory variables on a
    single dependent variable. Assumes one-way
    causal link between X and Y.
  • Thus, correlation is a measure of the strength of
    a relationship -1 to 1, while regression measures
    the exact nature of that relationship (e.g., the
    specific slope which is the change in Y given a
    change in X)

3
Basic Linear Model
4
Basic Linear Function
5
Slope
But...what happens if B is negative?
6
Statistical Inference Using Least Squares
  • We obtain a sample statistic, b, which estimates
    the population parameter.
  • We also have the standard error for b
  • Uses standard t-distribution with n-2 degrees of
    freedom for hypothesis testing.
  • Yi b0 b1xi ei.

7
Why Least Squares?
  • For any Y and X, there is one and only one line
    of best fit. The least squares regression
    equation minimizes the possible error between our
    observed values of Y and our predicted values of
    Y (often called y-hat).

8
Data points and Regression
  • http//www.math.csusb.edu/faculty/stanton/m262/reg
    ress/regress.html

9
Multivariate Regression
  • Control Variables
  • Alternate Predictor Variables
  • Nested Models

10
Nested Models
11
Regression Diagnostics
12
Lab 4
  • Stating Hypothesis
  • Interpreting Hypotheses
  • Terminology
  • Appropriate statistics and conventions
  • Effect Size (revisited)
  • Cohens d and the .2, .5, .8 interpretation
    values
  • See also http//web.uccs.edu/lbecker/Psy590/es.ht
    m for a very nice lecture and discussion of the
    different types of effect size calculations

13
Multicollinearity
  • Occurs when an IV is very highly correlated with
    one or more other IVs
  • Caused by many things (including variables
    computed by other variables in same equation,
    using different operationalizations of same
    concept, etc)
  • Consequences
  • For OLS regression, it does not violate
    assumptions, but
  • Standard Errors will be much, much larger than
    normal when there is multicollinearity
    (confidence intervals become wider, t-statistics
    become smaller)
  • We often use VIF (variance inflation factors)
    scores to detect multicollinearity
  • Generally, VIF of 5-10 is problematic, higher
    values considered problematic
  • Solving the problem
  • Typically, regressing each IV on the other IVs
    is a way to find the problem variable(s).

14
Heteroskedasticity
  • OLS regression assumes that the variance of the
    error term is constant. If the error does not
    have a constant variance, then it is
    heteroskedastic.
  • Where it comes from
  • Error may really change as an IV increases
  • Measurement error
  • Underspecified model

15
Heteroskedasticity (continued)
  • Consequences
  • We still get unbiased parameter estimates, but
    our line may not be the best fit.
  • Why? Because OLS gives more weight to the
    cases that might actually have the most error
    from the predicted line.
  • Detecting it
  • We have to look at the residuals (difference
    between observed responses from the predicted
    responses)
  • First, use a residual versus fitted values plot
    (in STATA, rvfplot) or the residuals versus
    predicted values plot, which is a plot of the
    residuals versus one of the independent
    variables.
  • We should see an even band across the 0 point
    (the line), indicating that our error is roughly
    equal.
  • If we are still concerned, we can run a test such
    as the Breusch-Pagan/Cook-Weisberg Test for
    Heteroskedasticity. It tests the null hypothesis
    that the error variances are all EQUAL, and the
    alternative hypothesis that there is some
    difference. Thus, if it is significant then we
    reject the null hypothesis and we have a problem
    of heteroskedasticity.
Write a Comment
User Comments (0)
About PowerShow.com