B AD 6243: Applied Univariate Statistics - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

B AD 6243: Applied Univariate Statistics

Description:

... To graphically represent the equation Y ... Linear relationships ... Applied Univariate Statistics Basics of Multiple Regression A Graphical Representation ... – PowerPoint PPT presentation

Number of Views:107
Avg rating:3.0/5.0
Slides: 18
Provided by: chid157
Category:

less

Transcript and Presenter's Notes

Title: B AD 6243: Applied Univariate Statistics


1
B AD 6243 Applied Univariate Statistics
  • Multiple Regression
  • Professor Laku Chidambaram
  • Price College of Business
  • University of Oklahoma

2
Basics of Multiple Regression
  • Multiple regression examines the relationship
    between one interval/ratio level variable and two
    or more interval/ratio (or dichotomous) variables
  • As in simple regression, the dependent (or
    criterion) variable is y and the other variables
    are the independent (or predictor) variables xi
  • The intent of the regression model is to find a
    linear combination of xs that best correlate
    with y
  • The model is expressed as Y ?0 ?1Xi ?2X2
    ?nXn ?I

3
A Graphical Representation
Objective To graphically represent the equation
Y ?0 ?1Exp_X1 ?2RlExp_X2 ?I
4
Selecting Predictors
  • Rely on theory to inform selection
  • Examine correlation matrix to determine strength
    of relationships with Y
  • Use variables based on your knowledge
  • Let the computer decide based on data set

5
Selecting Method of Inclusion
  • Enter
  • Enter Block
  • Stepwise
  • Forward selection
  • Backward elimination
  • Stepwise

6
What to Look For?
  • b-values vs. standardized beta weights (ß)
  • R represents correlation between observed values
    and predicted values of Y
  • R-squared represents the amount of variance
    shared between Y and all the predictors combined
  • Adjusted R-squared

7
First Order Assumptions
  • Continuous variables (also see next slide)
  • Linear relationships between Y and Xs
  • Sufficient variance in values of predictors
  • Predictors uncorrelated with external variables

8
Including Categorical Variables
  • Dichotomous variables e.g., Gender
  • Coded as 0 or 1
  • Dummy variables
  • e.g., Political affiliation
  • Create d - 1 dummy variables, where d is the
    number of categories
  • So, with four categories, you need three dummy
    variables

Variable/ Category D1 D2 D3
Democrat 1 0 0
Republican 0 1 0
Libertarian 0 0 1
Other 0 0 0
9
Second Order Assumptions
  • Independence of independent variables
  • Equality of variance
  • Normal distribution of error terms
  • Independence of observations

10
Violations of Assumptions
PROBLEM DEFINITION DETECTION
Multicollinearity Predictor variables are highly correlated High inter-correlations Examine VIFs and tolerances
Heteroskedasticity Error terms do not have a constant variance Scatter plot of residuals Split file to examine variances
Outliers Error terms not normally distributed Cooks distance Mahalanobis distance Residual plots
Autocorrelation Residuals are correlated Durbin-Watson ? 2 (If lt 2, then correlation If gt 2, then correlation)
11
Multicollinearity
  • High correlations among predictors
  • Can result in
  • Lower value of R
  • Difficulty of judging relative importance of
    predictors
  • Increases instability of model
  • Possible solutions
  • Examine correlation matrices, VIFs and tolerances
    to judge if predictor(s) need to be dropped
  • Rely on computer assisted means
  • Other options

12
Heteroskedasticity
  • Systematic increase or decrease in variance
  • Can result in
  • Confidence intervals being too wide or narrow
  • Unstable estimates
  • Possible solutions
  • Transform data
  • Other options

13
Outliers
  • Undue influence of extreme values
  • Can result in
  • Incorrect estimates and inaccurate confidence
    intervals
  • Possible solutions
  • Identify and eliminate value(s), but
  • Transform data
  • Other options

14
Autocorrelation
  • Observations are not independent (typically,
    observations over time)
  • Can result in
  • Lower standard error of estimate
  • Lower standardized beta values
  • Possible solutions
  • Search for key missing variables
  • Cochrane-Orcutt Procedure
  • Other options

15
Results of Analysis
16
Results of Analysis (contd.)
17
A Graphical Representation
Write a Comment
User Comments (0)
About PowerShow.com