Title: Autocorrelation in Regression Analysis
1Autocorrelation in Regression Analysis
- What is Autocorrelation?
- What causes Autocorrelation?
- Tests for Autocorrelation
- Examples
- Durbin-Watson Tests
- Modeling Autoregressive Relationships
2What is Autocorrelation?
- Correlation between values of the same variable
across observations - Violation of the assumption
- where
- In the presence of autocorrelation, the function
of Y can be expressed as - the function
- where
- defined as
3What is Autocorrelation?
4Where do we find Autocorrelation?
- Autocorrelation is most often a problem in time
series or geographic data - It reflects changes in data that are a function
of proximity in time or space - Examples
- Energy market price shocks
- Transitions depend on prior states
- Economic consequences of LULUs
- Distance from hazard influences magnitude of
price effect
5Federal Budget Example
- Incrementalists argue that the federal budget
shifts only incrementally from the prior years
budget. - Partial Effects
- Calculating partial effects interpretation
- Variable selection and model building
- Risks in model building
6Two types of Autocorrelation
- Positive autocorrelation
- This is what we normally find. If the
autocorrelation is positive, then we expect the
sign of the residual at t to be the same as at
t-1.
7Negative Autocorrelation
- We find that the sign of the residual at t is the
opposite of that at t-1 - Example a drunken amble
8What causes autocorrelation?
- Misspecification
- Data Manipulation
- Before receipt
- After receipt
- Event Inertia
- Spatial ordering
9Checking for Autocorrelation
- Test Durbin-Watson statistic
10Consider the following regression
From Statistics option in SPSS
11Find the D-upper and D-lower
- Check a Durbin Watson table for the numbers for
d-upper and d-lower. - In Hamilton thats on pp. 355-356
- For n20 and k2, a .05 the values are
- Lower 1.20
- Upper 1.41
- Because our value falls between zero and d-lower
we have positive autocorrelation
12The Runs Test
- An alternative to the D-W test is a formalized
examination of the signs of the residuals. We
would expect that the signs of the residuals will
be random in the absence of autocorrelation. - The first step is to estimate the model and
predict the residuals. - Next, order the signs of the residuals against
time (or spatial ordering in the case of
cross-sectional data) and see if there are
excessive runs of positives or negatives.
Alternatively, you can graph the residuals and
look for the same trends.
13Runs test continued
The final step is to use the expected mean and
deviation in a standard t-test
14More on The D-W
- D-W is not appropriate for auto-regressive (AR)
models, where - In this case, we use the Durbin alternative test
- For AR models, need to explicitly estimate the
correlation between Yi and Yi-1 as a model
parameter - Techniques
- AR1 models (closest to regression 1st order
only) - ARIMA (any order)
15Dealing with Autocorrelation
- There are several approaches to resolving
problems of autocorrelation. - Lagged dependent variables
- Differencing the Dependent variable
- GLS
- ARIMA
16Lagged dependent variables
- The most common solution
- Simply create a new variable that equals Y at
t-1, and use as a RHS variable - This correction should be based on a theoretic
belief for the specification - Can, at times cause more problems than it solves
- Also costs a degree of freedom (lost observation)
- There are several advanced techniques for dealing
with this as well
17Differencing
- Differencing is simply the act of subtracting the
previous observation value from the current
observation. - This process is effective however, it is an
EXPENSIVE correction - This technique throws away long-term trends
- Assumes the Rho 1 exactly
18GLS and ARIMA
- GLS approaches use maximum likelihood to estimate
Rho and correct the model - These are good corrections, and can be replicated
in OLS - ARIMA is an acronym for Autoregressive Integrated
Moving Average - This process is a univariate filter used to
cleanse variables of a variety of pathologies
before analysis
19Corrections based on Rho
- There are several ways to estimate rho, the most
simple being calculating it from the residuals
We then estimate the regression by transforming
the regressors so that
and This
gives the regression
20Estimating the relationship between X and Y
- First, we can estimate the lagged dependent
variable model.
21Now the regression correcting for Rho
- We can estimate Rho by calculating it.
- ? .587
22Final thoughts
- Each correction has a best application.
- If we wanted to evaluate a mean shift (dummy
variable only model), calculating rho will not be
a good choice. Then we would want to use the
lagged dependent variable - Also, where we want to test the effect of
inertia, it is probably better to use the lag - In Small N, calculating rho tends to be more
accurate
23Homework
- Using the data that accompany this lecture,
estimate the effect of X on Y. - Run the regular regression, a lagged dependent
variable model and calculate rho. - Next, test the effect of dummy variable X2 on the
series Y2. - Run a regular regression, then run a regression
with a lagged dependent variable. - Write a brief description of what problems
neglecting the effect of time in the second model
might cause a decision-maker
24BreakComing up
- Review for Exam
- Exam Posting
- Available on Wednesday Morning, 10am