Autocorrelation

About This Presentation

Title:

Autocorrelation

Description:

Chapter 6 Autocorrelation What is in this Chapter? How do we detect this problem? What are the consequences? What are the solutions? What is in this Chapter? – PowerPoint PPT presentation

Number of Views:214

Avg rating:3.0/5.0

Slides: 144

Provided by: acer181

Category:

more less

Transcript and Presenter's Notes

Title: Autocorrelation

1
Chapter 6

Autocorrelation

2
What is in this Chapter?

How do we detect this problem?
What are the consequences?
What are the solutions?

3
What is in this Chapter?

Regarding the problem of detection, we start with
the Durbin-Watson (DW) statistic, and discuss its
several limitations and extensions. We discuss
Durbin's h-test for models with lagged dependent
variables and tests for higher-order serial
correlation.
We discuss (in Section 6.5) the consequences of
serially correlated errors and OLS estimators.

4
What is in this Chapter?

The solutions to the problem of serial
correlation are discussed in Section 6.3
(estimation in levels versus first differences),
Section 6.9 (strategies when the DW test
statistic is significant), and Section 6.10
(trends and random walks).
This chapter is very important and the several
ideas have to be understood thoroughly.

5
6.1 Introduction

The order of autocorrelation
In the following sections we discuss how to
1. Test for the presence of serial correlation.
2. Estimate the regression equation when the
errors are serially correlated.

6
6.2 Durbin-Watson Test
7
6.2 Durbin-Watson Test
8
6.2 Durbin-Watson Test
9
6.2 Durbin-Watson Test
10
6.2 Durbin-Watson Test
11
6.3 Estimation in Levels Versus First Differences

Simple solutions to the serial correlation
problem First Difference
If the DW test rejects the hypothesis of zero
serial correlation, what is the next step?
In such cases one estimates a regression by
transforming all the variables by ?-differencing
(quasi-first difference) or first-difference

12
6.3 Estimation in Levels Versus First Differences
13
6.3 Estimation in Levels Versus First Differences
14
6.3 Estimation in Levels Versus First Differences

When comparing equations in levels and first
differences, one cannot compare the R2 because
the explained variables are different.
One can compare the residual sum of squares but
only after making a rough adjustment. (Please
refer to P.231)

15
6.3 Estimation in Levels Versus First Differences
16
6.3 Estimation in Levels Versus First Differences
17
6.3 Estimation in Levels Versus First Differences

Since we have comparable residual sum of squares
(RSS), we can get the comparable R2 as well,
using the relationship RSS Syy(l R2)

18
6.3 Estimation in Levels Versus First Differences
19
6.3 Estimation in Levels Versus First Differences

Illustrative Examples

20
6.3 Estimation in Levels Versus First Differences
21
6.3 Estimation in Levels Versus First Differences
22
6.3 Estimation in Levels Versus First Differences
23
6.3 Estimation in Levels Versus First Differences
24
6.3 Estimation in Levels Versus First Differences

Usually, with time-series data, one gets high R2
values if the regressions are estimated with the
levels yt and Xt but one gets low R2 values if
the regressions are estimated in first
differences (yt yt-1) and (xt xt-1)
Since a high R2 is usually considered as proof of
a strong relationship between the variables under
investigation, there is a strong tendency to
estimate the equations in levels rather than in
first differences.
This is sometimes called the R2 syndrome."

25
6.3 Estimation in Levels Versus First Differences

However, if the DW statistic is very low, it
often implies a misspecified equation, no matter
what the value of the R2 is
In such cases one should estimate the regression
equation in first differences and if the R2 is
low, this merely indicates that the variables y
and x are not related to each other.

26
6.3 Estimation in Levels Versus First Differences

Granger and Newbold present some examples with
artificially generated data where y, x, and the
error u are each generated independently so that
there is no relationship between y and x
But the correlations between yt and yt-1,.Xt and
Xt-1, and ut and ut-1 are very high
Although there is no relationship between y and x
the regression of y on x gives a high R2 but a
low DW statistic

27
6.3 Estimation in Levels Versus First Differences

When the regression is run in first differences,
the R2 is close to zero and the DW statistic is
close to 2
Thus demonstrating that there is indeed no
relationship between y and x and that the R2
obtained earlier is spurious
Thus regressions in first differences might often
reveal the true nature of the relationship
between y and x.
Further discussion of this problem is in Sections
6.10 and 14.7

28
Homework

Find the data
Y is the Taiwan stock index
X is the U.S. stock index
Run two equations
The equation in levels (log-based price)
The equation in the first differences
A comparison between the two equations
The beta estimate and its significance
The R square
The value of DW statistic
Q Adopt the equation in levels or the first
differences?

29
6.3 Estimation in Levels Versus First Differences

For instance, suppose that we have quarterly
data then it is possible that the errors in any
quarter this year are most highly correlated with
the errors in the corresponding quarter last year
rather than the errors in the preceding quarter
That is, ut could be uncorrelated with ut-1 but
it could be highly correlated with ut-4.
If this is the case, the DW statistic will fail
to detect it
What we should be using is a modified statistic
defined as

30
6.3 Estimation in Levels Versus First Differences
31
6.4 Estimation Procedures with Autocorrelated
Errors
32
6.4 Estimation Procedures with Autocorrelated
Errors
33
6.4 Estimation Procedures with Autocorrelated
Errors
34
6.4 Estimation Procedures with Autocorrelated
Errors
35
6.4 Estimation Procedures with Autocorrelated
Errors

GLS (Generalized least squares)

36
6.4 Estimation Procedures with Autocorrelated
Errors
37
6.4 Estimation Procedures with Autocorrelated
Errors

In actual practice ? is not known
There are two types of procedures for estimating
1. Iterative procedures
2. Grid-search procedures.

38
6.4 Estimation Procedures with Autocorrelated
Errors
39
6.4 Estimation Procedures with Autocorrelated
Errors
40
6.4 Estimation Procedures with Autocorrelated
Errors
41
6.4 Estimation Procedures with Autocorrelated
Errors
42
6.4 Estimation Procedures with Autocorrelated
Errors
43
6.4 Estimation Procedures with Autocorrelated
Errors
44
Homework

Redo the example (see Table 3.11 for the data) in
the Textbook
OLS
C-O procedure
H-L procedure with the interval of 0.01
Compare the R2 (Note please calculate the
comparable R2 form the levels equation)

45
6.5 Effect of AR(1) Errors on OLS Estimates

In Section 6.4 we described different procedures
for the estimation of regression models with
AR(1) errors
We will now answer two questions that might arise
with the use of these procedures
1. What do we gain from using these procedures?
2. When should we not use these procedures?

46
6.5 Effect of AR(1) Errors on OLS Estimates

First, in the case we are considering (i.e., the
case where the explanatory variable Xt is
independent of the error ut), the OLS estimates
are unbiased
However, they will not be efficient
Further, the tests of significance we apply,
which will be based on the wrong covariance
matrix, will be wrong.

47
6.5 Effect of AR(1) Errors on OLS Estimates

In the case where the explanatory variables
include lagged dependent variables, we will have
some further problems, which we discuss in
Section 6.7
For the present, let us consider the simple
regression model

48
6.5 Effect of AR(1) Errors on OLS Estimates
49
6.5 Effect of AR(1) Errors on OLS Estimates
50
6.5 Effect of AR(1) Errors on OLS Estimates
51
6.5 Effect of AR(1) Errors on OLS Estimates
52
6.5 Effect of AR(1) Errors on OLS Estimates
53
6.5 Effect of AR(1) Errors on OLS Estimates
54
6.5 Effect of AR(1) Errors on OLS Estimates
55
6.5 Effect of AR(1) Errors on OLS Estimates
56
An Alternative Method to Prove the Above
Characteristics???

Use simulation method as shown at Chapter 5
Write your program by the Gauss program
Take the program at Chapter 5 and make some
modifications on it

57
6.5 Effect of AR(1) Errors on OLS Estimates

Thus the consequences of autocorrelated errors
are
1. The least squares estimators are unbiased but
are not efficient. Sometimes they are
considerably less efficient than the procedures
that take account of the autocorrelation
2. The sampling variances are biased and
sometimes likely to be seriously understated.
Thus R2 as well as t and F statistics tend to be
exaggerated.

58
6.5 Effect of AR(1) Errors on OLS Estimates
59
6.5 Effect of AR(1) Errors on OLS Estimates

2. The discussion above assumes that the true
errors are first-order autoregressive. If they
have a more complicated structure (e.g.,
second-order autoregressive), it might be thought
that it would still be better to proceed on the
assumption that the errors are first-order
autoregressive rather than ignore the problem
completely and use the OLS method???
Engle shows that this is not necessarily true
(i.e., sometimes one can be worse off making the
assumption of first-order autocorrelation than
ignoring the problem completely).

60
6.5 Effect of AR(1) Errors on OLS Estimates
61
6.5 Effect of AR(1) Errors on OLS Estimates
62
6.5 Effect of AR(1) Errors on OLS Estimates
63
6.7 Tests for Serial Correlation in Models with
Lagged Dependent Variables

In previous sections we considered explanatory
variables that were uncorrelated with the error
term
This will not be the case if we have lagged
dependent variables among the explanatory
variables and we have serially correlated errors
There are several situations under which we would
be considering lagged dependent variables as
explanatory variables
These could arise through expectations,
adjustment lags, and so on.

64
6.7 Tests for Serial Correlation in Models with
Lagged Dependent Variables

The various situations and models are explained
in Chapter 10. For the present we will not be
concerned with how the models arise. We will
merely study the problem of testing for
autocorrelation in these models
Let us consider a simple model

65
6.7 Tests for Serial Correlation in Models with
Lagged Dependent Variables
66
6.7 Tests for Serial Correlation in Models with
Lagged Dependent Variables
67

new
format /m1 /rd 9,3
beta2
T30 _at_ sample number _at_
uRndn(T,1)
xRndn(T,1)0u
ybetaxu
_at_ OLS _at_
Beta_OLSolsqr(y,x)
print " OLS beta estimate "
Beta_OLS

68
(No Transcript)
69

new
format /m1 /rd 9,3
beta2
T50000 _at_ sample number _at_
uRndn(T,1)
xRndn(T,1)0u
ybetaxu
_at_ OLS _at_
Beta_OLSolsqr(y,x)
print " OLS beta estimate "
Beta_OLS

70
(No Transcript)
71

new
format /m1 /rd 9,3
beta2
T50000 _at_ sample number _at_
uRndn(T,1)
xRndn(T,1)0.5u
ybetaxu
_at_ OLS _at_
Beta_OLSolsqr(y,x)
print " OLS beta estimate "
Beta_OLS

72
(No Transcript)
73
6.7 Tests for Serial Correlation in Models with
Lagged Dependent Variables
74
6.7 Tests for Serial Correlation in Models with
Lagged Dependent Variables
75
6.7 Tests for Serial Correlation in Models with
Lagged Dependent Variables
76
6.7 Tests for Serial Correlation in Models with
Lagged Dependent Variables
77
6.7 Tests for Serial Correlation in Models with
Lagged Dependent Variables
78
6.8 A General Test for Higher-Order Serial
Correlation The LM Test

The h-test we have discussed is, like the
Durbin-Watson test, a test for first-order
autoregression.
Breusch and Godfrey discuss some general tests
that are easy to apply and are valid for very
general hypotheses about the serial correlation
in the errors
These tests are derived from a general principle
called the Lagrange multiplier (LM) principle
A discussion of this principle is beyond the
scope of this book. For the present we will
explain what the test is
The test is similar to Durbin's second test that
we have discussed

79
6.8 A General Test for Higher-Order Serial
Correlation The LM Test
80
6.8 A General Test for Higher-Order Serial
Correlation The LM Test
81
6.8 A General Test for Higher-Order Serial
Correlation The LM Test
82
6.8 A General Test for Higher-Order Serial
Correlation The LM Test
83
6.8 A General Test for Higher-Order Serial
Correlation The LM Test
84
6.9 Strategies When the DW Test Statistic is
Significant

The DW test is designed as a test for the
hypothesis ? 0 if the errors follow a
first-order autoregressive process
However, the test has been found to be robust
against other alternatives such as AR(2), MA(1),
ARMA(1, 1), and so on.
Further, and more disturbingly, it catches
specification errors like omitted variables that
are themselves autocorrelated, and misspecified
dynamics (a term that we will explain). Thus the
strategy to adopt, if the DW test statistic is
significant, is not clear. We discuss three
different strategies

85
6.9 Strategies When the DW Test Statistic is
Significant

1. Assume that the significant DW statistic is an
indication of serial correlation but may not be
due to AR(1) errors
2. Test whether serial correlation is due to
omitted variables.
3. Test whether serial correlation is due to
misspecified dynamics.

86
6.9 Strategies When the DW Test Statistic is
Significant
87
6.9 Strategies When the DW Test Statistic is
Significant
88
6.9 Strategies When the DW Test Statistic is
Significant
89
6.9 Strategies When the DW Test Statistic is
Significant
90
6.9 Strategies When the DW Test Statistic is
Significant
91
6.9 Strategies When the DW Test Statistic is
Significant

Serial correlation due to misspecification
dynamics

92
6.9 Strategies When the DW Test Statistic is
Significant
93
6.9 Strategies When the DW Test Statistic is
Significant
94
6.9 Strategies When the DW Test Statistic is
Significant
95
6.9 Strategies When the DW Test Statistic is
Significant
96
6.10 Trends and Random Walks
97
6.10 Trends and Random Walks
98
6.10 Trends and Random Walks
99
6.10 Trends and Random Walks
100
6.10 Trends and Random Walks

Both the models exhibit a linear trend. But the
appropriate method of eliminating the trend
differs
To test the hypothesis that a time series belongs
to the TSP class against the alternative that it
belongs to the DSP class, Nelson and Plosser use
a test developed by Dickey and Fuller

101
6.10 Trends and Random Walks
102
6.10 Trends and Random Walks
103
Three Types of RW

RW without drift Yt1Yt-1ut
RW with drift Ytalpha1Yt-1ut
RW with drift and time trend Ytalphabetat1Yt
-1ut
utiid(0,sigma)

104
(No Transcript)
105
(No Transcript)
106
(No Transcript)
107
(No Transcript)
108
RW or Unit Root tests by E-view

Additional Slides
Augmented D-F tests
Yta1Yt-1ut
Yt-Yt-1(a1-1)Yt-1ut
?Yt(a1-1)Yt-1ut
?Yt?Yt-1ut
H0a11 H0 ?0
?Yt?Yt-1S?Yt-iut

109
6.10 Trends and Random Walks
110
6.10 Trends and Random Walks

As an illustration consider the example given by
Dickey and Fuller.36 For the logarithm of the
quarterly Federal Reserve Board Production Index
1950-1 through 1977-4 they assume that the time
series is adequately represented by the model

111
6.10 Trends and Random Walks
112
6.10 Trends and Random Walks
113
6.10 Trends and Random Walks

6. Regression of one random walk on another, with
time included for trend, is strongly subject to
the spurious regression phenomenon. That is, the
conventional t-test will tend to indicate a
relationship between the variables when none is
present.

114
6.10 Trends and Random Walks

The main conclusion is that using a regression on
time has serious consequences when, in fact, the
time series is of the DSP type and, hence,
differencing is the appropriate procedure for
trend elimination
Plosser and Schwert also argue that with most
economic time series it is always best to work
with differenced data rather than data in levels
The reason is that if indeed the data series are
of the DSP type, the errors in the levels
equation will have variances increasing over time

115
6.10 Trends and Random Walks

Under these circumstances many of the properties
of least squares estimators as well as tests of
significance are invalid
On the other hand, suppose that the levels
equation is correctly specified. Then all
differencing will do is produce a moving average
error and at worst ignoring it will give
inefficient estimates
For instance, suppose that we have the model

116
6.10 Trends and Random Walks
117
6.10 Trends and Random Walks

Differencing and Long-Run EffectsThe Concept of
Cointegration
One drawback of the procedure of differencing is
that it results in a loss of valuable "long-run
information" in the data
Recently, the concept of cointegrated series has
been suggested as one solution to this problem.39
First, we need to define the term
"cointegration.
Although we do not need the assumption of
normality and independence, we will define the
terms under this assumption.

118
6.10 Trends and Random Walks
119
6.10 Trends and Random Walks

YtI(1)
Yt is a random walk
?Yt is a white noise, or iid
No one could predict the future price change
The market is efficient
The impact of previous shock on the price will
remain and not approach to zero

120
6.10 Trends and Random Walks
121
6.10 Trends and Random Walks
122
6.10 Trends and Random Walks
123
Cointegration
124
Cointegration
125
Cointegration

Run the VECM (vector error correction model) by
E-view
Additional slides

126
Cointegration
127
Lead-lag relation obtained with VECM model

If beta_A is significant and beta_U is
insignificant,
the price adjustment mainly depends on ADR
markets
ADR prices converge to UND prices
UND prices lead ADR prices in price discovery
process
UND prices provide an information advantage

128

If beta_U is significant and beta_A is
insignificant,
the price adjustment mainly depends on UND
markets
UND prices converge to ADR prices
ADR prices lead UND prices in price discovery
process
ADR prices provide an information advantage

129

If both of beta_U and beta_A are significant
suggesting a bidirectional error correction
The equilibrium prices line within ADR and UND
prices
Both ADR and UND prices converge to the
equilibrium prices

130

If both of beta_U and beta_A are significant, but
the beta_U is greater than beta_A in absolute
velue
The finding denotes that it is the UND price that
makes greater adjustment in order to reestablish
the equilibrium
That is, most of the price discovery takes place
at the ADR market.

131
Homework

Find the spot and futures prices
Daily and 5-year data at least
Run the cointegration test
Run the VECM
Lead-lag relationship

132
6.11 ARCH Models and Serial Correlation

We saw in Section 6.9 that a significant DW
statistic can arise through a number of
misspecifications.
We will now discuss one other source. This is the
ARCH model suggested by Engle which has, in
recent years, been found useful in the analysis
of speculative prices.
ARCH stands for "autoregressive conditional
heteroskedasticity."

133
6.11 ARCH Models and Serial Correlation

GARCH (p,q) Model

134
6.11 ARCH Models and Serial Correlation

The high level of persistence in GARCH models
the sum of the two GARCH parameter estimates
approximates unity in most cases
Li and Lin (2003) This finding provides some
support for the notion that GARCH models are
handicapped by the inability to account for
structural changes during the estimation period
and thus suffers from a high persistence problem
in variance settings.

135
6.11 ARCH Models and Serial Correlation