Univariate Time series - 2 - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Univariate Time series - 2

Description:

... Time series - 2. Methods of Economic Investigation ... any time series that is covariance stationary, has a linear ARMA representation. Information Sets ... – PowerPoint PPT presentation

Number of Views:73
Avg rating:3.0/5.0
Slides: 21
Provided by: sunt6
Category:

less

Transcript and Presenter's Notes

Title: Univariate Time series - 2


1
Univariate Time series - 2
  • Methods of Economic Investigation
  • Lecture 19

2
Last Time
  • Concepts that are useful
  • Stationarity
  • Ergodicity
  • Ergodic Theorem
  • Autocovariance Generating Function
  • Lag Operators
  • Time Series Processes
  • AR(p)
  • MA(q)

3
Todays Class
  • Building up to estimation
  • Wold Decomposition
  • Estimating with exogenous, serially correlated
    errors
  • Testing for Lag Length

4
Refresher
  • Stationarity Some persistence but not too much
  • Ergodicity Persistence dies out entirely over
    some finite period of time
  • Square Summability (assumption for MA process)
    with parameter ? such that
  • Invertibility (assumption for AR process) with
    parameter f which has roots ? such that

5
ARMA process
  • In general can have a process with both AR and MA
    components
  • A general ARMA(p, q) process in our lag function
    notation this looks likea(L)xt b(L)et
  • For example, we may have an ARMA(2, 1)
  • xt (f1xt-1 f2xt-3) et ?1 et-1
  • (1 - f1L f2L2) xt (1 ?1L) et
  • If the process is invertible then we can rewrite
    this as xta(L)-1 b(L) et

6
Why Focus on ARMA processes
  • Define the range of ARMA processes (invertible AR
    lag polynomials, square summable MA lag
    polynomials) which can rely on convergence
    theorems
  • any time series that is covariance stationary,
    has a linear ARMA representation.

7
Information Sets
  • At time t-n
  • Everything for time t-n and before is known
  • Everything at time t is unknown
  • Information set Ot-n
  • Define Et-n(et) Eet Ot-n
  • Distinct from Eet because we know previous
    values of es up until t-n
  • For example, suppose n 1 and et p et-1?,
  • E (et) 0 for all t so its a mean zero process
  • Et-1(et) p et-1

8
Recalling the CEF
  • Define the linear conditional expectation
    function CEF(a b) which is the linear project,
    i.e. the fitted values of a regression of a on b.
    i.e. a ßb
  • This is distinct from the general expectations
    operator in that it is imposing a linear form of
    the conditional expectation function.

9
Wold Decomposition Theorem - 1
  • Formally the Wold Decomposition Theorem says
    that
  • Any mean zero weakly stationary process xt can
    be represented in the form
  • This comes with some properties for each term

10
Wold Decomposition Theorem - 2
  • Where
  • et xt - CEF(xt xt-1, xt-2, . . . ,x0).
  • Properties of et
  • CEF (etxt-1, xt-2, . . . x0)0, E(etxt-j) 0,
    E(et) 0,
  • E(et2) s2 for all t, and E(et es) 0 for all t
    ?s
  • The MA polynomial is invertible
  • The parameters ? is square summable
  • ?j and es are unique.
  • ?t is linearly deterministic
  • i.e. ?t CEF(?txt-1, . . . .).

11
A note on the Wold Decomposition
  • Much of the properties come directly from our
    assumptions tha the process is weakly stationary
  • While it says mean zero process, remember we can
    de-mean our data so most processes can be
    represented in this format.

12
Uses of Wold Form
  • This theorem is extremely useful because it
    returns time-series processes back to our
    standard OLS model.
  • Notice that weve relaxed some of the conditions
    for the Gauss-Markov theorem to hold.
  • the Wold MA(8) representation is unique.
  • if two time series have the same Wold
    representation, then they are the same time
    series
  • This on true only up to second moments in linear
    forecasting
  •  

13
Emphasis on Linearity
  • although CEF(et xt-j) 0, can have
    E(et xt-j) ? 0 with nonlinear projections
  • If the true xt is not generated by linear
    combinations of past xt plus a shock, then the
    Wold shocks (es) will be different from the true
    shocks.
  • The uniqueness result only states that the Wold
    representation is the unique linear
    representation where the shocks are linear
    forecast errors.

14
Estimating with Serially Correlated Errors
  • Suppose that we have Yt ßXt et
  • Eet xt 0, Eet2 xts2
  • E et et-k ?k for k?0 and so define Eetek
    s2G
  • We could consistently estimate ß but our standard
    errors would be incorrect making it difficult to
    do inference.
  • Just a heteroskedasticity problem which we have
    already seen with random effects
  • Use feasible GLS to estimate weights and then
    re-estimate OLS to obtain efficient standard
    errors.

15
Endogenous Lagged Regressors
  • May be the case that either the dependent
    variable or the regressor should enter the
    estimating equation in lag values too
  • Suppose we were estimating t
  • Yt ß0 Xt ß1 Xt-1 ßk Xt-k et.
  • We think that these Xs are correlated with Y up
    to some lag length k
  • We think these Xs are correlated with each other
    (e.g. some underlying AR process)
  • but were not sure how many lags to include

16
Naive Test
  • Include lags going very far back r gtgt k
  • test the longest lag coefficient ßr 0 and see
    if that is significant. If not, drop it and keep
    going.
  • Problems
  • Practically, the longer lags you take, the more
    data you make unusable because it doesnt have
    enough time periods to construct the lags.
  • doesnt allow lag t-6 but exclude lag t-3.
  • The theoretical issue is that we will reject the
    null 5 percent of the time, even if its true (or
    whatever the significance of the test is).

17
More sophisticated testing
  • Can be a bit more sophisticated comparing
    restricted and unrestricted models
  • define pmax as some long lag length greater than
    the expected relevant lag length
  • In general, we do not test our pmax but as
    before, as p ? pmax the sample size decreases.
  • Define ej Yt ß0 Xt ß1 Xt-1 ßj Xj and
    let N be the sample size.
  • We therefore could imagine trying to minimize the
    sum of squared residual

18
Cost Functions
  • Intuition c( . ) is a penalty for adding
    additional parameters
  • thus we try to pick the best specification using
    that cost function to penalize inclusion of extra
    but irrelevant lags.
  • Akaike (AIC) c(n) 2
  • the AIC criterion is not well-founded in theory
    and will be biased in finite samples
  • the bias will tend to overstate the true lag
    length
  • Bayesian c(n) log(n)
  • the BIC will converge to the true p.

19
Return to Likelihood Ratio Tests
  • The minimization problem is just likelihood ratio
    test
  • To see this, compare lag length j to lag length
    k. We can write
  • Define constant

Constant Declining in N
LR test
20
Next Time
  • Multivariate Time Series
  • Testing for Unit Roots
  • Cointegration
  • Returning to Causal Effects
  • Impulse Response Functions
  • Forecasting
Write a Comment
User Comments (0)
About PowerShow.com