Ka-fu Wong University of Hong Kong

About This Presentation

Title:

Ka-fu Wong University of Hong Kong

Description:

... behavior of the series associated with weather patterns, holiday patterns, etc. ... instance, in the linear trend model, the forecast of TT h made at time T ... – PowerPoint PPT presentation

Number of Views:30

Avg rating:3.0/5.0

Slides: 48

Provided by: kafuw

Category:

more less

Transcript and Presenter's Notes

Title: Ka-fu Wong University of Hong Kong

1
Ka-fu WongUniversity of Hong Kong
Modeling and Forecasting Trends
2
Background

The unobserved components approach to modeling
and forecasting economic time series assumes that
the typical economic time series, yt, is made up
of the sum of three independent components
a time trend component
a seasonal component
an irregular or cyclical component.
yt time trend seasonal cyclical Tt
St Ct
The time trend refers to the long-run average
behavior of the series.
The seasonal refers to the annual predictable
cyclical behavior of the series associated with
weather patterns, holiday patterns, etc.
The cyclical component refers to the remainder of
the series after the trend and seasonal have been
accounted for.

3
Background

The assumption that these components are
determined independently means that each
component is determined and influenced by its own
set of forces and, consequently, each component
can be studied separately.
The approach is called an unobserved components
approach because we do not directly observe each
of the three components we only get to observe
their sum. Our job will be to model and estimate
the various components and use these estimates as
the basis for forecasting the components and
their sum.

4
Background

Whether the assumption underlying the unobserved
components approach, that the trend, seasonal,
and cyclical components are determined
independently, is plausible or not is debatable
and is, in fact, an issue of some controversy
among economists.
For example, many macroeconomists argue that
economic growth (trend) and the business cycle
(cyclical) are determined by a common set of
forces.

5
U.S. Female Labor Force Participation Rate
6
U.S. Male Labor Force Participation Rate
7
Hong Kong labor force participation rates (male
and female)
8
Chinas per Capita Real GDP
9
Modeling the Trend

If we look at Chinas per capita real GDP time
series or any one of your time series, the first
thing that stands out us is the obvious tendency
of the series to grow (or, in some cases, to
fall) over time.
That is, it is immediately apparent from the time
series plot that the average change in the series
is positive (or, in some cases, negative). This
tendency is the seriess trend.
The simplest model of the time trend is the
linear trend model
Tt ß0 ß1t, t 1,,T

10
Modeling the Trend

The simplest model of the time trend is the
linear trend model
Tt ß0 ß1t, t 1,,T
The trend component is a straight line with
intercept ß0 and slope ß1. And, T1 ß0 ß1, T2
ß0 2ß1,,TT ß0 Tß1.
Note that ß1 dTt/dt and ß1 Tt Tt-1. So,
ß1 gt 0 if y has a positive trend and
ß1lt 0 if y has a negative trend.
The intercept, as is often the case in
econometric models, does not have a meaningful
interpretation and its sign can be positive or
negative, regardless of the trends sign.

11
Graphical view of linear trend
An downward trend
An upward trend
12
Polynomial trend model

In some cases, a linear trend is inadequate to
capture the trend of a time series. A natural
generalization of the linear trend model is the
polynomial trend model
Tt ß0 ß1t ß2t2 ßptp where p is a
positive integer.
Note that the linear trend model is a special
case of the polynomial trend model (p1).
For economic time series we almost never require
p gt 2. That is, if the linear trend model is not
adequate, the quadratic trend model will usually
work
Tt ß0 ß1t ß2t2
In the quadratic model, dTt/dt ß12tß2

13
Graphical view of Quadratic Trends
14
Graphical view of Quadratic Trends
15
The Log Linear Trend Model

Another alternative to the linear trend model is
the log linear trend model, which is also called
the exponential trend model
Tt ß0exp(ß1t)
or, taking natural logs on both sides,
log(Tt) log(ß0) ß1t
so that the log of the trend component is
linear.
Note that for the log linear trend model
ß1 log(Tt) log(Tt-1) change in T

16
Graphical view of exponential trends
17
Graphical view of exponential trends
18
Which trend model to use?

Knowing the differences among these models can
help us decide whether the linear, quadratic or
log linear trend model is more appropriate for
our data.
In the linear trend model the change in T is
constant over time.
In the quadratic trend model the change in T has
a linear trend.
In the log linear trend model the growth rate
that is constant over time.
However, in practice, it is not always obvious by
simply looking at the time series plot which form
the trend model should take linear, log linear,
quadratic? Other?
Practice and experience are the most helpful.

19
All Deterministic Trend Models

Note that in all of these models, the trend is
deterministic, i.e., perfectly forecastable. For
instance, in the linear trend model, the forecast
of TTh made at time T is
ß0 (Th)ß1 TTh
(Later in the course we will talk about
stochastic trend models, in which the trend of
the series is not perfectly forecastable.)
However, even if we correctly specify the shape
of the trend (linear, quadratic, exponential, ),
the parameters of the trend model are unknown.
So, in practice, we will have to estimate these
parameters, which will introduce errors (called
sampling or estimation error) into our trend
forecasts.

20
Estimating the Trend Model

Our assumption at this point is that our time
series, yt, can be modeled as
yt Tt(?) et
where
Tt is one of the trend models we discussed
earlier,
? is the set of parameters ? (ß0, ß1) in a
linear trend model.
et denotes the other factors (i.e., the seasonal
and cyclical components) that determine yt.
Since ? is unknown, it is natural to estimate the
trend model via the least squares approach

Quadratic loss
The choice of ? that will minimize the objective
function.
21
Estimating the Trend Model via the Least Squares
approach

For linear trend model

can use OLS

For quadratic trend model

can use OLS

For exponential trend model

Nonlinear, has to be estimated numerically.
or
can use OLS
22
Property of the Ordinary Least Squares Estimators

Under the assumptions of the unobserved
components model, the OLS estimator of the linear
and quadratic trend models is
unbiased,
consistent, and
asymptotically efficient.
Standard regression procedures can be applied to
test hypotheses about the ?s and construct
interval estimates. This is true even though the
regression errors will generally be serially
correlated and heteroskedastic.

23
Forecasting the Trend

Once we have specified a trend model our forecast
of the h-step ahead trend component of y will
simply be
Tth(?)
When ? is unknown, we can estimate it as
discussed earlier. And, substitute the estimate
into the function above.

24
Forecasting the Trend
We would like to forecast yTh based on all
information available at time T.

Assume that the trend is linear.

If we know the true parameters, the part
?0?1TIMETh can be forecasted perfectly.
Can we forecast eTh? Sometimes YES. Sometimes
NO.
NO when et is known to be an independent
zero-mean random noise.
If et is an i.i.d. sequence with zero mean then
E(eTh ? information available at time T)
E(eTh) E(et) 0.
independent
identical
zero mean
25
Forecasting the Trend
Assume et is known to be an independent zero-mean
random noise.
Forecast when parameters are known
Emphasize that forecast is made at time T,
utilizing all information that is available at
time T (usually all past information).
Forecast error
Fundamental uncertainty! Unavoidable !!
Forecast when parameters are unknown
Substitute in the estimate from the OLS
regression.
Forecast error
Due to parameter uncertainty. (increases with h)
Note TIMETh Th
26
Density forecast

Suppose we have no parameter uncertainty, we have

and
The forecast error

Then the distribution of the forecast error will
simply be the distribution of ?Th. That is, for
any real number c,

E(eTh,T) 0,Var(eTh,T) ?2, where ?2
var(?t).
27
Density forecast

Further assume the ?s are i.i.d. N(0,?2), while
continuing to ignore parameter uncertainty. Then
the density forecast will be that

Note that this density forecast depends on the
unknown parameter ?2. To make the density
forecast operational, we can replace ?2 with an
unbiased and consistent estimator,

28
Density forecast

Now consider the case with parameter uncertainty.

Under usual assumptions, the forecast error due
to parameter uncertainty is asymptotically normal.

Thus, eTh,T will be asymptotically normal.

The unknown variance may be estimated as

Because we assume a linear trend.
29
(No Transcript)
30
Density forecast
Then we act as though yTh is distributed as
or, equivalently,
So, for example,

where Z is an N(0,1) random variable.
31
Density forecast

Further, we can construct interval forecasts of
yTh according to

is a (1-?)100 forecast interval for yTh,
where Z1-(?/2) is the (1-(?/2))100 percentile
of the N(0,1) distribution.

For example, if ? .05 then we obtain a
95-percent forecast interval for yTh,

since 1.96 is the 97.5 percentile of the N(0,1).

Recall the interpretation of this kind of
interval 95 of the time, this procedure will
produce an interval that will turn out to include
the actual value of yTh.

32
Selecting Forecasting ModelsR-square as a
criteria

Consider the mean squared error (MSE)

where T is the sample size and

Note that models with smallest MSE is also the
model with smallest sum of squared residuals,
because scaling the sum of squared residuals with
a constant (1/T) will not change the ranking.

33
Selecting Forecasting ModelsR-square as a
criteria

Consider the R-square (R2)

Depends only on data, not on model.

Thus, models with the largest R-square is also
the model with the smallest MSE, and also the
model with smallest sum of squared residuals,
because scaling the sum of squared residuals by a
model-independent quantity will not change the
ranking.

34
Selecting Forecasting ModelsR-square as a
criteria

The R-square (R2) may be a good measure of
in-sample fit but a bad measure for out-of-sample
fit.

Add an additional regressor in the model, we will
always obtain a R-square (R2) no less than the
one with less regressors. That is, a polynomial
trend model with a larger p will almost always
result in a smaller MSE and hence a larger
R-square.

In fact, give me a time series and specify an R2,
subject to data availability, I can almost always
produce a trend model that will attain the
specific R2.
This effect is called in-sample overfitting or
data mining.

35
Selecting Forecasting ModelsR-square as a
criteria

In short, the MSE is a biased estimator of
out-of-sample h-step-ahead prediction error
variance.
because the forecast error consists of two
parts
Fundamental uncertainty (unavoidable even if we
know the parameters)
Parameter uncertainty (increases with the number
of parameters in the model)

To reduce the bias associated with MSE and
R-square, we need to penalize for the number of
parameters included in the model (or the degree
of freedom).

36
Selecting Forecasting ModelsAdjusted R-square as
a criteria

Adjusted R-square

Number of parameters or degree of freedom

Maximizing adjusted R-square is like minimizing
s2.

S2 increases with number of parameters.
37
Selecting Forecasting ModelsCriteria that
penalize number of model parameters

Akaike information criterion (AIC)

Schwarz information criterion (SIC)

38
The variation of criteria with k/T
39
Use the consistent model selection criteria

A model selection criterion is consistent if the
following conditions are met
When the true model i.e., the data-generating
process (DGP) is among the models considered,
the probability of selecting the true DGP
approaches 1 as the sample size gets large.
When the true model is not among those
considered, so that it is impossible to select
the true DGP, the probability of selecting the
best approximation to the true DGP approaches 1
as the sample size gets large.
SIC is consistent but AIC is not.

40
Use the asymptotically efficient model selection
criteria

A asymptotically efficient model selection
criterion chooses a sequence of models, as the
sample size get large, whose 1-step-ahead
forecast error variances approach the one that
would be obtained using the true model with known
parameters at a rate at least as fast as that of
any other model selection criteria.
AIC is asymptotically efficient but SIC is not.

41
AIC or SIC

Usually AIC and SIC suggest the same model.
When AIC and SIC suggest different models, we
usually choose the model selected by SIC because
the SIC often suggests a more parsimonious model
(i.e., smaller number of parameters).

42
AIC and SIC reported across software packages

ln(AIC) ln(MSE) 2k/T
ln(SIC) ln(MSE) kln(T)/T

43
Out-of-sample fitting

The AIC and SIC are in-sample fit criteria,
although they account for the costs of
overfitting through the inclusion of penalty
term.
What we are really interested in is the question
Having fit the model over the sample period, how
well does it forecast outside of that sample?
The in-sample fit criteria that we discussed do
not directly answer this question.

44
Out-of-sample fitting

Suppose we have a data sample y1,,yT.
Break it up into two parts (where n ltlt T)
y1,yT-n (first T-n observations)
yT-n1,,yT (last n observations)

1
T-n
T
T-n1
Use to estimate the model
Save n observations for checking the
out-of-sample fit
45
Out-of-sample fitting

Break it up into two parts (where n ltlt T)
y1,yT-n (first T-n observations)
yT-n1,,yT (last n observations)
Fit the shortened sample, y1,,yT-n to various
trend models that may seem like plausible choices
based on time series plots, in-sample fit
criteria, linear, quadratic, the one selected
by AIC/SIC, log linear,
For each estimated trend model, forecast
yT-n1,,yT and compute the forecast errors
e1,,en
Compare the errors across the various models
time series plots (of the forecasts and actual
values of yT-n1,,yT of the forecast errors)
tables of the forecasts, actuals, and errors
mean squared prediction errors (MSPE)

46
Out-of-sample fitting

The advantage of this approach is that we are
actually comparing the trend models in terms of
their out-of-sample forecasting performance.
A disadvantage is that the comparison is based on
models fit over T-n observations rather than the
T observations we have available. (Note that if
you do use this approach and, for example, settle
on the quadratic model, then when you proceed to
construct your forecasts for T1, you should
use the quadratic model fit to the full T
observations in your sample.)
Will the fact that, for example, the quadratic
trend model outperformed other models in
forecasting out of sample based on the short
sample mean that it will perform best in
forecasting beyond the full sample? No.

47
End

Write a Comment

User Comments (0)

About PowerShow.com

Ka-fu Wong University of Hong Kong - PowerPoint PPT Presentation

Ka-fu Wong University of Hong Kong

... behavior of the series associated with weather patterns, holiday patterns, etc. ... instance, in the linear trend model, the forecast of TT h made at time T ... – PowerPoint PPT presentation