Chapter 4: Simple or Bivariate Regression

About This Presentation

Title:

Chapter 4: Simple or Bivariate Regression

Description:

Back to Jewelry ... What kind of good is jewelry, normal or inferior? ... Consider the Jewelry Example ... – PowerPoint PPT presentation

Number of Views:267

Avg rating:3.0/5.0

Slides: 107

Provided by: jsmi1

Category:

more less

Transcript and Presenter's Notes

Title: Chapter 4: Simple or Bivariate Regression

1
Chapter 4 Simple or Bivariate Regression

Terms
Dependant variable (LHS)
the series we are trying to estimate
Independent variable (RHS)
the data we are using to estimate the LHS

2
The line and the regression line

Y f(X)there is assumed to be a relationship
between X and Y.
Y Mx b
Because the line we are looking for is an
estimate of the population, and not every
observation falls on the estimate of the line we
have error (e).
Y b0 b1X1e

3
What is b

b0 represents the intercept term.
b1 represents the slope of the estimated
regression line.
This term (b1) can be interpreted as the rate of
change in Y with per unit change in Xjust like a
simple line eq.

4
Population vs Sample

Y b0 b1X1 e
b0 b1X1 e

Population (We dont often have this data) Sample
(We usually have this)
Y - e (a.k.a. error or the residuals)
5
Residuals another way

Residuals can also be constructed by solving for
e in the regression equation.
e Y - b0 b1X

6
The goal of Ordinary Least-Squares Regression
(the type we are going to use)

Minimize the sum of squared residuals.
We could calculate the regression line and the
residuals by hand.but, we aint gonna.

7
First step ALWAYS, look at your data

Plot it against time, or
Plot it against your dependent variable.
Why?...because dissimilar data can potentially
generate very similar summary statisticspictures
help discern the differences

8
Dissimilar data with similar stats
Xs have the same mean and St. Dev.
Ys have the same mean and St. Dev.
From this we might conclude that each of the data
sets are identical, but wed be wrong
9
What do they look like?
Although, they result in the same OLS regression,
they are very different.
10
Forecasting Simple Linear Trend

Disposable Personal Income (DPI)
Its sometimes reasonable to make a forecast on
the basis of just a linear trend, where Y is just
assumed to be a function of (T) or time.
The regression looks like the following

Where Y(hat) is the series you want to estimate.
In this case, its DPI.

11
DPI
12
To forecast with Simple OLS in ForecastX

You need to construct an index of T

For this data set, there are 144 months. Index
goes from 1-144
T
The Data
13
Forecast of DPI
14
Some of the Output
Or
15
To forecast, we just need the index for the month
(T)

Jan 1993 DPI1 4588.58 27.93 (1) 4616.51
Feb 1993 DPI2 4588.58 27.93 (2) 4644.44
.
.
.
Dec 2004 DPI144 4588.58 27.93 (144)
8610.50
Dec 2004 DPI145 4588.58 27.93 (145)
8638.43
And, so on

16
Output
Hypothesis test for slope 0 and intercept
0What does it say
17
Do we reject that the slope and intercept are
each equal to 0?!
138.95
297.80
Reject H0
Do Not Reject H0
Reject H0
??????????
??????????
t
0
2.045
-2.045
18
Just to note

In the previous model, the only thing we are
using to predict DPI is the progression of time.
There are many more things that have the
potential of increasing or decreasing DPI.
We dont account for anything elseyet.

19
The benefits of regression

The true benefits of regression models is in its
ability to examine cause and effect.
In trend models (everything weve seen until
now), we are depending on observed patterns of
past values to predict future values.
In a Causal model, we are hypothesizing a
relationship between the dependent variable (the
variable we are interested in predicting) and one
or more independent variables (the data we use to
predict).

20
Back to Jewelry

There many things that might influence the total
monthly sales of jewelrythings like
- Weddings
- Anniversaries
- Advertising expenditures, and
- DPI
Since this is bivariate regression, for now we
will focus on DPI as the sole independent
variable used to predict jewelry sales.

21
Lets Look at the jewelry sales data plotted
against DPI
Christmas
Other months
The big differences in sales during the Dec.
months will make it hard to estimate with a
bivariate regression. We will use both the
unadjusted and the seasonally adjusted series to
see the difference in model accuracy.
22
Jewelry Example

Our dependent (Y) variable is monthly jewelry
sales
unadjusted in the first example
seasonally adjusted in the second example
Our only independent variable (X) is DPI, so
the models we are going to estimate are
JS b0 b1(DPI) e
SAJS b0 b1(DPI) e

23
Data for unadjusted Jewelry Sales
24
Output for unadjusted JS
25
Output for unadjusted JS
26
Data for unadjusted Jewelry Sales
27
Output with adjusted JS
28
Output with adjusted JS
29
Things to consider with ANY regression

Do the signs on the bs make sense?
Your expectation should have SOME logical basis.
If the sign is not what is expected, your
regression may be
Underspecified-move on to multiple regression.
Misspecified-consider other RHS variables that
might provide a better measure.

30
Consider the Jewelry Example

Do we get the right sign? i.e., whats the
relationship between DPI and sales?
What is a normal good?
What kind of good is jewelry, normal or inferior?
What would be the expected sign if we were
looking at a good we though was and inferior good?

31
Things to consider with ANY regression

If you DO get the expected signs, are the effects
statistically significant?
Do the t-stats indicate a strong relationship?
Can you reject the null that the relationship
(slope) is 0?

32
Things to consider with ANY regression

Are the effects economically significant
Even with statistically significant results, a
very small slope indicates a very large change in
the RHS variable is necessary to get any change
in the LHS.
There is no hard fast rule here. It requires
judgment.

33
Consider the Jewelry Example

In the jewelry example, it takes a 250 million
(or .25 billion) dollar increase in DPI to
increase (adjusted) jewelry sales by 1 million.
Is this a lot or a little slope?
Lets think of it a little differently
T his would be roughly a 1 increase in
(adjusted) jewelry sales with a 250 increase in
personal disposable income.
Does this pass the sniff test?

34
Things to consider with ANY regression

Does the regression explain much?
In linear regressions, the fraction of the
variance in the dependent variable explained
by the independent variable is measured by the
R-squared (A.K.A. the Coefficient of Variation).
Trend R-sq .9933
Causal (w/season) R-sq .0845
Causal (w/o season) R-sq .8641
Although the causal model explains less of the
variance, we now have some evidence that sales
are related to DPI.

35
Another thing to consider about the first model
w/seasonality in it

The first model was poorly specified when we were
using the series with seasonality in it.
The de-seasonalized data provides better fit in
the simple regression.
why?
Well, income is obviously related to sales, but
so is the month of the year (e.g., Dec), so we
need to adjust or account.
Adjust for seasonality (use a more appropriate
RHS var) , or
Account for it in the model (move to multi-var
and include the season in the regressionto be
covered next chapt.)

36
Question

Why would we want to forecast Jewelry sales based
on a series like DPI?
DPI is very close to a linear trendwe have a
good idea what it might look like a several
periods from now.

37
Other examples of simple regression modelsCross
section (all in the same time)

Car mileage as a function of engine size
What do we expect this relationship to be on
average?
Body weight as a function of height
What do we expect this relationship to be on
average?
Income as a function of educational attainment
What do we expect this relationship to be on
average?

38
Assumptions of the OLS regression

One assumption of the OLS model is that the error
terms DONT have any regular patters. First off,
this means
Errors are independantly distributed
And, they are normally distributed
They have a mean of 0
They have a constant variance

39
Errors are independantly distributed

Errors might not be independantly distributed if
we have Serial Correlation (or Autocorrelation)
Serial correlation occurs when one periods error
is related to another periods error
You can have both positive and negative serial
correlation

40
Negative Serial Correlation
Negative serial correlation occurs when positive
errors are followed by negative errors (or vice
versa)
Y
X
41
Positive Serial Correlation
Positive serial correlation occurs when positive
errors tend to be followed by positive errors
Y
X
42
What does Serial Correlation Cause?

The estimates for b are unbiased, but the errors
are underestimatedthis means our t-stats are
overstated.
If our t-stats are overstated, then its possible
we THINK we have a significant effect for b, when
we really dont.
Additionally, R-squared and F-stat are both
unreliable.

43
Durbin-Watson Statistic

The Durbin-Watson Statistic is used to test for
the existence of serial correlation.

Sum of Sq Errors
The Durbin-Watson Statistic ranges from 0 to 4.
44
Evaluation of the DW Statistic

The rule of thumb If its near 2 (i.e., from
1.5 - 2.5) there is no evidence of serial
correlation present.
For more precise evaluation you have to calculate
and compare 5 inequalities and determine which of
the 5 is true.

45
of RHS vars
Lower and Upper DW
46
Evaluation of the DW Statistic

Evaluate (Choose
True Region)
4 DW (4-DWL) T/F A
(4-DWL) DW (4-DWU) T/F B
(4-DWU) DW DWU T/F C
DWU DW DWL T/F D
DWL DW 0 T/F E

Negative serial correlation
Positive serial correlation
Indeterminate or no observed serial correlation
47
For Example

Suppose we get a DW of 0.21 with a 36
observations
From the table DWL 1.41 DWU 1.52
The rest is just filling in and evaluating.

4 0.21 (4 - 1.41) T/F A (4 -
1.41) 0.21 (4 -1.52) T/F B (4-1.52)
0.21 1.52 T/F C 1.52 0.21
1.41 T/F D 1.41 0.21 0 T/F E
48
Durbin-Watson Statistic
Back
49
Errors are Normally Distributed
Each observations error is normally distributed
around the estimated regression line.
OLS Regression Line
Y
Error can be /-, but they are grouped around the
regression line.
X
50
When might errors be distributed some other way???

One example would be a dependant variable thats
like 0/1 or similar (discrete and/or limited).
Employed/Unemployed
Full-time/Part-time
1 if above a certain value, 0 if not.

51
Errors have a mean of 0
error is just as likely as error and they
balance out.
OLS Regression Line
Y

_
X
52
Variance (or st. dev.) of errors is constant
across values of the RHS variable
OLS Regression Line
Y
X
53
What would it look like if variance wasnt
constant
Here is one specific type of non-constant var.
The mean is still 0, but errors get larger as X
gets larger.
OLS Regression Line
Y
This is referred to as heteroscedasticity, Yes,
you heard right heteroscedasticity and, its
bad for inference.
X
54

-
Looking at it from another angle, errors can be
or - , but they should be stable over time or X

-
55
X
56
Heteroscedastic residuals
X
57
Heteroscedasticity

Can cause the estimated St. Error (those reported
by the statistical package) to be smaller than
the actual St. Error.
This messes up the estimated t-stats. The
estimated t-stats are reported as larger than
they actually areBased on the estimated t-stats,
we might reject the null, when we really
shouldnt.

58
Common causes of Heteroscedasticity

Personal hygiene aside, there are several
potential sources of this problem.
Model misspecification
Omitting an important variable
Improper functional form (may be non-linearity in
the relationship between xy)

59
Data problems and fixes for the bivariate model

Trendsno problem.
Adapting the bivariate model to forecast seasonal
data.
You might think the bivariate model is too simple
to handle seasonalitywell, its simple, but with
a trick or two, you can extend its capabilities
quite a bit.

60
Forecasting SA Total Houses Soldtwo Bivariate
Regressions (linear trend DPI)

What are the causal factors for house
purchases?
Income
Time trend (Inflation)
Employment (rate)
Interest rates
Consumer confidence
Price of housing
Price of rental units ( substitutes)
Price of insurance ( complements)
Property taxes (other costs)

We will focus on these
61
Steps involved in using the bivariate model for
seasonal data

Compute a set of seasonal indices
De-seasonalize the data
Do your forecast
Re-seasonalize the data
Calculate your measures of accuracy

62
We start withUn-Adjusted Housing Sales
63
Getting your indices and de-seasonalizing the
data

What we need to do is decompose the series to
get the seasonal index for each month. We could
use the Winters model to estimate the
seasonality index, but instead we use the
Decomposition Model
We estimate the Decomposition Model and choose
the multiplicative option to get the index (in
multiplicitive form).
(We will cover this model later, but for
nowwell, just think of it as magic!!!)

64
Getting the Seasonal Indices
They are repeated down the column next to each
month.
65
Applying the Index to the Data
Unadjusted
Adjusted
66
Seasonally Adjusted Total Housing Sales (SATHS)
There is still a trend, but we arent worried
about that right now.
67
Using the Adjusted Sales Data (SATHS)

Lets now forecast adjusted housing sales as a
function of time (a 12 month forecast).
The equation we are estimating is
SATHS b0 b1(Time) e
What do we expect for the sign for b1?

68
Data

There are two ways to approach this in ForecastX
Use the Linear Regression model without a time
index, or
Use Multiple Regression with both the time
index and the year and month variable.
Both provide essentially the same results.
Seasonally Adjusted Total
Houses Sold and Time

69
Forecast Using T
70
Regression Results
Re-seasonalize This!!!
71
Re-Seasonalizing
Actual
Forecast
72
The Trend Forecast
The thing to note here is that the simple linear
model is capturing some of the seasonal
fluctuationsWOW!!!
73
What have we done?!

Really, we have simply used a little math to
incorporate the estimated seasonal variation into
the bivariate forecastwithout actually
estimating it that way.

74
Now, lets do the same thing with DPI as the RHS
variable instead of T

The same steps are involved here.
Weve already obtained the seasonal indices and
computed the de-seasonalized data.
All we need to do is make the forecast, compute
the re-seasonalize the data, and calculate your
measures of accuracy.

75
Data

Seasonally Adjusted Total
Houses Sold and DPI

76
Re-Seasonalizing
77
Forecast Using DPI
78
What can we say about the bivariate model and
seasonality?

Its really easy to forecast when there is a
trendthats just the slope b.
Although there are a few steps involved, its not
terribly difficult to forecast a series that has
seasonality.
So, we can (substantially) do what the Winters
model can, with the added benefit of being able
to say why something is happening.
We also conserve degrees of freedom

79
Other problems

Serial or Autocorrelation
Remember, autocorrelation occurs when adjacent
observations are correlated this causes our
estimated standard errors to be too small, and
our t-stats to be too big, messing up inference.

80
Causes

Long-term cycles or trends
Inflation
population growth
i.e., any outside force that affects both series
Misspecification
Leaving out an important variable (see above)
Failing to include the correct functional form of
the RHS, i.e., a non-linear term (maybe
neccitates the move to multi-variate)

81
Considerations

We generally arent concerned with AC or SC if we
are just estimating the time trend (yf(t)) in
OLS.
Its mainly when we want to figure out the causal
relationship between the RHS and the LHS that we
need to worry about SC or AC.

82
Using what we know

We have looked at the DW statistic, how its
calculated and what it measures.
Lets look at a Bivariate forecast that has a
couple of problems and see if we can use some of
the tools we currently have to identify the
problems and fix them, if possible.

83
Example of Autocorrelation in Action Background

Remember in Macroeconomics, a guy named Keynes
made a couple of observations about aggregate
consumption
What was not consumed out of current income is
saved and
Current consumption depends on current income in
a less than proportionate manner.

84
Keynesian Theory of Consumption

Keynes theories placed emphasis on a parameter
called the
Marginal Propensity to Consume (MPC),
which is the slope of the aggregate consumption
function and a key factor in determining the
"multiplier effect" of tax and spending policies.

85
What is MPC in everyday terms?

MPC can be thought of as the share of each
additional dollar of income thats spent on
consumption.
The multiplier effect is the economic stimulus
effect that comes from the spending and
re-spending of that portion of the dollar.
Good taxing and spending policies take into
account the MPC and how higher MPC means a larger
multiple effect.

86
MPC in action

From a fiscal policy standpoint we want to
inject into activities that have a high
multiplier effect, or
provide lower taxes or higher subsidies to people
who spend all their income (i.e., MPC1).
Before you say, hey, dont like this ideaI am
talking about YOU!!!

87
OK now, why do we care?

Knowledge of the MPC can give us some idea how to
stimulate the economy in recession or put the
brakes on an overheated economy with government
policies.
For example, consider the last recession. Think
of the polices that were used by the feds.
Income tax rates were reduced
Tax Rebate checks (based on dependants)
Other federal expenditures increased (?)

88
What were the intended effects of these policies?

Anything aimed at increasing the disposable
income is expected to increase C.
CIG(X-M)GNP
If the relationship holds, then increasing C has
what effect on GNP?
C GNP
As C gets larger, GNP is expected to Grow

89
GNP and Consumption
90
The Regression (with issues)

We can obtain an estimate of the MPC by applying
OLS to the following aggregate consumption
function
GC b0 b1GNP e
Where the slope, b1 , is the estimate of MPC.
The Data

91
Forecast of Consumption
92
OutputEverything looks good, but
These look pretty good! ...maybe too good.
DW indicates Serial Correlation
93
Things we need to keep in mind

Both variables are probably non-stationary. In
fact, that can be shown using ForecastXs
Analyze button and creating the Correlograms
for both series (see Chpt 2, p83).
And, therefore they may have a common trend. In
other words, the regression may be plagued by
serial correlation.
Non-stationarity is not such a big deal in the
time-trend models, because we arent trying to
establish a causal relationship and we have
models that can deal with it (linear).

94
In OLS

In OLS models non-stationarity IS a problem in
forecasting, because we ARE trying to establish a
relationship, and
If there is a common trend in the LHS and RHS,
we erroneously attribute the trends influence to
the RHS variable.

95
Detecting Serial correlation and Autocorrelation
Graphically What is the ACF?

We just learned what the DW does, but there are
graphical ways we can use to spot SC/AC.
The ACF measures the Autocorrelation with
observations from each previous period.
If a time series is stationary, the
Autocorrelation should diminish to towards 0
quickly as we go back in time.
Note If a time series has seasonality, the
autocorrelation is usually highest between like
seasons in different years.

96
(No Transcript)
97
ACF for both Series
98
Be suspicious of results that look too good

The forecaster should be suspicious of the
results because, in a sense, they are too good.
Both the ACF and the DW stat (0.16) indicate that
in the original two series we likely have strong
positive serial correlation. See

99
A Potential Method for Fixing a Non-Stationary
Series

What method can we use to potentially fix a
non-stationary series?think back
Right now, we only know about first-differencing
or de-trending, so lets use that
Just to Note The Holt and Winters models allow
for trend, but not for RHS variables, so we cant
use these directly to find the MPC.

100
Spurious Regression First Differences

When significant autocorrelation is present,
spurious regression may arise in that our results
appear to be highly accurate when in fact they
are not, since the OLS estimator of the
regression error variance is biased downward.
To investigate this possibility, we will
re-estimate our consumption function using first
differences of the original data.
This transformation is designed to eliminate any
common linear trend in the data

101
ACF for first-differenced data
Didnt completely take care of it in this
series but, we use it anyway.
102
Scatter Plot of First Differences
There is still a positive relationship in the
differenced data, but it has more error and its
weaker.
103
Back to the Data

Lets now estimate the differenced model.
The Data

104
Effects of Serial Correlation on Regression
Results and Inference
These are the more accurate results
105
Summary Serial and Autocorrelation

Serial and/or Autocorrelation can have a serious
effect on your estimates and your inference.
We can use both the DW and Correlation
Coefficients (Correlograms) to detect serial or
autocorrelation.
1st differencing can often help in purging SC or
AC from the data.