Title: Chapter 4: Simple or Bivariate Regression
1Chapter 4 Simple or Bivariate Regression
- Terms
- Dependant variable (LHS)
- the series we are trying to estimate
- Independent variable (RHS)
- the data we are using to estimate the LHS
2The line and the regression line
- Y f(X)there is assumed to be a relationship
between X and Y. - Y Mx b
- Because the line we are looking for is an
estimate of the population, and not every
observation falls on the estimate of the line we
have error (e). - Y b0 b1X1e
3What is b
- b0 represents the intercept term.
- b1 represents the slope of the estimated
regression line. - This term (b1) can be interpreted as the rate of
change in Y with per unit change in Xjust like a
simple line eq.
4Population vs Sample
Population (We dont often have this data) Sample
(We usually have this)
Y - e (a.k.a. error or the residuals)
5Residuals another way
- Residuals can also be constructed by solving for
e in the regression equation. - e Y - b0 b1X
6The goal of Ordinary Least-Squares Regression
(the type we are going to use)
- Minimize the sum of squared residuals.
- We could calculate the regression line and the
residuals by hand.but, we aint gonna.
7First step ALWAYS, look at your data
- Plot it against time, or
- Plot it against your dependent variable.
- Why?...because dissimilar data can potentially
generate very similar summary statisticspictures
help discern the differences
8Dissimilar data with similar stats
Xs have the same mean and St. Dev.
Ys have the same mean and St. Dev.
From this we might conclude that each of the data
sets are identical, but wed be wrong
9What do they look like?
Although, they result in the same OLS regression,
they are very different.
10Forecasting Simple Linear Trend
- Disposable Personal Income (DPI)
- Its sometimes reasonable to make a forecast on
the basis of just a linear trend, where Y is just
assumed to be a function of (T) or time. - The regression looks like the following
- Where Y(hat) is the series you want to estimate.
In this case, its DPI.
11DPI
12To forecast with Simple OLS in ForecastX
- You need to construct an index of T
For this data set, there are 144 months. Index
goes from 1-144
T
The Data
13Forecast of DPI
14Some of the Output
Or
15To forecast, we just need the index for the month
(T)
- Jan 1993 DPI1 4588.58 27.93 (1) 4616.51
- Feb 1993 DPI2 4588.58 27.93 (2) 4644.44
- .
- .
- .
- Dec 2004 DPI144 4588.58 27.93 (144)
8610.50 - Dec 2004 DPI145 4588.58 27.93 (145)
8638.43 - And, so on
16Output
Hypothesis test for slope 0 and intercept
0What does it say
17Do we reject that the slope and intercept are
each equal to 0?!
138.95
297.80
Reject H0
Do Not Reject H0
Reject H0
??????????
??????????
t
0
2.045
-2.045
18Just to note
- In the previous model, the only thing we are
using to predict DPI is the progression of time.
- There are many more things that have the
potential of increasing or decreasing DPI. - We dont account for anything elseyet.
19The benefits of regression
- The true benefits of regression models is in its
ability to examine cause and effect. - In trend models (everything weve seen until
now), we are depending on observed patterns of
past values to predict future values. - In a Causal model, we are hypothesizing a
relationship between the dependent variable (the
variable we are interested in predicting) and one
or more independent variables (the data we use to
predict).
20Back to Jewelry
- There many things that might influence the total
monthly sales of jewelrythings like - - Weddings
- - Anniversaries
- - Advertising expenditures, and
- - DPI
- Since this is bivariate regression, for now we
will focus on DPI as the sole independent
variable used to predict jewelry sales.
21Lets Look at the jewelry sales data plotted
against DPI
Christmas
Other months
The big differences in sales during the Dec.
months will make it hard to estimate with a
bivariate regression. We will use both the
unadjusted and the seasonally adjusted series to
see the difference in model accuracy.
22Jewelry Example
- Our dependent (Y) variable is monthly jewelry
sales - unadjusted in the first example
- seasonally adjusted in the second example
- Our only independent variable (X) is DPI, so
- the models we are going to estimate are
- JS b0 b1(DPI) e
- SAJS b0 b1(DPI) e
23Data for unadjusted Jewelry Sales
24Output for unadjusted JS
25Output for unadjusted JS
26Data for unadjusted Jewelry Sales
27Output with adjusted JS
28Output with adjusted JS
29Things to consider with ANY regression
- Do the signs on the bs make sense?
- Your expectation should have SOME logical basis.
- If the sign is not what is expected, your
regression may be - Underspecified-move on to multiple regression.
- Misspecified-consider other RHS variables that
might provide a better measure.
30Consider the Jewelry Example
- Do we get the right sign? i.e., whats the
relationship between DPI and sales? - What is a normal good?
- What kind of good is jewelry, normal or inferior?
- What would be the expected sign if we were
looking at a good we though was and inferior good?
31Things to consider with ANY regression
- If you DO get the expected signs, are the effects
statistically significant? - Do the t-stats indicate a strong relationship?
- Can you reject the null that the relationship
(slope) is 0?
32Things to consider with ANY regression
- Are the effects economically significant
- Even with statistically significant results, a
very small slope indicates a very large change in
the RHS variable is necessary to get any change
in the LHS. - There is no hard fast rule here. It requires
judgment.
33Consider the Jewelry Example
- In the jewelry example, it takes a 250 million
(or .25 billion) dollar increase in DPI to
increase (adjusted) jewelry sales by 1 million.
Is this a lot or a little slope? - Lets think of it a little differently
- T his would be roughly a 1 increase in
(adjusted) jewelry sales with a 250 increase in
personal disposable income. - Does this pass the sniff test?
34Things to consider with ANY regression
- Does the regression explain much?
- In linear regressions, the fraction of the
variance in the dependent variable explained
by the independent variable is measured by the
R-squared (A.K.A. the Coefficient of Variation). - Trend R-sq .9933
- Causal (w/season) R-sq .0845
- Causal (w/o season) R-sq .8641
- Although the causal model explains less of the
variance, we now have some evidence that sales
are related to DPI.
35Another thing to consider about the first model
w/seasonality in it
- The first model was poorly specified when we were
using the series with seasonality in it. - The de-seasonalized data provides better fit in
the simple regression. - why?
- Well, income is obviously related to sales, but
so is the month of the year (e.g., Dec), so we
need to adjust or account. - Adjust for seasonality (use a more appropriate
RHS var) , or - Account for it in the model (move to multi-var
and include the season in the regressionto be
covered next chapt.)
36Question
- Why would we want to forecast Jewelry sales based
on a series like DPI? - DPI is very close to a linear trendwe have a
good idea what it might look like a several
periods from now.
37Other examples of simple regression modelsCross
section (all in the same time)
- Car mileage as a function of engine size
- What do we expect this relationship to be on
average? - Body weight as a function of height
- What do we expect this relationship to be on
average? - Income as a function of educational attainment
- What do we expect this relationship to be on
average?
38Assumptions of the OLS regression
- One assumption of the OLS model is that the error
terms DONT have any regular patters. First off,
this means - Errors are independantly distributed
- And, they are normally distributed
- They have a mean of 0
- They have a constant variance
39Errors are independantly distributed
- Errors might not be independantly distributed if
we have Serial Correlation (or Autocorrelation) - Serial correlation occurs when one periods error
is related to another periods error - You can have both positive and negative serial
correlation
40Negative Serial Correlation
Negative serial correlation occurs when positive
errors are followed by negative errors (or vice
versa)
Y
X
41Positive Serial Correlation
Positive serial correlation occurs when positive
errors tend to be followed by positive errors
Y
X
42What does Serial Correlation Cause?
- The estimates for b are unbiased, but the errors
are underestimatedthis means our t-stats are
overstated. - If our t-stats are overstated, then its possible
we THINK we have a significant effect for b, when
we really dont. - Additionally, R-squared and F-stat are both
unreliable.
43Durbin-Watson Statistic
- The Durbin-Watson Statistic is used to test for
the existence of serial correlation.
Sum of Sq Errors
The Durbin-Watson Statistic ranges from 0 to 4.
44Evaluation of the DW Statistic
- The rule of thumb If its near 2 (i.e., from
1.5 - 2.5) there is no evidence of serial
correlation present. - For more precise evaluation you have to calculate
and compare 5 inequalities and determine which of
the 5 is true.
45 of RHS vars
Lower and Upper DW
46Evaluation of the DW Statistic
- Evaluate (Choose
- True Region)
- 4 DW (4-DWL) T/F A
- (4-DWL) DW (4-DWU) T/F B
- (4-DWU) DW DWU T/F C
- DWU DW DWL T/F D
- DWL DW 0 T/F E
Negative serial correlation
Positive serial correlation
Indeterminate or no observed serial correlation
47For Example
- Suppose we get a DW of 0.21 with a 36
observations - From the table DWL 1.41 DWU 1.52
- The rest is just filling in and evaluating.
4 0.21 (4 - 1.41) T/F A (4 -
1.41) 0.21 (4 -1.52) T/F B (4-1.52)
0.21 1.52 T/F C 1.52 0.21
1.41 T/F D 1.41 0.21 0 T/F E
48Durbin-Watson Statistic
Back
49Errors are Normally Distributed
Each observations error is normally distributed
around the estimated regression line.
OLS Regression Line
Y
Error can be /-, but they are grouped around the
regression line.
X
50When might errors be distributed some other way???
- One example would be a dependant variable thats
like 0/1 or similar (discrete and/or limited). - Employed/Unemployed
- Full-time/Part-time
- 1 if above a certain value, 0 if not.
51Errors have a mean of 0
error is just as likely as error and they
balance out.
OLS Regression Line
Y
_
X
52Variance (or st. dev.) of errors is constant
across values of the RHS variable
OLS Regression Line
Y
X
53What would it look like if variance wasnt
constant
Here is one specific type of non-constant var.
The mean is still 0, but errors get larger as X
gets larger.
OLS Regression Line
Y
This is referred to as heteroscedasticity, Yes,
you heard right heteroscedasticity and, its
bad for inference.
X
54-
Looking at it from another angle, errors can be
or - , but they should be stable over time or X
-
55X
56Heteroscedastic residuals
X
57Heteroscedasticity
- Can cause the estimated St. Error (those reported
by the statistical package) to be smaller than
the actual St. Error. - This messes up the estimated t-stats. The
estimated t-stats are reported as larger than
they actually areBased on the estimated t-stats,
we might reject the null, when we really
shouldnt.
58Common causes of Heteroscedasticity
- Personal hygiene aside, there are several
potential sources of this problem. - Model misspecification
- Omitting an important variable
- Improper functional form (may be non-linearity in
the relationship between xy)
59Data problems and fixes for the bivariate model
- Trendsno problem.
- Adapting the bivariate model to forecast seasonal
data. - You might think the bivariate model is too simple
to handle seasonalitywell, its simple, but with
a trick or two, you can extend its capabilities
quite a bit.
60Forecasting SA Total Houses Soldtwo Bivariate
Regressions (linear trend DPI)
- What are the causal factors for house
purchases? - Income
- Time trend (Inflation)
- Employment (rate)
- Interest rates
- Consumer confidence
- Price of housing
- Price of rental units ( substitutes)
- Price of insurance ( complements)
- Property taxes (other costs)
We will focus on these
61Steps involved in using the bivariate model for
seasonal data
- Compute a set of seasonal indices
- De-seasonalize the data
- Do your forecast
- Re-seasonalize the data
- Calculate your measures of accuracy
62We start withUn-Adjusted Housing Sales
63Getting your indices and de-seasonalizing the
data
- What we need to do is decompose the series to
get the seasonal index for each month. We could
use the Winters model to estimate the
seasonality index, but instead we use the
Decomposition Model - We estimate the Decomposition Model and choose
the multiplicative option to get the index (in
multiplicitive form). - (We will cover this model later, but for
nowwell, just think of it as magic!!!)
64Getting the Seasonal Indices
They are repeated down the column next to each
month.
65Applying the Index to the Data
Unadjusted
Adjusted
66Seasonally Adjusted Total Housing Sales (SATHS)
There is still a trend, but we arent worried
about that right now.
67Using the Adjusted Sales Data (SATHS)
- Lets now forecast adjusted housing sales as a
function of time (a 12 month forecast). - The equation we are estimating is
-
- SATHS b0 b1(Time) e
- What do we expect for the sign for b1?
68Data
- There are two ways to approach this in ForecastX
- Use the Linear Regression model without a time
index, or - Use Multiple Regression with both the time
index and the year and month variable. - Both provide essentially the same results.
- Seasonally Adjusted Total
- Houses Sold and Time
69Forecast Using T
70Regression Results
Re-seasonalize This!!!
71Re-Seasonalizing
Actual
Forecast
72The Trend Forecast
The thing to note here is that the simple linear
model is capturing some of the seasonal
fluctuationsWOW!!!
73What have we done?!
- Really, we have simply used a little math to
incorporate the estimated seasonal variation into
the bivariate forecastwithout actually
estimating it that way.
74Now, lets do the same thing with DPI as the RHS
variable instead of T
- The same steps are involved here.
- Weve already obtained the seasonal indices and
computed the de-seasonalized data. - All we need to do is make the forecast, compute
the re-seasonalize the data, and calculate your
measures of accuracy.
75Data
- Seasonally Adjusted Total
- Houses Sold and DPI
76Re-Seasonalizing
77Forecast Using DPI
78What can we say about the bivariate model and
seasonality?
- Its really easy to forecast when there is a
trendthats just the slope b. - Although there are a few steps involved, its not
terribly difficult to forecast a series that has
seasonality. - So, we can (substantially) do what the Winters
model can, with the added benefit of being able
to say why something is happening. - We also conserve degrees of freedom
79Other problems
- Serial or Autocorrelation
- Remember, autocorrelation occurs when adjacent
observations are correlated this causes our
estimated standard errors to be too small, and
our t-stats to be too big, messing up inference.
80Causes
- Long-term cycles or trends
- Inflation
- population growth
- i.e., any outside force that affects both series
- Misspecification
- Leaving out an important variable (see above)
- Failing to include the correct functional form of
the RHS, i.e., a non-linear term (maybe
neccitates the move to multi-variate)
81Considerations
- We generally arent concerned with AC or SC if we
are just estimating the time trend (yf(t)) in
OLS. - Its mainly when we want to figure out the causal
relationship between the RHS and the LHS that we
need to worry about SC or AC.
82Using what we know
- We have looked at the DW statistic, how its
calculated and what it measures. - Lets look at a Bivariate forecast that has a
couple of problems and see if we can use some of
the tools we currently have to identify the
problems and fix them, if possible.
83Example of Autocorrelation in Action Background
- Remember in Macroeconomics, a guy named Keynes
made a couple of observations about aggregate
consumption - What was not consumed out of current income is
saved and - Current consumption depends on current income in
a less than proportionate manner.
84Keynesian Theory of Consumption
- Keynes theories placed emphasis on a parameter
called the - Marginal Propensity to Consume (MPC),
-
- which is the slope of the aggregate consumption
function and a key factor in determining the
"multiplier effect" of tax and spending policies.
85What is MPC in everyday terms?
- MPC can be thought of as the share of each
additional dollar of income thats spent on
consumption. - The multiplier effect is the economic stimulus
effect that comes from the spending and
re-spending of that portion of the dollar. - Good taxing and spending policies take into
account the MPC and how higher MPC means a larger
multiple effect.
86MPC in action
- From a fiscal policy standpoint we want to
- inject into activities that have a high
multiplier effect, or - provide lower taxes or higher subsidies to people
who spend all their income (i.e., MPC1). - Before you say, hey, dont like this ideaI am
talking about YOU!!!
87OK now, why do we care?
- Knowledge of the MPC can give us some idea how to
stimulate the economy in recession or put the
brakes on an overheated economy with government
policies. - For example, consider the last recession. Think
of the polices that were used by the feds. - Income tax rates were reduced
- Tax Rebate checks (based on dependants)
- Other federal expenditures increased (?)
88What were the intended effects of these policies?
- Anything aimed at increasing the disposable
income is expected to increase C. - CIG(X-M)GNP
- If the relationship holds, then increasing C has
what effect on GNP? - C GNP
- As C gets larger, GNP is expected to Grow
89GNP and Consumption
90The Regression (with issues)
- We can obtain an estimate of the MPC by applying
OLS to the following aggregate consumption
function -
- GC b0 b1GNP e
- Where the slope, b1 , is the estimate of MPC.
- The Data
91Forecast of Consumption
92OutputEverything looks good, but
These look pretty good! ...maybe too good.
DW indicates Serial Correlation
93Things we need to keep in mind
- Both variables are probably non-stationary. In
fact, that can be shown using ForecastXs
Analyze button and creating the Correlograms
for both series (see Chpt 2, p83). - And, therefore they may have a common trend. In
other words, the regression may be plagued by
serial correlation. - Non-stationarity is not such a big deal in the
time-trend models, because we arent trying to
establish a causal relationship and we have
models that can deal with it (linear).
94In OLS
- In OLS models non-stationarity IS a problem in
forecasting, because we ARE trying to establish a
relationship, and - If there is a common trend in the LHS and RHS,
we erroneously attribute the trends influence to
the RHS variable.
95Detecting Serial correlation and Autocorrelation
Graphically What is the ACF?
- We just learned what the DW does, but there are
graphical ways we can use to spot SC/AC. - The ACF measures the Autocorrelation with
observations from each previous period. - If a time series is stationary, the
Autocorrelation should diminish to towards 0
quickly as we go back in time. - Note If a time series has seasonality, the
autocorrelation is usually highest between like
seasons in different years.
96(No Transcript)
97ACF for both Series
98Be suspicious of results that look too good
- The forecaster should be suspicious of the
results because, in a sense, they are too good. - Both the ACF and the DW stat (0.16) indicate that
in the original two series we likely have strong
positive serial correlation. See
99A Potential Method for Fixing a Non-Stationary
Series
- What method can we use to potentially fix a
non-stationary series?think back - Right now, we only know about first-differencing
or de-trending, so lets use that - Just to Note The Holt and Winters models allow
for trend, but not for RHS variables, so we cant
use these directly to find the MPC.
100Spurious Regression First Differences
- When significant autocorrelation is present,
spurious regression may arise in that our results
appear to be highly accurate when in fact they
are not, since the OLS estimator of the
regression error variance is biased downward. - To investigate this possibility, we will
re-estimate our consumption function using first
differences of the original data. - This transformation is designed to eliminate any
common linear trend in the data
101ACF for first-differenced data
Didnt completely take care of it in this
series but, we use it anyway.
102Scatter Plot of First Differences
There is still a positive relationship in the
differenced data, but it has more error and its
weaker.
103Back to the Data
- Lets now estimate the differenced model.
- The Data
104Effects of Serial Correlation on Regression
Results and Inference
These are the more accurate results
105Summary Serial and Autocorrelation
- Serial and/or Autocorrelation can have a serious
effect on your estimates and your inference. - We can use both the DW and Correlation
Coefficients (Correlograms) to detect serial or
autocorrelation. - 1st differencing can often help in purging SC or
AC from the data.
106Ungraded Homework for next time