Chapter 4: Simple or Bivariate Regression - PowerPoint PPT Presentation

1 / 106
About This Presentation
Title:

Chapter 4: Simple or Bivariate Regression

Description:

Back to Jewelry ... What kind of good is jewelry, normal or inferior? ... Consider the Jewelry Example ... – PowerPoint PPT presentation

Number of Views:266
Avg rating:3.0/5.0
Slides: 107
Provided by: jsmi1
Category:

less

Transcript and Presenter's Notes

Title: Chapter 4: Simple or Bivariate Regression


1
Chapter 4 Simple or Bivariate Regression
  • Terms
  • Dependant variable (LHS)
  • the series we are trying to estimate
  • Independent variable (RHS)
  • the data we are using to estimate the LHS

2
The line and the regression line
  • Y f(X)there is assumed to be a relationship
    between X and Y.
  • Y Mx b
  • Because the line we are looking for is an
    estimate of the population, and not every
    observation falls on the estimate of the line we
    have error (e).
  • Y b0 b1X1e

3
What is b
  • b0 represents the intercept term.
  • b1 represents the slope of the estimated
    regression line.
  • This term (b1) can be interpreted as the rate of
    change in Y with per unit change in Xjust like a
    simple line eq.

4
Population vs Sample
  • Y b0 b1X1 e
  • b0 b1X1 e

Population (We dont often have this data) Sample
(We usually have this)
Y - e (a.k.a. error or the residuals)
5
Residuals another way
  • Residuals can also be constructed by solving for
    e in the regression equation.
  • e Y - b0 b1X

6
The goal of Ordinary Least-Squares Regression
(the type we are going to use)
  • Minimize the sum of squared residuals.
  • We could calculate the regression line and the
    residuals by hand.but, we aint gonna.

7
First step ALWAYS, look at your data
  • Plot it against time, or
  • Plot it against your dependent variable.
  • Why?...because dissimilar data can potentially
    generate very similar summary statisticspictures
    help discern the differences

8
Dissimilar data with similar stats
Xs have the same mean and St. Dev.
Ys have the same mean and St. Dev.
From this we might conclude that each of the data
sets are identical, but wed be wrong
9
What do they look like?
Although, they result in the same OLS regression,
they are very different.
10
Forecasting Simple Linear Trend
  • Disposable Personal Income (DPI)
  • Its sometimes reasonable to make a forecast on
    the basis of just a linear trend, where Y is just
    assumed to be a function of (T) or time.
  • The regression looks like the following
  • Where Y(hat) is the series you want to estimate.
    In this case, its DPI.

11
DPI
12
To forecast with Simple OLS in ForecastX
  • You need to construct an index of T

For this data set, there are 144 months. Index
goes from 1-144
T
The Data
13
Forecast of DPI
14
Some of the Output
Or
15
To forecast, we just need the index for the month
(T)
  • Jan 1993 DPI1 4588.58 27.93 (1) 4616.51
  • Feb 1993 DPI2 4588.58 27.93 (2) 4644.44
  • .
  • .
  • .
  • Dec 2004 DPI144 4588.58 27.93 (144)
    8610.50
  • Dec 2004 DPI145 4588.58 27.93 (145)
    8638.43
  • And, so on

16
Output
Hypothesis test for slope 0 and intercept
0What does it say
17
Do we reject that the slope and intercept are
each equal to 0?!
138.95
297.80
Reject H0
Do Not Reject H0
Reject H0
??????????
??????????
t
0
2.045
-2.045
18
Just to note
  • In the previous model, the only thing we are
    using to predict DPI is the progression of time.
  • There are many more things that have the
    potential of increasing or decreasing DPI.
  • We dont account for anything elseyet.

19
The benefits of regression
  • The true benefits of regression models is in its
    ability to examine cause and effect.
  • In trend models (everything weve seen until
    now), we are depending on observed patterns of
    past values to predict future values.
  • In a Causal model, we are hypothesizing a
    relationship between the dependent variable (the
    variable we are interested in predicting) and one
    or more independent variables (the data we use to
    predict).

20
Back to Jewelry
  • There many things that might influence the total
    monthly sales of jewelrythings like
  • - Weddings
  • - Anniversaries
  • - Advertising expenditures, and
  • - DPI
  • Since this is bivariate regression, for now we
    will focus on DPI as the sole independent
    variable used to predict jewelry sales.

21
Lets Look at the jewelry sales data plotted
against DPI
Christmas
Other months
The big differences in sales during the Dec.
months will make it hard to estimate with a
bivariate regression. We will use both the
unadjusted and the seasonally adjusted series to
see the difference in model accuracy.
22
Jewelry Example
  • Our dependent (Y) variable is monthly jewelry
    sales
  • unadjusted in the first example
  • seasonally adjusted in the second example
  • Our only independent variable (X) is DPI, so
  • the models we are going to estimate are
  • JS b0 b1(DPI) e
  • SAJS b0 b1(DPI) e

23
Data for unadjusted Jewelry Sales
24
Output for unadjusted JS
25
Output for unadjusted JS
26
Data for unadjusted Jewelry Sales
27
Output with adjusted JS
28
Output with adjusted JS
29
Things to consider with ANY regression
  • Do the signs on the bs make sense?
  • Your expectation should have SOME logical basis.
  • If the sign is not what is expected, your
    regression may be
  • Underspecified-move on to multiple regression.
  • Misspecified-consider other RHS variables that
    might provide a better measure.

30
Consider the Jewelry Example
  • Do we get the right sign? i.e., whats the
    relationship between DPI and sales?
  • What is a normal good?
  • What kind of good is jewelry, normal or inferior?
  • What would be the expected sign if we were
    looking at a good we though was and inferior good?

31
Things to consider with ANY regression
  • If you DO get the expected signs, are the effects
    statistically significant?
  • Do the t-stats indicate a strong relationship?
  • Can you reject the null that the relationship
    (slope) is 0?

32
Things to consider with ANY regression
  • Are the effects economically significant
  • Even with statistically significant results, a
    very small slope indicates a very large change in
    the RHS variable is necessary to get any change
    in the LHS.
  • There is no hard fast rule here. It requires
    judgment.

33
Consider the Jewelry Example
  • In the jewelry example, it takes a 250 million
    (or .25 billion) dollar increase in DPI to
    increase (adjusted) jewelry sales by 1 million.
    Is this a lot or a little slope?
  • Lets think of it a little differently
  • T his would be roughly a 1 increase in
    (adjusted) jewelry sales with a 250 increase in
    personal disposable income.
  • Does this pass the sniff test?

34
Things to consider with ANY regression
  • Does the regression explain much?
  • In linear regressions, the fraction of the
    variance in the dependent variable explained
    by the independent variable is measured by the
    R-squared (A.K.A. the Coefficient of Variation).
  • Trend R-sq .9933
  • Causal (w/season) R-sq .0845
  • Causal (w/o season) R-sq .8641
  • Although the causal model explains less of the
    variance, we now have some evidence that sales
    are related to DPI.

35
Another thing to consider about the first model
w/seasonality in it
  • The first model was poorly specified when we were
    using the series with seasonality in it.
  • The de-seasonalized data provides better fit in
    the simple regression.
  • why?
  • Well, income is obviously related to sales, but
    so is the month of the year (e.g., Dec), so we
    need to adjust or account.
  • Adjust for seasonality (use a more appropriate
    RHS var) , or
  • Account for it in the model (move to multi-var
    and include the season in the regressionto be
    covered next chapt.)

36
Question
  • Why would we want to forecast Jewelry sales based
    on a series like DPI?
  • DPI is very close to a linear trendwe have a
    good idea what it might look like a several
    periods from now.

37
Other examples of simple regression modelsCross
section (all in the same time)
  • Car mileage as a function of engine size
  • What do we expect this relationship to be on
    average?
  • Body weight as a function of height
  • What do we expect this relationship to be on
    average?
  • Income as a function of educational attainment
  • What do we expect this relationship to be on
    average?

38
Assumptions of the OLS regression
  • One assumption of the OLS model is that the error
    terms DONT have any regular patters. First off,
    this means
  • Errors are independantly distributed
  • And, they are normally distributed
  • They have a mean of 0
  • They have a constant variance

39
Errors are independantly distributed
  • Errors might not be independantly distributed if
    we have Serial Correlation (or Autocorrelation)
  • Serial correlation occurs when one periods error
    is related to another periods error
  • You can have both positive and negative serial
    correlation

40
Negative Serial Correlation
Negative serial correlation occurs when positive
errors are followed by negative errors (or vice
versa)
Y
X
41
Positive Serial Correlation
Positive serial correlation occurs when positive
errors tend to be followed by positive errors
Y
X
42
What does Serial Correlation Cause?
  • The estimates for b are unbiased, but the errors
    are underestimatedthis means our t-stats are
    overstated.
  • If our t-stats are overstated, then its possible
    we THINK we have a significant effect for b, when
    we really dont.
  • Additionally, R-squared and F-stat are both
    unreliable.

43
Durbin-Watson Statistic
  • The Durbin-Watson Statistic is used to test for
    the existence of serial correlation.

Sum of Sq Errors
The Durbin-Watson Statistic ranges from 0 to 4.
44
Evaluation of the DW Statistic
  • The rule of thumb If its near 2 (i.e., from
    1.5 - 2.5) there is no evidence of serial
    correlation present.
  • For more precise evaluation you have to calculate
    and compare 5 inequalities and determine which of
    the 5 is true.

45
of RHS vars
Lower and Upper DW
46
Evaluation of the DW Statistic
  • Evaluate (Choose
  • True Region)
  • 4 DW (4-DWL) T/F A
  • (4-DWL) DW (4-DWU) T/F B
  • (4-DWU) DW DWU T/F C
  • DWU DW DWL T/F D
  • DWL DW 0 T/F E

Negative serial correlation
Positive serial correlation
Indeterminate or no observed serial correlation
47
For Example
  • Suppose we get a DW of 0.21 with a 36
    observations
  • From the table DWL 1.41 DWU 1.52
  • The rest is just filling in and evaluating.

4 0.21 (4 - 1.41) T/F A (4 -
1.41) 0.21 (4 -1.52) T/F B (4-1.52)
0.21 1.52 T/F C 1.52 0.21
1.41 T/F D 1.41 0.21 0 T/F E
48
Durbin-Watson Statistic
Back
49
Errors are Normally Distributed
Each observations error is normally distributed
around the estimated regression line.
OLS Regression Line
Y
Error can be /-, but they are grouped around the
regression line.
X
50
When might errors be distributed some other way???
  • One example would be a dependant variable thats
    like 0/1 or similar (discrete and/or limited).
  • Employed/Unemployed
  • Full-time/Part-time
  • 1 if above a certain value, 0 if not.

51
Errors have a mean of 0
error is just as likely as error and they
balance out.
OLS Regression Line
Y

_
X
52
Variance (or st. dev.) of errors is constant
across values of the RHS variable
OLS Regression Line
Y
X
53
What would it look like if variance wasnt
constant
Here is one specific type of non-constant var.
The mean is still 0, but errors get larger as X
gets larger.
OLS Regression Line
Y
This is referred to as heteroscedasticity, Yes,
you heard right heteroscedasticity and, its
bad for inference.
X
54

-
Looking at it from another angle, errors can be
or - , but they should be stable over time or X

-
55
X
56
Heteroscedastic residuals
X
57
Heteroscedasticity
  • Can cause the estimated St. Error (those reported
    by the statistical package) to be smaller than
    the actual St. Error.
  • This messes up the estimated t-stats. The
    estimated t-stats are reported as larger than
    they actually areBased on the estimated t-stats,
    we might reject the null, when we really
    shouldnt.

58
Common causes of Heteroscedasticity
  • Personal hygiene aside, there are several
    potential sources of this problem.
  • Model misspecification
  • Omitting an important variable
  • Improper functional form (may be non-linearity in
    the relationship between xy)

59
Data problems and fixes for the bivariate model
  • Trendsno problem.
  • Adapting the bivariate model to forecast seasonal
    data.
  • You might think the bivariate model is too simple
    to handle seasonalitywell, its simple, but with
    a trick or two, you can extend its capabilities
    quite a bit.

60
Forecasting SA Total Houses Soldtwo Bivariate
Regressions (linear trend DPI)
  • What are the causal factors for house
    purchases?
  • Income
  • Time trend (Inflation)
  • Employment (rate)
  • Interest rates
  • Consumer confidence
  • Price of housing
  • Price of rental units ( substitutes)
  • Price of insurance ( complements)
  • Property taxes (other costs)

We will focus on these
61
Steps involved in using the bivariate model for
seasonal data
  • Compute a set of seasonal indices
  • De-seasonalize the data
  • Do your forecast
  • Re-seasonalize the data
  • Calculate your measures of accuracy

62
We start withUn-Adjusted Housing Sales
63
Getting your indices and de-seasonalizing the
data
  • What we need to do is decompose the series to
    get the seasonal index for each month. We could
    use the Winters model to estimate the
    seasonality index, but instead we use the
    Decomposition Model
  • We estimate the Decomposition Model and choose
    the multiplicative option to get the index (in
    multiplicitive form).
  • (We will cover this model later, but for
    nowwell, just think of it as magic!!!)

64
Getting the Seasonal Indices
They are repeated down the column next to each
month.
65
Applying the Index to the Data
Unadjusted
Adjusted
66
Seasonally Adjusted Total Housing Sales (SATHS)
There is still a trend, but we arent worried
about that right now.
67
Using the Adjusted Sales Data (SATHS)
  • Lets now forecast adjusted housing sales as a
    function of time (a 12 month forecast).
  • The equation we are estimating is
  • SATHS b0 b1(Time) e
  • What do we expect for the sign for b1?

68
Data
  • There are two ways to approach this in ForecastX
  • Use the Linear Regression model without a time
    index, or
  • Use Multiple Regression with both the time
    index and the year and month variable.
  • Both provide essentially the same results.
  • Seasonally Adjusted Total
  • Houses Sold and Time

69
Forecast Using T
70
Regression Results
Re-seasonalize This!!!
71
Re-Seasonalizing
Actual
Forecast
72
The Trend Forecast
The thing to note here is that the simple linear
model is capturing some of the seasonal
fluctuationsWOW!!!
73
What have we done?!
  • Really, we have simply used a little math to
    incorporate the estimated seasonal variation into
    the bivariate forecastwithout actually
    estimating it that way.

74
Now, lets do the same thing with DPI as the RHS
variable instead of T
  • The same steps are involved here.
  • Weve already obtained the seasonal indices and
    computed the de-seasonalized data.
  • All we need to do is make the forecast, compute
    the re-seasonalize the data, and calculate your
    measures of accuracy.

75
Data
  • Seasonally Adjusted Total
  • Houses Sold and DPI

76
Re-Seasonalizing
77
Forecast Using DPI
78
What can we say about the bivariate model and
seasonality?
  • Its really easy to forecast when there is a
    trendthats just the slope b.
  • Although there are a few steps involved, its not
    terribly difficult to forecast a series that has
    seasonality.
  • So, we can (substantially) do what the Winters
    model can, with the added benefit of being able
    to say why something is happening.
  • We also conserve degrees of freedom

79
Other problems
  • Serial or Autocorrelation
  • Remember, autocorrelation occurs when adjacent
    observations are correlated this causes our
    estimated standard errors to be too small, and
    our t-stats to be too big, messing up inference.

80
Causes
  • Long-term cycles or trends
  • Inflation
  • population growth
  • i.e., any outside force that affects both series
  • Misspecification
  • Leaving out an important variable (see above)
  • Failing to include the correct functional form of
    the RHS, i.e., a non-linear term (maybe
    neccitates the move to multi-variate)

81
Considerations
  • We generally arent concerned with AC or SC if we
    are just estimating the time trend (yf(t)) in
    OLS.
  • Its mainly when we want to figure out the causal
    relationship between the RHS and the LHS that we
    need to worry about SC or AC.

82
Using what we know
  • We have looked at the DW statistic, how its
    calculated and what it measures.
  • Lets look at a Bivariate forecast that has a
    couple of problems and see if we can use some of
    the tools we currently have to identify the
    problems and fix them, if possible.

83
Example of Autocorrelation in Action Background
  • Remember in Macroeconomics, a guy named Keynes
    made a couple of observations about aggregate
    consumption
  • What was not consumed out of current income is
    saved and
  • Current consumption depends on current income in
    a less than proportionate manner.

84
Keynesian Theory of Consumption
  • Keynes theories placed emphasis on a parameter
    called the
  • Marginal Propensity to Consume (MPC),
  • which is the slope of the aggregate consumption
    function and a key factor in determining the
    "multiplier effect" of tax and spending policies.

85
What is MPC in everyday terms?
  • MPC can be thought of as the share of each
    additional dollar of income thats spent on
    consumption.
  • The multiplier effect is the economic stimulus
    effect that comes from the spending and
    re-spending of that portion of the dollar.
  • Good taxing and spending policies take into
    account the MPC and how higher MPC means a larger
    multiple effect.

86
MPC in action
  • From a fiscal policy standpoint we want to
  • inject into activities that have a high
    multiplier effect, or
  • provide lower taxes or higher subsidies to people
    who spend all their income (i.e., MPC1).
  • Before you say, hey, dont like this ideaI am
    talking about YOU!!!

87
OK now, why do we care?
  • Knowledge of the MPC can give us some idea how to
    stimulate the economy in recession or put the
    brakes on an overheated economy with government
    policies.
  • For example, consider the last recession. Think
    of the polices that were used by the feds.
  • Income tax rates were reduced
  • Tax Rebate checks (based on dependants)
  • Other federal expenditures increased (?)

88
What were the intended effects of these policies?
  • Anything aimed at increasing the disposable
    income is expected to increase C.
  • CIG(X-M)GNP
  • If the relationship holds, then increasing C has
    what effect on GNP?
  • C GNP
  • As C gets larger, GNP is expected to Grow

89
GNP and Consumption
90
The Regression (with issues)
  • We can obtain an estimate of the MPC by applying
    OLS to the following aggregate consumption
    function
  • GC b0 b1GNP e
  • Where the slope, b1 , is the estimate of MPC.
  • The Data

91
Forecast of Consumption
92
OutputEverything looks good, but
These look pretty good! ...maybe too good.
DW indicates Serial Correlation
93
Things we need to keep in mind
  • Both variables are probably non-stationary. In
    fact, that can be shown using ForecastXs
    Analyze button and creating the Correlograms
    for both series (see Chpt 2, p83).
  • And, therefore they may have a common trend. In
    other words, the regression may be plagued by
    serial correlation.
  • Non-stationarity is not such a big deal in the
    time-trend models, because we arent trying to
    establish a causal relationship and we have
    models that can deal with it (linear).

94
In OLS
  • In OLS models non-stationarity IS a problem in
    forecasting, because we ARE trying to establish a
    relationship, and
  • If there is a common trend in the LHS and RHS,
    we erroneously attribute the trends influence to
    the RHS variable.

95
Detecting Serial correlation and Autocorrelation
Graphically What is the ACF?
  • We just learned what the DW does, but there are
    graphical ways we can use to spot SC/AC.
  • The ACF measures the Autocorrelation with
    observations from each previous period.
  • If a time series is stationary, the
    Autocorrelation should diminish to towards 0
    quickly as we go back in time.
  • Note If a time series has seasonality, the
    autocorrelation is usually highest between like
    seasons in different years.

96
(No Transcript)
97
ACF for both Series
98
Be suspicious of results that look too good
  • The forecaster should be suspicious of the
    results because, in a sense, they are too good.
  • Both the ACF and the DW stat (0.16) indicate that
    in the original two series we likely have strong
    positive serial correlation. See

99
A Potential Method for Fixing a Non-Stationary
Series
  • What method can we use to potentially fix a
    non-stationary series?think back
  • Right now, we only know about first-differencing
    or de-trending, so lets use that
  • Just to Note The Holt and Winters models allow
    for trend, but not for RHS variables, so we cant
    use these directly to find the MPC.

100
Spurious Regression First Differences
  • When significant autocorrelation is present,
    spurious regression may arise in that our results
    appear to be highly accurate when in fact they
    are not, since the OLS estimator of the
    regression error variance is biased downward.
  • To investigate this possibility, we will
    re-estimate our consumption function using first
    differences of the original data.
  • This transformation is designed to eliminate any
    common linear trend in the data

101
ACF for first-differenced data
Didnt completely take care of it in this
series but, we use it anyway.
102
Scatter Plot of First Differences
There is still a positive relationship in the
differenced data, but it has more error and its
weaker.
103
Back to the Data
  • Lets now estimate the differenced model.
  • The Data

104
Effects of Serial Correlation on Regression
Results and Inference
These are the more accurate results
105
Summary Serial and Autocorrelation
  • Serial and/or Autocorrelation can have a serious
    effect on your estimates and your inference.
  • We can use both the DW and Correlation
    Coefficients (Correlograms) to detect serial or
    autocorrelation.
  • 1st differencing can often help in purging SC or
    AC from the data.

106
Ungraded Homework for next time
  • Chpt 4 7 and 8
Write a Comment
User Comments (0)
About PowerShow.com