Title: Regression Analysis
1Regression Analysis
- Regression Analysis
- Set of statistical techniques that quantify the
dependence of a given economic variable on one or
more other variables. - Most common technique ordinary least squares
(OLS) regression. - STEPS
- Collect data on variables in question
- Specify form of the equation relating the
variables - Estimate equation coefficients
- Evaluate accuracy of the equation
- Interpret results in economic context.
2Regression Analysis
- Cross-sectional data
- Observations at the same time period in different
areas, regions, or markets - Example grade in ECON 3125 as a function of
GPA, time studying, attendance, age, junior vs
senior, major, etc. - Time-series data
- Observations in the same area or market over
different time periods - Example economic growth as a function of
income, unemployment rates, K stock, population,
education levels.
3Plotting observations implies a negative
relationship between price and quantity (which
makes sense). But where do we draw the demand
curve?
4OLS attempts to minimize the differences between
the actual values of the observations and the
estimated equation. SSE (sum of squared errors)
quantifies positive and negative differences so
we can seek minimum.
5Regression Analysis
Step 2 Specify form of the equation relating the
variables
Translate points in scatter plot into the form of
a linear demand equation Q a bP We are
looking for values of a (constant intercept) and
b (slope, negative sign expected). Left hand
variable dependent (the one being
explained) Right hand variables independent,
or explanatory (the ones doing the explaining)
6Regression Analysis
Step 2 Specify form of the equation relating the
variables
Multiple regression Q a bP b1Y b2Pop
b2P2 b3Age Price is not the only factor that
impacts Q, so a more precise demand equation must
include other explanatory variables. Least
squares will be achieved by minimizing the
unexplained portion of the variance from the
average.
7Regression Analysis
- Step 3
- Estimate equation coefficients (regression
software) - Q 28.84 2.12P 3.09Y 1.03P2
-
- Example The data on pg 151 yields this
regression equation. - Each 1 increase in price will decrease Q by 2.12
units - Each 1 increase in income will increase Q by
3.09 units (normal or inferior good?) - Each 1 increase in the competitors price will
increase Q by 1.03 units (sub or comp?)
8 Step 4 Evaluate accuracy of the equation
9N number of lines of data in the dataset.
Generally, the larger the dataset, the better the
results.
10N K, where K of estimated
coefficients. Number of possible permutations
you have available to you.
11For example, imagine you have four numbers (a, b,
c and d) that must add up to a total of m you
are free to choose the first three numbers at
random, but the fourth must be chosen so that it
makes the total equal to m - thus your degree of
freedom is three.
12Explanatory variable names
13Coefficient estimates for equation.
14Also called goodness of fit Coefficient of
determination
15Proportion of variation in Q explained by the
regression. A perfect fit would yield R2 1.
If the equation explains nothing, R2 0.
16Value is sensitive to K, so adding more
independent variables typically increases R2,
even if they have no explanatory power.
17Regular R2 adjusted for degrees of freedom.
Removes sensitivity to K. More accurate measure
of goodness of fit, always less than R2.
18Tests overall statistical significance of the
equation, not just each variable, but group of
variables. Must be compared to critical value in
table (higher better).
19If F-stat is greater than critical value, we can
reject hypothesis of zero coefficients at
specified confidence level (here at 95) and say
equation has explanatory power.
20If F-stat is greater than critical value, we can
reject hypothesis of zero coefficients at
specified confidence level and say equation has
explanatory power.
21Standard Error (Standard Deviation) of the
Coefficient
There is a 95 chance that the true coefficient
lies within two standard errors of the estimated
coefficient. Example Our price coefficient
estimate is -2.12, with a standard error of
.34. Two times the standard error is.68. There
is roughly a 95 chance that the true coefficient
lies in the range of -2.12 plus or minus .68, or
between -2.80 and -1.44.
22Standard deviation of the estimated coefficient.
The lower the standard error, the more accurate
the estimate.
23Coefficient estimate divided by the standard
error. Tells us how many standard errors the
estimate is from zero. Compared to critical
value in table.
24The t-stat tells us whether the estimated
coefficient is statistically significant, or
statistically different from zero.