Basic Econometrics - PowerPoint PPT Presentation

1 / 156
About This Presentation
Title:

Basic Econometrics

Description:

Definition 3: The quantitative analysis of actual economic ... The social science ... as many as variable into the model (or the reasons for using ui) ... – PowerPoint PPT presentation

Number of Views:11636
Avg rating:5.0/5.0
Slides: 157
Provided by: vern2
Category:

less

Transcript and Presenter's Notes

Title: Basic Econometrics


1
Basic Econometrics
  • Course Leader
  • Prof. Dr.Sc VuThieu

2
Basic Econometrics
  • Introduction
  • What is Econometrics?

3
Introduction What is Econometrics?
  • Definition 1 Economic Measurement
  • Definition 2 Application of the mathematical
    statistics to economic data in order to lend
    empirical support to the economic mathematical
    models and obtain numerical results (Gerhard
    Tintner, 1968)

4
Introduction What is Econometrics?
  • Definition 3 The quantitative analysis of
    actual economic phenomena based on concurrent
    development of theory and observation, related by
    appropriate methods of inference
  • (P.A.Samuelson, T.C.Koopmans and J.R.N.Stone,
    1954)

5
Introduction What is Econometrics?
  • Definition 4 The social science
  • which applies economics, mathematics and
    statistical inference to the analysis of economic
    phenomena (By Arthur S. Goldberger, 1964)
  • Definition 5 The empirical determination of
    economic laws (By H. Theil, 1971)

6
Introduction What is Econometrics?
  • Definition 6 A conjunction of economic theory
    and actual measurements, using the theory and
    technique of statistical inference as a bridge
    pier (By T.Haavelmo, 1944)
  • And the others

7
Economic Theory
Mathematical Economics
Econometrics
Economic Statistics
Mathematic Statistics
8
Introduction Why a separate discipline?
  • Economic theory makes statements that are mostly
    qualitative in nature, while econometrics gives
    empirical content to most economic theory
  • Mathematical economics is to express economic
    theory in mathematical form without empirical
    verification of the theory, while econometrics is
    mainly interested in the later

9
Introduction Why a separate discipline?
  • Economic Statistics is mainly concerned with
    collecting, processing and presenting economic
    data. It does not being concerned with using the
    collected data to test economic theories
  • Mathematical statistics provides many of tools
    for economic studies, but econometrics supplies
    the later with many special methods of
    quantitative analysis based on economic data

10
Economic Theory
Mathematical Economics
Econometrics
Economic Statistics
Mathematic Statistics
11
Introduction Methodology of Econometrics
  • Statement of theory or hypothesis
  • Keynes stated Consumption increases as income
    increases, but not as much as the increase in
    income. It means that The marginal propensity
    to consume (MPC) for a unit change in income is
    grater than zero but less than unit

12
Introduction Methodology of Econometrics
  • (2) Specification of the mathematical model of
    the theory
  • Y ß1 ß2X 0 lt ß2lt 1
  • Y consumption expenditure
  • X income
  • ß1 and ß2 are parameters ß1 is
  • intercept, and ß2 is slope coefficients

13
Introduction Methodology of Econometrics
  • (3) Specification of the econometric model of the
    theory
  • Y ß1 ß2X u 0 lt ß2lt 1
  • Y consumption expenditure
  • X income
  • ß1 and ß2 are parameters ß1is intercept and ß2
    is slope coefficients u is disturbance term or
    error term. It is a random or stochastic variable

14
Introduction Methodology of Econometrics
  • (4) Obtaining Data
  • (See Table 1.1, page 6)
  • Y Personal consumption
  • expenditure
  • X Gross Domestic Product
  • all in Billion US Dollars

15
Introduction Methodology of Econometrics
  • (4) Obtaining Data

16
Introduction Methodology of Econometrics
  • (5) Estimating the Econometric Model
  • Y - 231.8 0.7194 X (1.3.3)
  • MPC was about 0.72 and it means that for the
    sample period when real income increases 1 USD,
    led (on average) real consumption expenditure
    increases of about 72 cents
  • Note A hat symbol () above one variable
    will signify an estimator of the relevant
    population value

17
Introduction Methodology of Econometrics
  • (6) Hypothesis Testing
  • Are the estimates accord with the
  • expectations of the theory that is being
  • tested? Is MPC lt 1 statistically? If so,
  • it may support Keynes theory.
  • Confirmation or refutation of
  • economic theories based on
  • sample evidence is object of Statistical
  • Inference (hypothesis testing)

18
Introduction Methodology of Econometrics
  • (7) Forecasting or Prediction
  • With given future value(s) of X, what is the
    future value(s) of Y?
  • GDP6000Bill in 1994, what is the forecast
    consumption expenditure?
  • Y - 231.80.7196(6000) 4084.6
  • Income Multiplier M 1/(1 MPC) (3.57).
    decrease (increase) of 1 in investment will
    eventually lead to 3.57 decrease (increase) in
    income

19
Introduction Methodology of Econometrics
  • (8) Using model for control or
  • policy purposes
  • Y4000 -231.80.7194 X ? X ? 5882
  • MPC 0.72, an income of 5882 Bill
  • will produce an expenditure of 4000
  • Bill. By fiscal and monetary policy,
  • Government can manipulate the
  • control variable X to get the desired
  • level of target variable Y

20
Introduction Methodology of Econometrics
  • Figure 1.4 Anatomy of economic modelling
  • 1) Economic Theory
  • 2) Mathematical Model of Theory
  • 3) Econometric Model of Theory
  • 4) Data
  • 5) Estimation of Econometric Model
  • 6) Hypothesis Testing
  • 7) Forecasting or Prediction
  • 8) Using the Model for control or policy purposes

21
Economic Theory
Mathematic Model
Econometric Model
Data Collection
Estimation
Hypothesis Testing
Application in control or policy studies
Forecasting
22
Basic Econometrics
  • Chapter 1
  • THE NATURE OF REGRESSION ANALYSIS

23
1-1. Historical origin of the term Regression
  • The term REGRESSION was introduced by Francis
    Galton
  • Tendency for tall parents to have tall children
    and for short parents to have short children, but
    the average height of children born from parents
    of a given height tended to move (or regress)
    toward the average height in the population as a
    whole (F. Galton, Family Likeness in Stature)

24
1-1. Historical origin of the term Regression
  • Galtons Law was confirmed by Karl Pearson The
    average height of sons of a group of tall fathers
    lt their fathers height. And the average height
    of sons of a group of short fathers gt their
    fathers height. Thus regressing tall and
    short sons alike toward the average height of all
    men. (K. Pearson and A. Lee, On the law of
    Inheritance)
  • By the words of Galton, this was Regression to
    mediocrity

25
1-2. Modern Interpretation of Regression Analysis
  • The modern way in interpretation of Regression
    Regression Analysis is concerned with the study
    of the dependence of one variable (The Dependent
    Variable), on one or more other variable(s) (The
    Explanatory Variable), with a view to estimating
    and/or predicting the (population) mean or
    average value of the former in term of the known
    or fixed (in repeated sampling) values of the
    latter.
  • Examples (pages 16-19)

26
Dependent Variable Y Explanatory Variable Xs
  • 1. Y Sons Height X Fathers Height
  • 2. Y Height of boys X Age of boys
  • 3. Y Personal Consumption Expenditure
  • X Personal Disposable Income
  • 4. Y Demand X Price
  • 5. Y Rate of Change of Wages
  • X Unemployment Rate
  • 6. Y Money/Income X Inflation Rate
  • 7. Y Change in Demand X Change in the
  • advertising budget
  • 8. Y Crop yield Xs temperature, rainfall,
    sunshine,
  • fertilizer

27
1-3. Statistical vs.Deterministic Relationships
  • In regression analysis we are concerned with
    STATISTICAL DEPENDENCE among variables (not
    Functional or Deterministic), we essentially deal
    with RANDOM or STOCHASTIC variables (with the
    probability distributions)

28
1-4. Regression vs. Causation
  • Regression does not necessarily imply
    causation. A statistical relationship cannot
    logically imply causation. A statistical
    relationship, however strong and however
    suggestive, can never establish causal
    connection our ideas of causation must come from
    outside statistics, ultimately from some theory
    or other (M.G. Kendal and A. Stuart, The
    Advanced Theory of Statistics)

29
1-5. Regression vs. Correlation
  • Correlation Analysis the primary objective is to
    measure the strength or degree of linear
    association between two variables (both are
    assumed to be random)
  • Regression Analysis we try to estimate or
    predict the average value of one variable
    (dependent, and assumed to be stochastic) on the
    basis of the fixed values of other variables
    (independent, and non-stochastic)

30
1-6. Terminology and Notation
  • Dependent Variable
  • ??
  • Explained Variable
  • ??
  • Predictand
  • ??
  • Regressand
  • ??
  • Response
  • ??
  • Endogenous
  • Explanatory Variable(s)
  • ??
  • Independent Variable(s)
  • ??
  • Predictor(s)
  • ??
  • Regressor(s)
  • ??
  • Stimulus or control variable(s)
  • ??
  • Exogenous(es)

31
1-7. The Nature and Sources of Data for
Econometric Analysis
  • 1) Types of Data
  • Time series data
  • Cross-sectional data
  • Pooled data
  • 2) The Sources of Data
  • 3) The Accuracy of Data

32
1-8. Summary and Conclusions
  • 1) The key idea behind regression analysis is
    the statistic dependence of one variable on one
    or more other variable(s)
  • 2) The objective of regression analysis is to
    estimate and/or predict the mean or average value
    of the dependent variable on basis of known (or
    fixed) values of explanatory variable(s)

33
1-8. Summary and Conclusions
  • 3) The success of regression depends on the
    available and appropriate data
  • 4) The researcher should clearly state the
    sources of the data used in the analysis, their
    definitions, their methods of collection, any
    gaps or omissions and any revisions in the data

34
Basic Econometrics
  • Chapter 2
  • TWO-VARIABLE REGRESSION ANALYSIS Some basic
    Ideas

35
2-1. A Hypothetical Example
  • Total population 60 families
  • YWeekly family consumption expenditure
  • XWeekly disposable family income
  • 60 families were divided into 10 groups of
    approximately the same income level
  • (80, 100, 120, 140, 160, 180, 200, 220, 240,
    260)

36
2-1. A Hypothetical Example
  • Table 2-1 gives the conditional distribution
  • of Y on the given values of X
  • Table 2-2 gives the conditional probabilities of
    Y p(Y?X)
  • Conditional Mean
  • (or Expectation) E(Y?XXi )

37
Table 2-2 Weekly family income X (), and
consumption Y ()
38
2-1. A Hypothetical Example
  • Figure 2-1 shows the population regression line
    (curve). It is the
  • regression of Y on X
  • Population regression curve is the
  • locus of the conditional means or expectations
    of the dependent variable
  • for the fixed values of the explanatory variable
    X (Fig.2-2)

39
2-2. The concepts of population
regression function (PRF)
  • E(Y?XXi ) f(Xi) is Population Regression
    Function (PRF) or
  • Population Regression (PR)
  • In the case of linear function we have linear
    population regression function (or equation or
    model)
  • E(Y?XXi ) f(Xi) ß1 ß2Xi

40
2-2. The concepts of population
regression function (PRF)
  • E(Y?XXi ) f(Xi) ß1 ß2Xi
  • ß1 and ß2 are regression coefficients, ß1is
    intercept and ß2 is slope coefficient
  • Linearity in the Variables
  • Linearity in the Parameters

41
2-4. Stochastic Specification of PRF
  • Ui Y - E(Y?XXi ) or Yi E(Y?XXi ) Ui
  • Ui Stochastic disturbance or stochastic error
    term. It is nonsystematic component
  • Component E(Y?XXi ) is systematic or
    deterministic. It is the mean consumption
    expenditure of all the families with the same
    level of income
  • The assumption that the regression line passes
    through the conditional means of Y implies that
    E(Ui?Xi ) 0

42
2-5. The Significance of the Stochastic
Disturbance Term
  • Ui Stochastic Disturbance Term is a surrogate
    for all variables that are omitted from the model
    but they collectively affect Y
  • Many reasons why not include such variables into
    the model as follows

43
2-5. The Significance of the Stochastic
Disturbance Term
  • Why not include as many as variable into the
    model (or the reasons for using ui)
  • Vagueness of theory
  • Unavailability of Data
  • Core Variables vs. Peripheral Variables
  • Intrinsic randomness in human behavior
  • Poor proxy variables
  • Principle of parsimony
  • Wrong functional form

44
2-6. The Sample Regression Function (SRF)
  • Table 2-4 A random sample from the
    population
  • Y X
  • ------------------
  • 70 80
  • 65 100
  • 90 120
  • 95 140
  • 110 160
  • 115 180
  • 120 200
  • 140 220
  • 155 240
  • 150 260
  • ------------------
  • Table 2-5 Another random sample from the
    population
  • Y X
  • -------------------
  • 55 80
  • 88 100
  • 90 120
  • 80 140
  • 118 160
  • 120 180
  • 145 200
  • 135 220
  • 145 240
  • 175 260
  • --------------------

45
Weekly Consumption Expenditure (Y)
SRF1
SRF2
Weekly Income (X)
46
2-6. The Sample Regression Function (SRF)
  • Fig.2-3 SRF1 and SRF 2
  • Yi ?1 ?2Xi (2.6.1)
  • Yi estimator of E(Y?Xi)
  • ?1 estimator of ?1
  • ?2 estimator of ?2
  • Estimate A particular numerical value obtained
    by the estimator in an application
  • SRF in stochastic form Yi ?1 ?2Xi ui
  • or Yi Yi ui (2.6.3)

47
2-6. The Sample Regression Function
(SRF)
  • Primary objective in regression analysis is to
    estimate the PRF Yi ?1 ?2Xi ui on the basis
    of the SRF Yi ?1 ?2Xi ei and how to
    construct SRF so that ?1 close to ?1 and ?2
    close to ?2 as much as possible

48
2-6. The Sample Regression Function (SRF)
  • Population Regression Function PRF
  • Linearity in the parameters
  • Stochastic PRF
  • Stochastic Disturbance Term ui plays a critical
    role in estimating the PRF
  • Sample of observations from population
  • Stochastic Sample Regression Function SRF used to
    estimate the PRF

49
2-7. Summary and Conclusions
  • The key concept underlying regression analysis is
    the concept of the population regression function
    (PRF).
  • This book deals with linear PRFs linear in the
    unknown parameters. They may or may not linear in
    the variables.

50
2-7. Summary and Conclusions
  • For empirical purposes, it is the stochastic PRF
    that matters. The stochastic disturbance term ui
    plays a critical role in estimating the PRF.
  • The PRF is an idealized concept, since in
    practice one rarely has access to the entire
    population of interest. Generally, one has a
    sample of observations from population and use
    the stochastic sample regression (SRF) to
    estimate the PRF.

51
Basic Econometrics
  • Chapter 3
  • TWO-VARIABLE REGRESSION MODEL
  • The problem of Estimation

52
3-1. The method of ordinary least square (OLS)
  • Least-square criterion
  • Minimizing ?U2i ?(Yi Yi) 2
  • ?(Yi- ?1 - ?2X)2
    (3.1.2)
  • Normal Equation and solving it for ?1 and ?2
    Least-square estimators See (3.1.6)(3.1.7)
  • Numerical and statistical properties of OLS are
    as follows

53
3-1. The method of ordinary least square (OLS)
  • OLS estimators are expressed solely in terms of
    observable quantities. They are point estimators
  • The sample regression line passes through sample
    means of X and Y
  • The mean value of the estimated Y is equal to
    the mean value of the actual Y E(Y) E(Y)
  • The mean value of the residuals Ui is zero
    E(ui )0
  • ui are uncorrelated with the predicted Yi and
    with Xi That are ?uiYi 0 ?uiXi 0

54
3-2. The assumptions underlying the method of
least squares
  • Ass 1 Linear regression model
  • (in parameters)
  • Ass 2 X values are fixed in repeated
  • sampling
  • Ass 3 Zero mean value of ui E(ui?Xi)0
  • Ass 4 Homoscedasticity or equal
  • variance of ui Var (ui?Xi) ?2
  • VS. Heteroscedasticity
  • Ass 5 No autocorrelation between the
  • disturbances Cov(ui,uj?Xi,Xj )
    0
  • with i j VS. Correlation, or
    -

55
3-2. The assumptions underlying the method of
least squares
  • Ass 6 Zero covariance between ui and Xi
  • Cov(ui, Xi) E(ui, Xi) 0
  • Ass 7 The number of observations n must be
    greater than the number of parameters
    to be estimated
  • Ass 8 Variability in X values. They must
    not all be the same
  • Ass 9 The regression model is correctly
    specified
  • Ass 10 There is no perfect multicollinearity
    between Xs

56
3-3. Precision or standard errors of
least-squares estimates
  • In statistics the precision of an
  • estimate is measured by its standard
  • error (SE)
  • var( ?2) ?2 / ?x2i (3.3.1)
  • se(?2) ? Var(?2) (3.3.2)
  • var( ?1) ?2 ?X2i / n ?x2i (3.3.3)
  • se(?1) ? Var(?1) (3.3.4)
  • ? 2 ?u2i / (n - 2) (3.3.5)
  • ? ? ? 2 is standard error of the
  • estimate

57
3-3. Precision or standard errors of
least-squares estimates
  • Features of the variance
  • var( ?2) is proportional to ?2 and inversely
    proportional to ?x2i
  • var( ?1) is proportional to ?2 and ?X2i but
    inversely proportional to ?x2i and the sample
    size n.
  • cov ( ?1 , ?2) - var( ?2) shows the
    independence between ?1 and ?2

58
3-4. Properties of least-squares estimators The
Gauss-Markov Theorem
  • An OLS estimator is said to be BLUE if
  • It is linear, that is, a linear function of a
    random variable, such as the dependent variable Y
    in the regression model
  • It is unbiased , that is, its average or
    expected value, E(?2), is equal to the true
    value ?2
  • It has minimum variance in the class of all
    such linear unbiased estimators
  • An unbiased estimator with the least variance is
    known as an efficient estimator

59
3-4. Properties of least-squares estimators The
Gauss-Markov Theorem
  • Gauss- Markov Theorem
  • Given the assumptions of the classical linear
    regression model, the least-squares estimators,
    in class of unbiased linear estimators, have
    minimum variance, that is, they are BLUE

60
3-5. The coefficient of determination r2 A
measure of Goodness of fit
  • Yi i i or
  • Yi - i - i i or
  • yi i i (Note )
  • Squaring on both side and summing gt
  • ? yi2 2 ?x2i ? 2i or
  • TSS ESS RSS

61
3-5. The coefficient of determination r2 A
measure of Goodness of fit
  • TSS ? yi2 Total Sum of Squares
  • ESS ? Y i2 ?22 ?x2i
  • Explained Sum of Squares
  • RSS ? u2I Residual Sum of
  • Squares
  • ESS RSS
  • 1 -------- -------- or
  • TSS TSS
  • RSS
    RSS
  • 1 r2 ------- or
    r2 1 - -------
  • TSS
    TSS

62
3-5. The coefficient of determination r2 A
measure of Goodness of fit
  • r2 ESS/TSS
  • is coefficient of determination, it measures the
    proportion or percentage of the total variation
    in Y explained by the regression
  • Model
  • 0 ? r2 ? 1
  • r ?? r2 is sample correlation coefficient
  • Some properties of r

63
3-5. The coefficient of determination r2 A
measure of Goodness of fit
  • 3-6. A numerical Example (pages 80-83)
  • 3-7. Illustrative Examples (pages 83-85)
  • 3-8. Coffee demand Function
  • 3-9. Monte Carlo Experiments (page 85)
  • 3-10. Summary and conclusions (pages 86-87)

64
Basic Econometrics
  • Chapter 4
  • THE NORMALITY ASSUMPTION
  • Classical Normal Linear
  • Regression Model
  • (CNLRM)

65
4-2.The normality assumption
  • CNLR assumes that each u i is distributed
    normally u i ? N(0, ?2) with
  • Mean E(u i) 0 Ass 3
  • Variance E(u2i) ?2 Ass 4
  • Cov(u i , u j ) E(u i , u j) 0
    (ij) Ass 5
  • Note For two normally distributed variables, the
    zero covariance or correlation means independence
    of them, so u i and u j are not only uncorrelated
    but also independently distributed. Therefore
    u i ? NID(0, ?2) is Normal and
  • Independently Distributed

66
4-2.The normality assumption
  • Why the normality assumption?
  • With a few exceptions, the distribution of sum of
    a large number of independent and identically
    distributed random variables tends to a normal
    distribution as the number of such variables
    increases indefinitely
  • If the number of variables is not very large or
    they are not strictly independent, their sum may
    still be normally distributed

67
4-2.The normality assumption
  • Why the normality assumption?
  • Under the normality assumption for ui , the OLS
    estimators ?1 and ?2 are also normally
    distributed
  • The normal distribution is a comparatively simple
    distribution involving only two parameters (mean
    and variance)

68
4-3. Properties of OLS estimators under the
normality assumption
  • With the normality assumption the OLS estimators
    ?1 , ?2 and ?2 have the following properties
  • 1. They are unbiased
  • 2. They have minimum variance. Combined 1 and
    2, they are efficient estimators
  • 3. Consistency, that is, as the sample size
    increases indefinitely, the estimators converge
    to their true population values

69
4-3. Properties of OLS estimators under the
normality assumption
  • 4. ?1 is normally distributed ?
  • N(?1, ??12)
  • And Z (?1- ?1)/ ??1 is ? N(0,1)
  • 5. ?2 is normally distributed ?N(?2 ,??22)
  • And Z (?2- ?2)/ ??2 is ? N(0,1)
  • 6. (n-2) ?2/ ?2 is distributed as the
  • ?2(n-2)

70
4-3. Properties of OLS estimators under the
normality assumption
  • 7. ?1 and ?2 are distributed independently of
    ?2. They have minimum variance in the entire
    class of unbiased estimators, whether linear or
    not. They are best unbiased estimators (BUE)
  • 8. Let ui is ? N(0, ?2 ) then Yi is ?
  • NE(Yi) Var(Yi) N?1 ?2X i ?2

71
Some last points of chapter 4
  • 4-4. The method of Maximum likelihood (ML)
  • ML is point estimation method with some
  • stronger theoretical properties than OLS
  • (Appendix 4.A on pages 110-114)
  • The estimators of coefficients ?s by OLS and ML
    are
  • identical. They are true estimators of the ?s
  • (ML estimator of ?2) ?ui2/n (is biased
    estimator)
  • (OLS estimator of ?2) ?ui2/n-2 (is unbiased
    estimator)
  • When sample size (n) gets larger the two
    estimators tend to be equal

72
Some last points of chapter 4
  • 4-5. Probability distributions related
  • to the Normal Distribution The t, ?2,
  • and F distributions
  • See section (4.5) on pages 107-108
  • with 8 theorems and Appendix A, on
  • pages 755-776
  • 4-6. Summary and Conclusions
  • See 10 conclusions on pages 109-110

73
Basic Econometrics
  • Chapter 5
  • TWO-VARIABLE REGRESSION
  • Interval Estimation
  • and Hypothesis Testing

74
Chapter 5 TWO-VARIABLE REGRESSIONInterval
Estimation and Hypothesis Testing
  • 5-1. Statistical Prerequisites
  • See Appendix A with key concepts such as
    probability, probability distributions, Type I
    Error, Type II Error,level of significance, power
    of a statistic test, and confidence interval

75
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-2. Interval estimation Some basic Ideas
  • How close is, say, ?2 to ?2 ?
  • Pr (?2 - ? ? ?2 ? ?2 ?) 1 - ?
    (5.2.1)
  • Random interval ?2 - ? ? ?2 ? ?2 ?
  • if exits, it known as confidence interval
  • ?2 - ? is lower confidence limit
  • ?2 ? is upper confidence limit

76
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-2. Interval estimation Some basic Ideas
  • (1 - ?) is confidence coefficient,
  • 0 lt ? lt 1 is significance level
  • Equation (5.2.1) does not mean that the Pr of ?2
    lying between the given limits is (1 - ?), but
    the Pr of constructing an interval that contains
    ?2 is (1 - ?)
  • (?2 - ? , ?2 ?) is random interval

77
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-2. Interval estimation Some basic Ideas
  • In repeated sampling, the intervals will enclose,
    in (1 - ?)100 of the cases, the true value of
    the parameters
  • For a specific sample, can not say that the
    probability is (1 - ?) that a given fixed
    interval includes the true ?2
  • If the sampling or probability distributions of
    the estimators are known, one can make confidence
    interval statement like (5.2.1)

78
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-3. Confidence Intervals for Regression
  • Coefficients
  • Z (?2 - ?2)/se(?2) (?2 - ?2) ??x2i /?
    N(0,1)
  • (5.3.1)
  • We did not know ? and have to use ? instead,
    so
  • t (?2 - ?2)/se(?2) (?2 - ?2) ??x2i /?
    t(n-2)
  • (5.3.2)
  • gt Interval for ?2
  • Pr -t ?/2 ? t ? t ?/2 1- ?
    (5.3.3)

79
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-3. Confidence Intervals for Regression
  • Coefficients
  • Or confidence interval for ?2 is
  • Pr ?2-t ?/2se(?2) ? ?2 ? ?2t ?/2se(?2)
    1- ?
  • (5.3.5)
  • Confidence Interval for ?1
  • Pr ?1-t ?/2se(?1) ? ?1 ? ?1t ?/2se(?1)
    1- ?
  • (5.3.7)

80
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-4. Confidence Intervals for ?2
  • Pr (n-2)?2/ ?2?/2 ? ?2 ?(n-2)?2/ ?21- ?/2
    1- ?
  • (5.4.3)
  • The interpretation of this interval is If we
    establish (1- ?) confidence limits on ?2 and if
    we maintain a priori that these limits will
    include true ?2, we shall be right in the long
    run (1- ?) percent of the time

81
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-5. Hypothesis Testing General Comments
  • The stated hypothesis is known as the
  • null hypothesis Ho
  • The Ho is tested against and alternative
  • hypothesis H1
  • 5-6. Hypothesis Testing The confidence interval
    approach
  • One-sided or one-tail Test
  • H0 ?2 ? ? versus H1 ?2 gt ?

82
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • Two-sided or two-tail Test
  • H0 ?2 ? versus H1 ?2 ?
  • ?2 - t ?/2se(?2) ? ?2 ? ?2 t
    ?/2se(?2) values of ?2 lying in this interval
    are plausible under Ho with 100(1- ?)
    confidence.
  • If ?2 lies in this region we do not reject Ho
    (the finding is statistically insignificant)
  • If ?2 falls outside this interval, we reject Ho
    (the finding is statistically significant)

83
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-7. Hypothesis Testing
  • The test of significance approach
  • A test of significance is a procedure by which
    sample results are used to verify the truth or
    falsity of a null hypothesis
  • Testing the significance of regression
    coefficient The t-test
  • Pr ?2-t ?/2se(?2) ? ?2 ? ?2t ?/2se(?2) 1-
    ?
  • (5.7.2)

84
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-7. Hypothesis Testing The test of
    significance approach
  • Table 5-1 Decision Rule for t-test of
    significance

85
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-7. Hypothesis Testing The test of
    significance approach
  • Testing the significance of ?2 The ?2 Test
  • Under the Normality assumption we have
  • ?2
  • ?2 (n-2) ------- ?2 (n-2) (5.4.1)
  • ?2
  • From (5.4.2) and (5.4.3) on page 520 gt

86
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-7. Hypothesis Testing The test of
    significance approach
  • Table 5-2 A summary of the ?2 Test

87
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-8. Hypothesis Testing
  • Some practical aspects
  • 1) The meaning of Accepting or Rejecting a
    Hypothesis
  • 2) The Null Hypothesis and the Rule of
  • Thumb
  • 3) Forming the Null and Alternative
  • Hypotheses
  • 4) Choosing ?, the Level of Significance

88
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-8. Hypothesis Testing
  • Some practical aspects
  • 5) The Exact Level of Significance
  • The p-Value See page 132
  • 6) Statistical Significance versus
  • Practical Significance
  • 7) The Choice between Confidence-
  • Interval and Test-of-Significance
  • Approaches to Hypothesis Testing
  • Warning Read carefully pages 117-134

89
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-9. Regression Analysis and Analysis
  • of Variance
  • TSS ESS RSS
  • FMSS of ESS/MSS of RSS
  • ?22 ?xi2/ ?2 (5.9.1)
  • If ui are normally distributed H0 ?2 0 then F
    follows the F distribution with 1 and n-2 degree
    of freedom

90
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-9. Regression Analysis and Analysis of Variance
  • F provides a test statistic to test the null
    hypothesis that true ?2 is zero by compare this F
    ratio with the F-critical obtained from F tables
    at the chosen level of significance, or obtain
    the p-value of the computed F statistic to make
    decision

91
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-9. Regression Analysis and Analysis of Variance
  • Table 5-3. ANOVA for two-variable regression model

92
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-10. Application of Regression
  • Analysis Problem of Prediction
  • By the data of Table 3-2, we obtained the sample
    regression (3.6.2)
  • Yi 24.4545 0.5091Xi , where
  • Yi is the estimator of true E(Yi)
  • There are two kinds of prediction as
  • follows

93
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-10. Application of Regression
  • Analysis Problem of Prediction
  • Mean prediction Prediction of the conditional
    mean value of Y corresponding to a chosen X, say
    X0, that is the point on the population
    regression line itself (see pages 137-138 for
    details)
  • Individual prediction Prediction of an
    individual Y value corresponding to X0 (see pages
    138-139 for details)

94
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-11. Reporting the results of
  • regression analysis
  • An illustration
  • YI 24.4545 0.5091Xi (5.1.1)
  • Se (6.4138) (0.0357) r2 0.9621
  • t (3.8128) (14.2405) df 8
  • P (0.002517) (0.000000289) F1,22202.87

95
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-12. Evaluating the results of regression
    analysis
  • Normality Test The Chi-Square (?2) Goodness of
    fit Test
  • ?2N-1-k ? (Oi Ei)2/Ei
    (5.12.1)
  • Oi is observed residuals (ui) in interval i
  • Ei is expected residuals in interval i
  • N is number of classes or groups k is number of
  • parameters to be estimated. If p-value of
  • obtaining ?2N-1-k is high (or ?2N-1-k is small)
    gt
  • The Normality Hypothesis can not be rejected

96
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-12. Evaluating the results of regression
    analysis
  • Normality Test The Chi-Square (?2) Goodness of
    fit Test
  • H0 ui is normally distributed
  • H1 ui is un-normally distributed
  • Calculated-?2N-1-k ? (Oi Ei)2/Ei
    (5.12.1)
  • Decision rule
  • Calculated-?2N-1-k gt Critical-?2N-1-k then H0 can
  • be rejected

97
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-12. Evaluating the results of regression
    analysis
  • The Jarque-Bera (JB) test of normality
  • This test first computes the Skewness (S)
  • and Kurtosis (K) and uses the following
  • statistic
  • JB n S2/6 (K-3)2/24 (5.12.2)
  • Mean xbar ?xi/n SD2 ?(xi-xbar)2/(n-1)
  • Sm3/m2 3/2 Km4/m22 mk ?(xi-xbar)k/n

98
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-12. (Continued)
  • Under the null hypothesis H0 that the residuals
    are normally distributed Jarque and Bera show
    that in large sample (asymptotically) the JB
    statistic given in (5.12.12) follows the
    Chi-Square distribution with 2 df. If the p-value
    of the computed Chi-Square statistic in an
    application is sufficiently low, one can reject
    the hypothesis that the residuals are normally
    distributed. But if p-value is reasonable high,
    one does not reject the normality assumption.

99
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-13. Summary and Conclusions
  • 1. Estimation and Hypothesis testing
    constitute the two main branches of classical
    statistics
  • 2. Hypothesis testing answers this question
    Is a given finding compatible with a stated
    hypothesis or not?
  • 3. There are two mutually complementary
    approaches to answering the preceding question
    Confidence interval and test of significance.

100
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-13. Summary and Conclusions
  • 4. Confidence-interval approach has a specified
    probability of including within its limits the
    true value of the unknown parameter. If the
    null-hypothesized value lies in the confidence
    interval, H0 is not rejected, whereas if it lies
    outside this interval, H0 can be rejected

101
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-13. Summary and Conclusions
  • 5. Significance test procedure develops a test
    statistic which follows a well-defined
    probability distribution (like normal, t, F, or
    Chi-square). Once a test statistic is computed,
    its p-value can be easily obtained.
  • The p-value The p-value of a test is the lowest
    significance level, at which we would reject H0.
    It gives exact probability of obtaining the
    estimated test statistic under H0. If p-value is
    small, one can reject H0, but if it is large one
    may not reject H0.

102
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-13. Summary and Conclusions
  • 6. Type I error is the error of rejecting a
    true hypothesis. Type II error is the error of
    accepting a false hypothesis. In practice, one
    should be careful in fixing the level of
    significance ?, the probability of committing a
    type I error (at arbitrary values such as 1, 5,
    10). It is better to quote the p-value of the
    test statistic.

103
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-13. Summary and Conclusions
  • 7. This chapter introduced the normality test to
    find out whether ui follows the normal
    distribution. Since in small samples, the t,
    F,and Chi-square tests require the normality
    assumption, it is important that this assumption
    be checked formally

104
Chapter 5 TWO-VARIABLE REGRESSION Interval
Estimation and Hypothesis Testing
  • 5-13. Summary and Conclusions (ended)
  • 8. If the model is deemed practically adequate,
    it may be used for forecasting purposes. But
    should not go too far out of the sample range of
    the regressor values. Otherwise, forecasting
    errors can increase dramatically.

105
Basic Econometrics
  • Chapter 6
  • EXTENSIONS OF THE
  • TWO-VARIABLE LINEAR
  • REGRESSION MODEL

106
Chapter 6 EXTENSIONS OF THE TWO-VARIABLE LINEAR
REGRESSION MODELS
  • 6-1. Regression through the origin
  • The SRF form of regression
  • Yi b2X i u i (6.1.5)
  • Comparison two types of regressions
  • Regression through-origin model and
  • Regression with intercept
  •  

107
Chapter 6 EXTENSIONS OF THE TWO-VARIABLE LINEAR
REGRESSION MODELS
  • 6-1. Regression through the origin
  • Comparison two types of regressions
  • b2 SXiYi/SX2i (6.1.6) O
  • b2 Sxiyi/Sx2i (3.1.6) I
  • var(b2) s2/ SX2i (6.1.7) O
  • var(b2) s2/ Sx2i (3.3.1) I
  • s2 S(ui)2/(n-1) (6.1.8) O
  • s2 S(ui)2/(n-2) (3.3.5) I
  •  

108
Chapter 6 EXTENSIONS OF THE TWO-VARIABLE LINEAR
REGRESSION MODELS
  • 6-1. Regression through the origin
  • r2 for regression through-origin model
  • Raw r2 (SXiYi)2 /SX2i SY2i (6.1.9)
  • Note Without very strong a priory expectation,
    well advise is sticking to the conventional,
    intercept-present model. If intercept equals to
    zero statistically, for practical purposes we
    have a regression through the origin. If in fact
    there is an intercept in the model but we insist
    on fitting a regression through the origin, we
    would be committing a specification error

109
Chapter 6 EXTENSIONS OF THE TWO-VARIABLE LINEAR
REGRESSION MODELS
  • 6-1. Regression through the origin
  • Illustrative Examples
  • 1) Capital Asset Pricing Model - CAPM (page 156)
  • 2) Market Model (page 157)
  • 3) The Characteristic Line of Portfolio Theory
    (page 159)

110
Chapter 6 EXTENSIONS OF THE TWO-VARIABLE LINEAR
REGRESSION MODELS
  • 6-2. Scaling and units of measurement
  • Let Yi b1 b2Xi u i (6.2.1)
  • Define Yiw 1 Y i and Xiw 2 X i then
  • b2 (w1/w2) b2 (6.2.15)
  • b1 w1b1 (6.2.16)
  • s2 w12s2 (6.2.17)
  • Var(b1) w21 Var(b1) (6.2.18)
  • Var(b2) (w1/w2)2 Var(b2) (6.2.19)
  • r2xy r2xy (6.2.20)

111
Chapter 6 EXTENSIONS OF THE TWO-VARIABLE LINEAR
REGRESSION MODELS
  • 6-2. Scaling and units of measurement
  • From one scale of measurement, one can derive the
    results
  • based on another scale of measurement. If w1 w2
    the
  • intercept and standard error are both multiplied
    by w1. If
  • w21 and scale of Y changed by w1, then all
    coefficients and
  • standard errors are all multiplied by w1. If
    w11 and scale of
  • X changed by w2, then only slope coefficient and
    its standard
  • error are multiplied by 1/w2. Transformation
    from (Y,X) to
  • (Y,X) scale does not affect the properties of
    OLS
  • Estimators
  • A numerical example (pages 161, 163-165)

112
6-3. Functional form of regression model
  •  
  • The log-linear model
  • Semi-log model
  • Reciprocal model

113
6-4. How to measure elasticity
  • The log-linear model
  • Exponential regression model
  • Yi b1Xi b2 e u i (6.4.1)
  • By taking log to the base e of both side
  • lnYi lnb1 b2lnXi ui , by setting lnb1 a gt
  • lnYi a b2lnXi ui (6.4.3)
  • (log-log, or double-log, or log-linear
    model)
  • This can be estimated by OLS by letting
  • Yi a b2Xi ui , where YilnYi, XilnXi
  • b2 measures the ELASTICITY of Y respect to X,
    that is, percentage change in Y for a given
    (small) percentage change in X.

114
6-4. How to measure elasticity
  • The log-linear model
  • The elasticity E of a variable Y with
  • respect to variable X is defined as
  • EdY/dX( change in Y)/( change in X)
  • (?Y/Y) x 100 / (?X/X) x100
  • (?Y/?X)x (X/Y) slope x (X/Y)
  •  
  • An illustrative example The coffee
  • demand function (pages 167-168)

115
6-5. Semi-log model Log-lin and Lin-log
Models
  • How to measure the growth rate The log-lin model
  • Y t Y0 (1r) t (6.5.1)
  • lnYt lnY0 t ln(1r) (6.5.2)
  • lnYt b1 b2t , called constant growth model
    (6.5.5)
  • where b1 lnY0 b2 ln(1r)
  • lnYt b1 b2t ui (6.5.6)
  • It is Semi-log model, or log-lin model. The slope
    coefficient measures the constant proportional or
    relative change in Y for a given absolute change
    in the value of the regressor (t)
  • b2 (Relative change in regressand)/(Absolute
    change in regressor) (6.5.7)

116
6-5. Semi-log model Log-lin and Lin-log
Models
  • Instantaneous Vs. compound rate of growth
  • b2 is instantaneous rate of growth
  • antilog(b2) 1 is compound rate of growth
  • The linear trend model
  • Yt b1 b2t ut (6.5.9)
  • If b2 gt 0, there is an upward trend in Y
  • If b2 lt 0, there is an downward trend in Y
  • Note (i) Cannot compare the r2 values of models
    (6.5.5) and (6.5.9) because the regressands in
    the two models are different, (ii) Such models
    may be appropriate only if a time series is
    stationary.

117
6-5. Semi-log model Log-lin and Lin-log
Models
  • The lin-log model
  • Yi b1 b2lnXi ui (6.5.11)
  • b2 (Change in Y) / Change in lnX (Change in
    Y)/(Relative change in X) (?Y)/(?X/X)
    (6.5.12)
  • or ?Y b2 (?X/X) (6.5.13)
  • That is, the absolute change in Y equal to b2
    times the relative change in X. 

118
6-6. Reciprocal Models Log-lin and
Lin-log Models
  • The reciprocal model
  • Yi b1 b2( 1/Xi ) ui (6.5.14)
  • As X increases definitely, the term
  • b2( 1/Xi ) approaches to zero and Yi
  • approaches the limiting or asymptotic value b1
    (See figure 6.5 in page 174)
  • An Illustrative example The Phillips Curve for
    the United Kingdom 1950-1966
  •  

119
6-7. Summary of Functional Forms
  • Table 6.5 (page 178)

120
6-7. Summary of Functional Forms
  • Note / indicates that the elasticity
    coefficient is variable, depending on the value
    taken by X or Y or both. when no X and Y values
    are specified, in practice, very often these
    elasticities are measured at the mean values E(X)
    and E(Y).
  • ---------------------------------------------
    --
  • 6-8. A note on the stochastic error term
  • 6-9. Summary and conclusions
  • (pages 179-180)

121
Basic Econometrics
  • Chapter 7
  • MULTIPLE REGRESSION ANALYSIS
  • The Problem of Estimation

122
7-1. The three-Variable Model Notation and
Assumptions
  • Yi ß1 ß2X2i ß3X3i u i
    (7.1.1)
  • ß2 , ß3 are partial regression coefficients
  • With the following assumptions
  • Zero mean value of U i E(u iX2i,X3i) 0. ?i
    (7.1.2)
  • No serial correlation Cov(ui,uj) 0, ?i j
    (7.1.3)
  • Homoscedasticity Var(u i) ?2
    (7.1.4)
  • Cov(ui,X2i) Cov(ui,X3i) 0
    (7.1.5)
  • No specification bias or model correct
    specified (7.1.6)
  • No exact collinearity between X variables
    (7.1.7)
  • (no multicollinearity in the cases of more
    explanatory
  • vars. If there is linear relationship exits, X
    vars. Are said
  • to be linearly dependent)
  • Model is linear in parameters

123
7-2. Interpretation of Multiple Regression
  • E(Yi X2i ,X3i) ß1 ß2X2i ß3X3i (7.2.1)
  • (7.2.1) gives conditional mean or expected value
    of Y conditional upon the given or fixed value of
    the X2 and X3

124
7-3. The meaning of partial regression
coefficients
  • Yi ß1 ß2X2i ß3X3 . ßsXs ui
  • ßk measures the change in the mean value of Y per
    unit change in Xk, holding the rest explanatory
    variables constant. It gives the direct effect
    of unit change in Xk on the E(Yi), net of Xj (j
    k)
  • How to control the true effect of a unit change
    in Xk on Y? (read pages 195-197)

125
7-4. OLS and ML estimation of the partial
regression coefficients
  • This section (pages 197-201) provides
  • 1. The OLS estimators in the case of
    three-variable regression
  • Yi ß1 ß2X2i ß3X3 ui
  • 2. Variances and standard errors of OLS
    estimators
  • 3. 8 properties of OLS estimators (pp 199-201)
  • 4. Understanding on ML estimators

126
7-5. The multiple coefficient of determination R2
and the multiple coefficient of correlation R
  • This section provides
  • 1. Definition of R2 in the context of multiple
    regression like r2 in the case of two-variable
    regression
  • 2. R ??R2 is the coefficient of multiple
    regression, it measures the degree of association
    between Y and all the explanatory variables
    jointly
  • 3. Variance of a partial regression coefficient
  • Var(ßk) ?2/ ?x2k (1/(1-R2k)) (7.5.6)
  • Where ßk is the partial regression coefficient
    of regressor Xk and R2k is the R2 in the
    regression of Xk on the rest regressors

127
7-6. Example 7.1 The expectations-augmented
Philips Curve for the US (1970-1982)
  • This section provides an illustration for the
    ideas introduced in the chapter
  • Regression Model (7.6.1)
  • Data set is in Table 7.1

128
7-7. Simple regression in the context of multiple
regression Introduction to specification bias
  • This section provides an understanding on
    Simple regression in the context of multiple
    regression. It will cause the specification bias
    which will be discussed in Chapter 13

129
7-8. R2 and the Adjusted-R2
  • R2 is a non-decreasing function of the number of
    explanatory variables. An additional X variable
    will not decrease R2
  • R2 ESS/TSS 1- RSS/TSS 1-?u2I / ?y2i
    (7.8.1)
  • This will make the wrong direction by adding more
    irrelevant variables into the regression and give
    an idea for an adjusted-R2 (R bar) by taking
    account of degree of freedom
  • R2bar 1- ?u2I /(n-k) / ?y2i /(n-1) , or
    (7.8.2)
  • R2bar 1- ?2 / S2Y (S2Y is sample variance of
    Y)
  • K number of parameters including intercept
    term
  • By substituting (7.8.1) into (7.8.2) we get
  • R2bar 1- (1-R2) (n-1)/(n- k)
    (7.8.4)
  • For k gt 1, R2bar lt R2 thus when number of X
    variables increases R2bar increases less than R2
    and R2bar can be negative

130
7-8. R2 and the Adjusted-R2
  • R2 is a non-decreasing function of the number of
    explanatory variables. An additional X variable
    will not decrease R2
  • R2 ESS/TSS 1- RSS/TSS 1-?u2I / ?y2i
    (7.8.1)
  • This will make the wrong direction by adding more
    irrelevant variables into the regression and give
    an idea for an adjusted-R2 (R bar) by taking
    account of degree of freedom
  • R2bar 1- ?u2I /(n-k) / ?y2i /(n-1) , or
    (7.8.2)
  • R2bar 1- ?2 / S2Y (S2Y is sample variance of
    Y)
  • K number of parameters including intercept
    term
  • By substituting (7.8.1) into (7.8.2) we get
  • R2bar 1- (1-R2) (n-1)/(n- k)
    (7.8.4)
  • For k gt 1, R2bar lt R2 thus when number of X
    variables increases R2bar increases less than R2
    and R2bar can be negative

131
7-8. R2 and the Adjusted-R2
  • Comparing Two R2 Values
  • To compare, the size n and the dependent
    variable must be the same
  • Example 7-2 Coffee Demand Function Revisited
    (page 210)
  • The game of maximizing adjusted-R2 Choosing
    the model that gives the highest R2bar may be
    dangerous, for in regression our objective is not
    for that but for obtaining the dependable
    estimates of the true population regression
    coefficients and draw statistical inferences
    about them
  • Should be more concerned about the logical or
    theoretical relevance of the explanatory
    variables to the dependent variable and their
    statistical significance

132
7-9. Partial Correlation Coefficients
  • This section provides
  • 1. Explanation of simple and partial correlation
    coefficients
  • 2. Interpretation of simple and partial
    correlation coefficients
  • (pages 211-214)

133
7-10. Example 7.3 The Cobb-Douglas Production
functionMore on functional form
  • Yi ?1X?22i X?33ieUi (7.10.1)
  • By log-transform of this model
  • lnYi ln?1 ?2ln X2i ?3ln X3i Ui
    ?0 ?2ln X2i ?3ln X3i Ui
    (7.10.2)
  • Data set is in Table 7.3
  • Report of results is in page 216

134
7-11 Polynomial Regression Models
  • Yi ?0 ?1 Xi ?2 X2i ?k Xki Ui
  • (7.11.3)
  • Example 7.4 Estimating the Total Cost Function
  • Data set is in Table 7.4
  • Empirical results is in page 221
  • --------------------------------------------------
    ------------
  • 7-12. Summary and Conclusions
  • (page 221)

135
Basic Econometrics
  • Chapter 8
  • MULTIPLE REGRESSION ANALYSIS
  • The Problem of Inference

136
Chapter 8MULTIPLE REGRESSION ANALYSIS The
Problem of Inference
  • 8-3. Hypothesis testing in multiple regression
  • Testing hypotheses about an individual partial
    regression coefficient
  • Testing the overall significance of the estimated
    multiple regression model, that is, finding out
    if all the partial slope coefficients are
    simultaneously equal to zero
  • Testing that two or more coefficients are equal
    to one another
  • Testing that the partial regression coefficients
    satisfy certain restrictions
  • Testing the stability of the estimated regression
    model over time or in different cross-sectional
    units
  • Testing the functional form of regression models
  •  

137
Chapter 8MULTIPLE REGRESSION ANALYSIS The
Problem of Inference
  • 8-4. Hypothesis testing about individual partial
    regression coefficients
  • With the assumption that u i N(0,?2) we can
    use t-test to test a hypothesis about any
    individual partial regression coefficient.
  • H0 ?2 0
  • H1 ?2 ? 0
  • If the computed t value gt critical t value at the
    chosen level of significance, we may reject the
    null hypothesis otherwise, we may not reject it

138
Chapter 8MULTIPLE REGRESSION ANALYSIS The
Problem of Inference
  • 8-5. Testing the overall significance of a
    multiple
  • regression The F-Test
  • For Yi ?1 ?2X2i ?3X3i ........ ?kXki
    ui
  • To test the hypothesis H0 ?2 ?3 .... ?k 0
    (all slope coefficients are simultaneously zero)
    versus H1 Not at all slope coefficients are
    simultaneously zero, compute
  • F(ESS/df)/(RSS/df)(ESS/(k-1))/(RSS/(n-k))
    (8.5.7) (k total number of parameters to be
    estimated including intercept)
  • If F gt F critical F?(k-1,n-k), reject H0
  • Otherwise you do not reject it

139
Chapter 8MULTIPLE REGRESSION ANALYSIS The
Problem of Inference
  • 8-5. Testing the overall significance of a
    multiple regression
  • Alternatively, if the p-value of F obtained from
    (8.5.7) is sufficiently low, one can reject H0
  • An important relationship between R2 and F
  • F(ESS/(k-1))/(RSS/(n-k)) or
  • R2/(k-1)
  • F ---------------- (8.5.1)
  • (1-R2) / (n-k)
  • ( see prove on page 249)

140
Chapter 8MULTIPLE REGRESSION ANALYSIS The
Problem of Inference
  • 8-5. Testing the overall significance of a
    multiple regression in terms of R2
  • For Yi b1 b2X2i b3X3i ........ bkXki
    ui
  • To test the hypothesis H0 b2 b3 ..... bk
    0 (all slope coefficients are simultaneously
    zero) versus H1 Not at all slope coefficients
    are simultaneously zero, compute
  • F R2/(k-1) / (1-R2) / (n-k) (8.5.13) (k
    total number of parameters to be estimated
    including intercept)
  • If F gt F critical F a, (k-1,n-k), reject H0

141
Chapter 8MULTIPLE REGRESSION ANALYSIS The
Problem of Inference
  • 8-5. Testing the overall significance of a
    multiple regression
  • Alternatively, if the p-value of F obtained from
    (8.5.13) is sufficiently low, one can reject H0
  • The Incremental or Marginal contribution of
    an explanatory variable
  • Let ?X is the new (additional) term in the
    right hand of a regression. Under the usual
    assumption of the normality of ui and the HO ?
    0, it can be shown that the following F ratio
    will follow the F distribution with respectively
    degree of freedom

142
Chapter 8MULTIPLE REGRESSION ANALYSIS The
Problem of Inference
  • 8-5. Testing the overall significance of a
    multiple regression
  • R2new - R2old / Df1
  • F com ---------------------- (8.5.18)
  • 1 - R2new / Df2
  • Where Df1 number of new regressors
  • Df2 n number of parameters in the
    new model
  • R2new is standing for coefficient of
    determination of the new regression (by adding
    bX)
  • R2old is standing for coefficient of
    determination of the old regression (before
    adding bX).

143
Chapter 8MULTIPLE REGRESSION ANALYSIS The
Problem of Inference
  • 8-5. Testing the overall significance of a
    multiple regression
  • Decision Rule
  • If F com gt F a, Df1 , Df2 one can reject the Ho
    that b 0 and conclude that the addition of X to
    the model significantly increases ESS and hence
    the R2 value
  • When to Add a New Variable? If t of
    coefficient of X gt 1 (or F t 2 of that variable
    exceeds 1)
  • When to Add a Group of Variables? If adding a
    group of variables to the model will give F value
    greater than 1

144
Chapter 8MULTIPLE REGRESSION ANALYSIS The
Problem of Inference
  • 8-6. Testing the equality of two regression
    coefficients
  • Yi b1 b2X2i b3X3i b4X4i ui
    (8.6.1)
  • Test the hypotheses
  • H0 b3 b4 or b3 - b4 0
    (8.6.2)
  • H1 b3 ? b4 or b3 - b4 ? 0
  • Under the classical assumption it can be shown
  • t (b3 - b4) (b3 - b4) / se(b3 - b4)
  • follows the t distribution with (n-4) df because
    (8.6.1) is a four-variable model or, more
    generally, with (n-k) df. where k is the total
    number of parameters estimated, including
    intercept term.
  • se(b3 - b4) ? var((b3) var( b4)
    2cov(b3, b4) (8.6.4)
  • (see appendix)

145
Chapter 8MULTIPLE REGRESSION ANALYSIS The
Problem of Inference
  • t (b3 - b4) / ? var((b3) var( b4)
    2cov(b3, b4) (8.6.5)
  • Steps for testing
  • 1. Estimate b3 and b4
  • 2. Compute se(b3 - b4) through (8.6.4)
  • 3. Obtain t- ratio from (8.6.5) with H0 b3 b4
  • 4. If t-computed gt t-critical at designated level
    of significance for given df, then r
Write a Comment
User Comments (0)
About PowerShow.com