Forecasting Theory - PowerPoint PPT Presentation

1 / 64
About This Presentation
Title:

Forecasting Theory

Description:

Maximum Likelihood Estimation ... Maximum Likelihood Estimation. Assume that the observations are independent. We define the likelihood function ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 65
Provided by: Jack333
Category:

less

Transcript and Presenter's Notes

Title: Forecasting Theory


1
Forecasting Theory
  • J. M. Akinpelu

2
What is Forecasting?
  • Forecast - to calculate or predict some future
    event or condition, usually as a result of
    rational study or analysis of pertinent data
  • Websters Dictionary

3
What is Forecasting?
  • Forecasting methods
  • Qualitative
  • intuitive, educated guesses that may or may not
    depend on past data
  • Quantitative
  • based on mathematical or statistical models
  • The goal of forecasting is to reduce forecast
    error.

4
What is Forecasting?
  • We will consider two types of forecasts based on
    mathematical models
  • Regression forecasting
  • Single-variable (time series) forecasting

5
What is Forecasting?
  • Regression forecasting
  • We use the relationship between the variable of
    interest and the other variables that explain its
    variation to make predictions.
  • The explanatory variables are non-stochastic.
  • The explanatory variables are independent the
    variable of interest is dependent.

6
What is Forecasting?
  • Regression forecasting
  • Height is the independent variable
  • Weight is the dependent variable

7
What is Forecasting?
  • Single-variable (time series) forecasting
  • We use past history of the variable of interest
    to predict the future.
  • Predictions exploit correlations between past
    history and the future.
  • Past history is stochastic.

8
What is Forecasting?
  • Single-variable (time series) forecasting

9
Normal Distribution
  • A continuous random variable X is normally
    distributed if its density function is given by
  • In this case
  • EX ?
  • var(X) ? 2.

10
Normal Density Function
11
Maximum Likelihood Estimation
  • Suppose that Y1, , Yn are continuous random
    variables with respective densities fi(y ?) that
    depend on some common parameter ? (which can be
    vector-valued). Assume that
  • ? is unknown
  • we observe y1, , yn.
  • We want to estimate the value of ? associated
    with Y1, , Yn . Intuitively, we want to find the
    value of ? that is most likely to give rise to
    the data sample y1, , yn.

12
Maximum Likelihood Estimation
Example Consider the data sample y1, , y20
below. Assume that all the densities are the
same, and that the unknown parameter is the mean.
Which of the two distributions most likely
produced the data sample below?
13
Maximum Likelihood Estimation
  • Assume that the observations are independent. We
    define the likelihood function
  • In maximum likelihood estimation, we choose the
    value of ? that maximizes the likelihood
    function.

14
Maximum Likelihood Estimation
  • Furthermore, since logarithm is a monotone
    increasing function, then the value of ? that
    maximizes () also maximizes the log of the
    likelihood function

15
Maximum Likelihood Estimation and Least Squares
Estimation
  • Now assume that Yi is normally distributed with
    mean ?i(?), where ? is unknown. Assume also that
    all of the densities have a common known variance
    ? 2. Then the log likelihood function becomes

16
Maximum Likelihood Estimation and Least Squares
Estimation
  • Hence maximizing L(? y1, , yn) is equivalent
    to minimizing the sum of squared deviations
  • The value of ? that minimizes S(?) is called the
    least squares estimate of ?.

17
Regression Forecasting
  • We suppose that Y is a variable of interest, and
    X1, , Xp are explanatory or predictor variables
    such that
  • Y h(X1, , Xp ß).
  • h is the mathematical model that determines the
    relationship between the variable of interest and
    the explanatory variables
  • ? (ß0, , ßm)? are the model parameters.

18
Regression Forecasting
  • Further assume that
  • we know h (i. e., the model is known), but we do
    not know ß
  • we have noisy measurements of the variable of
    interest, Y
  • yi h(xi1, , xip ß) ei.

19
Regression Forecasting
  • the random noise ei satisfy
  • Eei 0 for all i.
  • var(ei) ? 2, a constant that does not depend on
    i.
  • The eis are uncorrelated.
  • The eis are each normally distributed (which
    implies that they are independent), i.e.,
  • ei N(0, ? 2).

20
Regression Forecasting
  • Note that since
  • yi h(xi1, , xip ß) ei
  • and
  • ei N(0, ? 2)
  • then
  • yi N(h(xi1, , xip ß) , ? 2).

21
Regression Forecasting
  • For any values of the explanatory variables x1,
    , xp, if ? is known, we can predict y as
  • y h(x1, , xp ß).
  • Since ? is unknown, we use least squares
    estimation to estimate ?, which we denote by
    . In this case, we forecast y as

22
Regression Forecasting
Example
What are the best values for ?0 and ?1?
23
Regression Forecasting
Residuals are the differences between the
observed values and the predicted values. We
define the residual for the ith observation
as A good set of parameters is one for
which the residuals are small.
ei
24
Regression Forecasting
  • More specifically, if
  • then we choose to minimize

25
Regression Forecasting
  • Examples of Regression Models

26
Constant Mean Regression
  • Suppose that the yis are a constant value plus
    noise
  • yi ?0 ei,
  • i.e., ? ?0. Hence
  • yi N(?0, ? 2).
  • We want to determine the value of ?0 that
    minimizes

27
Constant Mean Regression
  • Taking the derivative of S(?0) gives
  • Finally setting this equal to zero leads to
  • Hence the sample mean is the least squares
    estimator for ?0.

28
Constant Mean Regression
  • Example yi ?0 ei,

29
Simple Linear Regression
  • Consider the model
  • yi ?0 ?1xi ei,
  • i.e., ? (?0, ?1). Hence
  • yi N(?0 ?1xi, ? 2).
  • We want to determine the values of ?0 and ?1 that
    minimize

30
Simple Linear Regression
  • Setting the first partial derivatives equal to
    zero gives

31
Simple Linear Regression
  • Solving for ?0 and ?1 leads to the least squares
    estimates
  • (This is left as a homework exercise.)

32
Simple Linear Regression
  • Define e (e1, , en)?, where
  • The equations
  • imply that

33
Simple Linear Regression
  • Example

34
Simple Linear Regression
  • Example continued

35
Simple Linear Regression
  • Example (continued)
  • Regression equation

36
General Linear Regression
  • Consider the linear regression model
  • or
  • where xi (1, xi1, , xip)? and ? (?0, , ?p)?.

37
General Linear Regression
  • Suppose that we have n observations yi. We
    introduce matrix notation and define
  • y (y1, , yn)?, e (e1, , en)?,
  • Note that y is n ? 1, e is n ? 1, and X is n ? (p
    1).

38
General Linear Regression
  • Then we can write the regression model as
  • Note that y has a mean vector and covariance
    matrix given by
  • where I is the n ? n identity matrix.

39
General Linear Regression
  • Note that by var(y), we mean the matrix
  • Note that this is a symmetric matrix.

40
General Linear Regression
  • We assume that the matrix X, which is called the
    design matrix, is of full rank. This means that
    the columns of the X matrix are not linearly
    related.
  • A violation of this assumption would indicate
    that some of the independent variables are
    redundant, since at least one of the variables
    would contain the same information as a linear
    combination of the others.

41
General Linear Regression
  • In matrix notation, the least squares criterion
    can be expressed as minimizing
  • The least squares estimator is given by
  • (Proof is omitted.)

42
General Linear Regression
  • It follows that the prediction for Y is given by

43
General Linear Regression
  • Example Simple Linear Regression

44
General Linear Regression
  • Example Simple Linear Regression (p 1)

45
General Linear Regression
  • Example Simple Linear Regression (p 1)

46
Properties of Least Squares Estimators
  • The least squares estimator of ? is unbiased.
  • The (p 1) ? (p 1) covariance matrix of the
    least squares estimator is given by
  • If the errors are normally distributed then

47
Properties of Least Squares Estimators
  • It follow from the derivation of the least
    squares estimate that the residuals satisfy
  • X?e 0.
  • In particular,

48
Testing the Regression Model
  • How well does the regression line describe the
    relationship between the independent and
    dependent variables?

49
Testing the Regression Model
Explained deviation
Unexplained deviation
50
Testing the Regression Model
  • Lets analyze these variations
  • But

51
Testing the Regression Model
  • Hence
  • Total sum of squares (SSTO)
  • Sum of squares due to regression (SSR)
  • Sum of squares due to error (SSE)

52
Coefficient of Determination
  • The coefficient of determination R2 is a measure
    of how well the model is doing in explaining the
    variation of the observations around their mean
  • A large R2 (near 1) indicates that a large
    portion of the variation is explained by the
    model.
  • A small value of R2 (near 0) indicates that only
    a small fraction of the variation is explained by
    the model.

53
Correlation Coefficient
  • The correlation coefficient R is the square root
    of the coefficient of determination. For simple
    linear regression, it can also be expressed as
  • It varies between -1 and 1, and quantifies the
    strength of the association between the
    independent and dependent variables. A value of R
    close to 1 indicates a strong positive
    correlation a value close to -1 indicates a
    strong negative correlation. A value close to
    zero indicates weak or no correlation.

54
Correlation
positive correlation
negative correlation
no correlation
55
Testing the Regression Model
  • Example Simple Linear Regression
  • Regression equation

56
Estimating the Variance
  • So far we have assumed that we know the variance
    ? 2. But in general this value will be unknown.
    We can estimate ? 2 from the sample data by

57
Confidence Interval for Regression Line
  • Simple Linear Regression Suppose x0 is a
    specified value of the independent variable. A
    100?(1-?) confidence interval for the value of
    the mean of the dependent variable y0 at x0 is
    given by

58
Prediction Interval for an Observation
  • Simple Linear Regression A 100?(1-?) prediction
    interval for an observation y0 associated with x0
    is given by

59
What is Forecasting (Revisited)?
  • Statistical forecasting is not predicting
  • a value
  • Statistical forecasting is predicting
  • the expected value
  • variability about the expected value

60
Homework
  • Complete the proof for the result that for the
    simple linear regression model,
  • Prove that if Y is a random variable with finite
    expected value, then the constant c that
    minimizes E(Y c)2 is c EY.

61
Homework
  • Suppose that the following data represent the
    total costs and the number of units produced by a
    company.
  • Graph the relationship between X and Y.
  • Determine the simple linear regression line
    relating Y to X.
  • Predict the costs for producing 10 units. Give a
    95 confidence interval for the costs, and for
    the expected value (mean) of the costs associated
    with 10 units.
  • Compute the SSTO, SSR, SSE, R and R2. Interpret
    the value of R2.

62
Homework
  • Consider the fuel consumption data on the next
    slide, and the following model which relates fuel
    consumption (Y) to the average hourly temperature
    (X1) and the chill index (X2)
  • Plot Y versus X1 and Y versus X2.
  • Determine the least squares estimates for the
    model parameters.
  • Predict the fuel consumption when the temperature
    is 35 and the chill index is 10.
  • Compute the SSTO, SSR, SSE and R2. Interpret the
    value of R2.

63
Data for Problem 4
64
References
  • Bovas Abraham, Johannes Ledolter, Statistical
    Methods for Forecasting, Wiley Series in
    Probability and Mathematical Sciences, 1983.
  • Stanton A. Glantz, Bryan K. Slinker, Primer of
    Applied Regression and Analysis of Variance,
    Second Edition, McGraw-Hill, Inc., 2001.
  • Spyros Makridakis, Steven C. Wheelwright, Rob J.
    Hyndman, John Wiley Sons, Inc., 1998.
Write a Comment
User Comments (0)
About PowerShow.com