Regression Analysis: Estimating Relationships - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

Regression Analysis: Estimating Relationships

Description:

This is estimated for the analysis by using the Relative Demand from the previous quarter ... Significant correlation between Avg Price, Avg Advertising and Quarter ... – PowerPoint PPT presentation

Number of Views:109
Avg rating:3.0/5.0
Slides: 46
Provided by: aloksri
Category:

less

Transcript and Presenter's Notes

Title: Regression Analysis: Estimating Relationships


1
Regression Analysis and Modeling for Decision
Support
2
Regression Analysis Estimating Models for
Decision Support
Regression Analysis is a study of relationship
between a set of independent variables and the
dependent variable. Independent variables are
characteristics that can be measured directly
(example the area of a house). These variables
are also caled predictor variables (used to
predict the dependent variable) or explanatory
variables (used to explain the behavior of the
dependent variable). Dependent variable is a
characteristic whose value depends on the values
of independent variables.
3
Purpose of Regression Analysis
Future/Unknown
Past / Experience / Known
Now
time
ExplanationUse regression analysis to develop a
mathematical model to explain the variance in the
dependent variable based on values of independent
variables.
Prediction If the regression model adequately
explains the dependent variable, use the model to
predict values of the dependent variable.
Explain Selling Price of a house (dependent)
based on its characteristics (independents). If
the model is valid, use it for prediction.
Develop Regression Model using known data
(sample) Selling Price 40,000 100(Sq.ft)
20,000(Baths) If the above model is reliable and
valid, Use this model to predict the Selling
Price of any house based on its area (Sq.ft.) and
the number of bathrooms (Baths) The constant
term (40,000) is the fixed price of the house.
This is not dependent on the values of the
variables considered. Can be interpreted as the
price of the lot and transaction costs. The
coefficient of Sq.ft. (100) is the change in
Selling Price for an additional Square Foot. Can
be interpreted as Price per Sq.Foot.
4
Procedure for Building Regression Models
Define/Clarify Purpose. Identify, graph and
describe the measurement of the dependent
variable. Specific y
Identify possible independent variables
(predictors should make sense). Describe each
X. Use scatter plots and correlations for y with
each x.
Estimate Regression Coefficients (using least
squares method).
Test to see if all coefficients are significant
(reliability). Establish validity (are
relationships as expected, do predictions match
actuals?).
Implement the model in Decision Support System.
Incorporate error in predictions. Outline
limitations/constraints of the model.
Compare predictions with actual values.
Modify/Refine/Expand model if necessary.
5
Selecting Independent Variables Scatter Plots
Scatter Plots are used to visualize the
relationship between any two variables. For
regression analysis, we are looking for strong
linear relationship between the independent and
dependent variable.
Y-Intercept (Constant) Value of the dependent
variable irrespective of the value(s) of the
independent variable(s).
X-Coefficient (Slope) Change in dependent
variable per unit change in independent variable.
Overhead 3996 43 M_Hrs 883 Runs
R-Squared Proportion of variance in dependent
variable explained by independent variable(s).
6
Selecting Independent Variables Correlation
Analysis
Correlation Coefficients are used to measure the
linear relationship between any two variables.
For regression analysis, we are looking for
strong linear relationship between the
independent and dependent variable, and low
correlations among independent variables .
Correlation of MachHrs with Overhead (should be
high)
Correlation of MachHrs with ProdRuns (should be
low)
Correlation of ProdRuns with Overhead (should
be high)
Multicollinearity exists when two independent
variables are highly correlated (redundancy).
7
Simple Linear Regression
Linear regression function One dependent and one
independent variables Mathematical form Y
b0 b1X e b0 and b1 are parameters (unknown
constants) and their values are estimated from a
known sample of X and corresponding Y.
Estimated Model Y-Pred b0 b1X

Y-actual
e
Y-pred
b0 and b1 are estimates (based on a sample) of
B0 and B1 which are parameters (based on
population) Estimation of b0 and b1
(coefficients) is done by the Least Squares
Method. This method selects the line that has
the smallest squared error
B1 slope
B0 y -intercept
X
8
Example of Simple Linear Regression Defining
Objective(s)
Define Objectives
  • Pharmex is a chain of drugstores that operates
    around the country.
  • To see how effective their advertising and other
    promotional activities are, the company has
    collected data from 50 randomly selected
    metropolitan regions.
  • In each region it has compared its own
    promotional expenditures and sales to those of
    the leading competitor in the region over the
    past year.
  • So, Pharmexs objective is to model the
    relationship between Promotion expenditures and
    Sales
  • Since Pharmex is interested in improving its
    sales, relative to its largest competitor, the
    dependent (outcome) variable for this situation
    is
  • Sales Pharmexs sales as a percentage of those
    of the leading competitor. This is the dependent
    (or predicted) variable.

9
Example of SLR Select Independent Variable
Variable Selection
The company expects that there is a positive
relationship between the Relative measures of
Sales and Promotion Expenditures, so that regions
with relatively more expenditures have relatively
more sales.
Promote Pharmexs promotional expenditures as a
percentage of those of the leading competitor.
This is the independent variable (or predictor
variable), one which can be controlled by
Pharmex.
  • Selection Criteria
  • Based in Common Sense and Experience
  • Scatter Plots and Correlations

Description of Variables List each variable, how
measured, and expected relationship with
dependent variable. In this section report
results of Correlation Analyses, Scatter Plots,
etc.
10
Example of SLR Collect and Organize Data
Data Collection
11
Example of SLR Estimate Coefficients
Estimate Model
Regression Procedure in Excel
R-Square 45 of the variance in Sales is
explained by Promote (model)
Estimated Coefficients
Y-intercept (b0) 25.12 Slope (b1) 0.762
Sales-predicted
25.12 0.762 Promote
P-Value Indicates the probability of making a
Type I error (the possibility that the
coefficient is 0, that is there is no
relationship). If this value is greater than .05
do not use the variable as a predictor.
12
Example of SLR Testing the Model
  • Reliability and Validity
  • Does the model make intuitive sense? Is the model
    easy to understand and interpret?
  • Are all coefficients statistically significant?
    (p-values less than .05)
  • Are the signs associated with the coefficients as
    expected?
  • Does the model predict values that are reasonably
    close to the actual values?
  • Is the model sufficiently sound? (High R-square,
    low standard error, etc.)

13
Example of SLR Implementing and Using the Model
Develop a Spreadsheet Model (Decision Support
System)
Estimated
Decision Variable
Forecast (regression formula)
What-if Pharmex spent 160K on promotions?
(Sensitivity analysis) What will Pharmex have to
do to achieve 20 sales more than its competitor?
(goal seeking) What will happen to Pharmexs
sales if its Competitors promotion can be any
value between 130K and 140K? (Monte-Carlo
Simulation)
14
  • Estimating Demand for a Product

15
Conceptual Structure of Demand Model
Firm Demand Total Industry Demand Market
Share Define Market Share Firm Demand / Total
Industry Demand Market Share Firm
Demand / (Avg. Demand Number of Firms)
Market Share Relative Demand / Number of
Firms Firm Demand Total Industry Demand
(Relative Demand / N)
Macro-economic Influences Seasonal
Patterns Stage of Life Cycle
Exogenous Demand
Industry Activity Pricing, Promotion,
Quality
Endogenous Demand
FD
Competitive Profile Relative Pricing,
Promotion, Quality, and Loyalty
Relative Demand
16
Total Industry Demand
17
Relative Demand(Measure of Market Share)
RD is firm specific and a measure of market
share, the predictor variables should also be
relative to industry averages. For example,
relative price of the firm is PREL Firms
Price / Industry Avg. Price
18
Estimating Demand
  • Situation Overview

19
  • Objectives of Analysis

20
Objectives
  • Obtaining reliable forecasts for the Firm Demand
  • Build a more dependable model based on regression
    analysis
  • Make operations more efficient based on more
    reliable forecasts
  • Monitor patterns in the overall demand for the
    industry

21
  • Defining the Data

22
Descriptions of the Variables
  • FD Firm Demand
  • Demand for our Firms product
  • TID Total Industry Demand
  • Includes demand for all competitors in the market
  • MS Market Share
  • Firms percentage share of the market
  • N Number of Competitors
  • AFD Average Demand
  • Average Demand for all competitors in the market
  • Avg. Demand Total Industry Demand / N
  • RD - Relative Demand
  • The firms demand relative to the average demand
    of the market
  • Relative Demand Firm Demand / Avg. Demand

23
Descriptions of the Variables (continued)
  • Firm Demand Total Industry Demand Market
    Share
  • Market Share Firm Demand / Total Industry
    Demand
  • Market Share Firm Demand / (Avg. Demand N)
  • Market Share Relative Demand / N
  • Hence
  • Firm Demand Total Industry Demand (Relative
    Demand / N)

24
Predictor Variables for Total Industry Demand
  • Total Industry Demand is a factor of individual
    company demand
  • Demand for each company depends on macro-economic
    influences and overall industry trends
  • Average Price Indicates industry trends in
    pricing, and is reasonable to use instead of
    individual competitor prices
  • Average Advertising Indicates industry
    expenditures in promotions, marketing
  • In this case, also factors in RD

25
Predictor Variables for Relative Demand
  • Relative Demand is a measure of the firms market
    share
  • Hence, the predictor variables should also be
    relative to Industry Average
  • PREL Relative Price
  • PREL Firm Price / Average Price
  • AREL Relative Advertising
  • AREL Firm Advertising / Average Advertising
  • RD1 Relative Loyalty
  • This is estimated for the analysis by using the
    Relative Demand from the previous quarter

26
  • Description of Variables

27
TID Summary Measures
28
TID Describing Avg. Price with Graphs
  • Values seem to be trending down over time
  • Relatively stable - In the 375 range for the 19
    quarters observed
  • Median is to the right of the Mean in the
    Box-Plot
  • Indicates negative skewness

29
TID Describing Avg. Advertising with Graphs
  • Some indications of seasonality from the
    time-series plot
  • Values change significantly with time
  • Median is almost the same as the mean
  • Slight negative skewness exists

30
TID Describing Total Industry Demand with Graphs
  • Time series plot indicates a steadily increasing
    demand for the product
  • Short term increases are steep
  • Mean is to the right of the Median
  • Indicates positive skewness

31
TID Correlations Matrix
  • Decent correlation between Quarter and TID
  • Consider Quarter for regression analysis
  • High correlation between Avg Price and TID
  • Good candidate for regression analysis
  • High correlation exists between Avg Advertising
    and TID
  • Good candidate for regression analysis
  • Significant correlation between Avg Price, Avg
    Advertising and Quarter
  • Potential candidates for variable exclusion
    during regression analysis

32
RD Scatter Plot Arel vs RD
  • Trendline indicates a positive relationship
  • A correlation factor of .378 indicating less
    correlation between the two variables
  • Arel is a potential candidate to be discarded

33
RD Scatter Plot Prel vs RD
  • Trendline indicates a negative relationship
  • Has a correlation factor of -0.67 indicating a
    fair amount of correlation between the two
    variables

34
RD Scatter Plot RD1 vs RD
  • Trendline indicates a positive relationship
  • Has a correlation factor of 0.711indicating a
    fair amount of correlation between the two
    variables

35
RD Correlations Matrix
  • RD has a reasonably high correlation to Prel and
    RD1
  • RD has a relatively lower correlation to Arel
  • The variable could be discarded during regression
    analysis
  • The correlations between Prel, Arel, and RD1 are
    low enough to indicate that there will not be
    problems related to multicollinearity

36
  • Analysis and Modeling

37
TID Model Regression Analysis (1)
  • Dependent Variable
  • Total Industry Demand
  • Independent Variable
  • Quarter
  • Resulting Equation
  • TID 14218.0702 (645.9825 Quarter)
  • The R2 for the resulting equation indicates that
    the variable Quarter explains 48 of TID, and
    hence the unexplained value is 52
  • This makes for a very poor model despite the
    p-value being very low which implies that the
    probability of a Type 1 error is minimal

38
TID Model Regression Analysis (2)
  • Dependent Variable
  • Total Industry Demand
  • Independent Variables
  • Quarter, Avg. Price, Avg. Advertising
  • Resulting Equation
  • TID 130249 (132.228 Quarter) (-358.613
    Avg Price) (0.263 Avg. Advertising)
  • The R2 for the resulting equation indicates that
    the model explains 90.69 of TID, and hence the
    unexplained value is about 9, this generally
    signifies a good model
  • p-value is within tolerance levels for all
    variables, except Quarter which has a p-value of
    0.21 and includes the 0 value in its range. Hence
    Quarter must be discarded

39
TID Model Regression Analysis (3)
  • Dependent Variable
  • Total Industry Demand
  • Independent Variables
  • Avg. Price, Avg. Advertising
  • Resulting Equation
  • TID 164336.17 (-445.168 Avg Price) (0.262
    Avg. Advertising)
  • The R2 for the resulting equation indicates that
    the model explains 89.67 of TID, and hence the
    unexplained value is about 10.5
  • p-value is within tolerance levels for all
    variables, and none of the variables include the
    0 value in its range
  • This model is good, and is the model that will be
    used for TID

40
RD Model Regression Analysis (3)
  • Dependent Variable
  • Relative Demand
  • Independent Variables
  • Relative Pricing (Prel), Relative Advertising
    (Arel), Loyalty (RD1)
  • Resulting Equation
  • RD 16.13 (-16.445 Prel) (0.779 Arel)
    (0.533 RD1)
  • The R2 for the resulting equation indicates that
    the model explains 95.75 of RD, and hence the
    unexplained value is about 4.25
  • p-value is within tolerance levels for all
    variables, and none of the variables include the
    0 value in its range
  • This model is very good, and is the model that
    will be used for RD

41
  • Verifying the Model

42
Verification of TID Model
43
Verification of RD Model
44
  • Using the Model

45
DSS Forecasting Firm Demand
  • The coefficients from the TID model and the RD
    model are entered into this model
  • Based on estimated values for the various items
    on the left, the Firm Demand is calculated by
    this model
Write a Comment
User Comments (0)
About PowerShow.com