MATH 1107 Elementary Statistics - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

MATH 1107 Elementary Statistics

Description:

... 00 36558.00 32347.00 31077.00 20585.00 27490.00 26602.00 30177.00 32026.00 39644.00 39227.00 26158.00 31254.00 27835.00 27105.00 31479.00 25363.00 32114.00 ... – PowerPoint PPT presentation

Number of Views:93
Avg rating:3.0/5.0
Slides: 15
Provided by: MauriceLe70
Category:

less

Transcript and Presenter's Notes

Title: MATH 1107 Elementary Statistics


1
MATH 1107Elementary Statistics
  • Lecture 7
  • Regression Analysis

2
MATH 1107 Regression Analysis
  • Without question, Regression Analysis is the most
    heavily used tool in Statistical Modeling.
  • This is true because it enables you to predict or
    explain a dependent variable based upon one or
    more independent variables.
  • Regression Analysis is used in almost every
    industry.

3
MATH 1107 Regression Analysis
  • For Example
  • If you were a sports agent, how would you
    propose a reasonable contract salary for your
    client?
  • If you are interested in selling your house, how
    can you determine an appropriate market price?
  • If you are the head of the admissions department
    in a University, how do you decide who gets
    accepted?
  • If you are an investment banker, how do you
    decide which funds to hold in your portfolio?

4
MATH 1107 Regression Analysis
All of the variables underlined would be the
dependent variables what would be the
associated independent variables that we might
use to predict or explain these dependent
variables?
5
MATH 1107 Regression Analysis
The first step in predicting or explaining a
dependent variable using an independent
variable, is evaluating the correlation of the
two variables using a scatterplot. Lets return
to Median Household Income and Deathrate
although many independent variables can be used
in regression analysis, in these notes, we will
be using only one.
6
MATH 1107 Regression Analysis
7
MATH 1107 Regression Analysis
8
MATH 1107 Regression Analysis
  • Using the CORREL(array1, array2) function in
    EXCEL, we can determine that the correlation
    between Median Income and Death Rate is -.61.
  • This indicates three things
  • The relationship is fairly strong the value of
    -.61 is closer to 1 than it is to 0.
  • The direction is negative/inverse. Meaning that
    as one variable goes up, the other goes down.
  • The R2 value of a predictive regression equation
    using these two variables is .37.

9
MATH 1107 Regression Analysis
  • Since the correlation is pretty good, we can use
    these two variables to create a linear model a
    linear model
  • It will have an equation in the form ymxb
  • It will be the best fit of the data
  • it will minimize the distances between the
    actual data points and the predicted points
    (this distance is called a residual)
  • it will enable us to predict the death rates in
    other states, that were NOT included in the
    original dataset.

10
MATH 1107 Regression Analysis
From this analysis, the best fit line is This
equation was provided by EXCEL (tick the Display
Equation on Chart option under the Add
Trendline function). A better way to represent
this equation is State Death Rate (-0.0002
Median State Income) 13.255
y -0.0002x 13.255
11
MATH 1107 Regression Analysis
Lets interpret these values directly -.0002 is
the slope of the line. It can be translated
directly to mean For every one dollar of
additional median income, the death rate will
decrease by .0002. The slope tells you how
the dependent variable changes with one unit
change in the independent variable.
12
MATH 1107 Regression Analysis
Lets interpret these values directly 13.255 is
the Y-intercept. Algebraically, this is the
point at which the line will cross the y-axis
when the x-value is 0. Since it is not
reasonable to have a state with 0 Median Income,
its not really interpreted directly.
13
MATH 1107 Regression Analysis
Now, using the model we developed, predict the
death rates for the states below
STATE MEDIAN INCOME
Virginia 38,223
Washington 34,064
West Virginia 20,301
Wisconsin 33,415
Wyoming 30,379
14
MATH 1107 Regression Analysis
Now, lets determine our residuals or how far
off we were for each prediction.
Write a Comment
User Comments (0)
About PowerShow.com