Testing Relationships between Variables - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Testing Relationships between Variables

Description:

... 17 - Regression Testing. 7. Significance ... Test Statistic for Slope ... Confidence interval does not contain zero, which we expected from our hypothesis test ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 19
Provided by: shanej8
Category:

less

Transcript and Presenter's Notes

Title: Testing Relationships between Variables


1
Testing Relationships between Variables
Statistics 111 - Lecture 17
2
Administrative Notes
Administrative Notes
  • Homework 5 due tomorrow
  • Lecture on Wednesday will be review of entire
    course

3
Final Exam
  • Thursday from 1040-1210
  • Itll be right here in this room
  • Calculators are definitely needed!
  • Single 8.5 x 11 cheat sheet (two-sided) allowed
  • Ive put a sample final on the course website

4
Outline
  • Review of Regression coefficients
  • Hypothesis Tests
  • Confidence Intervals
  • Examples

5
Two Continuous Variables
  • Visually summarize the relationship between two
    continuous variables with a scatterplot
  • Numerically, we focus on best fit line
    (regression)

Education and Mortality
Draft Order and Birthday
Mortality 1353.16 - 37.62 Education
Draft Order 224.9 - 0.226 Birthday
6
Best values for Regression Parameters
  • The best fit line has these values for the
    regression coefficients
  • Also can estimate the average squared residual

Best estimate of slope ?
Best estimate of intercept ?
7
Significance of Regression Line
  • Does the regression line show a significant
    linear relationship between the two variables?
  • If there is not a linear relationship, then we
    would expect zero correlation (r 0)
  • So the slope b should also be zero
  • Therefore, our test for a significant
    relationship will focus on testing whether our
    slope ? is significantly different from zero
  • H0 ? 0 versus Ha ? ? 0

8
Linear Regression
  • Best fit line is called Simple Linear Regression
    Model
  • Coefficients?is the intercept and ? is the slope
  • Other common notation ?0 for intercept, ?1 for
    slope
  • Our Y variable is a linear function of the X
    variable but we allow for error (ei) in each
    prediction
  • We approximate the error by using the residual

Observed Yi
Predicted Yi ? ?Xi
9
Test Statistic for Slope
  • Our test statistic for the slope is similar in
    form to all the test statistics we have seen so
    far
  • The standard error of the slope SE(b) has a
    complicated formula that requires some matrix
    algebra to calculate
  • We will not be doing this calculation manually
    because the JMP software does this calculation
    for us!

10
Example Education and Mortality
11
Confidence Intervals for Coefficients
  • JMP output also gives the information needed to
    make confidence intervals for slope and intercept
  • 100C confidence interval for slope ?
  • b /- tn-2 SE(b)
  • The multiple t comes from a t distribution with
    n-2 degrees of freedom
  • 100C confidence interval for intercept ?
  • a /- tn-2 SE(a)
  • Usually, we are less interested in intercept ?
    but it might be needed in some situations

12
Confidence Intervals for Example
  • We have n 60, so our multiple t comes from a t
    distribution with d.f. 58. For a 95 C.I., t
    2.00
  • 95 confidence interval for slope ?
  • -37.6 2.08.307 (-54.2,-21.0)
  • Note that this interval does not contain zero!
  • 95 confidence interval for intercept ?
  • 1353 2.091.42 (1170,1536)

13
Another Example Draft Lottery
  • Is the negative linear association we see between
    birthday and draft order statistically
    significant?

p-value
14
Another Example Draft Lottery
  • p-value lt 0.0001 so we reject null hypothesis and
    conclude that there is a statistically
    significant linear relationship between birthday
    and draft order
  • Statistical evidence that the randomization was
    not done properly!
  • 95 confidence interval for slope ?
  • -.231.98.05 (-.33,-.13)
  • Multiple t 1.98 from t distribution with n-2
    363 d.f.
  • Confidence interval does not contain zero, which
    we expected from our hypothesis test

15
Education Example
  • Dataset of 78 seventh-graders relationship
    between IQ and GPA
  • Clear positive association between IQ and grade
    point average

16
Education Example
  • Is the positive linear association we see between
    GPA and IQ statistically significant?

p-value
17
Education Example
  • p-value lt 0.0001 so we reject null hypothesis and
    conclude that there is a statistically
    significant positive relationship between IQ and
    GPA
  • 95 confidence interval for slope ?
  • .1011.99.014 (.073,.129)
  • Multiple t 1.99 from t distribution with n-2
    76 d.f.
  • Confidence interval does not contain zero, which
    we expected from our hypothesis test

18
Next Class - Lecture 18
  • Review of course material
Write a Comment
User Comments (0)
About PowerShow.com