Regression Lecture 8 - PowerPoint PPT Presentation

1 / 71
About This Presentation
Title:

Regression Lecture 8

Description:

Default shows what appears to be a negative relationship, but the graphs ... Rank the data and use Pearson's stuff for ties. r = .94 and Spearman's rS = .78. ... – PowerPoint PPT presentation

Number of Views:119
Avg rating:3.0/5.0
Slides: 72
Provided by: DanWr7
Category:

less

Transcript and Presenter's Notes

Title: Regression Lecture 8


1
Regression Lecture 8
2
Aims for Today - Regression
  • Drawing lines on scatterplots
  • The regression line Predicting values
  • Correlation
  • Ranked based correlation
  • Break/Handout
  • Examples by Dan
  • Chile and maybe being hit by a car
  • How tos

3
(No Transcript)
4
Scatter Plot
  • Plotting 2 continuous-ish variables
  • Exploring their association
  • One of the most used and most useful techniques
    in science.

5
(No Transcript)
6
Several ways to make in SPSS.
7
Default shows what appears to be a negative
relationship, but the graphs can be improved.
8
(No Transcript)
9
(No Transcript)
10
(No Transcript)
11
(No Transcript)
12
(No Transcript)
13
(No Transcript)
14
Graphing 3 Variables (London et al., 2007)
  • 4- to 9-year olds
  • 2 week recall
  • 10 month recall

15
Can you see the 8s?
16
(No Transcript)
17
(No Transcript)
18
Is this the right approach?
  • Fitting a straight line (a linear relationship)

19
Finding the Regression Line
  • Very general procedure (easily expanded)
  • Simple linear regression
  • Easiest way is just to draw a straight line
    yourself
  • A more formal method has some value
  • and finding the ß0 and ß1 which minimize Sei2
  • Least Squares is also used in t test and mean
  • (least absolute value is used for the median)

20
(No Transcript)
21
  • minimizing the squared residuals min Sei2
  • Is least squares regression
  • better than eyeballing it?
  • Are there better formal methods?

22
Do you need to know the equations for ß0 and
ß1? Not reallyWould they be worth seeing
once? Probably
just look at, don't write
23
Regressions sometimes used to predict values
(data based on Tytherleigh, 2002)
24
Running a regression in R
lm is for Linear Model
25
(No Transcript)
26
r2 or adjusted r2and r or R
27
(No Transcript)
28
Assessing the Fit The Correlation
29
Equation bit
  • Top part determines whether positive or negative.
    If xi and yi are same side as their means,
    positive, otherwise negative.
  • If as one goes up, the other goes up, positive.

30
Correlation Strength of the linear relationship
  • Can get to it in several ways.
  • The correlation squared in the proportion of
    shared variance.
  • The correlation can range only from -1 to 1.
  • Does a correlation between x and y mean x caused
    y?
  • Does a correlation between x and y mean that
    there is some causal relationship in the network
    of hypotheses that include x and y?
  • Are the most parsimonious ones x -gt y and y -gt x?

31
Significance Testing
  • H0 ? (rho) 0
  • Almost always use two tailed tests
  • You must know the sample size
  • r 0.1 is significant with n500 at 5
  • r 0.4 is not significant with n20 at 5
  • (Cohen sizes .1 small, .3 medium, .5 large)

32
Significance Testing and Confidence IntervalsThe
equations
with df n - 2, and df 1, n - 2
33
Making Confidence Intervals
  • Several programs on web. http//glass.ed.asu.edu/
    stats/analysis/rci.html

34
(No Transcript)
35
Notice the Normal and Basic Bootstraps give
impossible upper bounds.
BCa very similar to asymptotic methods
36
(No Transcript)
37
(No Transcript)
38
(No Transcript)
39
(No Transcript)
40
r .64, w/o outlier r .92, w/o influential
point r .38
41
Assumptions for Significance
  • Random sampling
  • It must make sense to talk about the response
    variable (the DV) as being continuous.
  • No weird patterns (or non-linear in general) in
    residuals. Variance of residual homoscedastic
    (ie., not varying by other variables -
    heteroscedastic)
  • Examination of outliers

42
What to do if assumptions not meet(to get data,
install and load mrt. data(crime) and attach)
43
Ranked based Correlation
  • Spearman's rho
  • Rank the data and use Pearson's stuff for ties.
  • r .94 and Spearman's rS .78.

44
In SPSS and R just tick a box or change the method
Doesn't print confidence interval
45
  • Same correlation estimate.
  • But the CI really does meet appropriate
    assumptions.

46
(No Transcript)
47
Break Time
  • Short break, we have a lot to get through
    afterwards.
  • In 4 groups
  • Look at the handout that I am about to give you.
    Discuss how you would report your findings in a
    scientific journal versus People magazine. Are
    there any other statistics you would want to do?
  • Talk about what you wrote for Suppose an
    undergraduate said "Since it is for looking at
    differences among means, why is it called an
    Analysis of Variance?"

48
Some Examples
  • Chile Heat To discuss re-expression and what to
    do with outliers.
  • Automobile Accidents To discuss using theory to
    guide your statistics.

49
Are smaller chiles hotter?
  • How to measure length and heat.
  • Length skewed

50
Testing Normality
51
par(mfrowc(1,2)) qqnorm(LENGTH)
qqline(LENGTH)qqnorm(log(LENGTH2.54))qqline(log
(LENGTH2.54)) par(mfrowc(1,1))
52
Measuring Heat Scoville units or the number of
chiles?
53
(No Transcript)
54
(No Transcript)
55
Command Summary
  • r1 lt- lm(HEATLENGTH)
  • r2 lt- lm(HEATLENGTHlt30LENGTHLENGTHlt30)
  • r3 lt- lm(HEAT log(LENGTH 2.54))

56
(No Transcript)
57
(No Transcript)
58
plot(r1)Nu Mex is hotter than predicted for
its length
59
What to do with
  • Genetically
  • engineered.
  • Depends on the population and purpose.

60
What is a "linear model"
Y ßX e
Don't worry if you dislike matrix notation
61
(No Transcript)
62
Vehicle-Pedestrian Accidents
  • What is the relationship between the impact
    velocity of a vehicle and the throw of a
    pedestrian?
  • A lot is known about how a body should move when
    hit by a car at a certain velocity.
  • Good reason to suggest throwi k vi2 ei
  • Dan will glance around to see if anyone
    looks interested in "why" this equation makes
    sense, and may skip the next two slides.

63
Why Theoretical Sense?
  • Body takes on impact
  • horizontal velocity of
  • the car, v, at an angle
  • above the horizontal.
  • Vertical velocity vy v sin ?
  • Horizontal velocity vx v cos ?
  • Time in air, t, is related only to vy. t 2
    vy / g, where g is the constant for gravity on
    Earth, about 10m/s2..

64
  • Without friction, vx is constant and thus throw
    should be
  • and if ? is the same for all cars throw v2 k,
    where k is a constant.
  • Thus, throwi k vi2 ei
  • Simpler Only 1 unknown (k) to solve for AND it
    has some empirical meaning

65
Wood, Simms Walsh (2005)
66
Otte's work with crash test dummies
67
  • reglin lt- lm(distance speed)
  • regpoly lt- lm(distance speed spsq)
  • regmodel lt- lm(distance spsq - 1)

This can done in SPSS too. Tick
no intercept/constant.
68
(No Transcript)
69
Summary
70
This week's journal
  • Try help(par)
  • Write an equation in Word
  • Access these data from web fishstock in .dat
    (use read.table) or .sav (use SPSS or read.spss)
  • Variables are ocean (how much winter low
    temperature is above freezing in Celsius) and
    fishstock (gt 2cm in
  • thousands per cubic kilometer).
  • What are the correlation and the regression
    equation?
  • Write a sentence about the results.

71
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com