Title: Top 10 Things to Remember about Summarizing Bivariate Data
1Top 10 Things to Remember about Summarizing
Bivariate Data
210. Always make a picture
- (scatterplot of data, residual plot)
39. Identify the explanatory and response
variables.
- (either in your equation or define the variables
separately)
4True or FalsePearsons correlation coefficient,
r, does not depend on the units of measurement of
the two variables.
5True or FalseThe value of Pearsons correlation
coefficient, r, is always between 0 and 1.
68. If the goal is to describe the strength of
relationship, report correlation coefficient.
(strength, direction, form, and unusual features,
always in context)
77. If the goal is to predict, report the LSRL and
coefficient of determination.
8True or FalseThe slope of the least squares
line is the average amount by which y increases
as x increases by one unit.
9True or FalseThe slopes of the LSRL for
predicting y from x, and the LSRL for predicting
x from y, are equal.
106. Explain what the slope b and y-intercept mean
in CONTEXT in the predicted y a bx
regression line.
115. Beware of extrapolation (do not assume that a
linear model is valid over a wider range of x
values)
12True or FalseThe LSRL passes through the point
13True or FalseThe coefficient of determination
is equal to the positive square root of Pearsons
r.
144. A correlation coefficient of 0 does not
necessarily imply that there is no relationship
between two variables (could be strong but not
linear).
153. Watch out for influential observations. (pulls
LSRL toward it, but will be close to 0 in
residual plot)
16True or FalseIf r 1, the standard deviation
of y is equal to the standard deviation of the
residuals.
17True or FalseThe standard deviation about the
LSRL is roughly the typical amount by which an
observation deviates the least squares line.
18True or FalseA transformation (or reexpression)
of a variable is accomplished by substituting a
function of the variable in place of the variable
for further analysis.
19True or FalseThe higher the value of the
coefficient of determination, the greater the
evidence for a causal relationship between x and y
202. Correlation does not imply causation. A
strong correlation implies only that the two
variables tend to vary together in a predictable
way.
21And the 1 thing to remember.
1. Only use QUANTITATIVE data when comparing
bivariate data
22Plot your data. (scatterplot)
Interpret what you see. (direction, form,
strength, outliers)
Numerical summary?
Mathematical model? (Regression line)
How well does it fit? (Residuals and r2)
23- Given this residual plot, which of the following
is not a correct conclusion? - a) The pattern in the residuals indicates the
regression line does not fit the data well. - b) Point A is a candidate as an outlier.
- c) Point A is a candidate as an influential
point. - d) The relationship between the variables is
positive. - e) All of these are correct.
Residuals
Fitted values
24 2) Which of the following residual plots
indicates a reasonable fit to a given set of
data? a) c)
b) d) e) None of these indicates a
reasonable fit.
25- 3) Which of the following is a correct conclusion
based on the residual plot displayed? - The line overestimates the data.
- b) The line underestimates the data.
- c) It is not appropriate to fit a line to these
data since there is clearly no correlation
between the variables. - d) The data is not related.
- e) None of these choices is correct.
26- 5) You are given the regression equation
- temperature 30.4 - .72(distance), where
temperature is the temperature displayed on a
sensor in C and distance is the distance in
centimeters from the sensor to a heat source.
Which of the following is not a reasonable
conclusion? - The temperature of the heat source is
approximately 30.4C. - b) The temperature decreases approximately .72C
for each centimeter the sensor is moved away from
the heat source. - c) We can predict that the sensor displays a
temperature of 21.76C when the sensor is 12
centimeters away from the heat source. - d) The correlation coefficient between
temperature and distance indicates a negative
relationship. - e) All of these are reasonable.
27- 7) If the correlation coefficient of a bivariate
set of data (x,y) is r, then which of the
following is true? - The variable x and y are linearly related.
- b) The correlation coefficient of the set (y,x)
is also r. - c) The correlation coefficient of the set
(x,ay) is also a ?r. - d) The correlation coefficient of the set
(ax,ay) is also a ?r. - e) None of these is true.