Lake Eutrophication and a Golf Course - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Lake Eutrophication and a Golf Course

Description:

Chlorophyll-a (C) widely used indicator measure of eutrophication ... Use data to estimate parameter values that give 'best fit': b0=-9.4, b1=0.3, b2=1.2 ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 15
Provided by: brucek64
Category:

less

Transcript and Presenter's Notes

Title: Lake Eutrophication and a Golf Course


1
Lake Eutrophication and a Golf Course
  • Chlorophyll-a (C) widely used indicator measure
    of eutrophication
  • Nitrogen (N) associated with eutrophication
  • Q Golf Course Development. Nitrogen expected to
    ?. By how much will C increase/decrease in the
    local lake?
  • Lets look at data from other lakes and fit a
    linear relation between C and N
  • Slope of relationship will give us the expected
    effect on C of a unit increase in N

2
Ordinary Least Squares (OLS) Regression
  • Estimators have many properties.
  • 6 is an estimator, but not a very good one.
  • Two main properties we care about
  • Unbiased The expected distance of estimator from
    thing it is estimating is 0.
  • Efficient Small variance (uncertainty)
  • 6 is biased, but has a very small variance
    (zero).
  • Also called Classical Linear Regression Model
    (CLRM)
  • Find the intercept and slope parameters such that
    the sum of squared residuals is as small as
    possible
  • OLS is an estimator for the parameters of the
    model

Given certain assumptions are satisfied, OLS
estimator is unbiased and has minimum variance of
all unbiased estimators.
3
gt chlor lt- read.csv("Chlorophyll.csv") gt c1.lm lt-
lm(Chlorophyll.a Nitrogen, datachlor) gt
summary(c1.lm) Call lm(formula Chlorophyll.a
Nitrogen, data chlor) Residuals Min 1Q
Median 3Q Max -58.73 -34.13 -10.73 30.97
92.77 Coefficients Estimate Std.
Error t value Pr(gtt) (Intercept) 110.337
23.997 4.598 0.000127 Nitrogen
-4.300 1.596 -2.694 0.012946
--- Signif. codes 0 ' 0.001 ' 0.01 '
0.05 .' 0.1 ' 1 Residual standard error
44.55 on 23 degrees of freedom Multiple
R-Squared 0.2399, Adjusted R-squared 0.2068
F-statistic 7.259 on 1 and 23 DF, p-value
0.01295
4
But theres a problem...
5
(No Transcript)
6
Call lm(formula Chlorophyll.a Phosphorus,
data chlor) Residuals Min 1Q Median
3Q Max -36.148 -13.901 -5.022 5.254
61.037 Coefficients Estimate Std.
Error t value Pr(gtt) (Intercept) 11.34093
6.72380 1.687 0.105 Phosphorus
0.30241 0.03512 8.610 1.19e-08
--- Signif. codes 0 ' 0.001 ' 0.01
' 0.05 .' 0.1 ' 1 Residual standard error
24.86 on 23 degrees of freedom Multiple
R-Squared 0.7632, Adjusted R-squared 0.7529
F-statistic 74.13 on 1 and 23 DF, p-value
1.189e-08
7
Call lm(formula Chlorophyll.a Phosphorus
Nitrogen, data chlor) Residuals Min
1Q Median 3Q Max -37.008 -14.115
-7.214 7.675 61.875 Coefficients
Estimate Std. Error t value Pr(gtt)
(Intercept) -9.38601 21.32504 -0.440 0.664
Phosphorus 0.33321 0.04622 7.210
3.17e-07 Nitrogen 1.20043 1.17221
1.024 0.317 --- Signif. codes 0 '
0.001 ' 0.01 ' 0.05 .' 0.1 ' 1 Residual
standard error 24.84 on 22 degrees of
freedom Multiple R-Squared 0.774, Adjusted
R-squared 0.7534 F-statistic 37.67 on 2 and 22
DF, p-value 7.867e-08
8
(No Transcript)
9
Back to the golf course
  • Use data to estimate parameter values that give
    best fit b0-9.4, b10.3, b21.2
  • Answer A one unit increase in N, results in
    about a 1.2 unit increase in C.
  • Importance Omitting phosphorus from model
    introduced significant bias!!!
  • But theres a lot of uncertainty in the estimate
    of the effect of N
  • 95 CI ranges from about 1.2 to about 3.6
  • Question does nitrogen have any effect on
    chlorophyll A in these lakes?

10
Does nitrogen have an effect?
  • In multiple regression, cant tell just from
    looking at the P values of the individual
    coefficients
  • If two independent variables are colinear
    (correlated), then the P values will be inflated
    or deflated
  • Instead, look at effect of removing each
    variable, one at a time, from the model
  • Uses F-test to test null hypothesis that
    increased goodness of fit from that variable is
    just due to chance
  • gt Anova(c3.lm)
  • Anova Table (Type II tests)
  • Response Chlorophyll.a
  • Sum Sq Df F value Pr(gtF)
  • Phosphorus 32070 1 51.9830 3.171e-07
  • Nitrogen 647 1 1.0487 0.3169
  • Residuals 13572 22

11
Call lm(formula Chlorophyll.a Phosphorus
Nitrogen, data chlor) Residuals Min
1Q Median 3Q Max -22.193 -11.292
-3.648 4.538 47.546 Coefficients
Estimate Std. Error t value Pr(gtt)
(Intercept) -4.883608 15.889700 -0.307
0.761609 Phosphorus 0.161319
0.052467 3.075 0.005748 Nitrogen
0.241565 0.899195 0.269 0.790824
PhosphorusNitrogen 0.024162 0.005573 4.335
0.000291 --- Signif. codes 0 ' 0.001
' 0.01 ' 0.05 .' 0.1 ' 1 Residual
standard error 18.47 on 21 degrees of
freedom Multiple R-Squared 0.8807, Adjusted
R-squared 0.8637 F-statistic 51.69 on 3 and 21
DF, p-value 7.192e-10
Anova Table (Type II tests) Response
Chlorophyll.a Sum Sq Df F
value Pr(gtF) Phosphorus 32070 1
94.031 3.312e-09 Nitrogen 647
1 1.897 0.1829167 PhosphorusNitrogen
6410 1 18.795 0.0002914 Residuals
7162 21
12
Call lm(formula Chlorophyll.a Phosphorus
PhosphorusNitrogen, data
chlor) Residuals Min 1Q Median 3Q
Max -23.415 -11.417 -3.248 3.648 47.170
Coefficients Estimate
Std. Error t value Pr(gtt) (Intercept)
-0.896421 5.553804 -0.161 0.873246
Phosphorus 0.152876 0.041115 3.718
0.001196 PhosphorusNitrogen 0.024530
0.005287 4.640 0.000126 --- Signif. codes
0 ' 0.001 ' 0.01 ' 0.05 .' 0.1 ' 1
Residual standard error 18.07 on 22 degrees of
freedom Multiple R-Squared 0.8803, Adjusted
R-squared 0.8694 F-statistic 80.91 on 2 and 22
DF, p-value 7.219e-11
Anova Table (Type II tests) Response
Chlorophyll.a Sum Sq Df F
value Pr(gtF) Phosphorus 45828 1
140.287 5.107e-11 PhosphorusNitrogen 7033
1 21.528 0.0001264 Residuals
7187 22
13
R code, part 1
  • Read in the data
  • chlor lt- read.csv("Chlorophyll.csv")
  • Perform OLS regression of C on N, and look at
    the results
  • c1.lm lt- lm(Chlorophyll.a Nitrogen, datachlor)
  • summary(c1.lm)
  • Plot the data. Note that when both variables
    are in the data frame,
  • you can use the formula notation
  • plot(Chlorophyll.aNitrogen, datachlor)
  • Add the OLS regression line to the plot
  • abline(c1.lm)
  • Look at scatterplots of the all the variables
    (the first column is just the
  • lake ID number, so we dont include it). This
    is in the CAR library.
  • library(car)
  • scatterplot.matrix(chlor,24)
  • OLS regression of C on P
  • c2.lm lt- lm(Chlorophyll.a Phosphorus,
    datachlor)
  • summary(c2.lm)
  • OLS regression of C on N and P
  • c3.lm lt- lm(Chlorophyll.a PhosphorusNitrogen,
    datachlor)

14
R code, part 2
  • Plot the actual values vs. the fitted values
  • plot(fitted(c3.lm),chlorChlorophyll.a)
  • Show the line of equality
  • abline(0,1)
  • Look at the significance of individual terms in
    the previous regression.
  • Notice the capital A. This is in the car
    library.
  • Anova(c3.lm)
  • OLS regression of C on N, P, and their
    interaction
  • c4.lm lt- lm(Chlorophyll.a PhosphorusNitrogen,
    datachlor)
  • summary(c4.lm)
  • Anova(c4.lm)
  • Drop the N term from the previous regression
  • c5.lm lt- lm(Chlorophyll.a Phosphorus
    PhosphorusNitrogen, datachlor)
  • summary(c5.lm)
  • Anova(c5.lm)
Write a Comment
User Comments (0)
About PowerShow.com