Title: Spatial statistics and Generalized Least Squares Regression
1Spatial statistics and Generalized Least Squares
Regression
2pH and NO3 in Norwegian lakes
3Call lm(formula pH.1981 NO3.1981, data
lake) Residuals Min 1Q Median
3Q Max -0.888624 -0.410622 -0.007402
0.386811 1.237405 Coefficients
Estimate Std. Error t value Pr(gtt)
(Intercept) 5.7130797 0.1218255 46.896 lt
2e-16 NO3.1981 -0.0032446 0.0009186
-3.532 0.00106 --- Signif. codes 0 ''
0.001 '' 0.01 '' 0.05 '.' 0.1 ' ' 1 Residual
standard error 0.5103 on 40 degrees of
freedom Multiple R-Squared 0.2377, Adjusted
R-squared 0.2187 F-statistic 12.47 on 1 and 40
DF, p-value 0.001056
4pH of Norwegian Lakes
5Questions about Norwegian lake data
- Is there spatial autocorrelation in pH values?
- How can we interpolate and smooth those values?
- Is there a relationship between pH and NO3,
taking into account spatial autocorrelations in
both variables?
6Semivariogram
- Semivariance is half the average squared
difference between pairs of lakes a certain
distance apart - Measures variance among sites as a function of
distance - Also called empirical variogram
7Theoretical variogram
8Theoretical variogram
Sill
Range
Nugget
9Theoretical variogram
Summary of the parameter estimation --------------
--------------------- Estimation method WLS
(weighted least squares) Parameters of the
spatial component correlation function
gaussian (estimated) variance parameter
sigmasq (partial sill) 0.3175
(estimated) cor. fct. parameter phi (range
parameter) 0.7817 Parameter of the error
component (estimated) nugget
0.0773 Minimised weighted sum of squares
4.2704 Call variofit(vario pH.vg,
ini.cov.pars c(0.4, 1), cov.model "gaussian")
10(No Transcript)
11Kriging
- Minimum mean-squared-error method of spatial
prediction - Interpolates and smoothes
- Named after South African mining engineer D.G.
Krige - Uses a theoretical variogram
- Have to fit (and choose!) the theoretical
variogram first - Produces a smooth surface that fits the data
12pH
13Are the residuals spatially correlated?
14Variogram of residuals
15(No Transcript)
16OLS residuals
17Generalized least squares fit by REML Model
pH.1981 NO3.1981 Data NULL AIC
BIC logLik 84.07367 90.82919
-38.03684 Correlation Structure Spherical
spatial correlation Formula Long Lat
Parameter estimate(s) range 0.1000251
Coefficients Value Std.Error
t-value p-value (Intercept) 5.713080 0.12182546
46.89561 0.0000 NO3.1981 -0.003245 0.00091864
-3.53193 0.0011 Correlation
(Intr) NO3.1981 -0.763 Standardized residuals
Min Q1 Med Q3
Max -1.74134029 -0.80465113 -0.01450431
0.75799186 2.42480698 Residual standard error
0.5103106 Degrees of freedom 42 total 40
residual
18Generalized least squares fit by REML Model
pH.1981 NO3.1981 Data NULL AIC
BIC logLik 82.99625 89.75176
-37.49812 Correlation Structure Gaussian
spatial correlation Formula Long Lat
Parameter estimate(s) range 0.2091937
Coefficients Value Std.Error
t-value p-value (Intercept) 5.723516 0.12950850
44.19413 0.0000 NO3.1981 -0.003095 0.00092111
-3.36036 0.0017 Correlation
(Intr) NO3.1981 -0.742 Standardized residuals
Min Q1 Med Q3
Max -1.75288797 -0.82890218 -0.07613712
0.71610156 2.24719899 Residual standard error
0.5214192 Degrees of freedom 42 total 40
residual
19R code, part 1
- Use R Commander to read in the data. Call the
dataset lake - Make the variables available directly
- attach(lake)
- Plot the pH in space
- plot(Long, Lat, cex(pH.1981-3), xlab"Longitude
(degrees E)", ylab"Latititude (degrees N)") - Semivariogram
- Set the limits for the distance to not exceed
half the range (sample size gets too small with
large distances) - uvec is the list of distance bins to evaluate
the semivariogram - max.dist lt- sqrt((max(Lat)-min(Lat))2
(max(Long)-min(Long))2) - uvec lt- seq(0,max.dist/2,length20)
- Load the required library and fix a tiny bug in
the package - library(geoR)
- print.summary.variomodel lt- function(...)print.su
mmary.variofit(...) - Calculate and plot the semivariogram
- Long and Lat are the locations of the points,
and pH.1981 is the data - pH.vg lt- variog(coordscbind(Long,Lat),
datapH.1981, uvecuvec)
20R code, part 2
- Plot a variety of theoretical variograms
- The par statement sets up the graphics window
to hold multiple graphs - par(mfrowc(2,2))
- Gaussian (the sum of sqares is from the
summary) - pH.vls lt- variofit(pH.vg, c(0.4,1),
cov.model"gaussian") - summary(pH.vls)
- plot(pH.vg,main"Gaussian SS4.27")
- lines(pH.vls)
- Exponential
- pH.vls lt- variofit(pH.vg, c(0.4,1),
cov.model"exponential") - summary(pH.vls)
- plot(pH.vg,main"Exponential SS4.57")
- lines(pH.vls)
- Spherical
- pH.vls lt- variofit(pH.vg, c(0.4,1),
cov.model"spherical") - summary(pH.vls)
21R code, part 3
- Krige the pH data
- Re-fit the spherical variogram, as it was best
- pH.vls lt- variofit(pH.vg, c(0.4,1),
cov.model"spherical") - Create a grid of points to predict the model
- loci lt- expand.grid(seq(min(Long),max(Long),length
51), seq(min(Lat),max(Lat),length51)) - Run the krige
- pH.k lt- krige.conv(coordscbind(Long,Lat),
datapH.1981, locationloci, krigekrige.control(o
bj.modelpH.vls)) - Plot the output with a colormap and a contour
plot - par(mfrowc(1,1))
- image(pH.k, xlab"Longitude (east)",
ylab"Latitude (north)") - contour(pH.k,addT)
- Add the locations of the lakes
- points(Long,Lat , cex(pH.1981-3), pch10,
col"blue") - Use R commander to do an OLS regression
(ph.1981NO3.1981) and call it NO3.lm
22R code, part 4
- Look at the various theoretical variograms to
see which fits best - resid.vls lt- variofit(resid.vg, c(0.4,1),
cov.model"gaussian") - summary(resid.vls)
- resid.vls lt- variofit(resid.vg, c(0.4,1),
cov.model"exponential") - summary(resid.vls)
- resid.vls lt- variofit(resid.vg, c(0.4,1),
cov.model"spherical") - summary(resid.vls)
- resid.vls lt- variofit(resid.vg, c(0.4,1),
cov.model"power") - summary(resid.vls)
- Spherical fits best refit that and add to plot
- resid.vls lt- variofit(resid.vg, c(0.4,1),
cov.model"gaussian") - lines(resid.vls)
- Do a krige of the residuals, and plot it
- resid.k lt- krige.conv(coordscbind(Long,Lat),
dataresid(NO3.lm), locationloci,
krigekrige.control(obj.modelresid.vls)) - image(resid.k, xlab"Longitude (east)",
ylab"Latitude (north)") - contour(resid.k,addT)
- points(Long,Lat , cex(fitted(NO3.lm)-3), pch10,
col"blue")