Title: Multiple and complex regression
1Multiple and complex regression
2Extensions of simple linear regression
- Multiple regression models predictor variables
are continuous - Analysis of variance predictor variables are
categorical (grouping variables), - But general linear models can include both
continuous and categorical predictors -
3(No Transcript)
4Relative abundance of C3 and C4 plants
- Paruelo Lauenroth (1996)
- Geographic distribution and the effects of
climate variables on the relative abundance of a
number of plant functional types (PFTs) shrubs,
forbs, succulents, C3 grasses and C4 grasses.
5(No Transcript)
6data
73 sites across temperate central North America
Response variable
Predictor variables
- Relative abundance of PTFs (based on cover,
biomass, and primary production) for each site
- Longitude
- Latitude
- Mean annual temperature
- Mean annual precipitation
- Winter () precipitation
- Summer () precipitation
- Biomes (grassland , shrubland)
7Relative abundance transformed ln(dat1) because
positively skewed
8Collinearity
- Causes computational problems because it makes
the determinant of the matrix of X-variables
close to zero and matrix inversion basically
involves dividing by the determinant (very
sensitive to small differences in the numbers) - Standard errors of the estimated regression
slopes are inflated
9Detecting collinearlity
- Check tolerance values
- Plot the variables
- Examine a matrix of correlation coefficients
between predictor variables
10Dealing with collinearity
- Omit predictor variables if they are highly
correlated with other predictor variables that
remain in the model
11Correlations
12(No Transcript)
13(lnC3) ßo ß1(lat) ß2(long) ß3(latxlong)
After centering both lat and long
14Analysis of variance
Source of variation SS df MS
Regression S(yhat-Y)2 p S(yhat-Y)2 p
Residual S(yobs-yhat)2 n-p-1 S(yobs-yhat)2 n-p-1
Total S(yobs-Y)2 n-1
15Matrix algebra approach to OLS estimation of
multiple regression models
- YßXe
- XXbXY
- b(XX) -1 (XY)
16Criteria for best fitting in multiple
regression with p predictors.
Criterion Formula
r2
Adjusted r2
Akaike Information Criteria AIC
Akaike Information Criteria AIC
17Hierarchical partitioning and model selection
No pred Model r2 Adjr2 P AIC (R)
1 Lon 0.0006 -0.013 0.84 30.15
1 Lat 0.47 0.46 gt0.001 -16.16
2 Lon Lat 0.48 0.46 gt0.001 -15.25
3 Long Lat Lon x Lat 0.54 0.52 gt0.001 -22.55
18R20.48
C3
Longitude
Latitude
Model Lat Long
19(No Transcript)
2045 Lat
35 Lat
Model Lat Long
21The final forward model selection is
Step AIC-228.67 SQRT_C3 LAT MAP JJAMAP
DJFMAP Df Sum of Sq RSS AIC ltnonegt
2.7759 -228.67 LONG 1 0.0209705
2.7549 -227.23 MAT 1 0.0001829 2.7757
-226.68 Call lm(formula SQRT_C3 LAT MAP
JJAMAP DJFMAP) Coefficients (Intercept)
LAT MAP JJAMAP DJFMAP
-0.7892663 0.0391180 0.0001538 -0.8573419
-0.7503936
22The final backward selection model is
Step AIC-229.32 SQRT_C3 LAT JJAMAP
DJFMAP Df Sum of Sq RSS
AIC ltnonegt 2.8279 -229.32 - DJFMAP
1 0.26190 3.0898 -224.85 - JJAMAP 1 0.31489
3.1428 -223.61 - LAT 1 2.82772 5.6556
-180.72 Call lm(formula SQRT_C3 LAT
JJAMAP DJFMAP) Coefficients (Intercept)
LAT JJAMAP DJFMAP -0.53148
0.03748 -1.02823 -1.05164