Title: N' Scott Urquhart
1Using the Maryland Biological Stream Survey Data
to Test Spatial Statistical Models
- N. Scott Urquhart
- Joint work with
- Erin P. Peterson, Andrew A. Merton,
- David M. Theobald, and Jennifer A. Hoeting
- All of Colorado State University, Fort Collins,
CO 80523-1877
2FUNDING ACKNOWLEDGEMENT
The work reported here today was developed under
the STAR Research Assistance Agreement CR-829095
awarded by the U.S. Environmental Protection
Agency (EPA) to Colorado State University. This
presentation has not been formally reviewed by
EPA. The views expressed here are solely those
of presenter and STARMAP, the Program he
represents. EPA does not endorse any products or
commercial services mentioned in this
presentation.
3- Expected Results
- A geostatistical model
- Predict a specific reach scale condition at
points that were not sampled - Provide a better understanding of the
relationship between the landscape and reach
scale conditions - Give insight into potential sources of water
quality degradation - Develop landscape indicators
- Crucial for the rapid and cost efficient
monitoring of large areas - Better understanding of spatial autocorrelation
in stream networks - What is the distance within which it occurs?
- How does that differ between chemical variables?
- Products
- Map of the study area
- Shows the likelihood of water quality impairment
for each stream segment - Based on water quality standards or relative
condition (low, medium, high) - Future sampling efforts can be concentrated in
areas with a higher probability of impairment - Methodology
- Illustrates how States and Tribes can complete
spatial analysis using GIS data and field data - GIS tools will be available
4OUR PATH TODAY
- What are Spatial Statistical Models?
- Measuring Distance in Space
- The Maryland Biological Stream Survey
- Outstanding data set to compare models
- A Few Results
- Work in Progress
5GATHERING SOME INSIGHTS
- Raise your hand if you
- Had a statistics course even in the distant
past - Remember doing a t-test
- Did a simple linear regression (fitted a line)
- Did a multiple regression
- Examined model failures
- Did analyses accommodating correlated errors
- Have used spatial statistics, eg, kreiging
6STATISTICS AND PREDICTION
- OBJECTIVE Measure relevant responses,
- Like dissolved organic carbon (DOC), and
- Related variables at suitable sites, then
- Develop formula to predict DOC at
- Unvisited sites
- Why?
- Clean Water Act (CWA) 303(d)
- requires states to identify impacted
waters and plan to eliminate impact - What state has the to evaluate every water?
Predict, instead.
7PREDICTIVE VARIABLES
- Predict DOC from measures such as
- Area above the stream evaluation point
- Barren
- High Intensity Urban
- Woody Wetland ()
- Conifer or Evergreen Forest Type ()
- Mixed Forest Type ()
- low intensity Urban ()
- To accommodate year diffs
- 1996 1997 ()
8GIS TOOLS
- These variables require
- Efficient delineation of watershed above any
point - STARMAP has developed such software
- It is available
- Documented in a poster
9PREDICTIVE MODELS
- Classical regression model would be
- BUT Everything is related to everything else,
but near things are more related than distant
things Tobler (1970). - Thus the uncorrelated above is indefensible in
many cases
10SO WHAT IS SPATIAL STATISTICS?
- Spatial Statistics is a set of techniques which
- Allow correlated data
- Index the amount of correlation by distance the
points are apart - Incorporate this correlation into predictions
-
11SO WHAT IS SPATIAL STATISTICS II?
12WHAT ARE SPATIAL STATISTICAL MODELS?
13MEASURING DISTANCE IN SPACE
14The Maryland Biological Stream Survey
- Outstanding data set to compare models
15 A FEW RESULTS
16 WORK IN PROGRESS
17(No Transcript)
18- The Clean Water Act (CWA) of 1972 requires
- States, tribes, territories to identify water
quality (WQ) impaired stream segments - Create a priority ranking of those segments
- Calculate the Total Maximum Daily Load (TMDL) for
each impaired segment based upon chemical and
physical WQ standards - A biannual inventory characterizing regional WQ
- The Problem
- It is impossible to physically sample every
stream within a large area - Too many stream segments
- Limited personnel
- Cost associated with sampling
- Probability-based inferences used to generate
regional estimates of WQ - In miles by stream order
- Does not indicate where WQ impaired segments are
located - A rapid and cost-efficient method needed to
locate potentially impaired stream segments
throughout large areas - Our Approach
- Develop a geostatistical model based on
coarse-scale geographical information system
(GIS) data - Make predictions for every stream segment
throughout a large area
- The Clean Water Act (CWA) of 1972 requires
- States, tribes, territories to identify water
quality (WQ) impaired stream segments - Create a priority ranking of those segments
- Calculate the Total Maximum Daily Load (TMDL) for
each impaired segment based upon chemical and
physical WQ standards - A biannual inventory characterizing regional WQ
- The Problem
- It is impossible to physically sample every
stream within a large area - Too many stream segments
- Limited personnel
- Cost associated with sampling
- Probability-based inferences used to generate
regional estimates of WQ - In miles by stream order
- Does not indicate where WQ impaired segments are
located - A rapid and cost-efficient method needed to
locate potentially impaired stream segments
throughout large areas - Our Approach
- Develop a geostatistical model based on
coarse-scale geographical information system
(GIS) data - Make predictions for every stream segment
throughout a large area
19- Dissolved Organic Carbon (DOC) Example
- Fit a geostatistical model to DOC data and
coarse-scale watershed characteristics - Maryland Biological Stream Survey data 1996
- 7 interbasins 343 DOC survey sites
- GIS data
20- Methods
- Pre-process GIS data
- Snap survey sites to streams
- Calculate watershed attributes using the
Functional Linkage of Watersheds and Streams
(FLoWS) tools (Theobald et al., 2005 Peterson et
al., in review) - Calculate distance matrices for model selection
- R statistical software
- x,y coordinates for observed survey sites
- Test all possible linear models using the 10
covariates - 1024 models (210 1024)
- Distance measure Straight-line distance (aka
Euclidean) - Autocorrelation function Mariah
- Estimate autocorrelation parameters nugget,
sill, and range - Profile-log likelihood function
- Model Selection
- Spatial Akaike Information Corrected Criterion
(AICC) - (Hoeting et al., in press)
- Mean square prediction error (MSPE)
21- Model Results
- Range of spatial autocorrelation 21.09
kilometers - Significant watershed attributes WATER,
EMERGWET, WOODYWET, FELPERC, and MIN TEMP
- Model fit
- Leave-one-out cross validation method and
Universal kriging - Overall MSPE 0.93, R2 0.72
- One strongly influential site
- R2 without the influential site 0.66
22- East-West trend in model fit
- Conservative model fit tends to underestimate
DOC - 35 MSPE values gt 1.5
- These sites have similar covariate values to
nearby sites, but considerably different DOC
values than nearby sites
23- Model Predictions
- Create prediction sites
- 1st, 2nd, and 3rd order non-tidal stream segments
- 3083 prediction sites downstream node of each
GIS stream segment - Downstream node ensures that entire segment is
located in same watershed - More than one prediction location at stream
confluences - Covariates for prediction sites represent the
conditions upstream from the segment, not the
stream confluence - Calculate distance matrices for model predictions
- Include observed and predicted survey sites
- Generate predictions and prediction variances
- Assign values back to stream segments in GIS
- Universal kriging Algorithm
Prediction statistics
24- 18 prediction values gt 15.9 mg/l
- Also possessed 18 largest prediction variances
- Located in watersheds with large WATER, EMERGWET,
or WOODYWET values - Large covariate values are not represented in the
observed covariate data - Represent 5973.03 kilometers of stream miles
25- Products
- Geostatistical model used to predict
segment-scale WQ conditions at unobserved
locations - Map of the study area that shows the likelihood
of WQ impairment for each segment - Can be tied to threshold values or WQ standards
- Technical and Regulatory Services Administration
within the Maryland Department of the Environment - Modifying the USGS NHD to include
- watershed impairments stream-use designations
by NHD segment - Frank Siano, personal communication
- A methodology that illustrates how agencies can
accomplish spatial analysis using GIS data, MBSS
data, and geostatistics - The Advantages
- Additional sampling is not necessary
- Compliments existing methodologies
- Derive a regional estimate of stream condition in
two ways - Probability-based inferences about stream miles
by stream order - Sum prediction values in miles by stream order
- Identify potentially WQ impaired stream segments
- Methodology can be used for regulated
constituents as well - Nitrate, acid neutralizing capacity, pH, and
conductivity can be accurately predicted using
geostatistical models (Peterson et al., in
review2)
26References Hoeting J.A., Davis R.A., Merton
A.A., Thompson S.E. (in press) Model Selection
for Geostatistical Models. Ecological
Applications. http//www.stat.colostate.edu
/7Ejah/papers/index.html Peterson E.E., Theobald
D.M., Ver Hoef J.M. (in review1) Support for
geostatistical modeling on stream networks
Developing valid covariance matrices based on
hydrologic distance and stream flow. Freshwater
Biology. Peterson E.E., Merton A.A., Theobald
D.M., Urquhart N.S. (in review2) Patterns of
Spatial Autocorrelation in Stream Water
Chemistry. Environmental Monitoring. Theobald
D.M., Norman J., Peterson E.E., Ferraz S. (2005)
Functional Linkage of Watersheds and Streams
(FLoWs) Network-based ArcGIS tools to analyze
freshwater ecosystems. Proceedings of the ESRI
User Conference 2005. July 26, 2005, San Diego,
CA, USA.
Acknowledgements The work reported here was
developed under STAR Research Assistance
Agreement CR-829095 awarded by the U.S.
Environmental Protection Agency to the Space Time
Aquatic Resource Modeling and Analysis Program
(STARMAP) at Colorado State University. This
poster has not been formally reviewed by the EPA.
The views expressed here are solely those of the
authors. The EPA does not endorse any products or
commercial services presented in this poster.