Erin Peterson - PowerPoint PPT Presentation

1 / 67
About This Presentation
Title:

Erin Peterson

Description:

Predicting Water Quality Impaired Stream Segments using Landscape-scale Data and a Regional Geostatistical Model Erin Peterson Geosciences Department Colorado State ... – PowerPoint PPT presentation

Number of Views:126
Avg rating:3.0/5.0
Slides: 68
Provided by: epo87
Category:

less

Transcript and Presenter's Notes

Title: Erin Peterson


1
Predicting Water Quality Impaired Stream Segments
using Landscape-scale Data and a Regional
Geostatistical Model
  • Erin Peterson
  • Geosciences Department
  • Colorado State UniversityFort Collins, Colorado

2
Space-Time Aquatic Resources Modeling and
Analysis Program
The work reported here was developed under STAR
Research Assistance Agreement CR-829095 awarded
by the U.S. Environmental Protection Agency (EPA)
to Colorado State University. This presentation
has not been formally reviewed by EPA. EPA does
not endorse any products or commercial services
mentioned in this presentation.
3
Overview
  • Introduction
  • Background
  • Patterns of spatial autocorrelation in stream
    water chemistry
  • Predicting water quality impaired stream segments
    using landscape-scale data and a regional
    geostatistical model A case study in Maryland

4
The Clean Water Act (CWA) 1972
  • Section 303(d)
  • Requires states and tribes to ID water quality
    impaired stream segments
  • Section 305(b)
  • Create a biannual water quality inventory
  • Characterizes regional water quality
  • Based on attainment of designated-use standards
    assigned to individual stream segments

5
Probability-based Random Survey Designs
  • Used to meet section 305(b) requirements
  • Derive a regional estimate of stream condition
  • Assign a weight based on stream order
  • Provides representative sample of streams by
    order
  • Statistical inference about population of
    streams, within stream order, over large area
  • Reported in stream miles based on inference of
    attainment
  • Disadvantages
  • Does not take watershed influence into account
  • Does not ID spatial location of impaired stream
    segments
  • Fails to meet requirements of CWA Section 303(d)

6
Purpose
Develop a geostatistical methodology based on
coarse-scale GIS data and field surveys that can
be used to predict water quality characteristics
about stream segments found throughout a large
geographic area (e.g., state)
7
(No Transcript)
8
Geostatistical Modeling
  • a.k.a. Kriging
  • Interpolation method
  • Allows spatial autocorrelation in error term
  • More accurate predictions
  • Fit an autocovariance function to data
  • Describes relationship between observations based
    on separation distance
  • 3 Autocovariance Parameters
  • Nugget variation between sites as separation
    distance approaches zero
  • Sill delineated where semivariance asymptotes
  • Range distance within which spatial
    autocorrelation occurs

9
Distance Measures Spatial Relationships
Distances and relationships are represented
differently depending on the distance measure
Straight-line Distance (SLD) Geostatistical
models typically based on SLD
10
Distance Measures Spatial Relationships
Distances and relationships are represented
differently depending on the distance measure
Symmetric Hydrologic Distance (SHD) Hydrologic
connectivity Fish movement
11
Distance Measures Spatial Relationships
Distances and relationships are represented
differently depending on the distance measure
Asymmetric Hydrologic Distance Longitudinal
transport of material
12
Distance Measures Spatial Relationships
Distances and relationships are represented
differently depending on the distance measure
  • Challenge
  • Spatial autocovariance models developed for SLD
    may not be valid for hydrologic distances
  • Covariance matrix is not positive definite

13
Asymmetric Autocovariance Models for Stream
Networks
  • Weighted asymmetric hydrologic distance (WAHD)
  • Developed by Jay Ver Hoef, National Marine Mammal
    Laboratory, Seattle
  • Moving average models
  • Incorporate flow volume, flow direction, and use
    hydrologic distance
  • Positive definite covariance matrices

Ver Hoef, J.M., Peterson, E.E., and Theobald,
D.M., Spatial Statistical Models that Use Flow
and Stream Distance, Environmental and Ecological
Statistics. In Press.
14
Patterns of Spatial Autocorrelation in Stream
Water Chemistry
15
Objectives
  • Evaluate 8 chemical response variables
  • pH measured in the lab (PHLAB)
  • Conductivity (COND) measured in the lab µmho/cm
  • Dissolved oxygen (DO) mg/l
  • Dissolved organic carbon (DOC) mg/l
  • Nitrate-nitrogen (NO3) mg/l
  • Sulfate (SO4) mg/l
  • Acid neutralizing capacity (ANC) µeq/l
  • Temperature (TEMP) C
  • Determine which distance measure is most
    appropriate
  • SLD
  • SHD
  • WAHD
  • More than one?

16
Dataset
  • Maryland Biological Stream Survey (MBSS) Data
  • Maryland Department of Natural Resources
  • 1995, 1996, 1997
  • Stratified probability-based random survey design
  • 881 sites in 17 interbasins

17
(No Transcript)
18
Spatial Distribution of MBSS Data
19
GIS Tools
Automated tools needed to extract data about
hydrologic relationships between survey sites did
not exist! Wrote Visual Basic for Applications
(VBA) programs to
  • Calculate watershed covariates for each stream
    segment
  • Functional Linkage of Watersheds and Streams
    (FLoWS)
  • Calculate separation distances between sites
  • SLD, SHD, Asymmetric hydrologic distance (AHD)
  • Calculate the spatial weights for the WAHD
  • Convert GIS data to a format compatible with
    statistics software
  • FLoWS tools will be available on the STARMAP
    website
  • http//nrel.colostate.edu/projects/starmap

20
Spatial Weights for WAHD
  • Proportional influence (PI) influence of each
    neighboring survey site on a downstream survey
    site
  • Weighted by catchment area Surrogate for flow
    volume

21
Spatial Weights for WAHD
  • Proportional influence (PI) influence of each
    neighboring survey site on a downstream survey
    site
  • Weighted by catchment area Surrogate for flow
    volume

survey sites stream segment
22
Spatial Weights for WAHD
  • Proportional influence (PI) influence of each
    neighboring survey site on a downstream survey
    site
  • Weighted by catchment area Surrogate for flow
    volume

A
C
B
E
D
F
G
H
23
Data for Geostatistical Modeling
  • Distance matrices
  • SLD, SHD, AHD
  • Spatial weights matrix
  • Contains flow dependent weights for WAHD
  • Watershed covariates
  • Lumped watershed covariates
  • Mean elevation, Urban
  • Observations
  • MBSS survey sites

24
Geostatistical Modeling Methods
  • Validation Set
  • Unique for each chemical response variable
  • 100 sites
  • Initial Covariate Selection
  • Reduce covariates to 5
  • Model Development
  • Restricted model space to all possible linear
    models
  • Model set 32 models (25 models)
  • One model set for
  • General linear model (GLM), SLD, SHD, and WAHD
    models

25
Geostatistical Modeling Methods
  • Geostatistical model parameter estimation
  • Maximize the profile log-likelihood function

26
Geostatistical Modeling Methods
Fit exponential autocorrelation function
  • Model selection within model set
  • GLM Akaike Information Corrected Criterion
    (AICC)
  • Geostatistical models Spatial AICC (Hoeting et
    al., in press)

where n is the number of observations, p-1 is the
number of covariates, and k is the number of
autocorrelation parameters. http//www.stat.col
ostate.edu/jah/papers/spavarsel.pdf
27
Geostatistical Modeling Methods
  • Model selection between model types
  • 100 Predictions Universal kriging algorithm
  • Mean square prediction error (MSPE)
  • Cannot use AICC to compare models based on
    different distance measures
  • Model comparison r2 for observed vs. predicted
    values

28
Results
  • Summary statistics for distance measures
  • Spatial neighborhood differs
  • Affects number of neighboring sites
  • Affects median, mean, and maximum separation
    distance

29
Results
Mean Range Values SLD 28.2 km SHD 88.03
km WAHD 57.8 km
  • Range of spatial autocorrelation differs
  • Shortest for SLD
  • TEMP shortest range values
  • DO largest range values

30
Results
  • Distance Measures
  • GLM always has less predictive ability
  • More than one distance measure usually performed
    well
  • SLD, SHD, WAHD PHLAB DOC
  • SLD and SHD ANC, DO, NO3
  • WAHD SHD COND, TEMP
  • SLD distance SO4

31
Results
Predictive ability of models
Strong ANC, COND, DOC, NO3, PHLAB Weak DO,
TEMP, SO4
r2
32
Discussion
Distance measure influences how spatial
relationships are represented in a stream network
  • Sites relative influence on other sites
  • Dictates form and size of spatial neighborhood
  • Important because
  • Impacts accuracy of the geostatistical model
    predictions

33
(No Transcript)
34
Discussion
  • Probability-based random survey design (-)
    affected WAHD
  • Maximize spatial independence of sites
  • Does not represent spatial relationships in
    networks
  • Validation sites randomly selected

35
Discussion
WAHD models explained more variability as
neighboring sites increased
  • Not when neighbors had
  • Similar watershed conditions
  • Significantly different chemical response values

36
Discussion
  • GLM predictions improved as number of neighbors
    increased
  • Clusters of sites in space have similar watershed
    conditions
  • Statistical regression pulled towards the cluster
  • GLM contained hidden spatial information
  • Explained additional variability in data with gt
    neighbors

37
Predictive Ability of Geostatistical Models
r2
38
Conclusions
  • Spatial autocorrelation exists in stream
    chemistry data at a relatively coarse scale
  • Geostatistical models improve the accuracy of
    water chemistry predictions
  • Patterns of spatial autocorrelation differ
    between chemical response variables
  • Ecological processes acting at different spatial
    scales
  • SLD is the most suitable distance measure at
    regional scale at this time
  • Unsuitable survey designs
  • SHD GIS processing time is prohibitive

39
Conclusions
  • Results are scale specific
  • Spatial patterns change with survey scale
  • Other patterns may emerge at shorter separation
    distances
  • Further research is needed at finer scales
  • Watershed or small stream network
  • Need new survey designs for stream networks
  • Capture both coarse and fine scale variation
  • Ensure that hydrologic neighborhoods are
    represented

40
Predicting Water Quality Impaired Stream Segments
using Landscape-scale Data and a Regional
Geostatistical Model A Case Study In Maryland
41
Objective
Demonstrate how a geostatistical methodology can
be used to meet the requirements of the Clean
Water Act
  • Predict regional water quality conditions
  • ID the spatial location of potentially impaired
    stream segments

42
(No Transcript)
43
Methods
Potential covariates
44
Methods
Potential covariates after initial model
selection (10)
45
Methods
  • Fit geostatistical models
  • Two distance measures SLD and WAHD
  • Restricted model space to all possible linear
    models
  • 1024 models per set (210 models)
  • Parameter Estimation
  • Maximized the profile log-likelihood function

46
Methods
47
Results
  • SLD models performed better than WAHD
  • Exception Spherical model
  • Best models
  • SLD Exponential, Mariah, and Rational Quadratic
    models
  • r2 for SLD model predictions
  • Almost identical
  • Further analysis restricted to SLD Mariah model

48
Results
  • Covariates for SLD Mariah model
  • WATER, EMERGWET, WOODYWET, FELPERC, MINTEMP
  • Positive relationship with DOC
  • WATER, EMERGWET, WOODYWET, MINTEMP
  • Negative relationship with DOC
  • FELPERC

49
Cross-validation intervals for Mariah model
regression coefficients
  • Cross-validation interval 95 of regression
    coefficients produced by leave-one-out cross
    validation procedure
  • Narrow intervals
  • Few extreme regression coefficient values
  • Not produced by common sites
  • Covariate values for the site are represented in
    observed data
  • Not clustered in space

50
r2 Observed vs. Predicted Values
1 influential site r2 without site 0.66
n 312 sites r2 0.72
51
Model Fit
52
Discussion
  • SLD models more accurate than WAHD models
  • Landscape-scale covariates were not restricted to
    watershed boundaries
  • Geology type
  • Temperature
  • Wetlands water

53
Discussion
  • Regression Coefficients
  • Narrow cross-validation intervals
  • Spatial location of the sites not as important as
    watershed characteristics
  • Extreme regression coefficient values
  • Not produced by common sites
  • Not clustered in space
  • Local-scale factor may have affected stream DOC
  • Point source of organic waste

54
Spatial Patterns in Model Fit
  • North and east of Chesapeake Bay - large SPE
    values
  • Naturally acidic blackwater streams with elevated
    DOC
  • Not well represented in observed dataset
  • 2 blackwater sites
  • Geostatistical model unable to account for
    natural variability
  • Large square prediction errors
  • Large prediction variances

55
Spatial Patterns in Model Fit
  • West of Chesapeake Bay - low SPE values
  • Due to statistical and spatial distribution of
    observed data
  • Regression equation fit to the mean in the data
  • Most observed sites low DOC values
  • Less variation in western and central Maryland
  • Neighboring sites tend to be similar
  • Separation distances shorter in the west
  • Short separation distances stronger covariances

56
Model Performance
Unable to account for abrupt differences in DOC
values between neighboring sites with similar
watershed conditions
  • What caused abrupt differences?
  • Point sources of organic pollution
  • Not represented in the model
  • Non-point sources of pollution
  • Lumped watershed attributes are non-spatial
  • Differences due to spatial location of landuse
    are not represented
  • Challenging to represent ecological processes
    using coarse-scale lumped attributes
  • i.e. Flow path of water

57
Generate Model Predictions
  • Prediction sites
  • Study area
  • 1st, 2nd, and 3rd order non-tidal streams
  • 3083 segments 5973 stream km
  • ID downstream node of each segment
  • Create prediction site
  • More than one site at each confluence
  • Generate predictions and prediction variances
  • SLD Mariah model
  • Universal kriging algorithm
  • Assigned predictions and prediction variances
    back to stream segments in GIS

58
(No Transcript)
59
Weak Model Fit
60
Strong Model Fit
61
Water Quality Attainment by Stream Kilometers
  • Threshold values for DOC
  • Set by Maryland Department of Natural Resources
  • High DOC values may indicate biological or
    ecological stress

62
Implications for Water Quality Monitoring
  • Tradeoff between cost-efficiency and model
    accuracy
  • Western Maryland
  • Can be described using a single geostatistical
    model
  • Eastern and northeastern Maryland
  • Accept poor model fit
  • Collect additional survey data for regional
    geostatistical model
  • Develop a separate geostatistical model for
    eastern Maryland

63
Implications for Water Quality Monitoring
  • Apply this methodology to other regulated
    constituents
  • Technical and Regulatory Services Administration
    within the MDE modifying the NHD
  • Include water quality standards stream-use
    designations by NHD segment
  • Use water quality standards instead of thresholds
  • Categorize predictions into potentially impaired
    or unimpaired status
  • Report on attainment in stream miles/kilometers

64
Conclusions
  • Geostatistical models generated more accurate DOC
    predictions than previous non-spatial models
    based on coarse-scale landscape data
  • SLD is more appropriate than WAHD for regional
    geostatistical modeling of DOC at this time
  • Adds value to existing water quality monitoring
    efforts
  • Used to comply with the CWA more easily
  • Additional field sampling is not necessary
  • Inferences about regional stream condition can be
    generated
  • It can be used to identify the spatial location
    of potentially impaired stream segments

65
Conclusions
  • Model predictions and prediction variances
  • Allow additional field efforts to be concentrated
    in
  • Areas with large amounts of uncertainty
  • Areas with a greater potential for water quality
    impairment
  • Model results can be displayed visually
  • Allows professionals to communicate results to a
    wide variety of audiences

66
Thank You!
Advisors Dave Theobald and Melinda
Laituri Committee Members Will Clements and
Brian Bledsoe Collaborators N. Scott Urquhart,
Jay M. Ver Hoef, and Andrew A. Merton Team
Theobald Grant Wilcox, John Norman, Nate
Peterson, and Melissa Sherburne Dennis Ojima and
Keith Paustian Family and friends My husband
Nate
67
Questions?
Write a Comment
User Comments (0)
About PowerShow.com