Erin Peterson - PowerPoint PPT Presentation

1 / 67
About This Presentation
Title:

Erin Peterson

Description:

Colorado State University, USA. Dr. Jay M. Ver Hoef ... 2 blackwater sites. Geostatistical model unable to account for natural variability ... – PowerPoint PPT presentation

Number of Views:71
Avg rating:3.0/5.0
Slides: 68
Provided by: epos8
Category:
Tags: erin | peterson

less

Transcript and Presenter's Notes

Title: Erin Peterson


1
Predicting Water Quality Impaired Stream Segments
using Landscape-scale Data and a Regional
Geostatistical Model
  • Erin Peterson
  • Environmental Risk Technologies
  • CSIRO Mathematical Information Sciences
  • St Lucia, Queensland

2
Space-Time Aquatic Resources Modeling and
Analysis Program
The work reported here was developed under STAR
Research Assistance Agreement CR-829095 awarded
by the U.S. Environmental Protection Agency (EPA)
to Colorado State University. This presentation
has not been formally reviewed by EPA. EPA does
not endorse any products or commercial services
mentioned in this presentation.
3
Collaborators
Dr. David M. Theobald Natural Resource Ecology
Lab Department of Recreation Tourism Colorado
State University, USA Dr. N. Scott
Urquhart Department of Statistics Colorado State
University, USA Dr. Jay M. Ver Hoef National
Marine Mammal Laboratory, Seattle, USA Andrew A.
Merton Department of Statistics Colorado State
University, USA
4
Overview
  • Introduction
  • Background
  • Patterns of spatial autocorrelation in stream
    water chemistry
  • Predicting water quality impaired stream segments
    using landscape-scale data and a regional
    geostatistical model A case study in Maryland,
    USA

5
Water Quality Monitoring Goals
  • Create a regional water quality assessment
  • Ecosystem Health Monitoring Program
  • Identify water quality impaired stream segments

6
Probability-based Random Survey Designs
  • Advantages
  • Statistical inference about population of streams
    over large area
  • Reported in stream kilometers
  • Disadvantages
  • Does not take watershed influence into account
  • Does not identify spatial location of impaired
    stream segments

7
Purpose
Develop a geostatistical methodology based on
coarse-scale GIS data and field surveys that can
be used to predict water quality characteristics
about stream segments found throughout a large
geographic area (e.g., state)
8
(No Transcript)
9
Geostatistical Modeling
  • Fit an autocovariance function to data
  • Describes relationship between observations based
    on separation distance

Distances and relationships are represented
differently depending on the distance measure
10
Distance Measures Spatial Relationships
Straight-line Distance (SLD) Geostatistical
models typically based on SLD
11
Distance Measures Spatial Relationships
Symmetric Hydrologic Distance (SHD) Hydrologic
connectivity Fish movement
12
Distance Measures Spatial Relationships
Asymmetric Hydrologic Distance Longitudinal
transport of material
13
Distance Measures Spatial Relationships
  • Challenge
  • Spatial autocovariance models developed for SLD
    may not be valid for hydrologic distances
  • Covariance matrix is not positive definite

14
Asymmetric Autocovariance Models for Stream
Networks
  • Weighted asymmetric hydrologic distance (WAHD)
  • Developed by Jay Ver Hoef
  • Moving average models
  • Incorporate flow volume, flow direction, and use
    hydrologic distance
  • Positive definite covariance matrices

Ver Hoef, J.M., Peterson, E.E., and Theobald,
D.M., Spatial Statistical Models that Use Flow
and Stream Distance, Environmental and Ecological
Statistics. In Press.
15
Patterns of Spatial Autocorrelation in Stream
Water Chemistry
16
Objectives
  • Evaluate 8 chemical response variables
  • pH measured in the lab (PHLAB)
  • Conductivity (COND) measured in the lab µmho/cm
  • Dissolved oxygen (DO) mg/l
  • Dissolved organic carbon (DOC) mg/l
  • Nitrate-nitrogen (NO3) mg/l
  • Sulfate (SO4) mg/l
  • Acid neutralizing capacity (ANC) µeq/l
  • Temperature (TEMP) C
  • Determine which distance measure is most
    appropriate
  • SLD
  • SHD
  • WAHD
  • More than one?

17
Dataset
  • Maryland Biological Stream Survey (MBSS) Data
  • Maryland Department of Natural Resources
  • Maryland, USA
  • 1995, 1996, 1997
  • Stratified probability-based random survey design
  • 881 sites in 17 interbasins

18
Maryland, USA
Baltimore
Annapolis
Washington D.C.
Chesapeake Bay
19
Spatial Distribution of MBSS Data
20
GIS Tools
Automated tools needed to extract data about
hydrologic relationships between survey sites did
not exist! Wrote Visual Basic for Applications
(VBA) programs to
  • Calculate watershed covariates for each stream
    segment
  • Functional Linkage of Watersheds and Streams
    (FLoWS)
  • Calculate separation distances between sites
  • SLD, SHD, Asymmetric hydrologic distance (AHD)
  • Calculate the spatial weights for the WAHD
  • Convert GIS data to a format compatible with
    statistics software
  • FLoWS tools will be available on the STARMAP
    website
  • http//nrel.colostate.edu/projects/starmap

21
Spatial Weights for WAHD
  • Proportional influence (PI) influence of each
    neighboring survey site on a downstream survey
    site
  • Weighted by catchment area Surrogate for flow
    volume

22
Spatial Weights for WAHD
  • Proportional influence (PI) influence of each
    neighboring survey site on a downstream survey
    site
  • Weighted by catchment area Surrogate for flow
    volume

survey sites stream segment
23
Spatial Weights for WAHD
  • Proportional influence (PI) influence of each
    neighboring survey site on a downstream survey
    site
  • Weighted by catchment area Surrogate for flow
    volume

A
C
B
E
D
F
G
H
24
Data for Geostatistical Modeling
  • Distance matrices
  • SLD, SHD, AHD
  • Spatial weights matrix
  • Contains flow dependent weights for WAHD
  • Watershed covariates
  • Lumped watershed covariates
  • Mean elevation, Urban
  • Observations
  • MBSS survey sites

25
Geostatistical Modeling Methods
  • Validation Set
  • Unique for each chemical response variable
  • Initial Covariate Selection
  • 5 covariates
  • Model Development
  • Restricted model space to all possible linear
    models
  • 4 model sets

26
(No Transcript)
27
Geostatistical Modeling Methods
  • Covariance matrix for SLD and SHD models
  • Fit exponential autocorrelation function

28
Geostatistical Modeling Methods
  • Model selection within model set
  • GLM Akaike Information Corrected Criterion
    (AICC)
  • Geostatistical models Spatial AICC (Hoeting et
    al., in press)

where n is the number of observations, p-1 is the
number of covariates, and k is the number of
autocorrelation parameters. http//www.stat.col
ostate.edu/jah/papers/spavarsel.pdf
  • Model selection between model types
  • 100 Predictions Universal kriging algorithm
  • Mean square prediction error (MSPE)
  • Cannot use AICC to compare models based on
    different distance measures
  • Model comparison r2 for observed vs. predicted
    values

29
Results
  • Summary statistics for distance measures
  • Spatial neighborhood differs
  • Affects number of neighboring sites
  • Affects median, mean, and maximum separation
    distance

30
Results
Mean Range Values SLD 28.2 km SHD 88.03
km WAHD 57.8 km
  • Range of spatial autocorrelation differs
  • Shortest for SLD
  • TEMP shortest range values
  • DO largest range values

31
Results
  • Distance Measures
  • GLM always has less predictive ability
  • More than one distance measure usually performed
    well
  • SLD, SHD, WAHD PHLAB DOC
  • SLD and SHD ANC, DO, NO3
  • WAHD SHD COND, TEMP
  • SLD distance SO4

32
Results
Predictive ability of models
Strong ANC, COND, DOC, NO3, PHLAB Weak DO,
TEMP, SO4
r2
33
Discussion
Distance measure influences how spatial
relationships are represented in a stream network
  • Sites relative influence on other sites
  • Dictates form and size of spatial neighborhood
  • Important because
  • Impacts accuracy of the geostatistical model
    predictions

34
(No Transcript)
35
Discussion
  • Probability-based random survey design (-)
    affected WAHD
  • Maximize spatial independence of sites
  • Does not represent spatial relationships in
    networks
  • Validation sites randomly selected

36
Discussion
WAHD models explained more variability as
neighboring sites increased
  • Not when neighbors had
  • Similar watershed conditions
  • Significantly different chemical response values

37
Discussion
  • GLM predictions improved as number of neighbors
    increased
  • Clusters of sites in space have similar watershed
    conditions
  • Statistical regression pulled towards the cluster
  • GLM contained hidden spatial information
  • Explained additional variability in data with gt
    neighbors

38
Predictive Ability of Geostatistical Models
r2
39
Conclusions
  • Spatial autocorrelation exists in stream
    chemistry data at a relatively coarse scale
  • Geostatistical models improve the accuracy of
    water chemistry predictions
  • Patterns of spatial autocorrelation differ
    between chemical response variables
  • Ecological processes acting at different spatial
    scales
  • SLD is the most suitable distance measure at
    regional scale at this time
  • Unsuitable survey designs
  • SHD GIS processing time is prohibitive

40
Conclusions
  • Results are scale specific
  • Spatial patterns change with survey scale
  • Other patterns may emerge at shorter separation
    distances
  • Further research is needed at finer scales
  • Watershed or small stream network
  • New survey designs for stream networks
  • Capture both coarse and fine scale variation
  • Ensure that hydrologic neighborhoods are
    represented

41
Predicting Water Quality Impaired Stream Segments
using Landscape-scale Data and a Regional
Geostatistical Model A Case Study In Maryland
42
Objective
Demonstrate how a geostatistical methodology can
be used to compliment regional water quality
monitoring efforts
  • Predict regional water quality conditions
  • Identify the spatial location of potentially
    impaired stream segments

43
(No Transcript)
44
Methods
Potential covariates
45
Methods
Potential covariates after initial model
selection (10)
46
Methods
  • Fit geostatistical models
  • Two distance measures SLD and WAHD
  • Restricted model space to all possible linear
    models
  • 1024 models per set
  • 9 model sets
  • Parameter Estimation
  • Maximized profile log-likelihood function

47
Methods
48
Results
  • SLD models performed better than WAHD
  • Exception Spherical model
  • Best models
  • SLD Exponential, Mariah, and Rational Quadratic
    models
  • r2 for SLD model predictions
  • Almost identical
  • Further analysis restricted to SLD Mariah model

49
Results
  • Covariates for SLD Mariah model
  • WATER, EMERGWET, WOODYWET, FELPERC, MINTEMP
  • Positive relationship with DOC
  • WATER, EMERGWET, WOODYWET, MINTEMP
  • Negative relationship with DOC
  • FELPERC

50
Cross-validation intervals for Mariah model
regression coefficients
  • Cross-validation interval 95 of regression
    coefficients produced by leave-one-out cross
    validation procedure
  • Narrow intervals
  • Few extreme regression coefficient values
  • Not produced by common sites
  • Covariate values for the site are represented in
    observed data
  • Not clustered in space

51
r2 Observed vs. Predicted Values
1 influential site r2 without site 0.66
n 312 sites r2 0.72
52
Model Fit
53
Discussion
  • SLD models more accurate than WAHD models
  • Landscape-scale covariates were not restricted to
    watershed boundaries
  • Geology type
  • Temperature
  • Wetlands water

54
Discussion
  • Regression Coefficients
  • Narrow cross-validation intervals
  • Spatial location of the sites not as important as
    watershed characteristics
  • Extreme regression coefficient values
  • Not produced by common sites
  • Not clustered in space
  • Local-scale factor may have affected stream DOC
  • Point source of organic waste

55
Spatial Patterns in Model Fit
  • North and east of Chesapeake Bay - large SPE
    values
  • Naturally acidic blackwater streams with elevated
    DOC
  • Not well represented in observed dataset
  • 2 blackwater sites
  • Geostatistical model unable to account for
    natural variability
  • Large square prediction errors
  • Large prediction variances

56
Spatial Patterns in Model Fit
  • West of Chesapeake Bay - low SPE values
  • Due to statistical and spatial distribution of
    observed data
  • Regression equation fit to the mean in the data
  • Most observed sites low DOC values
  • Less variation in western and central Maryland
  • Neighboring sites tend to be similar
  • Separation distances shorter in the west
  • Short separation distances stronger covariances

57
Model Performance
Unable to account for abrupt differences in DOC
values between neighboring sites with similar
watershed conditions
  • What caused abrupt differences?
  • Point sources of organic pollution
  • Not represented in the model
  • Non-point sources of pollution
  • Lumped watershed attributes are non-spatial
  • Differences due to spatial location of landuse
    are not represented
  • Challenging to represent ecological processes
    using coarse-scale lumped attributes
  • i.e. Flow path of water

58
Generate Model Predictions
  • Prediction sites
  • Study area
  • 1st, 2nd, and 3rd order non-tidal streams
  • 3083 segments 5973 stream km
  • ID downstream node of each segment
  • Create prediction site
  • More than one site at each confluence
  • Generate predictions and prediction variances
  • SLD Mariah model
  • Universal kriging algorithm
  • Assigned predictions and prediction variances
    back to stream segments in GIS

59
(No Transcript)
60
Weak Model Fit
61
Strong Model Fit
62
Water Quality Attainment by Stream Kilometers
  • Threshold values for DOC
  • Set by Maryland Department of Natural Resources
  • High DOC values may indicate biological or
    ecological stress

63
Implications for Water Quality Monitoring
  • Tradeoff between cost-efficiency and model
    accuracy
  • Western Maryland
  • Can be described using a single geostatistical
    model
  • Eastern and northeastern Maryland
  • Accept poor model fit
  • Collect additional survey data
  • Develop a separate geostatistical model for
    eastern Maryland

64
Implications for Water Quality Monitoring
  • Apply this methodology to other regulated indices
  • e.g. conductivity and pH
  • Categorize predictions into potentially impaired
    or unimpaired status
  • Report on attainment in stream miles/kilometers

65
Conclusions
  • Geostatistical models generated more accurate DOC
    predictions than previous non-spatial models
    based on coarse-scale landscape data
  • SLD is more appropriate than WAHD for regional
    geostatistical modeling of DOC at this time
  • Probability-based random survey designs
  • Maryland, USA
  • Adds value to existing water quality monitoring
    efforts
  • Used to evaluate/report regional water quality
    conditions
  • Additional field sampling is not necessary
  • Generate inferences about regional stream
    condition
  • ID spatial location of potentially impaired
    stream segments

66
Conclusions
  • Model predictions and prediction variances
  • Additional field efforts concentrated in
  • Areas with large amounts of uncertainty
  • Areas with a greater potential for water quality
    impairment
  • Model results displayed visually
  • Communicate results to a variety of audiences

67
Questions?
Write a Comment
User Comments (0)
About PowerShow.com