Statistics in WR: Session 20 - PowerPoint PPT Presentation

About This Presentation

Title:

Statistics in WR: Session 20

Description:

Statistics in WR: Session 20 Introduction to Spatial Statistics Ernest To * Results of the interpolation can be presented as a 3D volume in space and time. – PowerPoint PPT presentation

Number of Views:62

Avg rating:3.0/5.0

Slides: 56

Provided by: ceUtexas9

Learn more at: https://www.caee.utexas.edu

Category:

more less

Transcript and Presenter's Notes

Title: Statistics in WR: Session 20

1
Statistics in WR Session 20

Introduction to Spatial Statistics
Ernest To

2
Outline

Basics of spatial statistics
Kriging
Application of spatial-temporal statistics
(Gravity currents in CCBay)

3
Basics
4
Consider the following scenario

Two river stations, A and B, measure dissolved
oxygen (DO).
At station A
mean DO µA 5 mg/L
std dev at Station A sA 2 mg/L
At station B
mean DO µB 5 mg/L
std dev at Station A sB 2 mg/L
Correlation between measurements at stations A
and B ?AB 0.5.

A
B
5
New data!

We collected a DO measurement of 2 mg/L at
Station A.
What is the updated mean (µBXA ) and standard
deviation (sBXA) at Station B?
(assume that the DO distributions are normal)

µA 5 mg/L sA 2 mg/L New sample X A 2 mg/L
A
µB 5 mg/L sB 2 mg/L µBXA ? sBXA ?
B
6
Lets sketch out the distributions

Distributions at A and B (assume normal)
Joint distribution at A and B

f(xA)
f(xB)
XA
XB
µA 5 mg/L, sA 2 mg/L
µB 5 mg/L, sB 2 mg/L
f(xA,xB)
XA
XB
7
Marginal and joint distributions
8
How does ?AB affect the shape of the joint
distribution?
Scatter plots of XA vs XB
?AB 0.5
?AB 0.99
?AB 0
?AB -0.99
XA
XB
f(xA,xB)
XA
XB
Joint distribution of XB and XA
9
Bayesian conditioning
Prior pdf (joint distribution)
XA
PRIOR STAGE
XB
CONDITIONALIZATION STAGE Observed data is used to
update the distribution.
xA 2 mg/L
XA
XB
POSTERIOR STAGE A conditional pdf for XB is
generated.
Prior pdf
xA 2 mg/L
XA
Conditional pdf
XB
10
Conditional pdf
If the prior pdf is binormal, the conditional pdf
is also normal with Mean Variance
Conditional pdf
XBXA
(The variance is independent of XA or XB
Homoscedasticity)
Expected value of conditional pdf is a linear
function of the conditioning data
11
Back to the problem

Updated mean and std. dev at Station B
Mean
Std. dev

µA 5 mg/L sA 2 mg/L New sample X A 2 mg/L
A
µB 5 mg/L sB 2 mg/L µBXA 3.5 mg/L sBXA
1.7 mg/L
B
12
Can we do the same for any two points on the
river?

Yes we can.
But under following conditions
Normality
2nd order stationarity
Mean does not change with location
Variance does not change with location
Know the mean and variance.
Have a function that determines the correlation
between two locations

A
µ 5 mg/L s 2 mg/L
B
13
Modeling correlation

In spatial statistics, correlation is modeled as
a function of the separation distance between two
points
Where h separation distance (aka lag).

Most of the time, correlation decreases with
distance. (Things that are closer together tend
to be more correlated with each other).
14
Estimating correlation model from data

Imagine the case where we have a smattering of
data along an axis.
Any given pair of data points, i and j, will have
two properties
The semivariance ? 0.5(Zi-Zj )2
2. The separation distance hij

15
Estimating correlation model from data

We can plot the semivariance, ? , of all possible
pairs against the lag, h. This gives us a
variogram.

16
Estimating correlation model from data

We can fit a curve through the semivariogram to
model the semivariance as a function of the lag.
This is the variogram model.

17
Estimating correlation model from data

We can fit a curve through the semivariogram to
model the semivariance as a function of the lag.
This is the variogram model.

sill
range
18
Estimating correlation model from data

Assuming that mean and variance do not change
with location (assumption of stationarity), the
variogram model is related to the
covariance model by the equation

C(h)
Where s2 is the variance
19
Estimating correlation model from data

Assuming that variance does not change with
location (assumption of stationarity), the
correlation model is related to the
covariance model model by the equation

?(h)
1
.8
.6
.4
.2
20
How does the correlation model affect the
estimation
?AB 0
?AB 0.5
?AB 0.99
Scatter plots of XA vs XB
XA
XB
f(xA,xB)
Joint distribution of XA and XB
XA
XB
Conditional distribution of XBXA
XBXA
Increasing h
21
Kriging
22
Multivariable case

What if we have more than one location that
provide conditioning data?
(Assume distributions are STILL normal at all
locations).
At station A1, A2, A3, A4
µA1 µA2 µA3 µA4 5 mg/L
sA1 sA2 sA3 sA4 2 mg/L
At station B
mean DO µB 5 mg/L
std dev at Station A sB 2 mg/L
? f(h) 0.0125h2 - 0.225h 1

A1
A2
A3
A4
B
23
Modeling correlation
? f(h) 0.0125h2 - 0.225h 1
Distance along river (in hundred meters)
2
2
2
2
From correlation model ?A1B 0.0, ?A2B 0.1,
?A3B 0.3, ?A4B 0.6 ?A1A2 0.6, ?A1A3 0.3,
?A1A4 0.1, ?A2A3 0.6, ?A2A4 0.3 , ?A3A4 0.6
24
Dealing with multiple variables

Divide locations into two groups
The vector, , representing the set of random
variables at the locations contributing the
conditioning data.
The variable, ,representing the random
variable at the point of estimation.

A1
A2
A3
A4
B
25
Concept
1. If individual distributions are normal, joint
pdf is multi-normal.
2. Group variables into two one for points
with data, one for the point of estimation.
Prior pdf
3. Intersect pdf with conditioning data to get
conditional pdf.
Conditional pdf
26
Dealing with multiple variables

The updated mean and variance of the distribution
at Station B are given by
Mean
Variance
Where

A1
A2
A3
A4
B
27
Equations in multivariable case are more
generalized
Recall two variable case

Multivariable case takes into account
Correlation between data locations and estimated
location ( ).
Correlation among data locations ( ).
This is the most fundamental form of kriging,
i.e. Simple Kriging.

Multivariable case
Conditional pdf
28
Plug and Chug

Recall that Cov(A,B) ?AB sA s B
Compute data to data correlation

29
Plug and Chug

Compute data to estimation point correlation

30
Plug and Chug
weights
Note The weights attributed to each station are
determined by the prior (joint distribution)
among them.
31
Plug and Chug
Weights ?1, ?2, ?3, ?n
weights
Note The weights attributed to each station are
determined by the prior (joint distribution)
among them.
32
Plug and Chug
33
Plug and Chug
34
Results from Simple Kriging

The updated mean and standard deviation of the
distribution at Station B are
Mean
Standard deviation

A1
A2
A3
A4
B
35
Other forms of kriging

Ordinary kriging (OK)
Does not require mean to be known
Assumes that mean is constant and is somewhere in
the range of the conditioning data
Universal kriging (UK)
Does not require mean to be known nor require it
to be constant
User specifies a model for the trend in mean. UK
will then fit the model to the data.
Indicator kriging (IK)
handles binary variables (0 or 1)
has ability to take care of non-normality in data
through iterative application.
Co-kriging (CK)
takes into account a related secondary variable
to help estimate the primary variable.

36
Extension to 2D, 3D

The lag can be represented by the euclidean
distance between 2 points
So the covariance model of the form, C f(h),
can still be used
Variables may be more correlated in one direction
than the other (anisotropy)
linear transformation can be performed to
transform the distances so the correlation
distance is the same in all directions (isotropy)

37
Extension to space-time

For space and time, there is no standard
space-time metric.
The form
is not always correct because the temporal and
spatial axes are not always orthogonal to each
other.
Processes that happen in time usually have some
dependency on processes that happen in space.
(They are not independent).
A separate temporal lag term is usually used
The covariance function takes on the form

38
Application(Gravity currents in Corpus Christi
Bay)
39
Sensors in Corpus Christi Bay
TCOON stations
TCEQ stations
Corpus Christi Bay
Oso Bay
Gulf of Mexico
Laguna Madre
Aerial photo from Google Earth
HRI stations
USGS gages
SERF stations
40
(No Transcript)
41
(No Transcript)
42
Selecting a study area
depressions
ridges
?
?
?
- 5.0 m above Mean High Water Level
- 4.5 m above Mean High Water Level
Oso Bay
- 4.0 m above Mean High Water Level
- 3.5 m above Mean High Water Level
West Laguna Madre
- 2.5 m above MeanHigh Water Level
East Laguna Madre
- 2.0 m above Mean High Water Level
- 1.5 m above Mean High Water Level
- 1.0 m above Mean High Water Level
channel
43
Downstream of East Laguna Madre
Water quality data July 12 and 18, 2006. (At
birth and demise of gravity current) Paul
Montagna Texas AM University, Corpus Christi

Plume tracking survey
July 14 to 17, 2006.
(While gravity current was on the move)
Ben Hodges
University of Texas at Austin

44
Synthesis of data
45
Data Preparation
1. Salinity data from HRI are acquired using
HydroGet (a GIS web service client) and combined
with plume tracking data.
2. Data locations are projected onto a reference
line following the general direction of flow.

Space-time kriging is performed in 3 dimensions
X Longitudinal measure
(meters from origin point)
Y Time
(days since 7/12/2006)
Z Elevation
(meters from water surface)

Reference line
Origin x 0 m
46
Variogram along direction of flow
where h lag distance along direction of
flow C0 nugget 2 psu2 C1 sill 3.6 psu2 a
range 6000 m (Gaussian variogram model)
47
Variogram along direction of flow
where h lag distance along direction of
flow C0 nugget 2 psu2 C1 sill 3.6 psu2 a
range 6000 m (Gaussian variogram model)
sill
nugget
range
48
Variogram along depth
where h lag distance along direction of
flow C0 nugget 0 psu2 C1 sill 3.6 psu2 a
range 1.7 m (Gaussian variogram model)
49
Variogram along time axis
where h lag distance along direction of
flow C0 nugget 0 psu2 C1 sill 3 psu2 a
range 1 day (Spherical variogram model)
50
Interpolation results
N
LEGEND
37 40 psu 40 42 psu 42 43 psu 42 44
psu 44 46 psu
Longitudinal profile on 7/13/2006 1800
z
Longitudinal profile on 7/12/2006 1800
N
y
x
51
Longitudinal Profiles
52
Bottom salinities
53
Cross validation

a common method to evaluate variogram models.
aka fictitious point method (Delhomme, 1978),
remove one data point at a time from data set and
then using the remaining n-1 points the estimate
the removed point.
estimated and actual values were then compared
with each other.

54
Conclusions