Statistics in WR: Session 20 - PowerPoint PPT Presentation

About This Presentation
Title:

Statistics in WR: Session 20

Description:

Statistics in WR: Session 20 Introduction to Spatial Statistics Ernest To * Results of the interpolation can be presented as a 3D volume in space and time. – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 56
Provided by: ceUtexas9
Category:

less

Transcript and Presenter's Notes

Title: Statistics in WR: Session 20


1
Statistics in WR Session 20
  • Introduction to Spatial Statistics
  • Ernest To

2
Outline
  • Basics of spatial statistics
  • Kriging
  • Application of spatial-temporal statistics
    (Gravity currents in CCBay)

3
Basics
4
Consider the following scenario
  • Two river stations, A and B, measure dissolved
    oxygen (DO).
  • At station A
  • mean DO µA 5 mg/L
  • std dev at Station A sA 2 mg/L
  • At station B
  • mean DO µB 5 mg/L
  • std dev at Station A sB 2 mg/L
  • Correlation between measurements at stations A
    and B ?AB 0.5.

A
B
5
New data!
  • We collected a DO measurement of 2 mg/L at
    Station A.
  • What is the updated mean (µBXA ) and standard
    deviation (sBXA) at Station B?
  • (assume that the DO distributions are normal)

µA 5 mg/L sA 2 mg/L New sample X A 2 mg/L
A
µB 5 mg/L sB 2 mg/L µBXA ? sBXA ?
B
6
Lets sketch out the distributions
  • Distributions at A and B (assume normal)
  • Joint distribution at A and B

f(xA)
f(xB)
XA
XB
µA 5 mg/L, sA 2 mg/L
µB 5 mg/L, sB 2 mg/L
f(xA,xB)
XA
XB
7
Marginal and joint distributions
8
How does ?AB affect the shape of the joint
distribution?
Scatter plots of XA vs XB
?AB 0.5
?AB 0.99
?AB 0
?AB -0.99
XA
XB
f(xA,xB)
XA
XB
Joint distribution of XB and XA
9
Bayesian conditioning
Prior pdf (joint distribution)
XA
PRIOR STAGE
XB
CONDITIONALIZATION STAGE Observed data is used to
update the distribution.
xA 2 mg/L
XA
XB
POSTERIOR STAGE A conditional pdf for XB is
generated.
Prior pdf
xA 2 mg/L
XA
Conditional pdf
XB
10
Conditional pdf
If the prior pdf is binormal, the conditional pdf
is also normal with Mean Variance
Conditional pdf
XBXA
(The variance is independent of XA or XB
Homoscedasticity)
Expected value of conditional pdf is a linear
function of the conditioning data
11
Back to the problem
  • Updated mean and std. dev at Station B
  • Mean
  • Std. dev

µA 5 mg/L sA 2 mg/L New sample X A 2 mg/L
A
µB 5 mg/L sB 2 mg/L µBXA 3.5 mg/L sBXA
1.7 mg/L
B
12
Can we do the same for any two points on the
river?
  • Yes we can.
  • But under following conditions
  • Normality
  • 2nd order stationarity
  • Mean does not change with location
  • Variance does not change with location
  • Know the mean and variance.
  • Have a function that determines the correlation
    between two locations

A
µ 5 mg/L s 2 mg/L
B
13
Modeling correlation
  • In spatial statistics, correlation is modeled as
    a function of the separation distance between two
    points
  • Where h separation distance (aka lag).

Most of the time, correlation decreases with
distance. (Things that are closer together tend
to be more correlated with each other).
14
Estimating correlation model from data
  • Imagine the case where we have a smattering of
    data along an axis.
  • Any given pair of data points, i and j, will have
    two properties
  • The semivariance ? 0.5(Zi-Zj )2
  • 2. The separation distance hij

15
Estimating correlation model from data
  • We can plot the semivariance, ? , of all possible
    pairs against the lag, h. This gives us a
    variogram.

16
Estimating correlation model from data
  • We can fit a curve through the semivariogram to
    model the semivariance as a function of the lag.
    This is the variogram model.

17
Estimating correlation model from data
  • We can fit a curve through the semivariogram to
    model the semivariance as a function of the lag.
    This is the variogram model.

sill
range
18
Estimating correlation model from data
  • Assuming that mean and variance do not change
    with location (assumption of stationarity), the
    variogram model is related to the
  • covariance model by the equation

C(h)
Where s2 is the variance
19
Estimating correlation model from data
  • Assuming that variance does not change with
    location (assumption of stationarity), the
    correlation model is related to the
  • covariance model model by the equation

?(h)
1
.8
.6
.4
.2
20
How does the correlation model affect the
estimation
?AB 0
?AB 0.5
?AB 0.99
Scatter plots of XA vs XB
XA
XB
f(xA,xB)
Joint distribution of XA and XB
XA
XB
Conditional distribution of XBXA
XBXA
Increasing h
21
Kriging
22
Multivariable case
  • What if we have more than one location that
    provide conditioning data?
  • (Assume distributions are STILL normal at all
    locations).
  • At station A1, A2, A3, A4
  • µA1 µA2 µA3 µA4 5 mg/L
  • sA1 sA2 sA3 sA4 2 mg/L
  • At station B
  • mean DO µB 5 mg/L
  • std dev at Station A sB 2 mg/L
  • ? f(h) 0.0125h2 - 0.225h 1

A1
A2
A3
A4
B
23
Modeling correlation
? f(h) 0.0125h2 - 0.225h 1
Distance along river (in hundred meters)
2
2
2
2
From correlation model ?A1B 0.0, ?A2B 0.1,
?A3B 0.3, ?A4B 0.6 ?A1A2 0.6, ?A1A3 0.3,
?A1A4 0.1, ?A2A3 0.6, ?A2A4 0.3 , ?A3A4 0.6
24
Dealing with multiple variables
  • Divide locations into two groups
  • The vector, , representing the set of random
    variables at the locations contributing the
    conditioning data.
  • The variable, ,representing the random
    variable at the point of estimation.

A1
A2
A3
A4
B
25
Concept
1. If individual distributions are normal, joint
pdf is multi-normal.
2. Group variables into two one for points
with data, one for the point of estimation.
Prior pdf
3. Intersect pdf with conditioning data to get
conditional pdf.
Conditional pdf
26
Dealing with multiple variables
  • The updated mean and variance of the distribution
    at Station B are given by
  • Mean
  • Variance
  • Where

A1
A2
A3
A4
B
27
Equations in multivariable case are more
generalized
Recall two variable case
  • Multivariable case takes into account
  • Correlation between data locations and estimated
    location ( ).
  • Correlation among data locations ( ).
  • This is the most fundamental form of kriging,
    i.e. Simple Kriging.

Multivariable case
Conditional pdf
28
Plug and Chug
  • Recall that Cov(A,B) ?AB sA s B
  • Compute data to data correlation

29
Plug and Chug
  • Compute data to estimation point correlation

30
Plug and Chug
weights
Note The weights attributed to each station are
determined by the prior (joint distribution)
among them.
31
Plug and Chug
Weights ?1, ?2, ?3, ?n
weights
Note The weights attributed to each station are
determined by the prior (joint distribution)
among them.
32
Plug and Chug
33
Plug and Chug
34
Results from Simple Kriging
  • The updated mean and standard deviation of the
    distribution at Station B are
  • Mean
  • Standard deviation

A1
A2
A3
A4
B
35
Other forms of kriging
  • Ordinary kriging (OK)
  • Does not require mean to be known
  • Assumes that mean is constant and is somewhere in
    the range of the conditioning data
  • Universal kriging (UK)
  • Does not require mean to be known nor require it
    to be constant
  • User specifies a model for the trend in mean. UK
    will then fit the model to the data.
  • Indicator kriging (IK)
  • handles binary variables (0 or 1)
  • has ability to take care of non-normality in data
    through iterative application.
  • Co-kriging (CK)
  • takes into account a related secondary variable
    to help estimate the primary variable.

36
Extension to 2D, 3D
  • The lag can be represented by the euclidean
    distance between 2 points
  • So the covariance model of the form, C f(h),
    can still be used
  • Variables may be more correlated in one direction
    than the other (anisotropy)
  • linear transformation can be performed to
    transform the distances so the correlation
    distance is the same in all directions (isotropy)

37
Extension to space-time
  • For space and time, there is no standard
    space-time metric.
  • The form
  • is not always correct because the temporal and
    spatial axes are not always orthogonal to each
    other.
  • Processes that happen in time usually have some
    dependency on processes that happen in space.
  • (They are not independent).
  • A separate temporal lag term is usually used
  • The covariance function takes on the form

38
Application(Gravity currents in Corpus Christi
Bay)
39
Sensors in Corpus Christi Bay
TCOON stations
TCEQ stations
Corpus Christi Bay
Oso Bay
Gulf of Mexico
Laguna Madre
Aerial photo from Google Earth
HRI stations
USGS gages
SERF stations
40
(No Transcript)
41
(No Transcript)
42
Selecting a study area
depressions
ridges
?
?
?
- 5.0 m above Mean High Water Level
- 4.5 m above Mean High Water Level
Oso Bay
- 4.0 m above Mean High Water Level
- 3.5 m above Mean High Water Level
West Laguna Madre
- 2.5 m above MeanHigh Water Level
East Laguna Madre
- 2.0 m above Mean High Water Level
- 1.5 m above Mean High Water Level
- 1.0 m above Mean High Water Level
channel
43
Downstream of East Laguna Madre
Water quality data July 12 and 18, 2006. (At
birth and demise of gravity current) Paul
Montagna Texas AM University, Corpus Christi
  • Plume tracking survey
  • July 14 to 17, 2006.
  • (While gravity current was on the move)
  • Ben Hodges
  • University of Texas at Austin

44
Synthesis of data
45
Data Preparation
1. Salinity data from HRI are acquired using
HydroGet (a GIS web service client) and combined
with plume tracking data.
2. Data locations are projected onto a reference
line following the general direction of flow.
  • Space-time kriging is performed in 3 dimensions
  • X Longitudinal measure
  • (meters from origin point)
  • Y Time
  • (days since 7/12/2006)
  • Z Elevation
  • (meters from water surface)

Reference line
Origin x 0 m
46
Variogram along direction of flow
where h lag distance along direction of
flow C0 nugget 2 psu2 C1 sill 3.6 psu2 a
range 6000 m (Gaussian variogram model)
47
Variogram along direction of flow
where h lag distance along direction of
flow C0 nugget 2 psu2 C1 sill 3.6 psu2 a
range 6000 m (Gaussian variogram model)
sill
nugget
range
48
Variogram along depth
where h lag distance along direction of
flow C0 nugget 0 psu2 C1 sill 3.6 psu2 a
range 1.7 m (Gaussian variogram model)
49
Variogram along time axis
where h lag distance along direction of
flow C0 nugget 0 psu2 C1 sill 3 psu2 a
range 1 day (Spherical variogram model)
50
Interpolation results
N
LEGEND
37 40 psu 40 42 psu 42 43 psu 42 44
psu 44 46 psu
Longitudinal profile on 7/13/2006 1800
z
Longitudinal profile on 7/12/2006 1800
N
y
x
51
Longitudinal Profiles
52
Bottom salinities
53
Cross validation
  • a common method to evaluate variogram models.
  • aka fictitious point method (Delhomme, 1978),
  • remove one data point at a time from data set and
    then using the remaining n-1 points the estimate
    the removed point.
  • estimated and actual values were then compared
    with each other.

54
Conclusions
  • Weve covered
  • Basics of spatial statistics
  • Kriging
  • Application of spatial-temporal statistics
    (Gravity currents in CCBay)
  • Spatial statistics is fun!

55
Geostatistical tools
  • ArcGIS Geostatistical Analyst
  • Easiest to use
  • GSLIB
  • Library of fortran programs
  • DeCesares version of GSLIB
  • Modification of GSLIB to do space-time kriging
  • BMELIB
  • Library of MATLAB programs
Write a Comment
User Comments (0)
About PowerShow.com