Title: EXPLORING SPATIOTEMPORAL VARIABILITY BY EIGENDECOMPOSITION TECHNIQUES
1EXPLORING SPATIO-TEMPORAL VARIABILITY BY
EIGEN-DECOMPOSITION TECHNIQUES
Luigi Ippoliti - Lara Fontanella
Department of Quantitative Methods and Economic
Theory University G. dAnnunzio of
Chieti-Pescara
2Outline of the talk
- 1. Identification of oscillation spatial patterns
by means of multivariate techniques - principal component analysis (PCA)
- canonical correlations analysis (CCA)
- partial least square (PLS)
- redundancy analysis (RA)
- 2. Unified approach in the framework of the
- Generalized Eigenvalue Decomposition
- 3. Dynamic models for predictions aims
- 4. Conclusions
3Expansion of a spatial process
- Spatio-temporal process X(st) s?D??2 ,t ?? .
- For a given time tt0, assume that X(st0) is a
zero mean second order spatial stochastic process
with covariance function Q(s,s'). - X(st0) can be expanded in a set of deterministic
functions ?k, k ??, which form a complete
orthonormal system in the space L2(D) of square
integrable functions on the domain D
where
are the time-dependent expansion coefficients.
This result is generally related to the
probabilistic corollary of Mercer's theorem known
as Karhunen-Loéve expansion.
In this talk, we consider an extension of the
preceding expansion and discuss the case in which
two kernels, Q1(s,s ') and Q2(s,s ') are defined
4Estimating spatial patterns using Simultaneous
Diagonalization
- Let us assume that
- Q1(s,s') and Q2(s,s') are real, symmetric and
square integrable functions - Q1 and Q2 are the integral operators with kernels
Q1(s,s') and Q2(s,s') - Q1 is positive definite
- Q2 is nonnegative definite
- is densely
defined, bounded, and its extension to the whole
of L2(D) has eigenfunctions which span L2(D) - If lk and vk, k ??, are the eigenvalues and the
orthonormalized eigenfunctions of Q, we have the
following simultaneous diagonalization of the two
kernels (Kadota 1976)
where
5Finite approximation
- Given the observed space-time field x(si,t),
i1,,n, t1,T, a finite approximation for - equations (1), (2) and (3) is required.
- At each time t, the observed spatial series x(t)
is expanded in terms of a set of n column vectors
called patterns
Where W is the matrix of the patterns, and a(t)
the vector of the sample expansion coefficients,
obtained as a weighted linear combination of the
data
The simultaneous diagonalization of the two
kernels can be approximated as
Where Q1 and Q2 are the matrix of the value of
the kernel functions onto the spatial sample
points. Equation (6) constitutes a Generalized
Eigenvalue Equation (GED)
6Principal Component Analysis
- Criterion for the selection of the spatial
patterns to preserve as much variance as
possible given a certain dimensionality of the
model. - PCA projects the data on the subspace of maximum
data variation - The spectral decomposition of the symmetric
covariance matrix C is a special case of the
generalized eigenvalue decomposition where
In this case, the k-th column of matrix W
represents the k-th EOF which is associated to
the corresponding expansion coefficient ak,
known as principal component. Sorting the EOFs
according to the magnitude of their eigenvalues
the process can be reconstructed following
equation in a reduced space if a truncation level
K in the expansion equation is chosen. PCA gives
a data dependent set of basis vectors that is
optimal in statistical mean square sense.
7Partial Least Square
- The partial least square allows for the
identification of pairs of spatial patterns and
time coefficients which account for a fraction of
the covariance between two processes analized
jointly. - PLS projects the data on the subspace of maximum
data covariation - Given two data matrices, X and Y, of zero-mean
spatio-temporal series x(si,t) and y(si,t), with
between-sets spatial covariance matrix Cxy, the
goal is to find the two directions of maximal
data covariation i.e. the directions wxk and wyk
such that the linear combination axkXwxk and
aykYwyk have maximum covariance. - The spatial patterns Wx and Wy can be obtained by
generalized eigenvalue decomposition (6) where
The eigenvalue lk represents the covariance
between axk and ayk.
8Canonical Correlation Analysis
- The canonical correlation analysis allows for the
identification of pairs of spatial patterns and
time coefficients which account for a fraction of
the correlation between two processes analyzed
jointly. - CCA projects the data on the subspace of maximum
data correlation - Given two data matrices, X and Y, of zero-mean
spatio-temporal series x(si,t) and y(si,t), with
between-sets spatial covariance matrix Cxy and
spatial covariance matrices Cxx and Cyy, the
goal is to find the two directions of maximal
data correlazione i.e. the directions wxk and
wyk such that the linear combination axkXwxk and
aykYwyk have maximum correlation. - The spatial patterns Wx and Wy can be obtained by
generalized eigenvalue decomposition (6) where
The eigenvalue lk represents the correlation
coefficients between axk and ayk.
9Redundancy Analysis
- Given two processes, if the aim is to predict one
process as well as possible in the least square
error sense, the spatial patterns must be chosen
so that this error measure is minimized or
equivalently the redundancy index is maximized. - RA projects the data on the subspace of minimum
prediction error - Given two data matrices, X and Y, of zero-mean
spatio-temporal series x(si,t) and y(si,t), with
between-sets spatial covariance matrix Cxy and
spatial covariance matrix, of X, Cxx, the
spatial patterns Wx and Wy can be obtained by
generalized eigenvalue decomposition (6) where
The eigenvalue lk represents the regression
coefficient of ayk on axk.
10State-Space Models for Temporal Predictions
- Several prediction models can be represented in a
unified formulation using the state-space
representation
State equation
Measurement equation
- where vt is the state vector, F is the transition
matrix, H is the measurement matrix and ut and et
are zero mean Gaussian error terms. - State-Space Model for a Single Field
- One possibility is to assume the measurement
matrix H known and equal to Wx(K). In this case,
the state equation models the temporal evolution
of the expansion coefficients ax(K)(t). (Mardia
et. Al. 1998 Wikle and Cressie 1999)
11State-Space Model for Coupled Fields
- symmetric case both variables are equally
important such that none of them can be
considered as a predictor for the other. - The state and measurement equations could be
defined as follows
12State-Space Model for Coupled Fields
- asymmetric case it is explicitly recognized that
one variable can be fruitfully used to predict
the other variable. In this case, the state-space
model can be derived by the following regression
model where, assuming Y as the processes to be
predicted, we have
Both for RA and CCA, B is a diagonal matrix.
Thus, in this case, since the expansion
coefficients are zero-mean random variables, the
state and measurement equations could be defined
as follows
13Conclusions
- Oscillation patterns could be useful in exploring
the spatial variability of the fields - Parsimonious dynamic models can be defined by
only using a few spatial patterns - State-Space models can be used to model the
temporal evolution of the spatial patterns. An
alternative procedure to State-Space models could
be parsimonious VAR models in the expansion
coefficients domain. - Different approach can be used to estimate
spatial covariance and cross-covariance
structures (i.e. MOM, geostatistical approach) - PLS, CCA and RA can also be used when only one
variable is observed to take in account the
temporal dependence - Further work is needed ..
14Intercomparison of methods space-time series of
PM10 and NOX
- DATA daily mean concentrations of PM10 and NOX
observed at 15 monitoring stations located in the
Lombardy monitoring network (Italy). All the data
range from 1st January to 13th December 2004 and
constitute 348 temporal observations
15Comparative measure of explained covariance and
goodness of reconstruction
- Cumulative squared covariance fraction (CSCFK)
indicates how successful each method has been in
explaining the observed covariance matrix using a
reduced number K of spatial patterns to
reconstruct the data matrices X and Y
where the Frobenius matrix norm of a matrix C is
given by
- The cumulative squared covariance fraction has
been applied to - the NOX-PM10 between-set covariance matrix Cxy
- the NOX covariance matrix Cxx
- the PM10 covariance matrix Cyy
Root mean squared error (RMSEK) indicates how
successful each method has been in reconstructing
the observed data matrix using a reduced number K
of spatial patterns to reconstruct the data
matrices X and Y
16Cumulative squared covariance fraction (CSCFK)
17Root Mean Squared Error (RMSEK)
18PCA PM10
Correlation map for PM10 and NOx with ay1
First spatial pattern for PM10 wy1
19First spatial pattern for PM10 wy1
20First spatial pattern for NOx wx1
21Second spatial pattern for PM10 wy2
22Second spatial pattern for NOx wx2