Advanced Techniques in Data Assimilation

About This Presentation

Title:

Advanced Techniques in Data Assimilation

Description:

When the observation network is non-stationary, e.g. LEO ... Impact of HALOE observations (~30 profiles per day and. hour model time step) ... – PowerPoint PPT presentation

Number of Views:160

Avg rating:3.0/5.0

Slides: 83

Provided by: richard167

Category:

more less

Transcript and Presenter's Notes

Title: Advanced Techniques in Data Assimilation

1
Advanced Techniques in Data Assimilation
Richard Ménard Meteorological Service of
Canada Department of Atmospheric and Oceanic
Sciences, McGill University
GCC Summer School, Banff. May 22-28, 2004
2
Outline

Comparing observations with models
How do we get the error statistics
Covariance modelling
Fitting covariance models with innovation data
5. Analysis
6. When and why do we need advanced data
assimilation
7. Analogy between numerical modeling and data
assimilation
8. Kalman filtering
9. Computational simplification for the Kalman
filter
10. Estimation of Q
11. Estimation of model error bais
12. Some thoughts about distinguishing model
error bias from
observation error bias
13. Some remarks and future direction

3
Some useful references

Daley, R., 1991 Atmospheric Data Analysis.
Cambridge University Press.
(very good textbook, but somewhat outdated)
Daley, R., and E. Barker, 2000 NAVDAS Source
book 2000.
NRL Publication NRL/PU/7530 00 418. Naval
Research Laboratory,
Monterey, CA, 151 pp. (technical description of
an operational 3D-Var
scheme with MPI implementation)
Ghil, M., et al. (ed). 1997 Special Issue of
the Second International
Symposium on Assimilation of Observations in
Meteorology and
Oceanography. J. Meteorol. Soc. Japan, vol.
75. (Contains a number
of review papers)
Kasibhatla, P. et al. (ed.) Inverse Methods in
Global Biogeochemical Cycles.
(comes with a CD-ROM with Matlab exercises)
Course notes of Saroja Polavarapu, University of
Toronto
-Course notes of Ricardo Todling Introduction to
Data Assimilation

4
New books Kalnay, E., 2003 Atmospheric Modeling
Data Assimilation and Predictability.
Cambridge University Press. -Bennett, A., 2002
Inverse modeling of the Ocean and Atmosphere.
Cambridge University Press. (computer codes can
be obtained from an ftp site)
5
1. Comparing observations with models
In trying to validate a model formulation using
observations you have two choices 1- Compare
climate model runs, for which the sensitivity to
initial condition has disappeared. Then we
compare observation climatology with model
climatology. Problem Even if the first
moment and second moment (in practice
only the variance) of the model and observation
agrees, doesnt mean that we have
instantaneous agreement between
observation and model. 2- Attack the problem
of specifying the initial condition using
data assimilation. Problem Even if
an assimilation run stick close to the
truth, how can we disentangle whether the
differences between the forecast and
analysis is due to initial conditions
or modeling error.
6
2. How do we get the error statistics
2.1 Innovation method
Observation errors are spatially
uncorrelated Forecast error is the spatially
correlated part
7
Obs and Forecast error variances
Mitchell et al. (1990)
Mitchell et al. (1990)
8
If covariances are homogeneous, variances are
independent of space
Covariances are not homogeneous
If correlations are homogeneous, correlation
lengths are independent of location
Correlations are not homogeneous
9
Correlations are not isotropic
Gustafsson (1981)
Daley (1991)
10
Advantages Estimates observation error that
includes representativeness error. Provide
reasonably accurate error statistics at
observation locations and
for the observed variables. Disadvantages
Error statistics are incomplete to construct Pf
because lacking
information at all grid points and for all
variables
In practice To complete Pf (on the model grid)
we fit the forecast error
information obtained from innovations to a
forecast error correlation
model. A good fitting procedure to do this
is the maximum likelihood method
which gives unbiased estimates of the
covariance parameters. The drawback is that
most correlation models are homogeneous and
isotropic inhomogeneous
correlation models are hard to come by
and that the few tricks that have been
used to construct then does
not provide significantly superior analyses.
11
2.2 The lagged-forecast method (NMC method)
0
-48
-24

Compares 24-h and 48-h forecasts valid at same
time
Provides global, multivariate corr. with full
vertical and spectral resolution
Not used for variances
Assumes forecast differences approximate forecast
error

Why 24 48 ?

24-h start forecast avoids spin-up problems
24-h period is short enough to claim similarity
with 0-6 h forecast error. Final difference is
scaled by an empirical factor
24-h period long enough that the forecasts are
dissimilar despite lack of data to update the
starting analysis
0-6 h forecast differences reflect assumptions
made in OI background error covariances

12
Properties (Bouttier 1994)

For linear H, and taking the difference between
analyses (0-hour forecast) with
6-h forecast difference, the
lagged-forecast method measures the difference
between the forecast and analysis error
covariances
NMC-method breaks down if there is no data
between launch of 2 forecasts. With no data the
lagged-forecast variance is zero!
For obs at every gridpoint, and if observation
and background error variances are equal, then
the lagged-forecast method produces estimates
that are half of the
the true value.

The error correlation remain close to
the forecast error correlation for any
ratio as long as the
observations
are dense

Advantage - Provides global error
statistics on all variables
for Pf. - Estimates of forecast
error can be given in
spectral space Disadvantages - Provides no
information on
observation error statistics.
- Only the information on correlation is
reliable provided the
the observation network is dense.
14
In practice We combine both methods. The
innovation method is used to estimate the
observation error variance and the forecast error
variance for the observed variables. The
lagged-forecast method is used to estimate error
correlation for all variables on the global
domain, and the error variance for the
unobserved variables. However, since the
observation density can alter the estimated
error correlation, the lagged-forecast data
is used to fit an homogeneous isotropic
correlation model which can be compactly
represented in spectral space.

15
3. Covariance modelling
Positive definite matrix (Horn and Johnson 1985,
Chap 7)
A real n x n symmetric matrix A is positive
definite if for any nonzero vector c. A is said
to be positive semi-definite if

Properties
The sum of any positive definite matrices of the
same size is also
positive definite
Each eigenvalue of a positive definite matrix is
a positive real number
The trace and determinant are positive real
numbers.

Covariance matrix
The covariance matrix P of a random vector X
X1, X2, , XnT is the matrix P Pij in which
and E is the mathematical expectation.
16
Property A covariance matrix is positive
semi-definite
Remarks 1 - It is often necessary in data
assimilation to invert the covariance matrices,
and thus we need to have positive definite
covariances 2 The positive definite property
is global property of a matrix, and it is not
trivial to obtain
17
Examples
a) Truncated parabola
for n4
eigenvalues
18
b) Triangle
for n4
eigenvalues
19
Covariance functions (Gaspari and Cohn 1998)
Definition 1 A function P(r,r) is a covariance
function of a random field X if
Definition 2 A covariance function P(r,r) is a
function that defines positive
semi-definite matrices when evaluated on any
grid. That is, letting ri and rj be any
two grid points, the matrix P whose
elements are Pi,j P(ri,rj) is defines a
covariance matrix, when P
is a covariance function
The equivalence between definition 1 and 2 can be
found in Wahba(1990, p1-2). The covariance
function is also known as the reproducing kernel.
Remark Suppose a covariance function is defined
in a 3D space, . Restricting the
value of r to remain on an manifold (e.g. the
surface of a unit sphere) will also define a
covariance function, and a covariance matrix
(e.g. a covariance matrix on the surface of a
sphere)
20
Correlation function A correlation function
C(r,r) is a covariance function P(r,r)
normalized by the standard deviation at the
points r and r
Homogeneous and isotropic correlation function
If a correlation function is invariant under all
translation and all orthogonal transformation,
then the correlation function become only a
function of the distance between the two points,

Smoothness properties
The continuity at the origin determines the
continuity allowed on the rest of the
domain. For example, if the first derivative
is discontinuous at the origin, then
first derivative discontinuity is allowed
elsewhere (see example with triangle)

Spectral representation
On a unit circle
where ? is the angle between the two position
vectors, and where
and all the Fourier coefficients am are
nonnegative
On a unit sphere
where all the Legendre coefficients bm are
nonnegative.

21
Examples of correlation functions
1. Spatially uncorrelated model (black)
3. Second-order auto-regressive model (SOAR)
(cyan)
2. First-order auto-regressive model (FOAR)
(blue)
4. Gaussian model (red)
where L is the correlation length scale
22
Exercise Construct a correlation matrix on a
one-dimensional periodic domain
Consider a 2D Gaussian model on a plane
x
x

Along the circle of radius a
Now define a coordinate x along the circle ,
then we get
x
x
23
Ways to define new covariances by transformation
24
(No Transcript)
25

Examples in Gaspari and Cohn (1999)

Hadamard product of an arbitrary
covariance function with a compactly
supported function is very useful in
Ensemble Kalman filtering to create
compactly supported covariances

When formulated this way
state-dependent errors can
be introduced in estimation
theory for data assimilation

e.g. To create a relative error
formulation for observation error, the
relative error is scaled by the forecast
interpolated at the observation location
( not the observed value! )
27
4. Fitting covariance models with innovation
data Application of the maximum likelihood
method
The method of maximum likelihood applies when we
estimate a deterministic variable (no error
statistics) from noisy measurements.
The method works as follows Suppose that the
innovation (O-F) covariance depends on a number
of parameters, ,
such as the error variances, the error
correlation length scales, etc . Assuming that
the actual innovations are uncorrelated in time
and Gaussian distributed, the conditional
probability of the innovations for a given
value of the covariance parameters
28
The value of ? which maximizes the probability
p(?n,,?1?) is the maximum likelihood estimate

In practice, we take the log of p(?n,,?1?)
that is called the log-likelihood
function, L(?) , and search for its minimum.
The curvature of the log-likelihood function is
a measure of the estimation
accuracy. A pronounced curvature indicates a
reliable estimate, and vice versa.

29
example Doppler imager HRDI
Coincident measurements in two channels B
band , ? band
30
5. Analysis
Definition Best estimate (e.g. minimum error
variance, maximum a posteriori) of an atmospheric
model state using observations at a given time
Estimators

Best Linear Unbiased Estimator
Assumes that the estimate is a linear
combination of observations and model state
Assumes that the observation error is
uncorrelated with model state error
Doesnt assume any knowledge of the
probability density functions (e.g. Gaussian)

Using

We get
minimizing the analysis error variance
31
? Bayesian estimators MV and MAP
Analytical expressions for p(xy) can be found
for Gaussian probabilities, Lognormal
probabilities, Alpha-stable distributions
(generalized Gaussians), Poisson
distributions. Analytical expressions may not be
obtained with a mixture of p.d.f.
- MAP estimator
This estimator is the one used in variational
data assimilation schemes, such as 3D-Var and
4D-Var.
- MV estimator min Tr(Pa)
This estimator is the one used in Kalman
filtering, OI, and Ensemble Kalman filtering.
32
Properties 1- reduce the error variance 2-
interpolate the observation information to the
model grid 3- to filter out the small scale noise
Example1 One observation
Let Pf be the periodic Gaussian correlation model
of the previous example
variance
correlation
x
x
33
variance
varying the correlation length scale
variance
- Two observations
The reduction of error at one point can be help
reduce the error further at neighboring points
This is what happens in 3D-Var vs. 1D-Var or
retrievals
34
Analysis of positive quantities

Even if the forecast error correlation is
everywhere positive, and that
the observed and forecast quantities are
positive, there is a possibility
that the analysis value (away from the
observation location) be
negative

Example Consider two grid points located at x1
and x2 , and a single observation yo
coinciding with grid point x1. The forecast
error between the two grid points can be written
as
where ?1 and ?2 are the error standard deviation
at the grid points 1 and 2, and ? is the error
correlation, assumed to be positive.
If is close to zero, then if the innovation
is negative , the analysis at grid
point 2 can be negative.
35
Is using the log x (the logarithm of mixing
ratio of concentration) in the analysis equation
is a reasonable solution ?
The analysis in logs will produce a best
estimate of the logarithm of the concentration
but
we would have , only if x was
not a random variable (or its variance would be
very small). If w is Gaussian distributed, then
x exp w is lognormally distributed
36
Letting w log x , the analysis equation is then
of the form
where the Bs are the error covariances defined
in log space. b is an observational correction,
and has different form whether we consider that
it is the measurement in physical space that
is unbiased or that it is the measurement in log
space that is unbiased.
When we consider that it is the measurement of
the physical concentration variable that is
unbiased, Cohn (1997) shows that
Transformation of mean and covariance between log
and physical quantities
37
When the observations are retrievals
Retrievals are usually 1D (vertical) analyses,
using the radiance as the observation, and a
prior xp with a given error covariance Pp
averaging kernel A for the limb sounder HRDI
So in a data assimilation scheme we have
Problem !
A is usually rank deficient so unless R is full
rank, cant invert
Solution Use SVD and use singular vectors has
observations (Joiner and daSilva
1998)
38
6. When and why do we need advanced data
assimilation
? When the observation network is non-stationary,
e.g. LEO
MLS assimilation of ozone using the sequential
method
39

When the data coverage is sparse to increase
information
content with dynamically evolving error
covariances

Impact of HALOE observations (30 profiles per
day and ½ hour model time step)
40
? To infer model parameter values
Estimates of photolysis rates during assimilation
experiment with simulated data random disturbed
photolysis rates (solid, thick) two-sigma bounds
due to specified uncertainties (dashed),
two-sigma bounds after assimilation of simulated
measurements (solid)
41
? To infer unobserved variables. Here winds
increments are developed from ozone observations
in a 4D Var scheme, although there is no error
correlation between ozone and winds in the
background error covariance
42
7. Data assimilation Numerical modelling
analogy

A data assimilation scheme can be unstable
In data sparse areas and with wrong error
covariances
The numerical properties of the tangent linear
model and adjoint
e.g. with an advection scheme that is
limited by the courant number,
the adjoint model is described by
the flux scheme with a
Lipchitz condition (creates
numerical noise at the poles
with a grid point model)
Parametrization of the error covariances
Because of the size of the error
covariances (e.g. 107 x 107) they cannot
be stored so they need to be modelled

Initial forecast error covariance derived from
zonal monthly climatology
Retrieval error used as observation error
No model error (assume perfect model)

Initial forecast error covariance derived from
zonal monthly climatology
Retrieval error used as observation error
No model error (assume perfect model)

Initial forecast error covariance derived from
zonal monthly climatology
Retrieval error used as observation error
No model error (assume perfect model)

46
8. Kalman filtering
In a Kalman filter, the error covariance is
dynamically evolved between the analyses, so
that both the state vector and the error
covariance are updated by the dynamics and by the
observations.
? Prediction
? Analysis
xf,a
tn
observations
Pf,a
tn
47
Definition The Kalman filter produces the best
estimate of the atmospheric state given all
current and past observations, and yet the
algorithm is sequential in time in a form of a
predictor-corrector scheme.
From a Bayesian point of view the Kalman filter
constructs an estimate based on
Time sequential property of a Kalman filter is
not easy to show
Example Estimating a scalar constant using no
prior, and assuming white noise
observation error with constant error variance.
The MV estimate can be shown to be the
simple time averaging
and which can be re-written in a sequential scheme
48
Prediction of errors
M is a model of the atmosphere and which
expresses our theoretical understanding, based on
physical laws of dynamics, radiation, etc
Use the same model to represent the evolution of
the true state xt
where is called the model error (or
modelling error).
Linearizing the model about the analysis state,
i.e.
It is usually assumed that the model error is
white, or equivalently that the forecast error
can be represented as a first-order Markov
process, in which case we can show that
49
Results from the assimilation of CH4 observations
from UARS Menard and Chang (2000)
Kalman filter
OI and other sub-optimal schemes
Forecast error no assimilation
observation error
observation error
analysis error no assimilation
OI
KF analysis error
variance evolving
The accumulation of information over time is
limited by model error, so that in practice we
need to integrate a Kalman filter only over some
finite time to obtain the optimal solution
50
8.2 Asymptotic stability and observability
? We say that we have observability when it is
possible for a data assimilation system with a
perfect model and perfect observations to
determine a unique initial state from a finite
time sequence of observations.
L
Example one-dimensional advection
over a periodic domain
x
continuous observations at a single point x0
over a time TL/U determines uniquely the initial
condition ?0(x) ?(x,0)
0
t
Theorem. If an assimilation system is
observable and the model is imperfect,
, then the sequence of forecast error
covariance converges geometrically to
which is independent of
51
Information about error covariances is still
needed !
52
9. Computational simplification for the Kalman
filter

Ensemble Kalman filtering
Monte Carlo approach. Storage of the error
covariance is never
needed except to compute the Kalman gain
which requires the
storage of an n x p forecast error
covariance matrix.
Because of the spatial correlation of
forecast errors, the sample
size of the ensemble can be significantly
reduced, and it is
found that sample size of 100-1000 gives
reasonable results.
Reduced rank formulation
By using a singular value decomposition of
either the model
matrix or forecast error covariance, a
number of singular modes
is needed to evolve the error covariances
if the dynamics is unstable.
Typically of the order of 100 modes may be
needed.
This formulation is not efficient for
stable dynamics problems.
Variance evolving scheme for chemical
constituents

53
Evolution of the forecast-error covariance for
the advection equation
Daley, R, 1992 Forecast-error Statistics for
Homogenous Observation Networks, Mon.
Wea. Rev., 120, 627-643.
j 1
j N
j k
i 1
? Kalman filter formulation
? M is a linear advection model
i k
i N
54
Evolution of the forecast-error covariance for
the advection equation
Daley, R, 1992 Forecast-error Statistics for
Homogenous Observation Networks, Mon.
Wea. Rev., 120, 627-643.
j 1
j N
j k
i 1
? Kalman filter formulation
? M is a linear advection model
i k
i N
55
Evolution of the forecast-error covariance for
the advection equation
Daley, R, 1992 Forecast-error Statistics for
Homogenous Observation Networks, Mon.
Wea. Rev., 120, 627-643.
j 1
j N
j k
i 1
? Kalman filter formulation
? M is a linear advection model
i k
i N
56
Evolution of the forecast-error covariance for
the advection equation
Daley, R, 1992 Forecast-error Statistics for
Homogenous Observation Networks, Mon.
Wea. Rev., 120, 627-643.
j 1
j N
j k
i 1
? Kalman filter formulation
? M is a linear advection model
i k
i N
57
Evolution of the forecast-error covariance for
the advection equation
Daley, R, 1992 Forecast-error Statistics for
Homogenous Observation Networks, Mon.
Wea. Rev., 120, 627-643.
j 1
j N
j k
i 1
? Kalman filter formulation
? M is a linear advection model
i k
i N
advection of error variance
58
Theory
Mass conservation
n is the number density (molecules m-3).
Mixing ratio is a conserved quantity
Lagrangian description
trajectories conservation
59
Evolution of errors
Evolution of the error covariance
Covariance of mixing ratio error between a pair
of points
Multiplying
and likewise for the ?1?t?2 term, and taking
the expectation gives,
X1
X2
Error covariance is conserved (without model
error)
60
Forecast error variance equation
The equation
has two characteristic equations
whose solution are of the form
Since no trajectories will cross each other, if
initially X1 and X2 coincide, then the
characteristics will be coincident at all times,
It follows that the error variance ?2(x,t)
P(x,x,t) will obey the following variance
evolution equation
61
Analysis error variance Approximation by
sequential processing (Dee 2002)

Although the KF scheme is sequential in time,
the state estimate is optimal
accounting for all observations since the
beginning of the integration
? Observations can be processed one at a time
(sequential processing of observations)

Update the whole Pa is costly. Dee (2002)
suggest to update only the error variance (keep
the error correlation fixed) in the sequential
processing of the observations.
62
Analysis error estimation an illustration
Observing 10 grid point values on a 1D grid
Approximation errors arise because correlations
are not updated
63
2D Kalman filter code (Parallel MPI code
Eulerian Lagrangian formulations) ftp//dao.gsfc
.nasa.gov/pub/papers/lyster/lagn_euln_hpcc_r_1_0_0
.tar.gz
3D Sequential method http//acd.ucar.edu/boris/r
esearch.htm

The variance evolving scheme (also known as the
sequential scheme) is the
most effective method to simplify the Kalman
filter
This scheme is limited to hyperbolic problems,
but not to
univariate problems. For instance we can
address the problem of
assimilating several chemical reacting
species using the same approach
Remark Computational simplification for
parabolic problems is currently being
investigated

64
10. Estimation of Q for the Kalman filter

Chi-square is easily obtained in
3D Var ?2 Jmin
PSAS ?2 ?T?

65
The lagged-forecast method revisited
66
Taking the average over a day, shows that the
observation contribution balances the model error
contribution
In this context, the lagged-forecast method
actually provide information about the model error
67
11. Estimation of model error bias

The transport of chemical tracer
where L represent the advection,
diffusion, and chemical life time,

68
Measurement and Analysis

Suppose we have only observation
of the concentration. The measurement
equation becomes
where
and the state equation becomes
where

Let yo be an observation, or set of
observations at time tn and H the
interpolation (or observation) operator
where the superscript t denotes the
true value (or value we are estimating),
?o is the observation error.

There are two parameters that we are
estimating on the basis of observations,
the initial state Co and source S.

Lets consider a state-augmented variable

69
Analysis

Even if initially there is no
correlation between the state
error and the source error, the
effect of transport create this
coupling

Lets calculate the analysis error
covariance, from
we get

The gain matrix for the state estimate

and the gain matrix for the source estimate
It is from the correlation between the state
error and source error that observational
information can be projected on the source
70
(No Transcript)
71

We note the measurement of
the state is equally efficient for
reducing the ltccgt and ltcsgt
error covariances.

We thus observe the following 1- When the
state-source error is uncorrelated, i.e.
?cs 0, then there is no variance
reduction of source error
72
Summary The system we are studying has
two unknown, the (initial) state and the source.
The effect of observing is encapsulated in the
state-state, state-source, and source-source
error covariances. In particular we note, that
we cant gain any information about the source if
there is no cross state-source error
correlation. Ironically, the effect of observing
is to reduce the cross state-source error
correlation. Finally, we note the state error
variance is greater reduced by observations than
the source error variance is reduced.
Then after a bit of algebra, we get or
That is, the effect of observing the state
reduces the cross state-source error correlation.
73
Forecast

Lets continue with the scalar model,
and parametrized the model evolution
by the simple equation
where L is a constant that represent
chemical life time and diffusion
effects

where the superscript t denotes the true value
The error thus follows where the tilde
denotes the departure from the truth

Also let consider a constant source

In order to compute the evolution of
error, a model of the truth is needed.
To simplify, lets assume that the
main uncertainty arises from modeling
of the source dynamics rather than
from the modeling of the transport,

74
(No Transcript)
75
(No Transcript)
76
Numerical experiment
No assimilation
With assimilation R Q100
77
Time correlation of errors A way to increase
sensitivity to the sources

In synthetic inversion of the source
long time integration period are
used. Lets examine how longer
time integrations may carry more
information about the source. For
this we will examine the time
correlations

we get,

Using the formal solutions

78
Numerical experiment
(same conditions as before)
red gain for the source blue gain for the state
79
12. Some thoughts about distinguishing model
error bias from observation error bias

Model error bias is a growing (time evolving)
quantity which
contrast with observation error bias that is
nearly constant in time
From a strict point of view of estimation
theory, there is a fundamental
difference between observation error
estimation and model error estimation.
In model error estimation, the model bias is
an unobserved variable.
Whereas in observation bias estimation, the
observation bias is
an observed variable.

80
13. Some remarks and future directions

Data assimilation is part of nearly all
geophysical and planetary science
investigation
Especially with advanced data assimilation, the
problem becomes that
of estimating the model error and observation
error. Thus touching upon
the scientific question of determining what is
model error from observation
error
Its is through lagged-forecast error covariance
(e.g. 4D Var) that we can
get a lot of sensitivity to model error
In order to make progress in data assimilation,
there is a need to develop
systematic ways to determine all the error
statistics

81
Useful formulas
Differentiation with inner products
Differentiation with trace of matrices
82
Sherman-Morrison-Woodbury formula
Matrix inversion lemma

Write a Comment

User Comments (0)