OPAVAR : The Basics Based on version OPA8'2_VAR2'0 tagged on 24 June 2005

About This Presentation

Title:

OPAVAR : The Basics Based on version OPA8'2_VAR2'0 tagged on 24 June 2005

Description:

Constructing the control vector (balance, smoothing, ... advection scheme used everywhere (in the nonlinear model upstream advection is ... – PowerPoint PPT presentation

Number of Views:24

Avg rating:3.0/5.0

Slides: 40

Provided by: cerf

Category:

more less

Transcript and Presenter's Notes

Title: OPAVAR : The Basics Based on version OPA8'2_VAR2'0 tagged on 24 June 2005

1
OPAVAR The BasicsBased on version
OPA8.2_VAR2.0(tagged on 24 June 2005)
Anthony Weaver CERFACS,Toulouse
2
Outline

General formulation of OPAVAR
Constructing the control vector (balance,
smoothing,)
TL/AD model operator, and observation operators
Temporal filtering
Minimization and preconditioning
Observation processing, QC and error
specification
Assimilation diagnostics

3
General formulation of OPAVAR

Minimize
where
background term
observation term
subject to the constraint
We construct the control vector so that
Improves the conditioning of the minimization
Allows for nonlinear balance and smoothness
constraints
(they are embedded in )

4
Constructing the control vector

The control vector is assumed to be related to
the model state vector via a transformation of
the form
removes dynamical and physical
correlations between variables
normalizes the uncorrelated variables
roughens the normalized uncorrelated
variables
The above transformation is need to evaluate
We also need the inverse to evaluate the
observation term
smoothes each of the normalized
uncorrelated 3D variables
rescales the smoothed variables
provides multivariate coupling between
variables

5
Minimization

We use an incremental algorithm
Minimize
where

The background error covariances in model
space are
flow dependent (they depend on the background
state)
adaptive (they are updated on outer iterations)

6
Simplifications

Through successive linearizations, we can write
where since
Also, since then in the background
term we can write
So, we can iterate the minimization algorithm
without the need to perform either the nonlinear
transformation or its inverse
Only the linearized transformation (and its
adjoint) are required
We need to keep track of the increments

The linearized transformation (G) can be
summarized as

3D-Var (FGAT)
4D-Var
8
Linearized balance operator

transforms the uncorrelated increments
includes
Linear T-S constraint (reference state-dependent)
Linear equation of state (reference
state-dependent)
Dynamic height relation
Hydrostatic relation
Geostrophy (f-plane ß-plane)
A nonlinear T-S constraint has not been
implemented and is not needed with the
approximation described earlier
Implications of this approximation should be
explored

9
Smoothing

is based on a diffusion operator
can be interpreted as a correlation matrix
(providing the diagonal elements are made equal
to one)
3D correlations are modelled as the product of a
2D (horizontal) and 1D (vertical) diffusion
operator
Correlations implied by the diffusion operator
are effectively Gaussian
Correlations are weakly anisotropic
Horizontal correlations are made anisotropic near
the equator (elsewhere isotropic)
Vertical correlations are made a function of the
model vertical resolution

10
Normalization

is a diagonal matrix of
normalization factors
normalizes the smoothed fields so that the
diagonal elements of are
equal to one
Normalization factors can be computed exactly but
the algorithm is very expensive (namelist)
Can be computed approximately using a more
efficient algorithm based on randomization
(namelist)
Needs to be done only once if is
state-independent (currently the case)
multiplies the smoothed fields by estimates
of the background error standard deviations for
the uncorrelated variables. Currently,
std. devs. are dependent on the background
vertical T gradient
std. devs. are specified as an analytical
function of depth
std. devs. are defined as an
analytical function of latitude
(amplitudes controlled by namelist parameters)

11
Some comments on the transformation U

The salinity increments can be strongly
anisotropic since they depend on local vertical
derivatives of the background T and S states in
the balance operator

3D-Var, single T observation at 100m
12
Some comments on the transformation U

The salinity increments can be strongly
anisotropic since they depend on local vertical
derivatives of the background T and S states in
the balance operator

3D-Var, single T observation at 100m
After horizontal smoothing of the salinity
balance coefficient
13
Some comments on the transformation U

U determines how information about unobserved
variables is extracted from directly observed
quantities

3D-Var, single SSH observation at the equator
14
Some comments on the transformation U

The balance operator is important in 4D-Var as
well as 3D-Var

4D-Var, single T observation at 100m on day 10
Balance results in a more effective projection of
the observation onto large-scale equatorial wave
modes
15
Generalized observation operator cont.

Simplified tangent-linear model and its adjoint
(M and MT)
TL/AD routines exist for the dynamical core
(standard ORCA2 options) of OPA8.2
Test routines available to verify numerical
accuracy of TL/AD routines (namelist)
Simplifications have been introduced in the
following
Perturbations to vertical diffusion coefficients
ignored
Perturbations to isopycnal slopes ignored
Centred advection scheme used everywhere (in the
nonlinear model upstream advection is used in
specific areas such as river mouths or straits)
Elliptic solver in TL/AD is based on Red-Black
SOR (CG solver is used in the nonlinear model)
Nonlinear trajectory stored once per day and
defined at intermediate times through linear
interpolation

16
Generalized observation operator cont.

Simplified tangent-linear model and its adjoint
(M and MT)
TL/AD routines exist for the dynamical core of
standard options for OPA8.2
Simplifications have been introduced in the
following
Perturbations to vertical diffusion coefficients
ignored
Perturbations to isopycnal slopes ignored
Centred advection scheme used everywhere (in the
nonlinear model upstream advection is used in
specific areas such as river mouths or straits)
Elliptic solver in TL/AD is based on Red-Black
SOR (CG solver is used in the nonlinear model)
Nonlinear trajectory stored once per day and
defined at intermediate times through linear
interpolation

17
Generalized observation operator cont.

Simplified tangent-linear model and its adjoint
(M and MT) cont.
Test routines available to verify numerical
accuracy of TL/AD routines
ASCII output for numerical TL and AD tests
Vairmer format for 3D output of TL test
Some Open-MP directives have been included in
costly routines
Trajectory I/O has been reduced (and code
performance enhanced) by storing two nonlinear
states in core memory

18
Generalized observation operator cont.

Observation operators (H)
Interfaces have been developed to allow for the
assimilation of in situ T and S profiles, SST and
SSH
Observation times are approximated by the closest
model time step, except for buoy data which are
assimilated as a daily averaged quantity
These data types require spatial interpolation
(2D and 3D) from model grid to measurement
location
Vertical interpolation using linear or cubic
spline (namelist)
Several methods available for horizontal
interpolation (namelist)
For a quadrilateral grid, bilinear remapping
method most accurate (SCRIPS)

19
Generalized observation operator cont.

Observation operators (H) cont.
A search algorithm has been developed to
determine the corner points of the quadrilateral
cell for interpolation
This is the most complicated and costly part of
the interpolation
Done only once (before assimilation) and
interpolation indices are stored in arrays
The efficiency of the algorithm is improved by
making an initial search around the coords of the
last observation point
If the initial search fails then the search is
performed over the entire domain using a
bisection method
Some points on the model grid have to be
treated specifically (beware of differences
between ORCA2 and ORCA0.5)

20
Temporal filtering

Two procedures can be used in OPAVAR to filter
out high frequency noise from the
analysis/forecast
Incremental Analysis Updating (IAU) (3D-Var)
(namelist)
Jc-term based on a digital filter (4D-Var)
(namelist)
IAU
Forcing term with constant weights
Can be used with 3D-Var and, in principle, with
4D-Var (but probably not useful)
Leads to time-continuous analyses ?
Can produce excessive damping of low frequency
waves ?
Can degrade fit to data achieved in the analysis
(its a post-processing tool) ?
Jc-term
Additional term in the cost function which
penalizes the energy of the difference between
and a time-(digital-) filtered
estimate of
Can be used with 4D-Var only
Temporal filtering achieved during the analysis
step ?
Negligible additional cost ?
Weighting parameter not easy to tune ?
The need for IAU / Jc-term should be constantly
re-evaluated as improvements in other aspects of
the assimilation system are made (balance,
correlation operators, model error terms,)

21
Minimization and preconditioning

Minimization of the non-quadratic function is
achieved by minimizing a sequence of quadratic
functions (outer/inner loop configuration)
Two gradient descent algorithms available with
OPAVAR for the inner loop minimization
M1QN3 (Gilbert and Lemarechal 1989), with exact
line searches
CONGRAD (Fisher 1998), with some modifications
For quadratic minimization problems, QN and CG
are equivalent within the limit of exact
arithmetic, but differ in practice due to
round-off error
CONGRAD is generally better than M1QN3
Based on a three-term Lanczos recursion
Lanczos vectors are reorthogonalized on each
iteration to counteract the effects of round-off
error
Reorthogonalization can increase memory
requirements significantly
Vectors are stored out-of-core to reduce memory
(cost due to increased I/O is insignificant)

22
Minimization and preconditioning

An inner loop minimization on a given outer
iteration is preconditioned using Hessian
information accumulated from the inner loop
minimization of the previous outer iteration
M1QN3 uses a preconditioner based on the BFGS
formula (warm start)
CONGRAD uses a (spectral) preconditioner
developed from the leading eigenvectors of the
Hessian (a by-product of the CG algorithm)
Alternative preconditioners currently being
explored at CERFACS (Gratton, Tshimanga)
Cost of the minimization algorithm is small
compared to the cost of computing ,
and

23
Minimization in 3D-Var and 4D-Var with a 10-day
window (Global in situ temperature data
assimilation)
1 outer iteration
2 outer iterations
IAU Incremental Analysis Updating (Bloom et
al., 1996, MWR)
24
Proposal for a cost effective minimization
strategy using a hybrid 3D-Var/4D-Var

4D-Var cost function for the outer loop
Combined incremental 3D-Var and 4D-Var for the
inner loops
Incremental 3D-Var with IAU for the first (few)
inner loop(s)
Incremental 4D-Var with Jc-term for the final
(few) inner loop(s) using 3D-Var solution as the
first-guess for minimization
Can adjust number of inner/outer iterations,
number of 3D-Var/4D-Var loops according to
computational constraints and minimization
performance
A promising strategy to be explored.

25
Observation processing, QC and error specification

Some basic online screening and QC is available
Observations are automatically flagged if
Outside the time window
Located at a model land point
Grid search fails to find their coords. for
interpolation
Located at the start of an assimilation cycle
(and cycle is gt 1)
Duplicated (same type, space and time coords. as
another obs.)
Observations are optionally flagged if
Located within a closed sea or semi-enclosed sea
(namelist)
Located poleward of a specified cutoff latitude
(namelist)
Located below a specified cutoff depth (namelist)
T profile measurement located in the first model
level (namelist)
Vertically thinned (namelist)
Fail a background check (namelist)
Routines are available for plotting horizontal
maps of accepted/rejected data

26
Automatic QC in OPAVAR
27
Observation processing, QC and error
specification cont.

Observation sorting (boring and messy!)
Flagged observations are removed from the
observation vector
Remaining observations are batched according to
time step in the assimilation window
The algorithm that does this is efficient but can
split in situ profiles into non-contiguous
segments in the vector
Profiles must be reordered for vertical thinning
and background check
Separated segments are first appended to one
another
Appended segments are then sorted by depth using
a Heapsort algorithm

28
Observation processing, QC and error
specification cont.

Vertical thinning
Densely sampled profiles can (should?) be thinned
before assimilation to remove redundant
observations
Thinning will
reduce vertical error correlations in the
observations (i.e., make R more diagonal)
reduce unnecessary computational overhead
ENACT data are already thinned (max 150 levels)
but can still leave several 10s of measurements
within a single model (ORCA2) layer
Thinning is performed evenly in a given model
layer according to a maximum allowed value /
layer (namelist)

29
Observation processing, QC and error
specification cont.

Background check
Designed to flag bad data that have survived an
initial QC or good data that are incompatible
with the model
For Gaussian, unbiased errors, an innovation di
is unlikely if
where is a
tolerance factor and
is not available explicitly but can be
estimated using randomization (namelist)
Or for simplicity we can set in
the above check (namelist)
When a observation fails this check, the entire
profile is flagged
The background check has not been used for most
experiments with the ENACT data-set (should be
used for real-time applications!)
Sensitive to the choice of (namelist)

30
Observation processing, QC and error
specification cont.

Observation errors
All observation errors are assumed to be
uncorrelated (R is diagonal)
is specified per instrument (XBT, CTD, )
(namelist)
Different parameterizations are available to
account for representativeness error (namelist)
for profiles made an analytical function of
depth
for T profiles made proportional to
for T
made a function of the innovation vector
made a function of the distance to
coastlines
Standard options

31
Assimilation diagnostics

Some standard diagnostics for variational
assimilation are stored in ASCII files
Value of the incremental (non-incremental) J and
its components (Jb, Jo, Jc) vs. inner (outer)
iteration
Value of the gradient and increment vs. inner
iteration
Other minimization details

32
Sample output in coststats.diagnostic1st outer
iteration before the inner loop

Cycle number
1
Number of outer iterations 1
Number of inner iterations 0
Number of background variables 1701995
Number of in situ T obs
46264
Number of in situ S obs
11048
Number of SSH obs
0
Number of SST obs
0
Total number of real obs
57312
Total number of pseudo real obss
1742556
Value of J_b
0.000000000000000000E00
Value of J_c
0.000000000000000000E00
Value of nonincremental J_o (in situ T)
91585.4573984951712
Value of nonincremental J_o (in situ S)
22800.6436087972543
Value of nonincremental J_o (SSH)
0.000000000000000000E00
Value of nonincremental J_o (SST)
0.000000000000000000E00
Value of nonincremental J_o
114386.101007292425
Value of nonincremental J
114386.101007292425

33
Sample output in coststats.diagnostic 1st outer
iteration after 1st inner loop

Cycle number
1
Number of outer iterations 1
Number of inner iterations 10
Number of background variables
1701995
Number of in situ T obs
46264
Number of in situ S obs
11048
Number of SSH obs 0
Number of SST obs 0
Total number of real obs
57312
Total number of pseudo real obs
1742556
Value of J_b
11412.8692121270215
Value of J_c
110.309572196781076
Value of incremental J_o (in situ T)
38781.8255538293815
Value of incremental J_o (in situ S)
9407.66244325712796
Value of incremental J_o (SSH)
0.000000000000000000E00
Value of incremental J_o (SST)
0.000000000000000000E00
Value of incremental J_o
48189.4879970865077
Value of incremental J
59712.6667814103130

34
Sample output in coststats.diagnostic2nd outer
iteration before the inner loop

Cycle number 1
Number of outer iterations 2
Number of inner iterations
10
Number of background variables
1701995
Number of in situ T obs
46264
Number of in situ S obs
11048
Number of SSH obs
0
Number of SST obs
0
Total number of real obs
57312
Total number of pseudo real obs
1742556
Value of J_b
2853.21730303175536
Value of J_c
110.931994244594165
Value of nonincremental J_o (in situ
T) 39464.1026831233976
Value of nonincremental J_o (in situ
S) 10672.5246749353282
Value of nonincremental J_o (SSH)
0.000000000000000000E00
Value of nonincremental J_o (SST)
0.000000000000000000E00
Value of nonincremental J_o
50136.6273580587294
Value of nonincremental J
53100.7766553350812

35
Assimilation diagnostics

Innovations and residuals
Model space increments (analysis-minus-background)
Stored in a NetCDF restart file
Observation space increments (OmB, OmA, OmAinc)
For ENACT in situ data, stored in a copy of the
original data file
and (at obs point) are also stored
Post-processing scripts/routines available for
computing statistics as a function of region,
time and observation type
Nothing yet available for altimeter data

36
Box regions for ENACT diagnostics
37
3D-Var and Control T and S residual statistics
for 1987-1994
Obs. Control ENACT_v1.4 data
Obs. - Analysis
3D-Var, T S ENACT_v1.4 data
Obs. - Analysis
3D-Var, T S EN2_v1a data
38
3D-Var (T S EN2_v1a data) and Control T and S
residual and innovation statistics for 1987-1994
Obs.-Control
Obs.-Anal.
(Before IAU)
Obs.-Anal.
(After IAU)
39
Assimilation diagnostics

Diagnosing implied background error variances in
model and observation space (namelist)
Compute (approximately) the diagonal of B, MBMT,
HBHT and HMBMTHT using randomization (namelist)
This computation is also needed online for the
background check
Diagnosing implied background error covariances
(namelist)
Compute columns of B
Single observation (T, S, SST, SSH) or profile
(T, S) experiments using prescribed innovations
(namelist)
Some redundancy with previous diagnostic
Some output still in Vairmer format, other in
NetCDF