Title: OPAVAR : The Basics Based on version OPA8'2_VAR2'0 tagged on 24 June 2005
1OPAVAR The BasicsBased on version
OPA8.2_VAR2.0(tagged on 24 June 2005)
Anthony Weaver CERFACS,Toulouse
2Outline
- General formulation of OPAVAR
- Constructing the control vector (balance,
smoothing,) - TL/AD model operator, and observation operators
- Temporal filtering
- Minimization and preconditioning
- Observation processing, QC and error
specification - Assimilation diagnostics
3General formulation of OPAVAR
- Minimize
where - background term
- observation term
- subject to the constraint
- We construct the control vector so that
- Improves the conditioning of the minimization
- Allows for nonlinear balance and smoothness
constraints - (they are embedded in )
-
4Constructing the control vector
- The control vector is assumed to be related to
the model state vector via a transformation of
the form - removes dynamical and physical
correlations between variables - normalizes the uncorrelated variables
- roughens the normalized uncorrelated
variables - The above transformation is need to evaluate
- We also need the inverse to evaluate the
observation term - smoothes each of the normalized
uncorrelated 3D variables - rescales the smoothed variables
- provides multivariate coupling between
variables
5Minimization
- We use an incremental algorithm
- Minimize
-
-
- where
-
-
- The background error covariances in model
space are - flow dependent (they depend on the background
state) - adaptive (they are updated on outer iterations)
6Simplifications
- Through successive linearizations, we can write
- where since
- Also, since then in the background
term we can write - So, we can iterate the minimization algorithm
without the need to perform either the nonlinear
transformation or its inverse
- Only the linearized transformation (and its
adjoint) are required - We need to keep track of the increments
7- The linearized transformation (G) can be
summarized as
3D-Var (FGAT)
4D-Var
8Linearized balance operator
- transforms the uncorrelated increments
- includes
- Linear T-S constraint (reference state-dependent)
- Linear equation of state (reference
state-dependent) - Dynamic height relation
- Hydrostatic relation
- Geostrophy (f-plane ß-plane)
- A nonlinear T-S constraint has not been
implemented and is not needed with the
approximation described earlier - Implications of this approximation should be
explored
9Smoothing
- is based on a diffusion operator
- can be interpreted as a correlation matrix
(providing the diagonal elements are made equal
to one) - 3D correlations are modelled as the product of a
2D (horizontal) and 1D (vertical) diffusion
operator - Correlations implied by the diffusion operator
are effectively Gaussian - Correlations are weakly anisotropic
- Horizontal correlations are made anisotropic near
the equator (elsewhere isotropic) - Vertical correlations are made a function of the
model vertical resolution
10Normalization
- is a diagonal matrix of
normalization factors - normalizes the smoothed fields so that the
diagonal elements of are
equal to one - Normalization factors can be computed exactly but
the algorithm is very expensive (namelist) - Can be computed approximately using a more
efficient algorithm based on randomization
(namelist) - Needs to be done only once if is
state-independent (currently the case) - multiplies the smoothed fields by estimates
of the background error standard deviations for
the uncorrelated variables. Currently, - std. devs. are dependent on the background
vertical T gradient - std. devs. are specified as an analytical
function of depth - std. devs. are defined as an
analytical function of latitude - (amplitudes controlled by namelist parameters)
11Some comments on the transformation U
- The salinity increments can be strongly
anisotropic since they depend on local vertical
derivatives of the background T and S states in
the balance operator
3D-Var, single T observation at 100m
12Some comments on the transformation U
- The salinity increments can be strongly
anisotropic since they depend on local vertical
derivatives of the background T and S states in
the balance operator
3D-Var, single T observation at 100m
After horizontal smoothing of the salinity
balance coefficient
13Some comments on the transformation U
- U determines how information about unobserved
variables is extracted from directly observed
quantities
3D-Var, single SSH observation at the equator
14Some comments on the transformation U
- The balance operator is important in 4D-Var as
well as 3D-Var
4D-Var, single T observation at 100m on day 10
Balance results in a more effective projection of
the observation onto large-scale equatorial wave
modes
15Generalized observation operator cont.
- Simplified tangent-linear model and its adjoint
(M and MT) - TL/AD routines exist for the dynamical core
(standard ORCA2 options) of OPA8.2 - Test routines available to verify numerical
accuracy of TL/AD routines (namelist) - Simplifications have been introduced in the
following - Perturbations to vertical diffusion coefficients
ignored - Perturbations to isopycnal slopes ignored
- Centred advection scheme used everywhere (in the
nonlinear model upstream advection is used in
specific areas such as river mouths or straits) - Elliptic solver in TL/AD is based on Red-Black
SOR (CG solver is used in the nonlinear model) - Nonlinear trajectory stored once per day and
defined at intermediate times through linear
interpolation
16Generalized observation operator cont.
- Simplified tangent-linear model and its adjoint
(M and MT) - TL/AD routines exist for the dynamical core of
standard options for OPA8.2 - Simplifications have been introduced in the
following - Perturbations to vertical diffusion coefficients
ignored - Perturbations to isopycnal slopes ignored
- Centred advection scheme used everywhere (in the
nonlinear model upstream advection is used in
specific areas such as river mouths or straits) - Elliptic solver in TL/AD is based on Red-Black
SOR (CG solver is used in the nonlinear model) - Nonlinear trajectory stored once per day and
defined at intermediate times through linear
interpolation
17Generalized observation operator cont.
- Simplified tangent-linear model and its adjoint
(M and MT) cont. - Test routines available to verify numerical
accuracy of TL/AD routines - ASCII output for numerical TL and AD tests
- Vairmer format for 3D output of TL test
- Some Open-MP directives have been included in
costly routines - Trajectory I/O has been reduced (and code
performance enhanced) by storing two nonlinear
states in core memory
18Generalized observation operator cont.
- Observation operators (H)
- Interfaces have been developed to allow for the
assimilation of in situ T and S profiles, SST and
SSH - Observation times are approximated by the closest
model time step, except for buoy data which are
assimilated as a daily averaged quantity - These data types require spatial interpolation
(2D and 3D) from model grid to measurement
location - Vertical interpolation using linear or cubic
spline (namelist) - Several methods available for horizontal
interpolation (namelist) - For a quadrilateral grid, bilinear remapping
method most accurate (SCRIPS)
19Generalized observation operator cont.
- Observation operators (H) cont.
- A search algorithm has been developed to
determine the corner points of the quadrilateral
cell for interpolation - This is the most complicated and costly part of
the interpolation - Done only once (before assimilation) and
interpolation indices are stored in arrays - The efficiency of the algorithm is improved by
making an initial search around the coords of the
last observation point - If the initial search fails then the search is
performed over the entire domain using a
bisection method - Some points on the model grid have to be
treated specifically (beware of differences
between ORCA2 and ORCA0.5)
20Temporal filtering
- Two procedures can be used in OPAVAR to filter
out high frequency noise from the
analysis/forecast - Incremental Analysis Updating (IAU) (3D-Var)
(namelist) - Jc-term based on a digital filter (4D-Var)
(namelist) - IAU
- Forcing term with constant weights
- Can be used with 3D-Var and, in principle, with
4D-Var (but probably not useful) - Leads to time-continuous analyses ?
- Can produce excessive damping of low frequency
waves ? - Can degrade fit to data achieved in the analysis
(its a post-processing tool) ? - Jc-term
- Additional term in the cost function which
penalizes the energy of the difference between
and a time-(digital-) filtered
estimate of - Can be used with 4D-Var only
- Temporal filtering achieved during the analysis
step ? - Negligible additional cost ?
- Weighting parameter not easy to tune ?
- The need for IAU / Jc-term should be constantly
re-evaluated as improvements in other aspects of
the assimilation system are made (balance,
correlation operators, model error terms,)
21Minimization and preconditioning
- Minimization of the non-quadratic function is
achieved by minimizing a sequence of quadratic
functions (outer/inner loop configuration) - Two gradient descent algorithms available with
OPAVAR for the inner loop minimization - M1QN3 (Gilbert and Lemarechal 1989), with exact
line searches - CONGRAD (Fisher 1998), with some modifications
- For quadratic minimization problems, QN and CG
are equivalent within the limit of exact
arithmetic, but differ in practice due to
round-off error - CONGRAD is generally better than M1QN3
- Based on a three-term Lanczos recursion
- Lanczos vectors are reorthogonalized on each
iteration to counteract the effects of round-off
error - Reorthogonalization can increase memory
requirements significantly - Vectors are stored out-of-core to reduce memory
(cost due to increased I/O is insignificant)
22Minimization and preconditioning
- An inner loop minimization on a given outer
iteration is preconditioned using Hessian
information accumulated from the inner loop
minimization of the previous outer iteration - M1QN3 uses a preconditioner based on the BFGS
formula (warm start) - CONGRAD uses a (spectral) preconditioner
developed from the leading eigenvectors of the
Hessian (a by-product of the CG algorithm) - Alternative preconditioners currently being
explored at CERFACS (Gratton, Tshimanga) - Cost of the minimization algorithm is small
compared to the cost of computing ,
and
23Minimization in 3D-Var and 4D-Var with a 10-day
window (Global in situ temperature data
assimilation)
1 outer iteration
2 outer iterations
IAU Incremental Analysis Updating (Bloom et
al., 1996, MWR)
24Proposal for a cost effective minimization
strategy using a hybrid 3D-Var/4D-Var
- 4D-Var cost function for the outer loop
- Combined incremental 3D-Var and 4D-Var for the
inner loops - Incremental 3D-Var with IAU for the first (few)
inner loop(s) - Incremental 4D-Var with Jc-term for the final
(few) inner loop(s) using 3D-Var solution as the
first-guess for minimization - Can adjust number of inner/outer iterations,
number of 3D-Var/4D-Var loops according to
computational constraints and minimization
performance - A promising strategy to be explored.
25Observation processing, QC and error specification
- Some basic online screening and QC is available
- Observations are automatically flagged if
- Outside the time window
- Located at a model land point
- Grid search fails to find their coords. for
interpolation - Located at the start of an assimilation cycle
(and cycle is gt 1) - Duplicated (same type, space and time coords. as
another obs.) - Observations are optionally flagged if
- Located within a closed sea or semi-enclosed sea
(namelist) - Located poleward of a specified cutoff latitude
(namelist) - Located below a specified cutoff depth (namelist)
- T profile measurement located in the first model
level (namelist) - Vertically thinned (namelist)
- Fail a background check (namelist)
- Routines are available for plotting horizontal
maps of accepted/rejected data
26Automatic QC in OPAVAR
27Observation processing, QC and error
specification cont.
- Observation sorting (boring and messy!)
- Flagged observations are removed from the
observation vector - Remaining observations are batched according to
time step in the assimilation window - The algorithm that does this is efficient but can
split in situ profiles into non-contiguous
segments in the vector - Profiles must be reordered for vertical thinning
and background check - Separated segments are first appended to one
another - Appended segments are then sorted by depth using
a Heapsort algorithm
28Observation processing, QC and error
specification cont.
- Vertical thinning
- Densely sampled profiles can (should?) be thinned
before assimilation to remove redundant
observations - Thinning will
- reduce vertical error correlations in the
observations (i.e., make R more diagonal) - reduce unnecessary computational overhead
- ENACT data are already thinned (max 150 levels)
but can still leave several 10s of measurements
within a single model (ORCA2) layer - Thinning is performed evenly in a given model
layer according to a maximum allowed value /
layer (namelist)
29Observation processing, QC and error
specification cont.
- Background check
- Designed to flag bad data that have survived an
initial QC or good data that are incompatible
with the model - For Gaussian, unbiased errors, an innovation di
is unlikely if - where is a
tolerance factor and -
- is not available explicitly but can be
estimated using randomization (namelist) - Or for simplicity we can set in
the above check (namelist) - When a observation fails this check, the entire
profile is flagged - The background check has not been used for most
experiments with the ENACT data-set (should be
used for real-time applications!) - Sensitive to the choice of (namelist)
30Observation processing, QC and error
specification cont.
- Observation errors
- All observation errors are assumed to be
uncorrelated (R is diagonal) - is specified per instrument (XBT, CTD, )
(namelist) - Different parameterizations are available to
account for representativeness error (namelist) - for profiles made an analytical function of
depth - for T profiles made proportional to
for T - made a function of the innovation vector
- made a function of the distance to
coastlines - Standard options
31Assimilation diagnostics
- Some standard diagnostics for variational
assimilation are stored in ASCII files - Value of the incremental (non-incremental) J and
its components (Jb, Jo, Jc) vs. inner (outer)
iteration - Value of the gradient and increment vs. inner
iteration - Other minimization details
32Sample output in coststats.diagnostic1st outer
iteration before the inner loop
- Cycle number
1 - Number of outer iterations 1
- Number of inner iterations 0
- Number of background variables 1701995
- Number of in situ T obs
46264 - Number of in situ S obs
11048 - Number of SSH obs
0 - Number of SST obs
0 - Total number of real obs
57312 - Total number of pseudo real obss
1742556 -
- Value of J_b
0.000000000000000000E00 - Value of J_c
0.000000000000000000E00 - Value of nonincremental J_o (in situ T)
91585.4573984951712 - Value of nonincremental J_o (in situ S)
22800.6436087972543 - Value of nonincremental J_o (SSH)
0.000000000000000000E00 - Value of nonincremental J_o (SST)
0.000000000000000000E00 - Value of nonincremental J_o
114386.101007292425 - Value of nonincremental J
114386.101007292425
33Sample output in coststats.diagnostic 1st outer
iteration after 1st inner loop
- Cycle number
1 - Number of outer iterations 1
- Number of inner iterations 10
- Number of background variables
1701995 - Number of in situ T obs
46264 - Number of in situ S obs
11048 - Number of SSH obs 0
- Number of SST obs 0
- Total number of real obs
57312 - Total number of pseudo real obs
1742556 -
- Value of J_b
11412.8692121270215 - Value of J_c
110.309572196781076 - Value of incremental J_o (in situ T)
38781.8255538293815 - Value of incremental J_o (in situ S)
9407.66244325712796 - Value of incremental J_o (SSH)
0.000000000000000000E00 - Value of incremental J_o (SST)
0.000000000000000000E00 - Value of incremental J_o
48189.4879970865077 - Value of incremental J
59712.6667814103130
34Sample output in coststats.diagnostic2nd outer
iteration before the inner loop
- Cycle number 1
- Number of outer iterations 2
- Number of inner iterations
10 - Number of background variables
1701995 - Number of in situ T obs
46264 - Number of in situ S obs
11048 - Number of SSH obs
0 - Number of SST obs
0 - Total number of real obs
57312 - Total number of pseudo real obs
1742556 -
- Value of J_b
2853.21730303175536 - Value of J_c
110.931994244594165 - Value of nonincremental J_o (in situ
T) 39464.1026831233976 - Value of nonincremental J_o (in situ
S) 10672.5246749353282 - Value of nonincremental J_o (SSH)
0.000000000000000000E00 - Value of nonincremental J_o (SST)
0.000000000000000000E00 - Value of nonincremental J_o
50136.6273580587294 - Value of nonincremental J
53100.7766553350812
35Assimilation diagnostics
- Innovations and residuals
- Model space increments (analysis-minus-background)
- Stored in a NetCDF restart file
- Observation space increments (OmB, OmA, OmAinc)
- For ENACT in situ data, stored in a copy of the
original data file - and (at obs point) are also stored
- Post-processing scripts/routines available for
computing statistics as a function of region,
time and observation type - Nothing yet available for altimeter data
36Box regions for ENACT diagnostics
373D-Var and Control T and S residual statistics
for 1987-1994
Obs. Control ENACT_v1.4 data
Obs. - Analysis
3D-Var, T S ENACT_v1.4 data
Obs. - Analysis
3D-Var, T S EN2_v1a data
383D-Var (T S EN2_v1a data) and Control T and S
residual and innovation statistics for 1987-1994
Obs.-Control
Obs.-Anal.
(Before IAU)
Obs.-Anal.
(After IAU)
39Assimilation diagnostics
- Diagnosing implied background error variances in
model and observation space (namelist) - Compute (approximately) the diagonal of B, MBMT,
HBHT and HMBMTHT using randomization (namelist) - This computation is also needed online for the
background check - Diagnosing implied background error covariances
(namelist) - Compute columns of B
- Single observation (T, S, SST, SSH) or profile
(T, S) experiments using prescribed innovations
(namelist) - Some redundancy with previous diagnostic
- Some output still in Vairmer format, other in
NetCDF