Title: What is Data Assimilation? A Tutorial
1What is Data Assimilation?A Tutorial
Andrew S. Jones Lots of help also from Steven
Fletcher, Laura Fowler, Tarendra Lakhankar, Scott
Longmore, Manajit Sengupta, Tom Vonder Haar,
Dusanka Zupanski, and Milija Zupanski
2Data Assimilation
- Outline
- Why Do Data Assimilation?
- Who and What
- Important Concepts
- Definitions
- Brief History
- Common System Issues / Challenges
3The Purpose of Data Assimilation
- Why do data assimilation?
4The Purpose of Data Assimilation
- Why do data assimilation? (Answer Common Sense)
5The Purpose of Data Assimilation
- Why do data assimilation? (Answer Common Sense)
- MYTH Its just an engineering tool
6The Purpose of Data Assimilation
- Why do data assimilation? (Answer Common Sense)
- MYTH Its just an engineering tool
- If Truth matters,
- Its our most important science tool
7The Purpose of Data Assimilation
- Why do data assimilation?
- I want better model initial conditions for better
model forecasts
8The Purpose of Data Assimilation
- Why do data assimilation?
- I want better model initial conditions for better
model forecasts - I want better calibration and validation (cal/val)
9The Purpose of Data Assimilation
- Why do data assimilation?
- I want better model initial conditions for better
model forecasts - I want better calibration and validation
(cal/val) - I want better acquisition guidance
10The Purpose of Data Assimilation
- Why do data assimilation?
- I want better model initial conditions for better
model forecasts - I want better calibration and validation
(cal/val) - I want better acquisition guidance
- I want better scientific understanding of
- Model errors (and their probability
distributions) - Data errors (and their probability distributions)
- Combined Model/Data correlations
- DA methodologies (minimization, computational
optimizations, representation methods, various
method approximations) - Physical process interactions
11The Purpose of Data Assimilation
- Why do data assimilation?
- I want better model initial conditions for better
model forecasts - I want better calibration and validation
(cal/val) - I want better acquisition guidance
- I want better scientific understanding of
- Model errors (and their probability
distributions) - Data errors (and their probability distributions)
- Combined Model/Data correlations
- DA methodologies (minimization, computational
optimizations, representation methods, various
method approximations) - Physical process interactions (i.e.,
sensitivities and feedbacks)Leads toward better
future models
12The Purpose of Data Assimilation
- Why do data assimilation?
- I want better model initial conditions for better
model forecasts - I want better calibration and validation
(cal/val) - I want better acquisition guidance
- I want better scientific understanding of
- Model errors (and their probability
distributions) - Data errors (and their probability distributions)
- Combined Model/Data correlations
- DA methodologies (minimization, computational
optimizations, representation methods, various
method approximations) - Physical process interactions (i.e.,
sensitivities and feedbacks)Leads toward better
future models - VIRTUOUS CYCLE
13The Data Assimilation Community
- Who is involved in data assimilation?
- NWP Data Assimilation Experts
- NWP Modelers
- Application and Observation Specialists
- Cloud Physicists / PBL Experts / NWP
Parameterization Specialists - Physical Scientists (Physical Algorithm
Specialists) - Radiative Transfer Specialists
- Applied Mathematicians / Control Theory Experts
- Computer Scientists
- Science Program Management (NWP and Science
Disciplines) - Forecasters
- Users and Customers
14The Data Assimilation Community
- What skills are needed by each involved group?
- NWP Data Assimilation Experts (DA system
methodology) - NWP Modelers (Model Physics DA system)
- Application and Observation Specialists
(Instrument capabilities) - Physical Scientists (Instrument Physics DA
system) - Radiative Transfer Specialists (Instrument
config. specifications) - Applied Mathematicians (Control theory
methodology) - Computer Scientists (DA system OPS time
requirements) - Science Program Management (Everything
Good People) - Forecasters (Everything OPS time reqs.
Easy/fast access) - Users and Customers (Could be a wide variety of
responses)e.g., NWS / Army / USAF / Navy / NASA
/ NSF / DOE / ECMWF
15The Data Assimilation Community
- Are you part of this community?
16The Data Assimilation Community
- Are you part of this community?
- Yes, you just may not know it yet.
17The Data Assimilation Community
- Are you part of this community?
- Yes, you just may not know it yet.
- Who knows all about data assimilation?
18The Data Assimilation Community
- Are you part of this community?
- Yes, you just may not know it yet.
- Who knows all about data assimilation?
- No one knows it all, it takes many experts
19The Data Assimilation Community
- Are you part of this community?
- Yes, you just may not know it yet.
- Who knows all about data assimilation?
- No one knows it all, it takes many experts
- How large are these systems?
20The Data Assimilation Community
- Are you part of this community?
- Yes, you just may not know it yet.
- Who knows all about data assimilation?
- No one knows it all, it takes many experts
- How large are these systems?
- Typically, the DA systems are medium-sized
projectsusing software industry standards - Medium multi-year coding effort by several
individuals (e.g., RAMDAS is 230K lines of
code, 3500 pages of code) - Satellite processing systems tend to be larger
still - Our CIRA Mesoscale 4DVAR system was built over
7-8 years with heritage from the ETA 4DVAR system
21The Building Blocks of Data Assimilation
NWP Model
Control Variables are the initial model state
variables that are optimized using the new
data information as a guide
Observation Model
Observations
Minimization
They can also include boundary condition informati
on, model parameters for tuning, etc.
NWPAdjoint
Observation ModelAdjoint
22What Are We Minimizing?
Minimize discrepancy between model and
observation data over time
The Cost Function, J, is the link between the
observational data and the model
variables Observations are either assumed
unbiased, or are debiased by some adjustment
method
23Bayes Theorem
Maximum Conditional Probability is given by P
(x y) P (y x) P (x) Assuming Gaussian
distributions P (y x) exp -1/2 y H
(x)T R-1 y H (x) P (x) exp -1/2 x xbT
B-1 x xb
e.g., 3DVAR
Lorenc (1986)
24What Do We Trust for Truth?
Minimize discrepancy between model and
observation data over time
Model Background or Observations?
25What Do We Trust for Truth?
Minimize discrepancy between model and
observation data over time
Model Background or Observations? Trust
Weightings Just like your financial credit score!
26Who are the Candidates for Truth?
Minimize discrepancy between model and
observation data over time
Candidate 1 Background Term x0 is the model
state vector at the initial time t0 this is
also the control variable, the object of the
minimization process xb is the model
background state vector B is the background
error covariance of the forecast and model
errors
27Who are the Candidates for Truth?
Minimize discrepancy between model and
observation data over time
Candidate 2 Observational Term y is the
observational vector, e.g., the satellite input
data (typically radiances), salinity, sounding
profiles M0,i(x0) is the model state at the
observation time i h is the observational
operator, for example theforward radiative
transfer model R is the observational error
covariance matrix that specifies the instrumental
noise and data representation errors (currently
assumed to be diagonal)
28What Do We Trust for Truth?
Minimize discrepancy between model and
observation data over time
- Candidate 1 Background Term
- The default condition for the assimilation when
- data are not available or
- the available data have no significant
sensitivity to the model state or - the available data are inaccurate
29Model Error Impacts our Trust
Minimize discrepancy between model and
observation data over time
Candidate 1 Background Term Model error issues
are important Model error varies as a function of
the model time Model error grows with
time Therefore the background term should be
trusted more at the initial stages of the model
run and trusted less at the end of the model run
30How to Adjust for Model Error?
Minimize discrepancy between model and
observation data over time
- Candidate 1 Background Term
- Add a model error term to the cost function so
that the weight at that specific model step is
appropriately weighted or - Use other possible adjustments in the
methodology, i.e., make an assumption about the
model error impacts - If model error adjustments or controls are used
the DA system is said to be weakly constrained
31What About Model Error Errors?
Minimize discrepancy between model and
observation data over time
- Candidate 1 Background Term
- Model error adjustments to the weighting can be
wrong - In particular, most assume some type of linearity
- Non-linear physical processes may break these
assumptions and be more complexly interrelated - A data assimilation system with no model error
control is said to be strongly constrained
(perfect model assumption)
32What About other DA Errors?
- Overlooked Issues?
- Data debiasing relative to the DA system
reference. It is not the Truth,however it is
self-consistent. - DA Methodology Errors?
- Assumptions Linearization, Gaussianity, Model
errors - Representation errors (space and time)
- Poorly known background error covariances
- Imperfect observational operators
- Overly aggressive data quality control
- Historical emphasis on dynamical impact vs.
physical
Synoptic vs. Mesoscale?
33DA Theory is Still Maturing
- The Future Lognormal DA (Fletcher and Zupanski,
2006, 2007) - Gaussian systems typically force lognormal
variables to become Gaussian introducing an
avoidable data assimilation system bias
Lognormal Variables Clouds Precipitation Water
vapor Emissivities Many other hydrologic fields
Mode ? Mean
Many important variables are lognormally
distributed
Gaussian data assimilation system variables are
Gaussian
34What Do We Trust for Truth?
Minimize discrepancy between model and
observation data over time
- Candidate 2 Observational Term
- The non-default condition for the assimilation
when - data are available and
- data are sensitive to the model state and
- data are precise (not necessarily accurate) and
- data are not thrown away by DA quality control
methods
35What Truth Do We Have?
Minimize discrepancy between model and
observation data over time
DATA MODEL CENTRIC CENTRIC
TRUTH
36DA Theory is Still Maturing
- A Brief History of DA
- Hand Interpolation
- Local polynomial interpolation schemes(e.g.,
Cressman) - Use of first guess, i.e., a background
- Use of an analysis cycle to regeneratea new
first guess - Empirical schemes, e.g., nudging
- Least squares methods
- Variational DA (VAR)
- Sequential DA (KF)
- Monte Carlo Approx. to Seq. DA (EnsKF)
37Variational Techniques
Major Flavors 1DVAR (Z), 3DVAR (X,Y,Z), 4DVAR
(X,Y,Z,T) Lorenc (1986) and others Became the
operational scheme in early 1990s to the present
day
Finds the maximum likelihood (if Gaussian,
etc.) (actually it is a minimum variance
method) Comes from setting the gradient of the
cost function equal to zero Control variable is
xa
38Sequential Techniques
Kalman (1960) and many others These techniques
can evolve the forecast error covariance
fields similar in concept to OI
B is no longer static, B gt Pf forecast error
covariance Pa (ti) is estimated at future times
using the model K Kalman Gain (in blue
boxes) Extended KF, Pa is found by linearizing
the model about the nonlinear trajectory of the
model between ti-1 and ti
39Sequential Techniques
Ensembles can be used in KF-based sequential DA
systems Ensembles are used to estimate Pf through
Gaussian sampling theory
f is a particular forecast instance l is the
reference state forecast Pf is estimated at
future times using the model K number model runs
are required (Q How to populate the seed
perturbations?) Sampling allows for use of
approximate solutions Eliminates the need to
linearize the model (as in Extended KF) No
tangent linear or adjoint models are needed
40Sequential Techniques
- Notes on EnsKF-based sequential DA systems
- EnsKFs are an approximation
- Underlying theory is the KF
- Assumes Gaussian distributions
- Many ensemble samples are required
- Can significantly improve Pf
- Where does H fit in? Is it fully resolved?
- What about the Filter aspects?
- Future Directions
- Research using Hybrid EnsKF-Var techniques
41Sequential Techniques
Zupanski (2005) Maximum Likelihood Ensemble
Filter (MLEF) Structure function version of
Ensemble-based DA (Note Does not use sampling
theory, and is more similar to a variational DA
scheme using principle component analysis (PCA)
NE is the number of ensembles S is the
state-space dimension Each ensemble is carefully
selected to represent thedegrees of freedom of
the system Square-root filter is built-in to the
algorithm assumptions
42Where is M in all of this?
3DDA Techniques have no explicit model time
tendency information, it is all done implicitly
with cycling techniques, typically focusing only
on the Pf term 4DDA uses M explicitly via the
model sensitivities, L, and model adjoints,
LT,as a function of time Kalman Smoothers
(e.g., also 4DEnsKS) would likewise also need to
estimate L and LT
No M used
M used
434DVAR Revisited(for an example see Poster NPOESS
P1.16 by Jones et al.)
Automatically propagates the Pf within the cycle,
however can not save the result for the next
analysis cycle (memory of B info becomes lost
in the next cycle) (Thepaut et al., 1993)
LT is the adjoint which is integrated from ti to
t0 Adjoints are NOT the model running in
reverse, but merely the model sensitivities
being integrated in reverse order, thus all
adjoints appear to function backwards. Think of
it as accumulating the impacts back toward the
initial control variables.
44Minimization Process
Jacobian of the Cost Function is used in the
minimization procedure Minima is at ?J/ ?x
0 Issues Is it a global minima? Are we
converging rapidor slow?
J
TRUTH
x
45Ensembles Flow-dependent forecast error
covariance and spread of information from
observations
grid-point
time
t2
x
From M. Zupanski
t1
Isotropic correlations
obs2
obs1
Geographically distant observations can bring
more information than close-by observations, if
in a dynamically significant region
t0
46Preconditioning the Space
From M. Zupanski
Result faster convergence
Preconditioners transform the variable space so
that fewer iterations are required while
minimizing the cost function x -gt ?
47Incremental VAR
Courtier et al. (1994) Most Common 4D framework
in operational use Incremental form performs
Linear minimization within a lower dimensional
space (the inner loop minimization) Outer loop
minimization is at the full model
resolution(non-linear physics are added back in
this stage) Benefits Smoothes the cost function
and assures better minimization
behaviors Reduces the need for explicit
preconditioning Issues Extra
linearizations occur It is an approximate
form of VAR DA
48Types of DA Solution Spaces
- Model Space (x)
- Physical Space (y)
- Ensemble Sub-spacee.g., Maximum Likelihood
Ensemble Filter (MLEF)
- Types of Ensemble Kalman Filters
- Perturbed observations (or stochastic)
- Square root filters (i.e., analysis perturbations
are obtained from the Square root of the Kalman
Filter analysis covariance)
49How are Data used in Time?
Observation model
Cloud resolving model
forecast
time
observations
Assimilation time window
50A Smoother Uses All Data Availablein the
Assimilation Window(a Simultaneous Solution)
Observation model
Cloud resolving model
forecast
time
observations
Assimilation time window
51A Filter Sequentially Assimilates Dataas it
Becomes Available in each Cycle
Observation model
Cloud resolving model
time
observations
Assimilation time window
52A Filter Sequentially Assimilates Dataas it
Becomes Available in each Cycle
Observation model
Cloud resolving model
time
observations
Assimilation time window
53A Filter Sequentially Assimilates Dataas it
Becomes Available in each Cycle
Observation model
Cloud resolving model
time
observations
Assimilation time window
54A Filter Sequentially Assimilates Dataas it
Becomes Available in each Cycle
Observation model
Cloud resolving model
forecast
time
- Cycle Physics Barriers
- What Can Overcome the Barrier?
- Linear Physics Processes and
- Propagated Forecast Error Covariances
55Data Assimilation
- Conclusions
- Broad, Dynamic, Evolving, Foundational Science
Field! - Flexible unified frameworks, standards, and
funding will improve training and education - Continued need for advanced DA systemsfor
research purposes(non-OPS) - Can share OPS framework components,e.g., JCSDA
CRTM
For more information Great NWP DA Review Paper
(By Mike Navon) ECMWF DA training materials JCSDA
DA workshop http//people.scs.fsu.edu/navon/pubs/
JCP1229.pdf http//www.ecmwf.int/newsevents/traini
ng/rcourse_notes/ http//www.weatherchaos.umd.edu/
workshop/
Thanks! (jones_at_cira.colostate.edu)
56Backup Slides
- Or Why Observationalists and Modelers see
things differently - No offense meant for either side
57What Approach Should We Use?
DATA MODEL CENTRIC CENTRIC
TRUTH
58What Approach Should We Use?
DATA MODEL CENTRIC CENTRIC
TRUTH
59What Approach Should We Use?
DATA MODEL CENTRIC CENTRIC
TRUTH
60MODEL CENTRIC FOCUS
FOCUS ON B Background Error Improvements are
Needed xb Associated background states and
Cycling are more heavily emphasized DA method
selection tends toward sequential estimators,
filters, and improved representation of the
forecast model error covariances
DATA MODEL CENTRIC CENTRIC
E.g., Ensemble Kalman Filters, other Ensemble
Filter systems
61What Approach Should We Use?
This is not to say that all model-centric
improvements are bad
DATA MODEL CENTRIC CENTRIC
TRUTH
62What Approach Should We Use?
DATA MODEL CENTRIC CENTRIC
TRUTH
63What Approach Should We Use?
DATA MODEL CENTRIC CENTRIC
TRUTH
64DATA CENTRIC FOCUS
FOCUS ON h Observational Operator
Improvements are Needed M0,i(x0) Model state
capabilities and independent experimental
validation is more heavily emphasized DA
method selection tends toward smoothers (less
focus on model cycling), more emphasis on data
quantity and improvements in the data operator
and understanding of data representation errors
DATA DATA CENTRIC CENTRIC
e.g., 4DVAR systems
65DUAL-CENTRIC FOCUS
Best of both worlds? Solution Ensemble based
forecast covariance estimates combined with 4DVAR
smoother for research and 4DVAR filter for
operations? Several frameworks to combine the
two approaches are in various stages of
development now
66What Have We Learned?
- Your Research Objective is CRITICAL to making the
right choices - Operational choices may supercede good research
objectives - Computational speed is always critical for
operational purposes - Accuracy is critical for research purposes
DATA MODEL CENTRIC CENTRIC
TRUTH
67What About Model Error Errors?
I just cant run like I used to.
A Strongly Constrained System? Can Data Over
Constrain?
Little Data People
68What About Model Error Errors?
Well no ones perfect.
A Strongly Constrained System? Can Data Over
Constrain?
69Optimal Interpolation (OI)
Eliassen (1954), Bengtsson et al. (1981), Gandin
(1963) Became the operational scheme in early
1980s and early 1990s
OI merely means finding the optimal Weights, W
A better name would have been statistical
interpolation
70What is a Hessian?
A Rank-2 Square Matrix Containing the Partial
Derivatives of the Jacobian G(f)ij(x) DiDj
f(x) The Hessian is used in some minimization
methods,e.g., quasi-Newton
71The Role of the Adjoint, etc.
Adjoints are used in the cost function
minimization procedure But first Tangent
Linear Models are used to approximate the
non-linear model behaviors L x M(x1)
M(x2) / ? L is the linear operator of the
perturbation model M is the non-linear forward
model? is the perturbation scaling-factorx2
x1 ?x
72Useful Properties of the Adjoint
ltLx, Lxgt ? ltLTLx, xgt LT is the adjoint
operator of the perturbation model Typically the
adjoint and the tangent linear operator can be
automatically created using automated
compilers y ? (x1, , xn, y) ?xi
?xi ?y ??/?xi ?y ?y ??/?y where ?xi
and ?y are the adjoint variables
73Useful Properties of the Adjoint
ltLx, Lxgt ? ltLTLx, xgt LT is the adjoint
operator of the perturbation model Typically the
adjoint and the tangent linear operator can be
automatically created using automated
compilers Of course, automated methods fail
for complex variable types (See Jones et
al., 2004) E.g., how can the compiler know when
the variable is complex, when codes are
decomposed into real and imaginary parts as
common practice? (It cant.)