Title: 18 hrs 100 mph93 mb
1Verification of deterministic tropical cyclone
intensity forecasts Moving beyond mean absolute
error
Jonathan Moskaitis Massachusetts Institute of
Technology 28th AMS Conference on Hurricanes and
Tropical Meteorology 1 May 2008
Wilma, 2005
A particularly challenging intensity forecast
18 hrs/100 mph/-93 mb
2Verification An integral part of the forecast
process
- Immediate role Quantify the quality of the
relationship between a sample of - forecasts and the corresponding sample of
observations (forecast quality)
(2) Broader role Part of a feedback loop of
forecast system development
Process of making forecasts better
Model
Model
Verification
Forecast set
Forecast set
3Verification An integral part of the forecast
process
- Immediate role Quantify the quality of the
relationship between a sample of - forecasts and the corresponding sample of
observations (forecast quality)
(2) Broader role Part of a feedback loop of
forecast system development
Process of making forecasts better
X
Model
Model
Verification
Forecast set
Forecast set
4Verification An integral part of the forecast
process
- Immediate role Quantify the quality of the
relationship between a sample of - forecasts and the corresponding sample of
observations (forecast quality)
(2) Broader role Part of a feedback loop of
forecast system development
Process of making forecasts better
Model
Model
Verification
Forecast set
Forecast set
5Verification An integral part of the forecast
process
- Immediate role Quantify the quality of the
relationship between a sample of - forecasts and the corresponding sample of
observations (forecast quality)
(2) Broader role Part of a feedback loop of
forecast system development
Process of making forecasts better
Model is driven to produce forecast samples that
optimize a particular verification measure
X
Model
Model
Verification
Forecast set
Forecast set
For tropical cyclone intensity prediction
systems, this particular verification measure
is mean absolute error (MAE)
6Questions to address
- Immediate role Quantify the quality of the
relationship between a sample of - forecasts and the corresponding sample of
observations (forecast quality)
MAE is not a comprehensive measure of forecast
quality. If it alone is used to quantify TC
intensity forecast quality, what insights about
TC intensity forecast quality are left uncovered?
(2) Broader role Part of a feedback loop of
forecast system development
What are the consequences of demanding that TC
intensity forecast systems minimize the MAE of
their predictions?
Both questions can be addressed through a
comprehensive evaluation of the quality of
operational tropical cyclone intensity forecasts,
using distributions-oriented techniques
Murphy and Winkler 1987 Murphy et al. 1989
Brooks and Doswell III 1996
7Questions to address
- Immediate role Quantify the quality of the
relationship between a sample of - forecasts and the corresponding sample of
observations (forecast quality)
MAE is not a comprehensive measure of forecast
quality. If it alone is used to quantify TC
intensity forecast quality, what insights about
TC intensity forecast quality are left uncovered?
(2) Broader role Part of a feedback loop of
forecast system development
What are the consequences of demanding that TC
intensity forecast systems minimize the MAE of
their predictions?
Both questions can be addressed through a
comprehensive evaluation of the quality of
operational tropical cyclone intensity forecasts,
using distributions-oriented techniques
Murphy and Winkler 1987 Murphy et al. 1989
Brooks and Doswell III 1996
8Mean absolute error Error Distribution
Joint Distribution
Verification data sample
Atlantic basin 2001-2005
forecast
observation
Mean absolute error
Mean absolute error (kt)
Lead time (h)
DSHP Decay-SHIPS statistical-dynamical
model GFDL GFDL/URI coupled hurricane-ocean
model OFCL National Hurricane Center SHF5
5-day SHIFOR statistical model
9Mean absolute error Error Distribution
Joint Distribution
Verification data sample
Atlantic basin 2001-2005
forecast
observation
Mean absolute error
Mean absolute error (kt)
Focus on one forecast system at one lead time
OFCL, 36 h
Lead time (h)
MAE 12.0 kt
DSHP Decay-SHIPS statistical-dynamical
model GFDL GFDL/URI coupled hurricane-ocean
model OFCL National Hurricane Center SHF5
5-day SHIFOR statistical model
10Mean absolute error Error Distribution
Joint Distribution
OFCL 36 h absolute error distribution
Relative frequency distribution of absolute error
- Conflates errors of the same
- magnitude, but opposite sign
Relative frequency
forecast observation
OFCL 36 h error distribution
Relative frequency distribution of error
- Still ambiguity, e.g. a forecast
- and observation of (30,50) and
- (130,150) are both -20 kt errors
Relative frequency
forecast observation
11Mean absolute error Error Distribution
Joint Distribution
OFCL Joint distribution of 36 h forecasts and
observations
Joint relative frequency distribution of
forecasts and observations
- Expresses all information in the
- verification data sample
- Fundamental instrument of
- distributions-oriented
- verification
Observation
Forecast
Dot drawn for each (f,x) that occurs in the VDS,
colored according to relative frequency
12Mean absolute error Error Distribution
Joint Distribution
OFCL Joint distribution of 36 h forecasts and
observations
Joint relative frequency distribution of
forecasts and observations
Wilma 2005
- Expresses all information in the
- verification data sample
- Fundamental instrument of
- distributions-oriented
- verification
Observation
Any point on the red lines has 20 kt AE
Forecast
Dot drawn for each (f,x) that occurs in the VDS,
colored according to relative frequency
13SHF5
DSHP
Evolution of the joint distributions in lead time
Lead time 0 h
OFCL
GFDL
14SHF5
DSHP
Evolution of the joint distributions in lead time
Lead time 24 h
OFCL
GFDL
15SHF5
DSHP
Evolution of the joint distributions in lead time
Lead time 48 h
OFCL
GFDL
16SHF5
DSHP
Evolution of the joint distributions in lead time
Lead time 72 h
OFCL
GFDL
17SHF5
DSHP
Evolution of the joint distributions in lead time
Lead time 96 h
OFCL
GFDL
18SHF5
DSHP
Evolution of the joint distributions in lead time
Lead time 120 h
OFCL
GFDL
19SHF5 24 h
Evolution of the joint distributions in lead time
- Initially, distribution widens about diagonal
major axis - Then major axis starts rotating into the
vertical - Distribution contracts about nearly vertical
major axis
SHF5 120 h
SHF5 72 h
SHF5 model at 24, 72, and 120 h
20SHF5 24 h
Evolution of the joint distributions in lead time
- Initially, distribution widens about diagonal
major axis - Then major axis starts rotating into the
vertical - Distribution contracts about nearly vertical
major axis
Increasing conditional bias with lead time
SHF5 120 h
SHF5 72 h
SHF5 model at 24, 72, and 120 h
21SHF5 24 h
Evolution of the joint distributions in lead time
Why does the conditional bias increase with lead
time?
- Range of forecasted values decreases with
- lead time, but range of observations does not
- This must be the case if forecast samples are
- to minimize mean absolute error
SHF5 72 h
SHF5 120 h
SHF5 model at 24, 72, and 120 h
22Questions to address
- Immediate role Quantify the quality of the
relationship between a sample of - forecasts and the corresponding sample of
observations (forecast quality)
MAE is not a comprehensive measure of forecast
quality. If it alone is used to quantify TC
intensity forecast quality, what insights about
TC intensity forecast quality are left uncovered?
(2) Broader role Part of a feedback loop of
forecast system development
What are the consequences of demanding that TC
intensity forecast systems minimize the MAE of
their predictions?
Both questions can be addressed through a
comprehensive evaluation of the quality of
operational tropical cyclone intensity forecasts,
using distributions-oriented techniques
Murphy and Winkler 1987 Murphy et al. 1989
Brooks and Doswell III 1996
23Consequences of MAE minimization
Climatological probability distribution
Think in terms of probabilistic prediction
- Forecast uncertainty increases with
- lead time, saturating such that the
- forecast probability distribution is
- the same as the climatological
- distribution
Lead time
Initial probability distribution
24Consequences of MAE minimization
Climatological probability distribution
Think in terms of probabilistic prediction
- Forecast uncertainty increases with
- lead time, saturating such that the
- forecast probability distribution is
- the same as the climatological
- distribution
AE-minimizing deterministic forecast trajectory
- The median of the forecast
- probability distribution is the
- AE-minimizing deterministic
- forecast
Lead time
Initial probability distribution
25Consequences of MAE minimization
Climatological probability distribution
Think in terms of probabilistic prediction
- Forecast uncertainty increases with
- lead time, saturating such that the
- forecast probability distribution is
- the same as the climatological
- distribution
AE-minimizing deterministic forecast trajectories
- The median of the forecast
- probability distribution is the
- AE-minimizing deterministic
- forecast
Lead time
Set of MAE-minimizing forecast trajectories conver
ge towards the median of the climatological
distribution
Initial probability distributions
26SHF5 24 h
Consequences of MAE minimization
Think in terms of probabilistic prediction
- Forecast uncertainty increases with
- lead time, saturating such that the
- forecast probability distribution is
- the same as the climatological
- distribution
- The median of the forecast
- probability distribution is the
- AE-minimizing deterministic
- forecast
SHF5 120 h
Set of MAE-minimizing forecast trajectories conver
ge towards the median of the climatological
distribution
This is why the range of forecasted
values decreases with lead time
27Summary and Conclusions
- Immediate role Quantify the quality of the
relationship between a sample of - forecasts and the corresponding sample of
observations (forecast quality)
MAE is not a comprehensive measure of forecast
quality. If it alone is used to quantify TC
intensity forecast quality, what insights about
TC intensity forecast quality are left uncovered?
28Summary and Conclusions
- Immediate role Quantify the quality of the
relationship between a sample of - forecasts and the corresponding sample of
observations (forecast quality)
MAE is not a comprehensive measure of forecast
quality. If it alone is used to quantify TC
intensity forecast quality, what insights about
TC intensity forecast quality are left uncovered?
OFCL Joint distribution, 36 h
MAE 12.0 kt
Observation
Forecast
Analysis of joint distributions showed increasing
conditional bias with lead time
29Summary and Conclusions
(2) Broader role Part of a feedback loop of
forecast system development
What are the consequences of demanding that TC
intensity forecast systems minimize the MAE of
their predictions?
30Summary and Conclusions
(2) Broader role Part of a feedback loop of
forecast system development
What are the consequences of demanding that TC
intensity forecast systems minimize the MAE of
their predictions?
Forecasts converging toward the median of the
climatological distribution with lead time,
resulting in conditional bias
31Summary and Conclusions
(2) Broader role Part of a feedback loop of
forecast system development
What are the consequences of demanding that TC
intensity forecast systems minimize the MAE of
their predictions?
Forecasts converging toward the median of the
climatological distribution with lead time,
resulting in conditional bias
MAE verification is not inherently bad it is
just a limited means of quantifying forecast
quality
For further details, see my extended abstract,
or Moskaitis, J. R., 2008 A case study of
deterministic forecast verification Tropical
cyclone intensity. Weather and Forecasting,
in review. (available at web.mit.edu/jonmosk/www)