Title: Bias Correction Methods Adjusting Moments
1Bias Correction Methods Adjusting Moments
- Bo Cui, Zoltan Toth Yuejian Zhu, Dingchen
Hou, and Richard Wobus - Environmental Modeling Center, NCEP/NWS
- SAIC at Environmental Modeling Center, NCEP/NWS
-
2Acknowledgements
Zoltan Toth Yuejian Zhu Dingchen Hou
Richard Wobus
3Outline
- Tasks Goals
- Bias-Correction Algorithm Adjusting Moments
- Experimental Design
- Ensemble Forecast Verification
- Future Plans
4Ensemble Postprocessing
- NWP models, ensemble formation are imperfect
- Deficiencies due to various problems in NWP
models - Systematic errors in analysis induced by
observations and model related - Ensemble formation
- Not appropriate initial spread
- Lack of representation of model related
uncertainty - Limited ensemble size
- Known model/ensemble problems addressed at their
sources, no perfect solution exists - Systematic errors remain and cause biases in
- 1st , 2nd moments of ensemble distribution
5Tasks Goals
- Tasks
- Develop and implement a statistical
post-processing scheme to reduce the biases in
ensemble forecasts (height, temperature and other
variables) - Correct both the 1st and 2nd moments of the
ensemble -
- Goals
- Biased-corrected forecasts will have reduced or
no bias with respect to the verifying analysis
fields, given on the model grid
6Moment Adjustment
FIRST MOMENT B DIFFERENCE BETWEEN Ensemble
mean forecast and Verifying analysis
SECOND MOMENT R RATIO BETWEEN RMS Error of
Ensemble mean and Ensemble Spread
1st moment Ensemble mean B
2nd moment Ensemble mean B
(Ensemble Forecast Ensemble Mean) R
7Implementation Facts
- Bias assessment carried out separately at each
- forecast lead time
- individual grid point
- ensemble mean, GFS and ensemble control forecasts
- Bias correction tests - applied on
- all ensemble member forecasts
- for 00Z initial cycle only
- 2.5x2.5 lat/lon resolution
- 500 mb height, 850 mb temperature
8Alternatives or Refinements of Bias-Correction
Algorithm
- Adaptive methods
- Consider most recent past data with decaying
averaging - Use data from surrounding grid-points (with a
Gaussian weighting function) - Use large (climatological) sample data if
available and forecast system is stable - Adjust temporal/spatial sampling domain to
optimize performance - Construct cumulative frequency distribution to
match that of observed, QPF calibration (Yuejian
Zhu) - Regime dependent method (Jun Du)
- use correlation coefficients between circulation
field today vs. that in recent past to determine
weights given to data in estimating bias
9Experimental Design
Implementation of decaying averaging for 1st
moment bias
T0-46 day
T0-16 day
T0 day
decaying averaging mean error (1-w) prior
t.m.e w (f a)
a) Prior estimate to startup procedure choose T0
as current date (00Z), calculate the time mean
errors between T-46 and T-16 day. b) Update the
prior estimate of the average state is multiplied
by a factor 1-w (lt1). Then, most recent
verification error (f - a) is added to the
decaying average for each lead time with a weight
of w. c) Cycling repeat step (b) every
day. Three experiments with w of 1, 2 and 10
10Experimental Design
Centered running mean error test for 1st moment
bias
T0-15 day
T0 day T015 day
- Define /- 15 day time average as bias. Use bias
estimate - (with dependent data) as optimal benchmark.
- Implementation
- Four experiments optimal test, three decaying
averaging experiments (1, 2 and 10 weight) - 8-month period for these experiments (Spring and
Summer 2004 ) -
-
11OPT
Temporal Cross Section 500 mb Height Time Mean
Error (40 N, 95 W, Jan. to Aug. 2004)
W1
May 22
Jun. 22
May 22
W2
W10
Jun. 22
Jun. 11
May 22
May 22
12Temporal Cross Section 850 mb Temp. Time Mean
Error (40 N, 95 W, Jan. to Aug. 2004)
OPT
W1
May 1
Jun. 2
May 1
W2
W10
Jun. 2
May 1
May 1
May 10
13Ensemble Forecasts Verification
- Verification of ensemble mean
- 500 mb height and 850 mb temperature
- Verification domains
- NH, SH and Tropics
- Verification data set
- GFS final analysis
- Verification scores ACpattern anomaly
correlation coefficient RMSroot mean square
error of ensemble mean ROC relative
operating characteristics RPSSranked
probability skill score
14 AC and RMS 500 mb Height, Summer 2004
AC
RMS
RMS error slightly reduced for first several days
3 bias-corrected ensembles with decaying average
AC scores slightly improved for week 1
15ROC 500 mb Height, Summer 2004
NH
SH
- 2 weight experiment improves performance over
NH, and slightly over SH up to week 2 - 10 weight experiments performance improved over
Tropics
TR
16ROC 500 mb Height, Spring 2004
NH
SH
-
- NH and SH ROC with some weight improved for
most lead time - Tropics ROC improved at all leads indicting
bias much reduced for sub-regions. 10 weight
experiment has a better performance -
TR
17RPSS 500 mb Height, Summer 2004
NH
SH
- 2 weight experiment improve performance over
NH, and slightly over SH as well - 10 weight experiment improves performance over
Tropics, especially for week 2
TR
18Preliminary Results
- In general, the time mean errors of 500 mb
height increase with - forecast lead time. The time mean errors
growth of 500mb - height with forecast lead time is nearly
linear in some cases. - What determines linearity?
- The time mean error difference between 1 and
2 weight - experiments is small. The 10 weight
experiment has higher - frequency details compared to the 1 and 2
experiments (better for short range?). - The centred running mean error test (OPT) shows
potential for - significant improvement in the forecast of
both 500 mb height - and 850 mb temperature in term of all
verification scores, - compared to the raw ensembles.
19Preliminary Results
- For days 1 through 6, the AC scores for the raw
ensemble and - three bias corrected ensembles with decaying
averaging are - relatively close to each other on average.
With some weights, - AC and RMS performance can be improved.
- The 2 ensemble show large improvements of ROC,
RPSS - and BSS score over the North and South
Hemisphere. The - improvement of these scores in summer is more
significant than - in spring. On the other hand, the choice of
10 weight works - better for Tropics compared to 1 and 2. Use
different - weights for Tropics?
-
- The decaying averaging approach to improve the
NCEPs - global ensemble forecast system seems
promising. Problems - with estimating bias for longer lead time
with short sample. -
20Future Plans
- Test 1st moment bias-correction algorithm on
longer period (four seasons, 5 years) for tuning.
- Start research on the 2nd moment calibration.
- Test refinements of bias correction algorithm
listed before. - Run 4 cycles per day, adding 06Z 12Z and18Z
forecasts, to provide more timely information and
increase sample size. Use data with 1x1 lat/lon
resolution. - Add new ensemble forecast variables such as 2m
temperature, U,V, cumulative frequency
distribution for forecast QPF. - Consider other methods and/or use of larger
sample especially for longer lead times.
21(No Transcript)
22Refinements of Bias-Correction Algorithm
- Details
- Decaying averaging
- Use recent verification statistics in the
calibration process, accumulated in a decaying
averaging sense - Achieved by using a recursive averaging procedure
(Kalman Filtering)
6.6
3.3
1.6
Toth, Z., and Y. Zhu, 2001
23 Centered Running Mean Error Summer
2004 Latitudinal Cross Section (95 W)
Longitudinal Cross Section (40 N)
z500
z500
40N
95W
T850
T850
40N
95W