Title: Huug van den Dool / Dave Unger
1- Huug van den Dool / Dave Unger
- Consolidation of Multi-Method Seasonal Forecasts
at CPC. - Part I
2LAST YEAR / THIS YEAR
- Last Year Ridge Regression as a Consolidation
Method, to yield, potentially, non-equal weights. - This Year see new Posters on Consolidation by
- -) Malaquias Pena (methodological twists and
SST application) and - -) Peitao Peng (application to US TP)
- Last Year Conversion to pdf as per Kernel
method (Dave Unger). - This Year Time series approach is next.
3Does the NCEP CFS add to the skill of the
European DEMETER-3 to produce a viable
International Multi Model Ensemble (IMME)
?Much depends on which question we ask
Input by Suranjana Saha and Ake Johansson is
acknowledged.
4- DATA and DEFINITIONS USED
- DEMETER-3 (DEM3) ECMWF METFR UKMO
- CFS
- IMME DEM3 CFS
- 1981 2001
- 4 Initial condition months Feb, May, Aug and
Nov - Leads 1-5
- Monthly means
5- DATA/Definitions USED (cont)
- Deterministic Anomaly Correlation
- Probabilistic Brier Score (BS) and Rank
Probability Score (RPS) - Ensemble Mean and PDF
- T2m and Prate
- Europe and United States
- Verification Data
- T2m Fan and Van den Dool
- Prate CMAP
6BRIER SCORE FOR 3-CLASS SYSTEM 1. Calculate
tercile boundaries from observations 1981-2001
(1982-2002 for longer leads) at each
gridpoint. 2. Assign departures from models own
climatology (based on 21 years, all members) to
one of the three classes Below (B), Normal (N)
and Above (A), and find the fraction of forecasts
(F) among all participating ensemble members for
these classes denoted by FB, FN and FA
respectively, such that FB FNFA1 . 3.
Denoting Observations as O, we calculate a Brier
Score (BS) as BS(FB-OB)2 (FN-ON)2
(FA-OA)2/3, aggregated over all years and
all grid points. For example, when the
observation is in the B class, we have (1,0,0)
for (OB, ON, OA) etc. 4. BS for random
deterministic prediction 0.444 BS for
always climatology (1/3rd,1/3rd,1/3rd)
0.222 5. RPS The same as Brier Score, but for
cumulative distribution (no-skill0.148)
7Number of times IMME improves upon DEM-3 out of
20 cases (4 ICs x 5 leads)
Region EUROPE EUROPE USA USA
Variable T2m Prate T2m Prate
Anomaly Correlation 9 14 14 14
Brier Score 16 18.5 19 20
RPS 14 15 19.5 20
The bottom line NO consolidation, equal
weights, NO Cross-validation
8Anom.Corr US T2m February 1982-2002 lead 3
November start
Method ? CFS alone IMME CON4
No CV 29 33 38
CV-3R
Aspect to be CV-ed Systematic error in mean Systematic error in mean Systematic error in mean and weights
9Anom.Corr US T2m February 1982-2002 lead 3
November start
Method ? CFS alone IMME CON4
No CV 29 33 38
CV-3R 20 18 21
Aspect to be CV-ed Systematic error in mean Systematic error in mean Systematic error in mean and weights
10Cross Validation (CV)
- Why do we need CV?
- Which aspects are CV- ed a) systematic error
correction ( i) the mean and ii) the stand.dev)
and b) weights generated by Consolidation - How?? CV-1, CV-3, CV-3R
- Dont use CV-1!. CV-1 malfunctions for systematic
error correction in combination with internal
climatology and suffers from degeneracy when
weights generated by Consolidation are to be
CV-ed. - Define internal and external climatology
11Last year
wrt OIv2 1971-2000 climatology
12Last year
wrt OIv2 1971-2000 climatology
13w1 w2 w3 w4
nov lead 3 check .35 .04 .27 .35 11
3 1981 left out (and 2 others) nov lead 3 check
.36 .00 .29 .35 11 3 1982 left out
(and 2 others) nov lead 3 check .29 .02
.33 .36 11 3 nov lead 3 check .40 .04
.31 .36 11 3 nov lead 3 check .38
-.01 .28 .34 11 3 nov lead 3 check
.37 .00 .27 .33 11 3 nov lead 3 check
.35 -.01 .25 .40 11 3 nov lead 3
check .31 .01 .29 .43 11 3 nov lead
3 check .28 .01 .31 .39 11 3 nov
lead 3 check .38 .02 .30 .32 11 3
nov lead 3 check .36 .00 .39 .32 11
3 nov lead 3 check .31 .03 .29 .39 11
3 nov lead 3 check .45 -.01 .23 .35
11 3 nov lead 3 check .37 .00 .31 .41
11 3 nov lead 3 check .35 .03 .28
.40 11 3 nov lead 3 check .33 .02 .36
.35 11 3 nov lead 3 check .42 .01
.33 .31 11 3 nov lead 3 check .33 .04
.31 .42 11 3 nov lead 3 check .33
.00 .29 .35 11 3 nov lead 3 check .40
.02 .31 .35 11 3 nov lead 3 check
.33 .00 .24 .38 11 3 2001 left out
(and 2 others)
Feb forecast has high co-linearity. Model 2 has
high ve weights for unconstrained regression
14Overriding conclusion
- ? With only 20 years of hindcasts it is hard
for any consolidation to be much better than
equal weight MME. (Give us 5000 years.) - ?Pooling data helps stabilize weights and it
increases skill, but is it enough? - ? 20 years is a problem even for CV-ed
systematic error correction.
15Further points of study
- The nature of climatology ( control in
verification), external, internal, fixed - Cross Validation method not settled
- The Many Details of Consolidation as per Ridge
Regression - Conversion to pdf can be done in very many
different ways (including 3-class BS
minimization, logistic regression, count
method, Kernels)
16Forecast Consolidation at CPC Part 2Ensembles
to Probabilities
- David Unger / Huug van den Dool
Acknowledgements Dan Collins, Malaquias Pena,
Peitao Peng
17Objectives
- Produce a single probabilistic forecast from
many tools - - Single value estimates
- - Ensemble sets
- Utilize Individual ensemble members
- - Assume individual forecasts represent
possible realizations - - We want more than just the ensemble mean
- Provide Standardized Probabilistic output
- - More than just a 3-class forecast
18Kernel Characteristics
19Kernel Characteristics Unequal Weighting
20Ensemble Regression
- A regression model designed for the kernel
smoothing methodology - - Each member is equally likely to occur
- - Each has the same conditional error
distribution in the event it is - closest to the truth.
- F Forecast, sF Forecast
Standard Deviation - ObsObservations, sObs
Standard Deviation of observations - RCorrelation between individual ensemble
members and the observations - Rm Correlation between ensemble mean and
observations - a1 , a0 Regression Coefficients,
- F a0 a1 F
21Time series estimation
- Moving Average, Let X11 Be the 10-year running
mean known on year 11. N10 - X11 1/N(x1x2x3x4x5x6x7x9x9x10)
- X12 X11 1/N(x11-x1)
- XY1 XY 1/N(xY1-xY-10)
- Exponential Moving Average (EMA), a 1/N
- X12 X11 a(x11- X11)
- XY1 (1- a)XY axY1
22Adaptive Ensemble Regression
EMA estimates
- F
- F2
- (Obs)
- (Obs)2
- F (Obs)
- Fm2
- (F-Fm)2
23Trends
- Adaptive Regression learns recent bias, and is
very good in compensating. - Most statistical tools also learn bias, and
adapt to compensate. - Steps need to be taken to prevent doubling
bias corrections.
24Trends (Continued)
- Step 1. Detrend all models and Obs.
- F F F10 Obs Obs
Obs10 - F10 , Obs10 The EMA approximating
a 10-year mean -
- Step 2. Ensemble Regression
- Final forecast set, F are anomalies.
- Step 3. Restore the forecast.
- A) F F F10
- We believe the OCN trend estimate
- B) F F C30 C30 30-year
(1971-2000) Climatology - We have no trust in OCN.
- C) F F C30ROCN (F10 C30)
- ROCN Correlation ( F10,Obs)
- Trust but verify.
25Weighting
- The chances of an individual ensemble member
being best increases with the skill of the
model. - The kernel distribution represents the expected
error distribution of a correct forecast.
26Final Forecast
- Consolidated Probabilities are the area under the
PDF within each of three (Below, Near, Above
median ) categories. - Call for ABOVE when the P(above)gt36 and
P(Below) lt33.3 - Call for BELOW when P(Below gt 36) and P(Above)
lt 33.3 - White area Equal Chances (We dont as yet
trust our Near Normal percentages)
27Performance
- Tools 1995 2005.
- CFS 15 members hindcast
- All Members weighted equally with combined
area equal to the tool weighting - CCA - Single Valued Forecast Hindcasts from
Cross Validation - SMLR - Single valued forecast Hindcasts from
Retroactive Real-time Validation - OCN incorporated with EMA rather than 10-year box
car average. -
28Performance (Continued)
- First Guess EMA parameters provided by CCA, SMLR
Statistics 1956-1980. - CFS spinup 1981-1994
- Validation Period All Initial times, 1-month
lead, Jan 1995-Dec 2005 (11 Years)
29Performance (Continued)
- Official Forecast Hand-drawn, probabilities in
3 classes. PoE obtained from Normal distribution,
Standard deviation based on tool skills) - CCASMLR A Consolidation of the two Statistical
Forecasts, equally Weighted. - CFS A Consolidation of 15 CFS ensembles,
equally weighted. - CFSCCASMLR Wts A Consolidation of CCA, CFS,
and SMLR, weighted by R/(1-R) for each of the
three tools. 15 CFS members each given 1/15th of
the CFS weight. Also known as All - All Equal Wts CCA, SMLR and the 15 CFS
members combined are given equal Weights. - All No Trend. Anomalies applied to 30-year
mean.
30Performance
HSS
CRPSS
RPSS - 3
Cover
Bias (C)
.046 .076 .191 -.147 63
.067 .076 .162 -.334 59
.063 .100 .215 -.268 73
.005 -.002 .058 -.876 47
.074 .100 .199 -.203 62
.023 .040 .098 -.858 38
CCASMLR
CFS
CFSCCASMLR, Wts.
All No Trend
All Equal Wts.
Official
31CRPS Skill Scores Temperature
1995 2005
.052 .021
.020 .032
.110 .083
.081 .061
.058 .035
.062 .009
.055 .025
.067 -.001
.015 -.004
.016 -.007
.102 .067
.081 .044
.218 .195
.202 .057
.092 .076
.086 .042
-.023 -.085
.015 .025
Skill
-.001 -.023
-.012 .026
High
Moderate
Low
None
.182 .119
.177 .097
.10
.092 .078
.101 .054
.05
.01
.074 .046
.067 .023
All
CCASMLR
1-Month Lead, All initial times
CFS
Official
32CRPS Skill Scores / cover Temperature
1995 2005
.052 .019
70 56
.110 -.005
81 64
.058 .005
42 32
.055 -.001
38 26
.015 .000
47 32
.102 -.027
79 61
.218 .074
88 63
.092 .031
51 39
-.023 -.097
79 63
Skill
-.001 -.007
54 44
High
Moderate
Low
None
.182 -.046
98 70
.10
.092 .055
.71 57
.05
.01
No Trends
Trends
.074 .005
62 45
CRPSS
1-Month Lead, All initial times
cover
33The Winter Forecast
Skill Weighting
Equal Weighting
34Conclusions
- Weighting is not critical within reason.
- Consolidation outperforms component models .
- Getting the trends right is essential.
- CFS Trend consolidation provides an accurate
continuous distribution.