Title: Principal Component Analysis PCA
1Principal Component Analysis PCA Linear
Discriminant Analysis LDA
2First 6 SSTa EOFs
http//www.bom.gov.au/bmrc/clfor/cfstaff/wld/RESRE
P65/rr65.htmPCA_SST
3Second 6 SSTa EOFs
http//www.bom.gov.au/bmrc/clfor/cfstaff/wld/RESRE
P65/rr65.htmPCA_SST
4Linear Discriminant Analysis
- Deals with classification of multivariate data
- In forecast example, LDA is used to assign
forecast probabilities to tercile groups. - Maximizes the ratio of between class variance to
the within class variance in a data set hence
guaranteeing maximal separability.
Forecast for Jun-Aug 2004, Predictor period
(Mar-May)
Threshold 33rd Percentile 36.9mm Threshold 67th
Percentile 83.2mm
5Contingency Tables
(Mason 25/03/2003)
6Contingency Tables
(Mason 25/03/2003)
7Linear Discriminant Analysis
Data set can be transformed and test vectors
classified by two approaches
Class-dependent transformation Maximise ratio
between class variance to within class
variance. Class-independent transformation
maximise the ratio of overall variance to within
class variance. Each class is considered as a
separate class against all others
8Discriminant Analysis
Probabilistic discrete (two or more predictors)
(Mason 25/03/2003)
9Linear Discriminant Analysis
(Mason 25/03/2003)
10LEPS ScoresLinear Error in Probability Space
Q How do we know if a probabilistic forecast
was correct??
A A probabilistic forecast can never be wrong!
- is a measure of forecast skill
- simple numerical form
- measures the accuracy of one set of forecasts
compared to climatology
11Another way of looking at itCumulative
Probability Distribution
http//www.bom.gov.au/bmrc/wefor/staff/eee/verif/L
EPS.html
12 Rainfall history showing Terciles
Low Rainfall
High Rainfall
Consider this frequency distribution of rainfall
at Emerald.
13How are probabilistic forecasts presented?
One way probability of event occurring in
terciles.
Forecast for Jun-Aug 2004, Predictor period
(Mar-May)
Threshold 33rd Percentile 36.9mm Threshold 67th
Percentile 83.2mm
Very likely to be a wet year!
14How do we know if forecast is good?
- Perform Hindcasts
- Keep tally of performance
- Penalise bad forecasts!
- Reward good forecasts!
15Penalty weightingNon-LEPS tercile category
weights
Are there any biases to score forecast one way or
another?
16Introduce LEPS tercile category weights
Weights are optimally defined so that forecasts
of climatology AND perpetual forecast of one
category AND random guessing have an expected
score of zero.
17Magic Numbers??
How are LEPS numbers calculated for terciles??
pf0
pf0.33
pf0.67
pf1
p00
Calc LEPS at corners and then average
p00.33
p00.67
p01.00
18Example Calculation for Terciles
If observed years falls in
Forecast probabilities
19Percentage LEPS score
To Convert to a percentage divide by worst
case OR best case scenario i.e. if LEPS score
was ve, divide by the highest category
weight. if LEPS score was ve, divide by the
lowest category weight.
i.e. if LEPS score was 0.18 (good forecasting),
and observed was in tercile 2, then LEPS
0.18 / 0.22 x 100 81.8
Highest weighing from table in tercile 2
20Forecast performance Via LEPS Score
Often expressed as percentage LEPS
-100
0
100
As good as Climatology
Worse than Climatology
Better than Climatology
i.e. LEPS 42 gt Good Forecasting LEPS
-3 gt Poor Forecasting