Title: Introduction to Probability and Probabilistic Forecasting
1Introduction to Probability and Probabilistic
Forecasting
- Simon Mason
- International Research Institute for Climate
Prediction - The Earth Institute of Columbia University
- AMS Short Course on Probabilistic Forecasting
- San Diego, CA, January 9, 2005
L i n k i n g S c i e n c e t o S o c
i e t y
2Questions about the Future
We make forecasts to answer questions about the
future
Will it rain in San Diego next weekend?
Will it snow in San Diego next weekend?
Will the Red Sox win the 2005 World Championship?
Will Dick Cheney die a pauper?
Will this surfer live to 50?
L i n k i n g S c i e n c e t o S o c
i e t y
3Probability
For most situations the future is uncertain. In
cases where the answer to a question about the
future is uncertain, we tend to use probabilities
to express this uncertainty in the outcome.
L i n k i n g S c i e n c e t o S o c
i e t y
4Probability and Events
We refer to a specific outcome, or a specific
combination of outcomes, as an event, and refer
to the probability of this event. What do these
terms mean?
- event a predefined outcome that forms the
subject of a forecast (an outcome of interest)
examples- rain in San Diego this afternoon-
tornado touch down anywhere in Iowa tomorrow-
average LAX January temperature of below 10C-
NIÑO3 anomaly of more than 2C by October-
global warming of more than 1C by 2050.
L i n k i n g S c i e n c e t o S o c
i e t y
5Probability and Events
Events can be - elementary (hot) or -
compound (hot and dry rain two days in a row).
Events either occur or do not occur there are
only these two possible outcomes (but there may
be some uncertainty as to whether the outcome has
occurred). If the event does not occur, its
complement occurs.
Events need to be well-defined to avoid
ambiguity. (What does unfaithful mean? What
does in San Diego mean?)
L i n k i n g S c i e n c e t o S o c
i e t y
6Notation
An elementary event is often denoted by the
letter E, and its complement by . To
distinguish an elementary event from a second
elementary event, subscripts may be used first
elementary event E1 second elementary event
E2 A compound event occurs when the first AND the
second elementary event occur (or, more
generally, when all elementary events occur)
L i n k i n g S c i e n c e t o S o c
i e t y
7Probability and Events
Uncertainty often is expressed using expressions
such as it is likely, the chances are,
etc. Different degrees of uncertainty can be
indicated possibly indicates higher
uncertainty than probably. Compare - will it
rain in San Diego next weekend? - will it snow
in San Diego next weekend?
- probability a quantitative measure of the
uncertainty in the event.
L i n k i n g S c i e n c e t o S o c
i e t y
8Probability and Uncertainty
Probabilities are used where there is
uncertainty. Apart from ambiguity, there are two
sources of uncertainty
- our understanding is limited
- there is some inherent randomness in the outcome.
- We do not know for certain what will happen.
- We cannot know for certain what will happen.
L i n k i n g S c i e n c e t o S o c
i e t y
9Probability and Uncertainty
But how can we quantify uncertainty?
- When probability is 1, the event will definitely
occur. It is impossible for the event not to
happen. - When probability is 0, the event will definitely
not occur. It is impossible for the event to
happen. - When the probability is between 0 and 1, the
event may or may not happen.
L i n k i n g S c i e n c e t o S o c
i e t y
10Probability
- When probability is close to 1, the event is more
likely to occur than not to occur. - When probability is close to 0, the event is more
likely not to occur than to occur. - When the probability is 0.5, the event is just as
likely to happen as not to happen.
L i n k i n g S c i e n c e t o S o c
i e t y
11Odds
- When the probability of an event, E, is 0.75, the
probability that the event will not happen, the
complement of the event, , is - 1 0.75 0.25
- When the probability is 0.75, the event is three
times more likely to happen than not to happen
L i n k i n g S c i e n c e t o S o c
i e t y
12Probability
- But how do we determine how likely the event is
compared to its complement? - How do we obtain / calculate probabilities?
L i n k i n g S c i e n c e t o S o c
i e t y
13Interpretations of Probability I
How do we obtain / calculate probabilities?
- What is the probability that it will rain in San
Diego (at Lindberg Field) on January 31, 2005?
- How often has it rained on the same day in
previous years (1927 2003)? Climatology.
L i n k i n g S c i e n c e t o S o c
i e t y
14Interpretations of Probability I
- What is the probability that it will rain in San
Diego (at Lindberg Field) on January 31, 2005?
L i n k i n g S c i e n c e t o S o c
i e t y
15Probability as Relative Frequency
What is the probability that it will rain in San
Diego on January 31, 2005?
- Look for similar / identical situations.
- Repeat the experiment many times only
unimportant things are allowed to change.
Note that there may be sampling errors in the
relative frequencies uncertainty about the
uncertainty! (The distribution of these sampling
errors can be obtained using the binomial
distribution).
L i n k i n g S c i e n c e t o S o c
i e t y
16Interpretations of Probability II
How do we obtain / calculate probabilities?
- What is the probability that it will rain in San
Diego (at Lindberg Field) tomorrow?
- How often has it rained on the same day in
previous years with similar atmospheric
conditions? Only unimportant things are allowed
to change.
- This experiment has no precedent todays
initial conditions are important, and they are
unique.
L i n k i n g S c i e n c e t o S o c
i e t y
17Probability as Subjective Belief
The probability that it will rain in San Diego is
best estimated by conditioning upon the current
atmospheric state, a set of conditions that are
unique.
- Make a forecast based on the physics of the
atmosphere, and expert knowledge / experience.
- Produce an ensemble of forecasts based on
sampling of known uncertainties in the physics of
the atmosphere and / or in the initial conditions
(Bright).
- The probability now represents the degree to
which we believe that it will rain in San Diego
tomorrow.
L i n k i n g S c i e n c e t o S o c
i e t y
18Interpretations of Probability
So two interpretations of probability are
- relative frequency interpretation how often the
event has occurred in similar situations in the
past - subjective interpretation how confident we are
the event will occur this time.
But all probabilities could be defined as
subjective because of he subjectivity in defining
which situations are similar.
L i n k i n g S c i e n c e t o S o c
i e t y
19Probability as Relative Frequency
The 77-year climatology does not provide a good
estimate of the probability that it will rain in
San Diego tomorrow because there are some
important differences between the 77 instances
of January 10 and January 10, 2005. Sometimes we
can improve upon climatological forecasts because
of access to important information. But how do
we know whether the information is important?
Is the probability of the event different when
these conditions are present compared to when
they are not?
L i n k i n g S c i e n c e t o S o c
i e t y
20Conditional Probabilities
The relative frequency of rainfall on January 10
could be obtained by considering only those
January 10s on which January 9 rainfall
occurrence was the same as on January 9, 2005.
If January 9 is wet P(January 10 is wet)? If
January 9 is dry P(January 10 is wet)?
This conditional probability is different from a
compound event P(E1?E2), because we know that E2
has (or has not) occurred already.
L i n k i n g S c i e n c e t o S o c
i e t y
21Conditional Probabilities
L i n k i n g S c i e n c e t o S o c
i e t y
22Conditional Probabilities
Venn diagram showing compound event
For conditional probabilities, the outcome of Jan
9 is known already, and so the sample space is
reduced
L i n k i n g S c i e n c e t o S o c
i e t y
23L i n k i n g S c i e n c e t o S o c
i e t y
24Conditional Probabilities
What is the probability that it will rain in San
Diego (at Lindberg Field) on January 10, 2005,
given that it has (or has not) rained on January
9, 2005?
L i n k i n g S c i e n c e t o S o c
i e t y
25Conditional Probabilities
What is the probability that it will rain in San
Diego (at Lindberg Field) on January 10, 2005,
given that it has has rained on January 9, 2005?
L i n k i n g S c i e n c e t o S o c
i e t y
26Conditional Probabilities
What is the probability that it will rain in San
Diego (at Lindberg Field) on January 10, 2005,
given that it has not rained on January 9, 2005?
L i n k i n g S c i e n c e t o S o c
i e t y
27Updating Probabilities
Based on the occurrence of January 9 rainfall,
the probability of rainfall on January 10 has
been updated from 0.22 to 0.54. What if we now
obtain a model forecast that states it will rain
tomorrow, E3? All we know about the model is that
it has given a correct forecast 90 of the time
over the last few days. How can we update our
probability for rain tomorrow?
L i n k i n g S c i e n c e t o S o c
i e t y
28Bayes Theorem
What is the probability that it will rain in San
Diego (at Lindberg Field) on January 10, 2005,
given that it has has rained on January 9, 2005
AND that the model forecasts rain? (To simplify,
conditions on E2 are dropped.)
L i n k i n g S c i e n c e t o S o c
i e t y
29L i n k i n g S c i e n c e t o S o c
i e t y
30L i n k i n g S c i e n c e t o S o c
i e t y
31L i n k i n g S c i e n c e t o S o c
i e t y
32L i n k i n g S c i e n c e t o S o c
i e t y
33Bayes Theorem
All terms on the right are unknown
The priors, at least, are known
L i n k i n g S c i e n c e t o S o c
i e t y
34Bayes Theorem
are likelihoods they tell us how likely it is
that rain was forecasted, assuming that it will /
will not rain, respectively. Or how often are
rain days successfully forecasted / dry days
unsuccessfully forecasted?
L i n k i n g S c i e n c e t o S o c
i e t y
35Bayes Theorem
We do not have exact values for the likelihoods
on the right side of the equation, but if we
assume that the model has no bias, given that it
has been correct 90 of the time, we can infer
that 90 of the rain days have been forecasted.
L i n k i n g S c i e n c e t o S o c
i e t y
36Bayes Theorem
L i n k i n g S c i e n c e t o S o c
i e t y
37Bayes Theorem
Bayes theorem allows us to update probabilities
(posterior probabilities) The prior
probabilities are imply the the best estimate of
the probabilities before considering the new
information. The may already have been previously
updated. The likelihoods indicate how likely the
new information is, assuming a specific outcome.
For example how likely is it that the forecast
would be for wet conditions assuming that it is
going to be wet / dry. (I.e., the hit and false
alarm rates of the ROC.)
L i n k i n g S c i e n c e t o S o c
i e t y
38Conditional Probabilities
A problem with conditional probabilities is that
the sample space is reduced, and so the errors in
estimating the relative frequencies
increases. These errors increase as the number of
conditions is increased, and it is easy to reach
the extreme case of having no previous cases with
only unimportant differences from which to
calculate the relative frequencies. (number of
possible states 2n). In numerical weather
prediction the infinite dimensions of the current
atmospheric state are important, and so the
current initial conditions are unique.
L i n k i n g S c i e n c e t o S o c
i e t y
39Conditional Probabilities
What if the probability of rainfall tomorrow
depends on how much rainfall there is today
rather than just its occurrence? In this case the
outcome of the current event is not dependent
upon another event measured on a binary
scale. Jolliffe some statistical models for
calculating probabilities of events that are
functions of continuous variables.
L i n k i n g S c i e n c e t o S o c
i e t y
40Conditional Probabilities
Similarly, many forecast verification procedures
are based on conditional probabilities
- reliability given a forecast of 90 chance of
rain, how often does rain occur? P(EFf)
- NB notice that this involves a subjective
interpretation of probability we are verifying
forecasts with similar levels of confidence, not
forecasts with similar boundary / initial
conditions.
L i n k i n g S c i e n c e t o S o c
i e t y
41Conditional Probabilities
- resolution can we expect a different outcome
given a different forecast?
Note reliability and resolution are often
confused.
Resolution is P(E) conditional upon the
forecast? Reliability if Ff does P(EFf)f?
L i n k i n g S c i e n c e t o S o c
i e t y
42REL (BC)2 REL (EC)2 RES (AC)2 Note the
y-axis gives the conditional probability of the
event given the forecast.
L i n k i n g S c i e n c e t o S o c
i e t y
43Reliability and Resolution
RELIABILITY Are the forecast probabilities
correct? Do the forecast probabilities reflect
an appropriate level of confidence? RESOLUTION
Does the outcome depend on the forecast? Do
different forecast probabilities imply actual
differences in the probability of an event?
L i n k i n g S c i e n c e t o S o c
i e t y