Title: Probability Distributions continued...
1Probability Distributions continued...
2Hypergeometric Distribution
- Sampling without replacement - consider a batch
of 20 microwave modules, of which 3 are
defective. - We sample and test one module first, and find it
is defective. - at this time, there was a 3/20 chance of
obtaining a defect - We sample a second time, without replacing the
module - this time, there is a 2/19 chance of obtaining a
defect - the outcome of the second trial is no longer
independent of the first trial - probability of success/failure changes with each
trial
3Hypergeometric Distribution
- Modeling this situation
- back away from looking at probability for each
trial, and return to a counting approach - suppose we have a total of N objects in the
batch, of which d are defective - we take samples of n objects, and we want to know
the probability of x of them being defective - there are NCn ways of taking the sample of n
objects - within the sample, there are N-dCn-x ways of
choosing the n-x non-defective objects - within the sample, there are dCx ways of choosing
the defective objects
4Hypergeometric Distribution
- the total number of ways of obtaining a sample
with x defective objects is N-xCn-d dCx - using the counting approach, the probability of
obtaining x defects is n(E)/n(S), where n(E) is
the number of outcomes in the event obtain x
defects - hypergeometric probability function
C
C
-
-
x
d
x
n
d
N
x
p
)
(
X
C
n
N
5Hypergeometric Distribution
- Example
- given a batch of 200 dashboard components, of
which 10 are typically defective - we take a sample of 10 components and test
without replacement - what is the probability of 3 defective
components? - probability of 0 defects is 0.34
C
C
-
-
3
20
3
10
20
200
055
.
0
)
3
(
p
X
C
10
200
6Poisson Distribution
- used when considering discrete occurrences in a
continuous interval - e.g., of auto accidents in a 100 km stretch of
road - e.g., of breakages in a length of yarn
- obtained via a Binomial distribution argument, in
which the number of trials is very large - key assumption - independence in the interval
(think of infinitely many trials) - occurrences
are statistically independent
7Poisson Distribution
- link to Binomial Distribution
- consider continuous interval divided into
sub-intervals - each sub-interval can be considered as a trial
- assumption of independence is used here
- take limit as number of sub-intervals goes to
infinity, and obtain Poisson distribution
...
8Poisson Distribution
- probability function - probability of k
occurrences in the interval - ? - average number of occurrences in the interval
(parameter) - we can also fix the distribution in terms of ? -
the average number of occurrences per unit
interval - we then have ? ? t where t is the
length of the interval
l
-
k
l
e
)
(
k
P
!
k
a
-
t
k
a
)
(
e
t
)
(
k
P
!
k
9Poisson Distribution
- Mean
- we identified ? as the average number of
occurrences in the interval - Variance
-
l
m
X
E
2
2
l
m
s
-
)
(
X
E
Note - the average number of occurrences in the
interval, ?, can be estimated from observations
10Poisson Distribution
- Additional Notes
- Poisson distribution can be used to approximate
Binomial distribution, when number of independent
trials is very large, and p is very small (i.e.,
np is relatively constant) - use ? n p
- why is the approximation necessary? - if number
of trials n is 100, we will have a term 100! in
the Binomial probability function - difficult to
compute in a calculator - e.g., for n gt 20, p lt 0.05 - approximation is
good - e.g., for n gt 100, p lt 0.01 - approximation is
very good
11Poisson Distribution - Example
- Consider a 100 km section of the 401, in which
the discrete occurrence is an accident. The
average number of accidents in the 100 km stretch
(monthly) is 15. - What is the probability of
- a) 0 accidents occurring
- b) 10 accidents occurring
- c) 15 accidents occurring
- in this stretch?
12Poisson Distribution - Example
- Note
- discrete occurrences (accidents) in continuous
interval (distance) - ? 15
- a) soln -
- b) soln -
- c) soln - P(15) 0.1
-
15
0
virtually no chance of no accidents occurring
15
e
-
15
-
7
06
.
3
)
0
(
E
e
P
!
0
-
15
10
15
e
05
.
0
)
10
(
P
!
10
13Continuous Random Variables
14Continuous Random Variables
- take values on the real line
- e.g., temperature, pressure, composition, density
- Can we define a probability function for a
continuous random variable? - First pass - follow the discrete case
- have a function pX(x) that assigns a probability
that Xx - problem - we have infinitely many values of x -
we cant assign small enough probabilities so
that they sum to 1 over the entire sample space - effectively, P(Xx) 0 - probability of a single
value is 0 - (hand-waving - think of the counting approach to
computing probabilities)
This doesnt work!
15Probability Density Function
- Instead, consider a probability density function
fX(x) - Interpretation
- fX(x) gives us the probability that the values
lie in an infinitessimally small neighbourhood
around x - intuitive -not strictly rigorous - fX(x) - represents frequency of occurrence, and
can be considered as the continuous histogram - restrictions on fX(x) follow from the
restrictions on probability - fX(x) gt 0, and
i.e., P(S) 1 - something must happen
1
)
(
dx
x
f
ò
X
-
16Probability Density Function
- Example - Normal probability density function
- the familiar bell-shaped curve
2
ö
æ
-
m
)
(
x
ç
-
ç
2
1
s
2
ø
è
)
(
e
x
f
X
s
p
2
17Cumulative Distribution Function
- What is P(Xlt?)?
- (? is some number of interest)
- e.g., P(Temperaturelt350)
- the event of interest is those values of
temperature less than 350 C - sum of
probabilities of outcomes in this event - sum becomes integral in the continuous case
Cumulative Distribution Function (also known as
cumulative density function)
t
t
dx
x
f
F
)
(
)
(
ò
X
X
-
18Expected Value
- We can also define the expected value operation
in a manner analogous to the discrete case - weighting is performed by the probability density
function - summation is replaced by integral because were
dealing with a continuum of values
dx
x
f
x
X
E
)
(
ò
X
-
The mean is EXas in the discrete case.
m
19Variance
- is defined using the expected value
- Standard deviation is the square root of
variance.
2
2
m
m
-
-
)
(
)
(
)
(
dx
x
f
x
X
E
ò
X
-
2
s
20Expected Values
- can be taken of a general function of a random
variable - Note - the expected value is a linear operation
(as in the discrete case) - additivity -- E(X1X2) E(X1) E(X2)
- scaling -- E(kX) k E(X)
dx
x
f
x
g
X
g
E
)
(
)
(
)
(
ò
X
-
21Building a library of continuous distributions
- We will consider -
- uniform distribution
- exponential distribution
- Normal distribution
- and later, as needed -
- Students t-distribution
- Chi-squared distribution
- F-distribution
These are needed for statistical inference -
decision-making in the presence of uncertainty.
22Uniform Distribution
- We have values that occur in an interval
- e.g., composition between 0 and 2 g/Land they
the probability is equal (uniform) across the
interval
We have a rectangular histogram - values occur
with equal frequency over the range.
fX(x)
x
a
b
23Uniform Distribution
- What is the probability density function?
- constant
- what is the height?
- Area under the curve must equal 1
- Area is (b-a) height
- height 1/(b-a)
1
ì
b
x
a
for
ï
-
a
b
x
f
)
(
í
X
ï
else
everywhere
0
î
24Uniform Distribution
- Mean -
- matches intuition
- Variance -
- variance grows as width of interval for uniform
distribution grows -
)
(
dx
x
f
x
X
E
ò
X
-
b
1
b
a
dx
x
ò
-
2
a
b
a
2
-
)
(
a
b
2
2
-
m
s
)
(
X
E
X
12
25Uniform distribution
- What might we model with such a distribution?
- Example - instrument readout to the nearest
integer - pressure gauge
- if we are provided only with the nearest integer,
the true pressure could be 0.5 below reading, or
0.5 above - in absence of any additional information, we
assume that values are distributed uniformly
between these two limits - additional example - numerical roundoff in
computations
26Normal Distribution
- arguably one of the most important distributions
- probability density function
parameterized by the mean and variance - written
in a specific form
ö
æ
2
-
m
)
(
x
ç
-
ç
2
1
s
2
ø
è
)
(
e
x
f
X
s
p
2
27Normal Distribution
- is symmetric
- centre is at the mean
- variance - standard deviation - measure of width
(dispersion) of the distribution - Cumulative distribution function
- integral has no analytical (closed-form) solution
- must rely on tables or numerical computation
ö
æ
2
-
m
x
)
(
ç
-
t
t
ç
2
1
s
2
ø
è
t
dx
e
dx
x
f
F
)
(
)
(
ò
ò
X
X
s
p
2
-
-
28Standard Normal Distribution
- Problem
- cumulative distributions must be computed
numerically - summarize in table form - cant have table for each possible value of mean,
standard deviation - Solution
- consider a new random variable
- where X is normally distributed with mean and
standard deviation
m
-
X
X
Z
s
X
s
m
,
X
X
29Standard Normal Distribution
- mean of Z
- variance of Z
- Standard Normal Distribution
- scaling and centering to produce zero mean, unit
variance
-
-
m
m
X
E
X
X
X
0
E
Z
E
s
s
X
X
2
2
ö
æ
-
s
m
X
2
X
X
-
m
ç
1
)
(
E
Z
E
Z
2
s
s
ø
è
X
X
30Standard Normal Distribution
- Values are available from tables - cumulative
distribution values
31Using the Standard Normal Tables
- What is P(Z lt 1.96)?
- What is P(Z lt -1.96)?
- What is P(-1.96 lt Z lt 1.96)?
Interpretation
32Central Limit Theorem
- why the Normal distribution is so important
- given N independent random variables, each having
the same distribution with mean ? and variance ?2
, then - the sum of the N random variables follows a
Normal distributionAND - for , we havewhere Z
is the standard Normal distribution
N
-
m
1
X
å
X
X
Z
lim
i
s
N
N
/
N
i
1
33Central Limit Theorem - Consequences
- in many instances, the Normal distribution
provides a reasonable approximation for
quantities that are a sum of independent random
variables - e.g., Normal approximation to Binomial
- e.g., Normal approximation to Poisson
- many quantities measured physically tend to a
Normal distribution
34Failures in Time
- we have an important pump on a recirculation line
- the packing fails on average 0.6 times/year
- what is the probability of the pump packing
failing before 1 year? - What is probability that the time to failure is
less than 1 year?
35Exponential Distribution
- Events occur in time at an average rate ? per
unit time - What is the probability that the time to the
event occurring is less than a given time t? - Approach -
- similar to Poisson problem - think in terms of
small time increments - independent trials - P(event occurs before a given time) 1 - P(event
doesnt occur in given time)
36Exponential Distribution
- event doesnt occur in a given time ? 0
occurrences - Poisson - with occurrence rate of ?t in interval
t - P(event occurs before this time)
-
l
-
t
0
l
e
t
)
(
l
-
t
e
occurences
P
)
0
(
!
0
l
-
t
-
e
1
37Exponential Distribution
- Denote X as time to occurrence.
- Cumulative distribution
- function
- Density function -
- distributions are parameterized by ? - average
number of occurrences per unit time - can also parameterize in terms of ? - mean time
to failure - then
l
-
t
-
lt
e
t
X
P
t
F
1
)
(
)
(
X
l
-
t
l
e
t
f
)
(
X
q
l
/
1
38Pump Failure Problem
- packing fails on average 0.6 times / year
- P(pump fails within year)
- 45 chance of failure within year
-
)
1
(
6
.
0
-
lt
45
.
0
1
)
1
(
e
X
P
39Exponential Distribution - Notes
- the exponential random variable is a continuous
random variable - time to occurrence takes on a
continuum of values - development assumes failure rate is constant
- development assumes that failures are
independent, and that each time increment is an
independent trial - cf. Poisson distribution - mean and variance
1
m
X
E
l
1
2
2
m
s
-
)
(
X
E
2
l
40Exponential Distribution
- Problem Variations -
- given mean time to failure, determine probability
that time to failure is less than a given value - given fraction of components failing in a
specified time, what is probability that time to
failure is less than a given value? - what is probability that a component lasts at
least a given time?
41Exponential Distribution
- Memoryless Property
- given that a component has operated for 100
hours, what is the probability that it operates
for at least 200 hours before failing, i.e.,
P(Xgt200 Xgt 100) ? - consider A Xgt100, B Xgt200
- A ? B B
- recall conditional probability
- for our events, we have
Ç
)
(
B
A
P
)
(
A
B
P
)
(
A
P
Ç
)
(
)
(
B
P
B
A
P
)
(
A
B
P
)
(
)
(
A
P
A
P
42Exponential Distribution
- Memoryless property
- probability of individual events
- for our events,
l
-
100
-
-
lt
-
gt
)
1
(
1
)
100
(
1
)
100
(
e
X
P
X
P
l
-
100
e
l
-
200
gt
)
200
(
e
X
P
l
-
200
Ç
)
(
)
(
e
B
P
B
A
P
l
-
100
)
(
e
A
B
P
l
-
100
)
(
)
(
A
P
A
P
e
43Exponential Distribution
- Memoryless property
- interpretation - probability of component lasting
for another 100 hours given that it has
functioned for 100 hours is simply probability of
it lasting 100 hours - prior history, in form of conditional
probability, has no influence on probability of
failure - consequence of form of distribution which results
in part from assumption of independence of time
slices - note that A and B are NOT independent - general result for exponential random variable
gt
gt
gt
)
(
)
(
b
X
P
a
X
b
a
X
P