Normal Dist1

About This Presentation

Transcript and Presenter's Notes

Title: Normal Dist1

1
Continuous Probability Distributions The Normal
Distribution
2
Towards the Meaning of Continuous Probability
Distribution Functions
When we introduced probabilities, we spoke of
discrete events S collection of all possible
sample points ei 0 P(ei) 1 ? Probability
of any event is between zero and
one ?P(ei) 1 ? Probability of all elementary
events sum to 1 (something happens)
3
In particular, for the binomial distribution

For the random variable X
x stands for a particular value

?
The probability that the random variable X takes
the value x is between 0 and 1, inclusive.

?
The sum of the probabilities over all possible
values of x is 1.
4
A continuous variable has infinitely many
possible values With infinitely many possible
values, the probability of observing any one
particular value is essentially zero Pr(Xx)
0 e.g., for x1.0 vs 1.02 vs
1.0195 vs 1.01947,
Pr(Xx) is meaningless for a continuous random
variable Instead, we consider a range of
values for X Pr(a?X ?b) We can make this range
quite broad or very narrow
5
Comparing Probability Distributions for Discrete
vs Continuous Random Variables We need new
notation to describe probability distributions
for continuous variables.
Discrete
Continuous
List all possible sample points, e.g., Sei,
i1 to k.
State the range of of possible values of X e.g.,
Note ? is the symbol for infinity
6

For a continuous Random Variable, X,
P(Xx) 0
Instead, we compute the probability of X within
some interval

This function is the probability density function
of X.
Dont worry if you dont know or have forgotten
calculus, I wont be asking you to work with this
notation.
7

Much of statistical inference is based upon a
particular choice of a probability density
function, fx(x)
The Normal distribution.
This function is a mathematical model describing
one particular pattern of variation of values.
It is appropriate for continuous variables only.

Practically speaking, the normal distribution
function is appropriate for
Many phenomena that occur naturally.
Special cases of other phenomena. e.g.,
averages of phenomena that, individually are not
normally distributed.
For example, the sampling distribution of means
may follow a normal distribution even when the
underlying data do not.

9
The Normal Probability Density Function

Features to note
The range of X is ? to ?
p is the mathematical constant 3.14159
e is the mathematical constant 2.71828

10
The Normal Probability Density Function

Features to note
m is the mean of the distribution
s is the standard deviation of the distribution
s2 is the variance
(x m)2 the squared deviation from the mean
appears in the function

11
Notation X N(m,s2) We say X follows a
Normal Distribution with mean m and variance s2
or X is Normally distributed with mean m and
variance s2
12
A Picture of the Normal Distribution
x
The infamous Bell-shaped Curve
13

There are infinitely many normal distributions,
each determined by different values of ? and ?2.
The Shape of the Normal Distribution is
characteristically
Smooth
Defined everywhere on the real axis
Bell-shaped
Symmetric about the mean ? (it is defined in
terms of deviations about the mean)

14
x
The area under the curve represents probability,
and the total area under the curve 1
15
PrX lt x
m

x
-?
The area under the curve up to the value x is
often represented by the notation
16
A Feeling for the Shape of the Normal
distribution ? locates the center, and
? measures the spread
17

IF ? alone is changed by adding a constant c,
the entire curve is shifted in location
but the shape remains the same.

IF s alone is changed by multiplying by a
constant c
the shape of the bell is changed
a larger variance implies a wider spread (or
flatter curve) the area under the curve is
always 1

c?
19
Picturing the Normal Probability Density
x

As the variance, ?2, increases
Bell flattens (gets wide)
Values close to the mean are less likely
Values farther from the mean more likely.
As the variance decreases
Bell narrows
Most values are close to the mean
Values close to the mean are more likely

20
A Very Handy Rough Rule of Thumb If X follows
a Normal Distribution Then 68 of the values
of X are in the interval m ? s
68

m

s
m
-
s
21
If X follows a Normal Distribution Then 95
of the values of X are in the interval m ?
1.96s 99 of the values of X are in the
interval m ? 2.576s
22

Why is the Normal Distribution So Important?
There are two types of data that follow a normal
distribution
A number of naturally occurring phenomena
For example
heights of men (or women)
total blood cholesterol of adults
Special functions of some non-normally
distributed phenomena, in particular sums and
averages
The sampling distribution of sample means tends
to be Normal.

23
Research often focuses on sample means Example
Blood pressure can vary with time of day,
stress, food, illness, etc. One reading may not
be a good representation of typical
Distribution of a single reading of blood
pressure for an individual tends to be skewed,
with a few high values
24
To have a better gauge of an individuals BP, we
might use the average of 5 readings
Sampling Distribution of mean of 5 readings for
an individual tends to be Normal, even when
the original distribution is not
25

A Feeling for the Central Limit Theorem.
Shake a pair of die.
On each roll, note the total of the two die
faces.
This total can range from 2 to 12.
The most likely total is 7. (Why?)
How often do the other totals arise?

Histogram of die totals for n100 trials of
rolling die pair
26
Histogram of die totals for n1000 trials of
rolling die pair
As the sample size n increases the distribution
of the sum of the 2 die begins to look more and
more normal.
27

A Statement of the Central Limit Theorem
For any population with
mean ? and finite variance ?2,
the sampling distribution of means, x,
from samples of size n from this population,
will be approximately normally distributed
with mean ?,
and variance ?2/n,
for n large.
That is, for n large, and X ?? (?, ?2)
then Xn N (?, ?2/n)

This is the main reason for our interest in the
normal distribution
regardless of the underlying distribution
if we take a large enough sample
we can make probability statements about means
from such samples
based upon the normal distribution.
This is true, even when the underlying
distribution is discrete.

29
Example The Central Limit Theorem Works even for
VERY non-normal data A population has only 3
outcomes in it
1
2
9
P(Xx) 1/3

1 2
9 X

m4
1,
2,
9

12
sum of
1,
2,
9
mean of

standard deviation of
1,
2,
9
s3.6

30
Experiment Take sample of size n with
replacement. Compute sum of all n. Repeat Look
at Sampling Distribution of Sums
n25
n50
n100
31

To compute probabilities for a normal
distribution.
Recall that we are looking at intervals of values
of the random variable, X.
The probability that X has a value in the
interval between a and b is the area under the
curve corresponding to that interval

Note since Pr(Xa) or any exact value is zero,
this can be written as Pr(a?X?b) or Pr(altXltb)
a
b
32

The symmetry of the normal distribution can also
help in computing probabilities.
The normal distribution is symmetric about the
mean µ.
This tells us that the probability of a value
less than the mean is .5 or 50,
and the probability of a value greater than the
mean is also .5 or 50

0.5
0.5
33
The Standard Normal Distribution
The standard normal distribution is just one of
infinitely many possible normal distributions.
It has mean m 0 variance s2 1
?1
?0
By convention we let the letter Z represent a
random variable that is distributed Normally with
m0 and s21 Z N(0,1)

34

The standard normal distribution is important for
several reasons
Probabilities of Z within any interval have been
computed and tabulated.
It is possible to look up Pr(a ? Z ? b) for any
values of a and b in such tables.
Any other normal distribution can be transformed
to a standard normal for computing probabilities.
Distances from the mean are equivalent to number
of standard deviations from the mean.
This last is perhaps of greatest interest to us,
now that software does much of the transformation
and computation for us.

Table 3 in the Appendix of Rosner gives areas
under the normal curve, in 4 different ways
Column A gives values between and z, where z
is a particular value of the standard normal
distribution.(Note Rosner uses X rather than
Z)
That is, column A gives values for Pr( ?
Z ? z) Pr(Z ? z)z is also known as a standard
normal deviate.

PrZ lt z

z
0
-?
36

Table 3 in the Appendix of Rosner
Column B gives values between z and Pr(z ? Z
? ) Pr(z ? Z) Pr(Z ?z)
Column C gives values between 0 and z
Pr(0 ? Z ? z)
Column D gives values between -z and z Pr(-z ?
Z ? z)

0 z
0 z
-z 0 z
37

A probability calculation for any random
variable, XNormal (?,?2) can be re- expressed as
an equivalent probability calculation for a
standard Normal (0,1).This is nice because
we have tables for probabilities of the Normal
(0,1) distribution.
We can interpret probabilities in terms of of
std deviations from the mean
Of course, we can also use computer programs to
compute probabilities for any Normal Distribution
the program does the translation for us.

38
The Normal (0,1) or Standard Normal
Table. Positive values of z are read from the
first column (under x in Rosner)
The shaded area, which is the probability of Z ?
z, is shown under Col A of the table Pr(Z lt
0.31) .6217
z A B C D 0.0
.5000 .5000 .0 .0 0.01 .5040
.4960 .0040 .0080 0.30 .6179 .3821
.1179 .2358 0.31 .6217 .3783 .1217
.2434
A check that this makes sense any positive value
of z is above the mean, and should have a
probability gt .5
PrZ lt 0.31
z 0.31

0
39

Note that only positive values of z are
tabulated.
We can take advantage of a few important features
of the standard normal, to compute probabilities
for values of z less than zero
Symmetry ? Pr(Z ? -z) Pr(Z ? z)
Zero is the median ? Pr(Z ? 0) Pr(Z ? 0)
.50
Total area is 1 ? Pr(Z ? z) Pr(Z ? z) 1

40
For example, we cannot read Pr(Z lt -0.31)
directly from the tables. We can, however use the
property of symmetry
We can read this probability from Col B
Use the property of symmetry to get this.
Pr(Z gt 0.31) .3783
Pr(Z lt- 0.31) .3783

z 0.31
z - 0.31
41
-z 0 z
42
Example Word Problem What is the probability of a
value of Z more than 1 standard deviation below
the mean? Solution Since m 0 and s 1 1
standard deviation below the mean is z m - (1x
s) 0 - 1 -1 Pr(Zlt-1) 0.1587

-1 0
The probability of observing a value more than 1
standard deviation below the mean is .1587, or
just under 16.
43
Example What is the probability Z is between
1.5 and 1.5? We can read this from Column D of
the Table in Rosner Pr-1.50 ? Z ? 1.50 from
the table 0.8664 Example What is the
probability of Z more than 1.5 standard
deviations from the mean in either
direction? Since probabilities sum to 1 Pr Z ?
-1.50 or 1.50 ? Z 1 0.8664 0.1336 By
symmetry, half of this or 0.0668 lies at either
end.
.0668
.0668
-1.50 0 1.50
44
Exercise
Find the area under the standard normal curve
between Z 1 and Z 2
Solution.
It helps to draw pictures!
0 1 2 0
2 0 1
Pr(1ltZlt2) Pr(Zlt2) -
Pr(Zlt1) 0.9772 -
0.8413 0.1359
45

Notes on using Standard Normal Tables
These come in a variety of formats. The examples
given here are for the version seen in Rosner,
Table 3 in the Appendix.
Look at the accompanying picture of the
distribution to be clear what probability is
listed in the body of the table.
Draw a sketch (paper and pencil) when computing
probabilities it always helps you keep track of
what you are doing.
Minitab provides the same probabilities as Column
A Pr(Xltx), when Cumulative Probability is
selected

46
Using Minitab Calc ? Probability Distributions
? Normal
Select for Pr(Zltz) or Pr(Xltx)
Enter value of z (or x)
47
Finding Percentiles of the Normal
Distribution Example What is the 75th
percentile of N(0,1) ?
Solution Again, it helps to draw a picture!
0.75
0 z.75
We want the area under the curve to be 75 --
The value of z we want is the value, below which
75 of values are found. That is, find z.75 so
that Pr(Z lt z.75) .75
48
Use the Inverse Cumulative Option in Minitab
Input desired percentile
Inverse Cumulative Distribution Function Normal
with mean 0 and standard deviation 1.00000
P( X lt x) x 0.7500 0.6745
49
Standardizing a Normal Random Variate From
N(m,s2) to N(0,1)
We can transform any Normal distribution to a
standard normal by means of a simple
transformation
?
50
Standardizing a Normal Random Variate From
N(m,s2) to N(0,1)
Adding a constant For XN(m,s2) ? (Xb) N(?,?)
The mean is shifted over b units, but the
variance or spread of the data is unchanged by
adding a constant (Xb) N(mb, s2)
51
Multiplying by a constant For XN(m,s2) ? (aX)
N(?,?)
a?
am
The mean is adjusted to a times the original
mean, and the variance by a2 times the original
variance this is a shift in scale (aX)
N(am, a2s2)
52
Adding a constant, multiplying by a constant For
XN(m,s2) ? (aXb) N(?,?)
Both adjustments are made The mean is adjusted
to a times the original mean plus b, and the
variance by a2 times the original
variance (aXb) N(amb, a2s2)
53
Now, let a 1/s and b -m/s Then For
XN(m,s2) ? Z N(?,?) Or Z N(0,1)
54
?

We have transformed the original scale
to units measured in multiples of standard
deviations
centered around zero
A value of z-1 means the value of x is 1
standard deviation below the mean
A value of z2.5 means the value of x is 2.5
standard deviations above the mean

55
This transformation is also important, because if
we want to know Pr(a ? X ? b) Then we can
convert it to an equivalent calculation
56
Word Problem
The profit from the Massachusetts state lottery
on any given week is distributed Normally with
mean 10.0 million and variance 6.25 million
dollars. What is the probability that this weeks
profit is between 8 and 10.5 million? Let X
weekly profit in millions Then X N(m,s2)
where m10 and s26.25 ( ? s2.5 ) What is
Pr(8 ? X ? 10.5) ?
57
What is Pr(8 ? X ? 10.5) ? Translate to Standard
Normal
-.8 .2
58
-.8
.2
Pr(Zlt0.2) Pr(Zlt-.8)

Read from Table 3 or use Minitab or other program
0.5793 0.2119

0.3674
The probability of a weekly profit between 8 and
10.5 million dollars is 36.74.
59

Application of the Central Limit Theorem
Means of samples of size n
from a population with
mean m and variance s2
follow a normal distribution
with mean m and variance s2/n, for n large.
That is, for X ?(m, s2)
for n large,
X N(m, s2/n)

60
Example Consider a population of families with
m3.4 children per family and s24.37. What
percentage of samples of size n4 families will
have means greater than 5 children per
family? Sample means from samples with n4
follow a normal distribution with mx 3.4 and
sx2 s2/n 4.37/4 1.09. Then sx
1.045 We want Pr(Xgt5) , where X N(3.4, 1.09)
61
Pr(z gt 1.53) 0.06
1.53
The probability of observing a sample with a mean
of 5 children per family or larger, when n4 is
about 6.
62

So far we have gone from
X N(m, s2) ? Z N(0,1)
We may be interested in the reverse
Z N(0,1) ? X N(m, s2)

63
Example The distribution of IQ scores is normal
with a mean of 100 and a standard deviation of
15. What is the 95th percentile of this
distribution? Step 1 Find the 95th percentile
of the standard normal use Minitab, or another
program to compute Inverse Cumulative
Distribution Function Normal with mean 0 and
standard deviation 1.00000 P( X lt x)
x 0.9500 1.6449 or z.95 1.645
64
Step 2 We know X N(100, 152), and z.95
1.645 x.95 sz.95 m (15)(1.645) 100
124.7 The 95th percentile of the IQ
distribution is 124.7
65
Another Example Taking samples of size n4 from
the population of families with m3.4 children
per family and s24.37 What is the middle 50
of the sampling distribution? That is,
find a and b so the Pr(a ? X ? b) .50 a is the
25th percentile of the sampling distribution of
X b is the 75th percentile of the sampling
distribution of X
50
25
25
a b
66
Use Minitab to find 25th and 75th percentiles of
standard normal Inverse Cumulative Distribution
Function P( X lt x) x 0.2500
-0.6745 0.7500 0.6745 For X N(m,
s2/n) where m3.4 and s2/n1.09, Convert z
back to x x z sx m x.75 .675 (1.045)
3.4 4.11 x.25 -.675 (1.045) 3.4
2.69 ? Pr( 2.69 lt X lt 4.11) .50 50 of
samples of size 4 from this population will have
mean family size between 2.69 and 4.11 children
per family.
67

Recap. . . Introduction to the Normal
Distribution
For continuous variables, we speak of a
probability density function
We calculate the probabilities of intervals of
values, not individual values
The normal distribution is a good description of
many naturally occurring phenomena
the average of non-normal phenomena
This last is particularly important since much
statistical inference is based on the behavior of
averages.

While there are infinitely many normal
distributions, each determined by ? and ?2,
they can all be standardized by using the
transformation
We use the standardized form to compute
probabilities for any normal distribution.
In the standardized form, distance from the mean
is in units of standard deviation

Write a Comment

User Comments (0)

About PowerShow.com

Normal Dist1 PowerPoint PPT Presentation