Title: Univariate Statistics 2: the normal curve, zscores, and relative frequencies
1Univariate Statistics 2the normal curve,
z-scores, and relative frequencies
2Some Important Characteristics of the Normal
Curve
- The normal curve is a symmetrical distribution of
scores with an equal number of scores above and
below the midpoint of the abscissa (horizontal
axis of the curve). - Since the distribution of scores is symmetrical
the mean, median, and mode are all at the same
point on the abscissa. In other words, the mean
the median the mode. - If we divide the distribution up into standard
deviation units, a known proportion of scores
lies within each portion of the curve. - Tables exist so that we can find the proportion
of scores above and below any part of the curve,
expressed in standard deviation units. Scores
expressed in standard deviation units, as we will
see shortly, are referred to as Z-scores.
3The Standard Normal Distribution.
Mean 0 Standard deviation 1
Total area under the curve 100
4Working out relative frequencies with a standard
normal distribution
- Because the area under a histogram is equal to
100 - And because we are familiar with the shape of a
standard normal distribution - We can use the standard normal distribution to
work out the relative frequencies of different
events.
5Question What is the relative frequency of
observations below 1.18?
6Question What is the relative frequency of
observations below 1.18? That is, find the
relative frequency of the event Z lt 1.18. (Here
small z is 1.18.) Step 1 Sketch the curve.
Identify--on the measurement (horizontal/X)
axis--the indicated range of values.
The event z lt 1.18 is shaded in green. Events and
possibilities are one in the same.
7Question What is the relative frequency of
observations below 1.18? Step 2 The relative
frequency of the event is equal to the area under
the curve over the description of the event.
The blue area is the relative frequency of the
event z lt 1.18. This area appears to be
approximately 85-90. A good sketch will help
you verify your answer.
8Question What is the relative frequency of
observations below 1.18?
- Step 3
- Look at the Table that describes the area under
the standard normal curve. Some tables list
precise values. If so you can look up 1.18.
Otherwise, since this is nearly 1.2, you can look
that up for an approximation. - Corresponding to a measurement value of z 1.18
is an area of 0.8810. - This is exactly the answer to the question!
Notice that it agrees with the picture as well as
the original "guess." For any value z the table
supplies the area under the curve over the region
to the left of z. - Again, area relative frequency.
9A note on the layout of tables.
Tables showing z-scores are laid out differently.
However they all will tell you the same thing. A
common layout, thats different to Field is shown
below. If we go across from 1.1 to 0.08, we find
the score for 1.18. It says 0.8810. This is what
Field calls the Larger Portion. This table does
not give the Smaller portion. But since we know
that the two portions combined 1.0 (or 100),
we can work it out as 1 - 0.8810 0.1190 (which
is what Field lists as Smaller Portion
10Question What is the relative frequency of
observations below 1.18? Answer 0.8810 or
88.10. For a standard normal variable the
relative frequency of observations falling below
1.18 is 0.8810. (Also, for any normal
distribution, 0.8810 or 88.1 of the observations
fall below 1.18 times the standard deviation
above the mean.)
11Question What is the relative frequency of
observations below -0.63?
12Question What is the relative frequency of
observations below -0.63?
- Identify the range of values described by "below
-0.63" (shaded green). - Identify the area you need to find (shaded blue).
- Look-up the appropriate area in your table. (In
some tables you must be careful to choose the
"negative" portion of your table--look up -0.63.)
- That area is 0.2643. For a standard normal
variable the relative frequency of observations
falling below -0.63 is 0.2643. (Also, for any
normal distribution, 0.2643 or 26.43 of the
observations fall below 0.63 times the standard
deviation below the mean. Below because -0.63 is
negative.)
Answer 0.2643 or 26.43.
13howevera problem
- So far weve been working with a standard normal
distribution. - However this is not really that useful since most
populations do not have a mean of 0 nor a
standard deviation of 1. - In the real world statistics needs to be able to
work out the likelihood of other types of events,
which have a wide range of values. - For instance How likely is it to get a mark of
at least 75 in an exam, if the mean is 60 and the
standard deviation is 10?
14z-scores
- In order to address these types of problems we
use z-scores. - z-scores are calculated by subtracting the mean
from any value and dividing it by the standard
deviation. - Z X - mean
- s
- z-scores will always have a mean of 0 and a
standard deviation of 1. - We can quickly see that this is true of the
mean, since when the Xmean, the numerator will
equal 0, and therefore z must 0. - It may be a little less clear that it is true of
the standard deviation. - However if you think about the instance when X
is one standard deviation bigger than the mean
(i.e. X mean s) - ? z (mean s) - mean s 1
- s s
15Example conversion to z-scores
- So, returning to the example of the test marks,
where the question was the likelihood of getting
a mark of 75 or better if mean60 and s10. We
can calculate the z-score for getting 75 as
follows - Z 75-60
- 10
- z 1.5
- Once we know the z-score for a mark we can work
out the relative frequency that people will score
that amount or better in exactly the same way as
we were when we were working with the standard
normal distribution.
16Example continued.
The likelihood of a mark of 75 or over is the
same as a likelihood of a z-score of 1.5 or over.
It is shown by the red part of the distribution
If we look up 1.5 we find the area is 0.0670.
Therefore there is a 6.7 probability of getting
a mark of 75 or better.
17You can also find the score that corresponds with
a certain percentage
- For example, you want to find what mark you would
need to be in the top 20 - Look at the tables in reverse, finding what
z-score corresponds to a p of .20. ?
approximately z.84 - Then work out the mark that corresponds to z.84
- X mean (.84 standard deviation)
- 60 (.8410) 68.4
- Therefore you would need a mark of 68.4 to be
in the top 20
18- Example You want to find the scores people get
in the middle 30. - This would correspond with a score ranging 15
above and below the mean. - Some tables give values between two points, the
one in Field does not. - So we work out that the small area to the right
of b is equal to (50-15) 35 - We then look at the tables in reverse. And find
that a smaller portion corresponding to .35
occurs at the approximate z-score .385 - Likewise the z-score 15 below the mean (at a) is
-.385 - We can then convert these to marks
- 60 (.38510) 56.15 60 (-.38510) 63.85
- Therefore the middle 30 of marks fell between
56.15 and 63.85