Lecture 7 THE NORMAL AND STANDARD NORMAL DISTRIBUTIONS - PowerPoint PPT Presentation

1 / 60
About This Presentation
Title:

Lecture 7 THE NORMAL AND STANDARD NORMAL DISTRIBUTIONS

Description:

Letters from the Roman alphabet, such as M and s (for the mean and standard ... Greek letters, ( , s) are used to denote the values of the corresponding ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 61
Provided by: colin111
Category:

less

Transcript and Presenter's Notes

Title: Lecture 7 THE NORMAL AND STANDARD NORMAL DISTRIBUTIONS


1
Lecture 7THE NORMAL AND STANDARD NORMAL
DISTRIBUTIONS
2
Populations and samples
  • When we gather data, the POPULATION is the
    reference set containing ALL POSSIBLE
    OBSERVATIONS (ALL scores, ALL reaction times,
    ALL IQs).
  • Our own data are usually a selection or SAMPLE
    from the population.
  • In statistics, our data are assumed to be samples
    from known THEORETICAL populations.

3
Distribution of 4000 IQs
4
A sample from a population
  • This is a picture of the distribution of 4000
    IQs.
  • The histogram is symmetrical and bell-shaped.
  • Thats because we have sampled from a NORMAL
    DISTRIBUTION.
  • The normal distribution is the most important
    theoretical population in statistics.

5
Large samples
  • From the Laws of Large Numbers, we can expect the
    values of sample statistics to be close to those
    of the corresponding population parameters.
  • The mean of this large sample is 99.9 and the SD
    is 14.96. These values are quite close to the
    population mean of 100 and the population SD of
    15.

6
It makes sense to say that, if IQ is
normally distributed with a mean of 100 and an SD
of 15, we have sampled 4000 values from a normal
distribution.
7
Population means distribution
  • In these lectures, I shall use the terms
    population and distribution interchangeably.

8
Statistics versus parameters
  • STATISTICS are characteristics of SAMPLES
    PARAMETERS are characteristics of POPULATIONS.
  • A normal population has two parameters
  • the mean
  • the standard deviation.
  • The IQ population is (approximately) a normal
    distribution with a mean of 100 and an SD of 15.

9
A notational convention
  • Letters from the Roman alphabet, such as M and s
    (for the mean and standard deviation,
    respectively), are used to denote the values of
    the STATISTICS of samples.
  • Greek letters, (µ, s) are used to denote the
    values of the corresponding population
    characteristics or PARAMETERS.

10
The IQ example
  • In this particular sample of 4000 IQs, M 99.9
    and s 14.96.
  • In the population, µ 100 and s 15.

11
The caffeine experiment
  • In the caffeine experiment, we are sampling from
    TWO populations
  • the population of scores under the Placebo
    condition with mean µ1 and standard deviation s1
  • the population of scores under the Caffeine
    condition with mean µ2 and standard deviation s2.

12
Specifying a normal distribution
  • Suppose that a variable X has a normal
    distribution with mean µ and standard deviation
    s.
  • We write this as shown.

13
There are many normal distributions
  • There are an infinite number of normal
    distributions, each specified by particular
    values for µ and s.
  • The IQ is approximately distributed as N(100,
    15).
  • The heights of men are approximately distributed
    as N(69, 2.6).

14
IQs of at least 130
  • Suppose IQ has a normal distribution, with a mean
    of 100 and a standard deviation of 15.
  • What proportion of people have IQs of at least
    130?

15
At least 130?
  • If a variable is normally distributed, 95 of
    values lie within 1.96 standard deviations (2
    approx.) on EITHER side of the mean.
  • So only 2 ½ (0.025) of values lie beyond 2 SDs
    above the mean.

0.95 (95)
2 ½ .025
2 ½ .025
mean
mean 1.96SD
mean 1.96SD
16
At least 130?
  • An IQ of 130 is 2 standard deviations above the
    mean.
  • So only 2 ½ (0.025) of IQs lie beyond 130.

0.95 (95)
2 ½ .025
2 ½ .025
mean
mean 1.96SD
mean 1.96SD
17
Probability
  • A PROBABILITY is a measure of likelihood ranging
    from 0 (an impossibility) to 1 (a certainty).
  • In the POPULATION, relative frequencies are
    probabilities.

18
Relative frequency as an area
Write a little into this box
Relative frequency of heights between 65 inches
and 70 inches.
19
Probability
  • In the POPULATION, relative frequencies are
    PROBABILITIES.
  • The area under the normal curve of height between
    65 inches and 70 inches is the PROBABILITY of a
    height in that range.

20
Probability as an area
Write a little into this box
Probability of a height between 65 inches and 70
inches.
21
IQ and probability
  • If IQ is indeed normally distributed with a mean
    of 100 and an SD of 15, 2.5 of values in the
    population are greater than 130.
  • The PROBABILITY of an IQ of at least 130 is
    0.025.

Probability of an IQ greater than 130 0.025
0.95
0.025
100 130
22
Notation
  • Intelligence is assumed to have a CONTINUOUS
    DISTRIBUTION there are an infinite number of
    values between any two points.
  • So the probability of any one value is zero.
  • Consequently Pr(IQ 130) Pr(IQ gt 130) and
    Pr(IQ 130) Pr (IQ lt 130).

23
Probability density
  • Associated with each value x of IQ is a
    PROBABILITY DENSITY, which can be thought of as
    the probability of a value IN THE NEIGHBOURHOOD
    of x.
  • The height of the normal curve above the value x
    is the probability density of x.

24
Probability distribution
  • The normal distribution is an example of a
    PROBABILITY DISTRIBUTION.
  • It is so-called because we can use it to obtain
    the probability of values of the variable within
    a specified range.
  • There are several important probability
    distributions in statistics, and they are all
    used for this purpose.

25
The standard normal variable z
  • To find out how many standard deviations an IQ of
    130 is above the mean, we have to SUBTRACT the
    mean and DIVIDE by the value of the standard
    deviation, i.e., by 15.
  • If X is the original variable (X is IQ in this
    example), we have transformed X into another
    variable z, which is known as the STANDARD NORMAL
    VARIABLE.

26
The standard normal variable z
  • If X is a normal variable, that is,
  • XN(µ,s),
  • z will also be normally distributed.
  • z is known as the STANDARD NORMAL VARIABLE.

27
Standardisation
  • Strictly speaking, z is defined in relation to
    the theoretical population mean µ.
  • However, any set of scores X can be STANDARDISED
    by subtracting the sample mean from each score
    and dividing by the sample standard deviation.
  • We shall investigate the effects of standardising
    the 4000 IQ scores in our large sample.

28
Sample distribution of 4000 IQs
29
The distribution of X
  • This is the sample distribution of X, which is
    centred on 99.9 and has a standard deviation of
    14.96 IQ points.

30
The Compute Variable procedure
31
Transforming IQ to z
32
Distribution of z
  • The distribution of z is also normal, but it is
    centred on zero and has a standard deviation of
    1.

33
Scientific notation
  • -1.4016E-4 means
  • -1.401610-4 -.00014016, which is
    zero, within rounding error.

Scientific notation
34
The statistics of z
  • Just use the Descriptives procedure to find the
    mean and standard deviation of z.
  • The mean is 0.
  • The standard deviation is 1.

35
Effects of standardisation
  • Standardising a set of scores (or a population
    of scores) has two effects
  • The mean becomes zero
  • The standard deviation becomes 1.

36
The standard normal distribution
  • In the notation I introduced earlier, we can
    represent the standard normal distribution as
    follows.

37
Distribution of z
  • Standardising a set of scores does NOT make them
    normally distributed.
  • If theres a tail to the right (ve skew) before
    transforming X to z, there will be one after the
    transformation.
  • Nevertheless, whatever the shape of the original
    distribution, the mean standardised scores will
    be zero and the standard deviation will be 1.

38
Deviations sum to zero
Zero deviations
-ve deviations
ve deviations
The mean is the centre of gravity, or balance
point. The deviations are the distances of the
points from the balance point. They must sum to
zero the positives and negatives must cancel
each other out.
39
The mean of z
  • The numerator of z, (X mean), is a DEVIATION
    SCORE.
  • Since deviations about the mean sum to zero, the
    mean of the distribution of z is also zero.
  • So the bell-shaped STANDARD NORMAL DISTRIBUTION
    is centered on zero.

40
The standard deviation of z
41
Using z
Probability that X (IQ) lies between 70 and
130 AND ALSO Probability that z lies between
-1.96 and 1.96.
Probability that X (IQ) is at least 130 AND ALSO
Probability that z is at least 1.96
0.95
X (IQ) 70 100
130 100 1.96SD z
-1.96 0
1.96
42
Referring questions from X to z
  • What is the probability of an IQ of at least 130?
  • This is to ask about the probability that X is at
    least 130, where X N(100, 15).
  • Transform X to z z (130 100)/15 2.
  • We know that the probability of z greater than 2
    (1.96) .025.

43
Between 100 and 130?
  • Convert these values to values of z.
  • If X 100, z 0.
  • If X 30, z 2.
  • Pr(z between 0 and 2) 0.475.

0.95 (95)
2 ½ .025
2 ½ .025
µ
µ 1.96SD
X µ 1.96SD
z -1.96 0 1.96
44
Finding the probability of a range of values of
X
  • In the problems we have considered, the value of
    z has always been around 2 (about 1.96), so that
    we can find the probability from memory.
  • Suppose z 1, 0.5, or any value other than
    1.96?
  • Just standardise the value of X by converting it
    to z z (X mean)/SD.
  • The are available tables in standard statistics
    textbooks which give probabilities of any
    specified range of z. You can also use SPSS to
    find such probabilities.

45
The standard normal distribution
  • There are countless normal distributions.
  • But there is only ONE standard normal
    distribution, to which any of the others can be
    transformed by z (X mean)/SD.
  • So only the probabilities of ranges of values of
    z need to be tabled.
  • It would not be feasible to table the
    probabilities for ALL possible normal
    distributions.

46
To sum up
  • If we know the DISTRIBUTION of some variable, we
    can assign a probability of obtaining a value
    within a specified range.
  • We can visualise the probability of such a value
    as the area under the curve of the distribution.
  • If the distribution is normal, we can translate
    probability questions in the original units of
    measurement into questions about ranges of z,
    which, provided X is normally distributed, has
    the STANDARD NORMAL DISTRIBUTION.

47
Percentiles
  • A PERCENTILE is the VALUE or SCORE below which a
    specified percentage or proportion of the
    distribution lies.
  • The 30th percentile is the value below which 30
    of the distribution lies.
  • The 70th percentile is the value below which 70
    of scores lie.

48
The 30th and 70th percentiles
(0.70)
0.30
30th percentile
0.70
(0.30)
70th percentile
49
Cumulative probability
  • The CUMULATIVE PROBABILITY of a particular value
    is the probability of a value LESS THAN OR EQUAL
    TO the value.
  • The cumulative probability of a value at the 30th
    percentile is 0.3 . The cumulative probability
    of a value at the 70th percentile is 0.70.

Cumulative probability of a score at the 30th
percentile 0.30
30th
Cumulative probability of a score at the 70th
percentile 0.70.
70th
50
Using cumulative probabilities
  • Given that height is normally distributed, with
    a mean of 69 inches and an SD of 2.6 inches, what
    is the probability of a man having a height
    between 65 and 70 inches?

51
CumProb (65)
65
?CumProb (70)?
70
Pr of height between 65 70
65 70
52
The cumulative distribution function
53
Cumulative probability of 65
54
Cumulative probability of 70
  • Name the new target variable and insert the value
    70.
  • Each cumulative probability will appear in a
    column whose length is the number of rows in the
    existing data set.

55
The cumulative probabilities
  • There must be some data in the Editor already.
  • SPSS will create the new Target variables you
    have specified and will enter the cumulative
    probabilities.

56
0.06
65
? 0.65 ?
70
(0.65 - .06) 0.59
65 70
57
Multiple-choice example
58
Second example
59
SPSS exercise 1
  • Open the SPSS data file containing 4000 IQs.
  • Use the Compute procedure (in the Transform menu)
    to standardise the scores.
  • Use Descriptives to obtain the mean and standard
    deviation of z.

60
SPSS exercise 2
  • Assuming that height is normally distributed with
    a mean of 69 inches and an SD of 2.6 inches, what
    is the probability of a man having a height
    between 74.2 inches and 76.8 inches?
  • Solve by using the CDF to find the cumulative
    probabilities directly and subtracting.
  • Transform the heights to z and compare the
    cumulative probabilities you obtain with those
    you obtained using the first approach.
Write a Comment
User Comments (0)
About PowerShow.com