Using SPSS to calculate summary statistics - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Using SPSS to calculate summary statistics

Description:

And the standard deviation is just the square root of the variance ... The table of the standard normal deviation contains the cumulative area under ... – PowerPoint PPT presentation

Number of Views:191
Avg rating:3.0/5.0
Slides: 30
Provided by: MarkB9
Category:

less

Transcript and Presenter's Notes

Title: Using SPSS to calculate summary statistics


1
Using SPSS to calculate summary statistics
2
SPSS - summary statistics
  • In the data view
  • Enter the scores from problem 4, page 66
  • Analyze-gtDescriptive Statistics-gtFrequencies
  • Select VAR00001 for analysis
  • Click Statistics
  • Choose the Central Tendencies you want
  • Continue
  • OK
  • Use SPSS to check your work on problem 6 (except
    for SS and mean deviation).
  • Do you get the same answers?

3
Chapter 3B
More on summary statistics
4
Other ways to look at the mean and the variance
  • Weighted average (mean)
  • Efficient formula for the variance

5
The weighted mean
  • Suppose that one had repeating scores in a sample
    to be averaged
  • Example - a students grades in a 4.0 grading
    system
  • 2,3,3,3,3,3,4,4
  • One could find the mean directly, but this would
    be somewhat redundant
  • Instead, group like score together and weight
    each by the frequency of each score

6
The weighted mean
  • An even better motivation for using the weighted
    mean is that one might want to combine means from
    different studies
  • We cant just average the means
  • The means from larger samples must have a greater
    weight
  • How much weight?
  • Answer the sample size
  • Suppose we have 3 studies with means
  • And sample sizes
  • The resulting combined mean is then

7
Computationally efficient formula for the
variance
  • Not absolutely necessary
  • But, saves time in hand calculations
  • Is used by the author
  • Most of the work is in computing SS.
  • We will focus there.

8
Deriving the shortcut SS
  • Sum of squares is the numerator of the variance.
  • We will often have reason to handle it
    separately.

9
Deriving the shortcut SS
  • Start with
  • .
  • .
  • .
  • .
  • .
  • End with

10
Using the shortcutSS to find the variance and
standard deviation
  • Recall that the variance is SS/N for the
    population
  • And the unbiased variance is SS/(N-1) for the
    sample
  • And the standard deviation is just the square
    root of the variance
  • When calculating for the sample, we will almost
    always be using the unbiased expression

11
Properties of the meanDevelop your intuition
about the mean
  • If a constant C is added to every score in a
    sample, the new mean is C the original mean
  • If every score is multiplied by a constant C,
    then the new mean will be C the original mean
  • This can be deduced by the properties of
    summation
  • The sum of the deviations from the mean will
    always be zero
  • The sum of the squared deviations from the mean
    (SS) will be less than the sum of the squared
    deviations around any other point in the
    distribution
  • Called least squared property
  • Important when fitting a straight line to a cloud
    of points

12
Properties of the standard deviationDevelop your
intuition about standard deviation
  • If a constant C is added to every score in a
    sample, the new standard deviation is the same as
    the original mean
  • If every score is multiplied by a constant C,
    then the new standard deviation will be C the
    original standard deviation
  • This can be deduced by the properties of
    summation
  • The standard deviation from the mean will be
    smaller than the standard deviation from any
    other point in the distribution
  • Follows from the related property of the mean

13
Exercises
  • Page 76 1-4, 5 a b, 6-10

14
Chapter 4
  • Standardized scores and the normal distribution

15
Evaluating a single score within a distribution
  • The numerical distance between a score and the
    mean may not be very meaningful
  • However, if that distance can be expressed in
    units of standard deviation, then we have a much
    better understanding of the relationship between
    the score and the rest of the data

16
Z scores
  • Such a scaled distance is called the z score
  • We can replace our X scores with the z scores
  • Computed like so.

17
Properties of the z scores
  • Mean of the z scores is 0.
  • From the properties of the mean
  • Subtracting a constant from each score shifts the
    mean by the constant
  • Multiplying each score by a constant multiplies
    the mean by that constant.

18
This gives us a simplified formula for the
standard deviation
19
The standard deviation of a population of z
scores is always 1!
  • From properties of standard deviation
  • Adding a constant to each score does not change
    the standard deviation
  • Multiplying each score by a constant multiplies
    the standard deviation by that constant.

20
Normal distributions, standard normal
distributions and z scores revisited
  • What means are zero?
  • Only the mean of a standardized distribution is
    guaranteed to be zero
  • Normal distributions can have non-zero means
  • What is the purpose of a z score?
  • To permit the score to be inserted into a
    standard normal distribution for comparison
  • Why do we want to insert our score into a
    standard normal distribution?
  • If we dont use the standard distribution, we
    need lots of normal distribution tables (an
    infinite number, actually)
  • What would happen if we had a distribution of z
    scores?
  • It would be the standard normal distribution

21
Limitations of z scores
  • By translating the set of scores by the mean
  • And scaling by the standard deviation
  • We can compare two different distributions
  • But not if they are skewed differently

22
The normal distribution
  • Comparing scores in distributions from the same
    family overcomes this problem
  • The normal distribution is such a family of
    distributions
  • Why is it a family of distributions?

23
The standard normal distribution
  • Generic parameters
  • Mean set to zero
  • Standard deviation set to 1

24
Probability that a score falls between any two
values
  • Recall that a probability distribution represents
    the relative probability of various scores
  • And that the total area under a probability
    distribution is 1
  • Hence the probability that a score falls in any
    interval is the area under the corresponding
    part of the curve
  • The table of the standard normal deviation
    contains the cumulative area under the curve,
    from the mean outward

25
Distribution of sample means
  • What if we sample means rather than individuals?
  • Take a sample
  • Find the mean
  • Use that as a score
  • Form a distribution of such scores
  • This is called a Distribution of sample means

26
Properties of the distribution of sample means
  • If the underlying distribution is normal, the
    sampling distribution will be normal
  • The mean of the sampling distribution will tend
    to be the same as the mean of the population (as
    the number of means approaches infinity)
  • Groups vary less than individuals
  • Therefore the standard deviation of the sampling
    distribution is less than the standard deviation
    of the population
  • This value is called the standard error

27
Why do we care about the distribution of sample
means?
  • Because when we take the mean of a sample, we
    want to know how good of an estimate it is of the
    population mean
  • If the means vary a lot from sample to sample,
    then the estimate is not very good
  • If the means vary little, then the estimate is
    good
  • We may never actually do repeated sampling, we
    just wanted to come up with this equation(!),
    which tells us how the sample size improves our
    estimation of the mean

28
Exercises
  • Page 103
  • 1,7,8,10,12

29
Exercises
  • Page 103
  • 1,7,8,10,12
Write a Comment
User Comments (0)
About PowerShow.com