Using SPSS to calculate summary statistics - PowerPoint PPT Presentation

1 / 29

About This Presentation

Title:

Using SPSS to calculate summary statistics

Description:

And the standard deviation is just the square root of the variance ... The table of the standard normal deviation contains the cumulative area under ... – PowerPoint PPT presentation

Number of Views:191

Avg rating:3.0/5.0

Slides: 30

Provided by: MarkB9

Category:

more less

Transcript and Presenter's Notes

Title: Using SPSS to calculate summary statistics

1
Using SPSS to calculate summary statistics
2
SPSS - summary statistics

In the data view
Enter the scores from problem 4, page 66
Analyze-gtDescriptive Statistics-gtFrequencies
Select VAR00001 for analysis
Click Statistics
Choose the Central Tendencies you want
Continue
OK
Use SPSS to check your work on problem 6 (except
for SS and mean deviation).
Do you get the same answers?

3
Chapter 3B
More on summary statistics
4
Other ways to look at the mean and the variance

Weighted average (mean)
Efficient formula for the variance

5
The weighted mean

Suppose that one had repeating scores in a sample
to be averaged
Example - a students grades in a 4.0 grading
system
2,3,3,3,3,3,4,4
One could find the mean directly, but this would
be somewhat redundant
Instead, group like score together and weight
each by the frequency of each score

6
The weighted mean

An even better motivation for using the weighted
mean is that one might want to combine means from
different studies
We cant just average the means
The means from larger samples must have a greater
weight
How much weight?
Answer the sample size
Suppose we have 3 studies with means
And sample sizes
The resulting combined mean is then

7
Computationally efficient formula for the
variance

Not absolutely necessary
But, saves time in hand calculations
Is used by the author
Most of the work is in computing SS.
We will focus there.

8
Deriving the shortcut SS

Sum of squares is the numerator of the variance.
We will often have reason to handle it
separately.

9
Deriving the shortcut SS

Start with
.
.
.
.
.
End with

10
Using the shortcutSS to find the variance and
standard deviation

Recall that the variance is SS/N for the
population
And the unbiased variance is SS/(N-1) for the
sample
And the standard deviation is just the square
root of the variance
When calculating for the sample, we will almost
always be using the unbiased expression

11
Properties of the meanDevelop your intuition
about the mean

If a constant C is added to every score in a
sample, the new mean is C the original mean
If every score is multiplied by a constant C,
then the new mean will be C the original mean
This can be deduced by the properties of
summation
The sum of the deviations from the mean will
always be zero
The sum of the squared deviations from the mean
(SS) will be less than the sum of the squared
deviations around any other point in the
distribution
Called least squared property
Important when fitting a straight line to a cloud
of points

12
Properties of the standard deviationDevelop your
intuition about standard deviation

If a constant C is added to every score in a
sample, the new standard deviation is the same as
the original mean
If every score is multiplied by a constant C,
then the new standard deviation will be C the
original standard deviation
This can be deduced by the properties of
summation
The standard deviation from the mean will be
smaller than the standard deviation from any
other point in the distribution
Follows from the related property of the mean

13
Exercises

Page 76 1-4, 5 a b, 6-10

14
Chapter 4

Standardized scores and the normal distribution

15
Evaluating a single score within a distribution

The numerical distance between a score and the
mean may not be very meaningful
However, if that distance can be expressed in
units of standard deviation, then we have a much
better understanding of the relationship between
the score and the rest of the data

16
Z scores

Such a scaled distance is called the z score
We can replace our X scores with the z scores
Computed like so.

17
Properties of the z scores

Mean of the z scores is 0.
From the properties of the mean
Subtracting a constant from each score shifts the
mean by the constant
Multiplying each score by a constant multiplies
the mean by that constant.

18
This gives us a simplified formula for the
standard deviation
19
The standard deviation of a population of z
scores is always 1!

From properties of standard deviation
Adding a constant to each score does not change
the standard deviation
Multiplying each score by a constant multiplies
the standard deviation by that constant.

20
Normal distributions, standard normal
distributions and z scores revisited

What means are zero?
Only the mean of a standardized distribution is
guaranteed to be zero
Normal distributions can have non-zero means
What is the purpose of a z score?
To permit the score to be inserted into a
standard normal distribution for comparison
Why do we want to insert our score into a
standard normal distribution?
If we dont use the standard distribution, we
need lots of normal distribution tables (an
infinite number, actually)
What would happen if we had a distribution of z
scores?
It would be the standard normal distribution

21
Limitations of z scores

By translating the set of scores by the mean
And scaling by the standard deviation
We can compare two different distributions
But not if they are skewed differently

22
The normal distribution

Comparing scores in distributions from the same
family overcomes this problem
The normal distribution is such a family of
distributions
Why is it a family of distributions?

23
The standard normal distribution

Generic parameters
Mean set to zero
Standard deviation set to 1

24
Probability that a score falls between any two
values

Recall that a probability distribution represents
the relative probability of various scores
And that the total area under a probability
distribution is 1
Hence the probability that a score falls in any
interval is the area under the corresponding
part of the curve
The table of the standard normal deviation
contains the cumulative area under the curve,
from the mean outward

25
Distribution of sample means

What if we sample means rather than individuals?
Take a sample
Find the mean
Use that as a score
Form a distribution of such scores
This is called a Distribution of sample means

26
Properties of the distribution of sample means

If the underlying distribution is normal, the
sampling distribution will be normal
The mean of the sampling distribution will tend
to be the same as the mean of the population (as
the number of means approaches infinity)
Groups vary less than individuals
Therefore the standard deviation of the sampling
distribution is less than the standard deviation
of the population
This value is called the standard error

27
Why do we care about the distribution of sample
means?

Because when we take the mean of a sample, we
want to know how good of an estimate it is of the
population mean
If the means vary a lot from sample to sample,
then the estimate is not very good
If the means vary little, then the estimate is
good
We may never actually do repeated sampling, we
just wanted to come up with this equation(!),
which tells us how the sample size improves our
estimation of the mean

28
Exercises