Descriptive Statistics - PowerPoint PPT Presentation

1 / 36

About This Presentation

Title:

Descriptive Statistics

Description:

Descriptive Statistics Measures of Central Tendency Why? What? And How Remember, data reduction is key Are the scores generally high or generally low? – PowerPoint PPT presentation

Number of Views:1650

Avg rating:3.0/5.0

Slides: 37

Provided by: ksuwebKen

Category:

more less

Transcript and Presenter's Notes

Title: Descriptive Statistics

1
Descriptive Statistics

Measures of Central Tendency

2
Why? What? And How

Remember, data reduction is key
Are the scores generally high or generally low?
Where the center of the distribution tends to be
located
Three measures of central tendency
Mode
Median
Mean
Which one you report is related to the scale of
measurement and the shape of the distribution

3
Mode

The most frequently occurring score
Look at the simple frequency of each score
Unimodal or bimodal
Report mode when using nominal scale, the most
frequently occurring category
If you have a rectangular distribution do not
report the mode

4
Median

Score at the 50th percentile (Mdn)
If normal distribution the mdn is the same as the
mode
Arrange scores from lowest to highest, if odd
number of scores the mdn is the one in the
middle, if even number of scores then average the
two scores in the middle
Used when have ordinal scale and when the
distribution is skewed

5
Mean

Score at the exact mathematical center of
distribution (average)
M ?X/N
Used with interval and ratio scales, and when
have a symmetrical and unimodal distribution
Not accurate when distribution is skewed because
it is pulled towards the tail

6
Deviations around the Mean

The score minus the mean
Include plus or minus sign
Sum of deviations of the mean always equals zero
?(X-M)

7
Uses of the Mean

Describes scores
Deviation of mean gives us the error of our
estimate of the score, with total error equal to
zero
Predict scores
Describe a scores location
Describe the population mean (?) which is a
parameter
Typically estimate ?

8
Summarizing Results

Used in all research methods including
observational, survey, correlational, and
experimental
Compute the mean of the dependent variable for
each of the conditions or levels of the
independent variable
Mean dependent score changes as function of
changes in the IV
Graphing the results using line or bar graphs

9
Measures of Variability

Extent to which the scores differ from each other
or how spread out the scores are
Tells us how accurately the measure of central
tendency describes the distribution
Shape of the distribution

10
Why do we care about variability?
Where would you rather vacation, Gulfside
Bungalows, where the mean temperature is 70
degrees, or Kalahari Condos where the mean
temperature is 70 degrees?
Gulfside temperature range day 72 night
68 Kalahari temperature range day
110 night 30 Also variability in terms of
the range of temperature at each of these places
over the years that temperature has been
documented
11
Range

Can report the lowest and highest value
Or report the maximum difference between the
lowest and highest
Semi-interquartile range used with the median
one half the distance between the scores at the
25th and 75th percentile

12
Variance and Standard Deviation

Definitional and computational formulas (remember
order of operations)
Again, most psychological research uses interval
and ratio scales of measurement and assume a
normal distribution
Goal is to assess the average or typical amount
the scores differ from the mean
Biased estimates of the population variance

13
Sample Variance

Uses the deviation from the mean
Remember, the sum of the deviations always equals
zero, so you have to square each of the
deviations
S2X sum of squared deviations divided by the
number of scores (p. 107 and 108)
Provides information about the relative
variability

14
Some Limits

It isnt the average deviation
Interpretation doesnt make sense because
Number is too large
And it is a squared value

15
So, Standard Deviation

Take the square root of the variance
P.109 and 110
SX
Uses the same units of measurement as the raw
scores
How much scores deviate below and above the mean

16
The standard deviation
What is a standard deviation (in English)?
the mean of deviations from the mean (sort of)
What is
s
(lowercase sigma) is the population standard
deviation.
S
the sample standard deviation
(s-hat) is the sample estimate of s
17
The deviation (definitional) formula for the
population standard deviation

The larger the standard deviation the more
variability there is in the scores
The standard deviation is somewhat less
sensitive to extreme outliers than the range (as
N increases)

18
The deviation (definitional) formula for the
sample standard deviation
Whats the difference between this formula and
the population standard deviation?
In the first case, all the Xs represent the
entire population. In the second case, the Xs
represent a sample.
19
Standard Deviation Example
26.8
106.8

20
Calculating S using the raw-score formula
To calculate SX2 you square all the scores first
and then sum them To calculate (SX)2 you sum
all the scores first and then square them
21
The raw-score formula example
?X 134
?X2 3698
22
Estimating the population standard deviation
from a sample
S, the sample standard, is usually a little
smaller than the population standard deviation.
Why?
The sample mean minimizes the sum of squared
deviations (SS). Therefore, if the sample mean
differs at all from the population mean, then the
SS from the sample will be an understimate of the
SS from the population
Therefore, statisticians alter the formula of the
sample standard deviation by subtracting 1 from N
23
Population Variance and Standard Deviation

When we have data from the entire population we
use ? to compute ?X using the same formulas (p.
115)
We usually need to estimate
Variance and standard deviations of the sample
are biased estimates of the population
Limited in terms of how free the scores can vary
Not all of the deviations in the sample are free
to be random

24
Estimates of Population Variability

P. 117, 118, and 119
Symbol s2X and sX or s-hat -- estimations
Correction factor N-1
Not all of the deviations in the sample are free
to be random
Degrees of freedom df
With M 6 and scores of 1,5,7,and 9, then the
only possibility is for the score to be 8
More accurate estimate of population variability

25
Formulas for s-hat (estimate)
Definitional formula
Raw-score formula
26
The Estimate of the Variance
Remember what the variance is..
The standard deviation squares, or the number
that you took the square root of to get the
standard deviation
The variance is not a very useful descriptive
statistic, but it is very important value you
will use in other techniques (e.g., the analysis
of variance or ANOVA)
27
Sum up

Assuming a normal distribution
Sample mean is a good estimate of population mean
The estimate of the population variance and
standard deviation tells us how spread out the
scores are
68 of the scores are within 1 and 1 sX

28
Application to Normal Distribution

Knowing the standard deviation you can describe
your sample more accurately
Look at the inflection points of the distribution

29
(No Transcript)
30
(No Transcript)
31
Transformations

Adding or subtracting just shifts the
distribution, without changing the variation
(variance)
Multiplying or dividing changes the variability,
but it is a multiple of the transformation

32
Variance is Error in Predictions

The larger the variability, the larger the
differences between the mean and the scores, so
the larger the error when we use the mean to
predict the scores
Error or error variance average error between
the predicted mean score and the actual raw
scores
Same for the population estimate of population
variance

33
Summarizing Research Using Variability

Remember, the standard deviation is most often
the measure of variability reported
The more consistent the scores are (i.e., the
smaller the variance), the stronger the
relationship

34
Proportion of Variance Accounted For

Objective approach compute proportion of
variance accounted for
Can compute the overall mean and standard
deviation, not taking into consideration the
relationship with the levels of the IV
It is the largest error we would accept
When look at relationship we compute variance for
each condition and average

35
Computation

Subtract the average error from the each of the
conditions from the error of the total sample
Divide that difference into the error from the
total sample
Gives proportion of error accounted for by the
levels of the IV

36
Thus, .

Proportional improvement in predictions by using
a relationship
The stronger and more consistent the
relationship, the greater proportion of variance
we can account for

Write a Comment

User Comments (0)