Title: Becoming Acquainted With Statistical Concepts
1chapter 6
- Becoming Acquainted With Statistical Concepts
2Why We Need Statistics
- Statistics is an objective way of interpreting a
collection of observations. - Descriptions ex. Mean Score Average score of
a group of observed score - Mean score Sum of Scores/ Number of scores
- M ?X / N
- Associations ex. Correlations ( Pearson product
moment coefficient of correlation, or Pearson r,
or interclass, or simple correlation)
3Example of Correlation
4Why We Need Statistics
- Differences among groups ex. Dependent and
independent t tests, ANOVA - Differences in groups are either significant or
non-significant - Describing the data and inferring the data
results are not the same. Statistics describe the
sample and the impact of the independent
variable. We then infer the results of the sample
to the population of interest.
5Ways to Select a Sample
- Random sampling tables of random numbers
random sampling allows for inference to the
population - Stratified random sampling
- Population divided by some characteristic before
random selection occurs - Allows sampling based upon the presence of a
particular characteristic that naturally exists
within the sampled population - Results will then reflect the population sampled
6Ways to Select a Sample
Systematic sampling Logical Assignment of
Sampling that can be used for very large
populations (ex. Every 100th person listed in
phone directory) Random assignment How groups are
formed within the sample. This allows the
researcher to normalize inter-subject
variability between the experimental groups It
is the process of randomizing subjects within
the study after the experimental design is created
7Ways to Select a Sample
Justifying post hoc after the fact
explanations Does your sample reflect the
population inferred if your sample is not
randomized? How good does the sample have to
be? Can you truly have a random sample Findings
should be palusible in other participants,
treatments and situations, depending on their
similarity to the study characteristics. Good
enough for our purposes!
8Measures of Central Tendencyand Variability
- Central tendency scores using a number to
best represent a group of numbers - Mean (average) M ?X / N
- The mean score is a good measure of central
tendency when there are not a small number of
scores with a large range (ex. 1,3,2,4,2,56) - Median (midpoint) is the middle score in the
group of scores when the scores are place in
chronological order - Better choice of central tendency with larger
range and few scores
9Measures of Central Tendencyand Variability
Mode (most frequent) The most frequently
occurring score. This is a good measure with
repeated scores, few scores and a wide range of
scores. One may also have bi- or multi-modal
distributions of scores
10Measures of Central Tendencyand Variability
Variability (variance) scores how the scores
vary in the group of scores Low variation (n5)
3,4,3,5,4 High variation (n5) 3,9,1,8,10
11Standardizing the Variation(see Table 6.1, p104,
text)
- First, decide which score best represents the
group of scores. Calculate the mean score - Then determine how each individual score (X)
deviates from the mean score (M) by subtracting
the mean score from the individual score. (X-M) - The sum of the deviations (X-M) should equal zero
if the mean score was the best representative
score in the distribution of scores - Squaring the deviation scores quantifies the
variation of each score from the mean score
12Standardizing the Variation(see Table 6.1, p104,
text)
- Summing the squared deviation scores quantifies
all of the variation in the group of scores - To average the squared deviation scores, all
scores in the distribution are used to average
the deviation scores (with 1 degree of freedom
(N-1 one score is held constant and others are
free to vary ie. Point of comparison) - The average of the squared deviations are the
un-squared by the square root and we are left
with the standard deviation. - Thus the deviation scores have been standardized
by the total variance in the group of scores - If there is a large variance, there will be a
large standard deviation.
13Standard Deviation
- The larger the standard deviation, the greater
the variability in the group of scores - The square of the standard deviation is equal to
the total variance of the group of scores.
14Range of Scores
- Reporting the range of scores is typical when
using the median score instead of the mean score. -
- A large range of scores with a large variance
would indicate that the median would be a better
indicator of central tendency than the mean score
15Confidence Intervals
- A confidence interval provides an expected upper
and lower limit for a statistic at a specified
probability level - A confidence interval provides a band within
which the estimate of the population mean is
likely to fall instead of a single point - This pre-supposes sampling error. (The value I
measured is likely to be represented in the
population within this range with this degree of
confidence (ex 95 or 99 probability)
16Standard Error
- The standard error represents the variability of
the sampling distribution - Thus, the standard error average variation one
would expect from their collected sample if
compared to the population. One could take
multiple samples and calculate different mean
scores. Thus if you calculated the standard
deviation of all of these mean scores you would
get the standard error.
17Standard Error
- Thus, the standard error is the best
representation of the variability of the sample
score when inferring the sample score to the
population ( or variance in the population) - The standard error is calculated by dividing the
sample standard deviation by the square root of
the sample size - Thus one can reduce standard error by decreasing
sample variability and/or increasing the subject
number
18Frequency Distributions
- A frequency distribution assists the reader in
visualizing the numbers of scores within a range
of scores. It is a popular method in descriptive
statistics used to develop histograms - Example How many students scored between 90-100
on the exam, 80-89, etc.
19Stem-and-Leaf Displays
- A major drawback to a frequency distribution is
that there is a loss of information where the
reader does not know how many individuals rated
in scores within a group of scores. (See Figure
6.2, p. 106, text) - The Stem-and-Leaf display organizes raw scores
where the intervals are shown on the left and the
scores are horizontally lined to the right, from
low to high. This is a helpful method when
viewing the individual scores is of importance.
20Categories of Statistical Tests
- There are two categories Parametric and
Nonparametric - Parametric Statistical Test Assumptions
- Normal distribution the sample represents the
population on the variable of interest - Equal variances the sample variance is equal to
the variance found in the population on the
variable of interest - Independent observations
21- Nonparametric (distribution free) the previous
assumption of parametric statistics need not be
met - ex. Distribution is not normal
- Normal curve
- Mean, median, and mode are all at the same point
at the center of the distribution - Parametric statistics rely on the data following
the pattern of a normal curve. The ability to
determine if scores come from different
distributions (curves) is the function parametric
statistical tests.
22Normal Curve
23Normal Distribution (Curve)
- For this reason, parametric statistics can
increase the power of the research model (ie.
Chances of rejecting the Null hypothesis when the
null is actually false) - Thus, The assumptions of parametric statistics
can be tested by estimating the degree of
skewness or kurtosis.
24Skewness
- Skewness is the description of the direction of
the hump or apex of the curve and the nature of
the tails of the curve. A rightward skew of the
TAIL of the curve is positive skew when the and
a leftward skew is negative.
25Skewness
Negative Skew
Positive Skew
26Kurtosis
- Is a description of the vertical characteristic
of the curve showing the data distribution. May
be peaked or flattened but does not represent a
normal distribution
27Kurtosis
Platykurtic
Leptokurtic
28MesokurticKurtosis 0
29Statistics
- What statistical techniques tell us about the
data - Reliability (significance) of effect if the
research is repeated again, under like
conditions, will the independent variable cause a
significant change in the dependent variable
scores - Strength of the relationship (meaningfulness)
how powerful was the introduction of the
independent variable on the outcome of the
dependent variable scores? What is the effect
or magnitude of the relationship?
30Parametric Statistics - Overview
- Different Statistical Techniques
- Relationships between groups Correlation
- Cause and effect Correlation is no proof of
causation. - Differences between groups t Tests and ANOVA
- Differences can be significant but not meaningful