Title: Statistical Reasoning
1Statistical Reasoning
2Statistical Reasoning
- Descriptive Statistics are used to organize and
summarize data in a meaningful way. - Frequency distributions Where are the majority
of the scores? - Used to organize raw scores, or data, so that
information makes sense at a glance. - They take scores and arrange them in order of
magnitude and the number of times each score
occurs.
3Multiple Choice
Composite
Essay
Mean9.3 SD2.3
Mean10.2 SD2.0
Mean34.3 SD4.2
4Histograms Frequency Polygons
- These are ways of showing your frequency
distribution data. - Histogram graphically represents a frequency
distribution by making a bar chart using vertical
bars that touch - When you have a continuous scale (for example,
scores on a test go from 0-100, continuously
getting larger.) the bars touch, because you have
to have a class for each score to fall into, and
you cant have any gaps. - Different than a Bar Graph which is used when you
have non-continuous classes (example, which
candidate do you support, Obama or McCain? Youd
have a bar for each, with gaps in between,
because you cant fall between two candidates,
you have to pick one.) - Frequency Polygon graphically represents a
frequency distribution by marking each score
category along a graphs horizontal axis, and
connecting them with straight lines (line graph)
5Histogram Uses a Bar Graph to show data
6Frequency Polygon Uses a line graph to show data
7Skewed Curves
- Skewed Distribution when more scores pile up on
one side of the distribution than the other.
Positively skewed means more people have low
scores. Negatively skewed means more people have
high scores. - Positive Negative refers to the direction of
the tail of the curve, they do not mean good
or bad.
A Positive Skew has a tail that goes to the
right.A Negative Skew has a tail that goes to
the left.
8Measures of Central Tendency
- A single number that gives us information about
the center of a frequency distribution.
Measures of central tendency 3 types
4, 4, 3, 4, 5
- Modemost common4
- (Reports what there is more of Used in data
with no - connection. Cant average men women.)
- 2. Meanarithmetic average20/54
- (has most statistical value but is susceptible to
the effects of extreme scores ) - 3. Medianmiddle score4
- (1/2 the scores are higher, half are lower. Used
when there are extreme scores)
9Central Tendency An extremely high or low
price/score can skew the mean. Sometimes the
median is better at showing you the central
tendency.1968 TOPPS Baseball Cards
- Nolan Ryan 1500
- Billy Williams 8
- Luis Aparicio 5
- Harmon Killebrew 5
- Orlando Cepeda 3.50
- Maury Wills 3.50
- Jim Bunning 3
- Tony Conigliaro 3
- Tony Oliva 3
- Lou Pinella 3
- Mickey Lolich 2.50
- Elston Howard 2.25
- Jim Bouton 2
- Rocky Colavito 2
- Boog Powell 2
- Luis Tiant 2
- Tim McCarver 1.75
- Tug McGraw 1.75
- Joe Torre 1.5
- Rusty Staub 1.25
- Curt Flood 1
With Ryan Median2.50 Mean74.14
Without Ryan Median2.38 Mean2.85
10Does the mean accurately portray the central
tendency of incomes? NO!
What measure of central tendency would more
accurately show income distribution? Median the
majority of the incomes surround that number.
11Measures of Variability
- Gives us a single number that presents us with
information about how spread out scores are in a
frequency distribution. (See example of why this
is important). - Range Difference b/w a high low score
- Take the highest score and subtract the lowest
score from it. (can be skewed by an extreme
score) - Standard Deviation How spread out is your data?
- The larger this number is, the more spread out
scores are from the mean. - The smaller this number is, the more consistent
the scores are to the mean
12Calculating Standard DeviationHow spread out
(consistent) is your data?
- Calculate the mean.
- 2. Take each score and subtract the mean from
it. - Square the new scores to make them positive.
- Mean (average) the new scores
- Take the square root of the mean to get back to
your original measurement. - 6. The smaller the number the more closely
packed the data. The larger the number the more
spread out it is.
13Standard Deviation
Deviation Squared Numbers multiplied by itself
added together
Punt Distance
Deviation from Mean
Standard Deviation
36 38 41 45
36 - 40 -4 38 40 -2 41 40 1 45 40
5
16 4 1 25
Mean 160/4 40 yds
Variance 46/4 11.5
14Standard Normal Distribution Curve
- Characteristics of the normal curve
- Bell shaped curve where the mean, median and mode
are all the same and fall exactly in the middle
15Multiple Choice
Composite
Essay
Mean9.3 SD2.3
Mean10.2 SD2.0
Are these scores consistent? Is there a skew?
Mean34.3 SD4.2
16Skewed Curves
Skewed Distribution when more scores pile up on
one side of the distribution than the other.
Positively skewed means more people have low
scores. Negatively skewed means more people have
high scores.
A Positive Skew has a tail that goes to the
right.A Negative Skew has a tail that goes to
the left.
17Z-ScoresA number expressed in Standard Deviation
Units that shows an Individual scores deviation
from the mean.Basically, it shows how you did
compared to everyone else. Z-score means you
are above the mean, Z-score means you are
below the mean.
Z-Score your score minus the average score
divided by standard deviation.
Which class did you perform better in compared to
your classmates?
Z score in Biology 168-160 8, 8 / 4 2 Z
Score
Z score in Psych 44-38 6, 6/2 3 Z Score
You performed better in Psych compared to your
classmates.
18Correlation
- Correlation shows the relationship between two
variables. - The closer to or - one the stronger the
relationship between the two variables. - This enables us to predict. However, correlation
does not mean causation.
19Positive Correlation
- As the value of one variable increases (or
decreases) so does the value of the other
variable. - When A goes UP B goes UP or
- When A goes Down, B goes Down
- A perfect positive correlation is 1.0.
- The closer the correlation is to 1.0, the
stronger the relationship.
20Negative Correlation
- As the value of one variable increases, the value
of the other variable decreases. - When A goes UP B goes Down or
- When A goes Down, B goes Up
- A perfect negative correlation is -1.0.
- The closer the correlation is to -1.0, the
stronger the relationship.
21Zero Correlation
- There is no relationship whatsoever between the
two variables.
22Lets Review
23Inferential Statistics
- Techniques that allow a researcher to determine
whether a studys outcome is more than just
chance events. - Usually you would use inferential statistics to
try to predict things about a population based on
a sample. - For example, we surveyed 50 staff members in the
district about their level of education and are
trying to use that to predict the average level
of education for all staff in the district.
24Statistical Significance
- p value likelihood a result is caused by
chance. In other words, are they statistically
significant? If the answer is yes, then they can
be generalized to a larger population - This is bad to a researcher. They want this
number to be as small as possible to show that
any change in their experiment was caused by an
independent variable and not some outside force. - Results are considered statistically significant
if the probability of obtaining it by chance
alone is less than .05 or a P-Score of 5.
p .05 - This means the researcher must be 95 certain
their results are not caused by chance. - Replication of the experiment will prove the p
value to be true or not.
25Does the sample represent the population?
- Non-biased sample-good
- Low variability-good
- Larger samples-good
- Population is a complete set of something.
- Sample is a subset of a population.