Title: Action Research Review
1Action ResearchReview
2Why do we do this?
- Measurements are needed to understand a system,
and predict its future behavior - Statistical techniques provide a commonly
accepted means of analyzing measurements - Statistics is based on recognizing that
measurements tend to fall over a range of values,
not just one precise number
3Types of Research
- Historical (what happened?)
- Descriptive (what is happening?)
- Developmental (over time)
- Case and Field (study an organization)
- Correlational (does A affect B?)
- Causal Comparative (what caused it)
- True Experimental (single / double blind)
- Quasi-Experimental
- Action Research
4Data Analysis
- Raw data, such as one survey result
- Refined data, such as the distribution of ages of
Philadelphia residents - Derived data, such as comparing the age
distribution of Philadelphia residents to that
of the country
5Population vs. Sample
- Often the subject of interest (population) is so
big it isnt feasible to measure it all - Then a sample of measurements can be made, and we
want to relate the sample measurement to the
population
6Sampling
- Sampling can be done using probabilistic
techniques (e.g. various random samples) - Simple or stratified random,
- Cluster (geographic), or
- Systematic (every Nth) samples
- Or using non-probabilistic methods (whoevers
convenient, specific groups, or experts)
7Customer Satisfaction Surveys
- A special case of sampling, customer satisfaction
surveys are often done using - In person interview
- Telephone interview
- Questionnaire by mail
- Sample sizes are based on the allowable error,
population size, and the result obtained
8Measurement Scales
- Measurements can use four major types of scales
the types of analysis possible depend strongly on
the type of measurements used - Nominal (named buckets, without sequence)
- Ordinal (ordered buckets)
- Interval (intervals mean something, can -)
- Ratio (you can form ratios, can -/ )
9Discrete versus Continuous
- Discrete (nonparametric) measurements use nominal
or ordinal scales only specific values are
allowed - Car make Chevy, or cost High
- Continuous (parametric) measurements use interval
or ratio scales, and generally have integer or
real number values - Temperature 98.6 deg F, Height 172.1 cm
10Descriptive Statistics
- Many common statistics can describe the central
tendency of a set of measurements - Average (arithmetic mean)
- Minimum, Maximum, Range
- Median (middle value)
- Mode (most common value)
11Normal Distribution
- Many measurements can be described by a normal
distribution, which is summarized by an average
value and a standard deviation, s or s - We can predict how likely any range of values is
to occur for a normal distribution (how often is
X between 5 and 8?)
12Z Score
- Z scores measure how far from the mean a single
measurement isz (Xi - m) / s - Same formula used for finding t too
- Does not only apply to a normal distribution, but
if it does, then we can predict the probability
of that value or higher/lower occurring
13Standard Error
- A sample of N measurements will have a standard
error SEx s / sqrt(N) - The standard error allows us to define the
confidence interval, CICI mean /-
critSExwhere crit is the critical z score for
a large sample, or the critical t score for a
small sample
14Critical z and t
- The critical z score is only a function of the
desired confidence level of the results (zc
1.96 for 95 confidence level) - Critical t score is a function of the sample size
(degrees of freedom, df n-1) and the desired
confidence level - As df gets very large, critical t ? critical z
15Confidence Level
- We have to accept some level of uncertainty in a
statistical analysis our conclusion might be
wrong! - Generally, a 95 level of confidence is used,
unless life is on the line - then a 99 level of
confidence is required - Use 95 typically, hence critical significance
is 0.050
16Confidence Level
- The level of confidence of your results, plus the
critical significance, always equals exactly one - For practically every statistical test, having
the Significance of the result less than the
critical value means to reject the null
hypothesis - If Sig actual lt Sig crit, reject null hypothesis
17Frequency and Percentage
- Frequency graphs and crosstabs can provide a lot
of information just from counts of a nominal or
ordinal measurement occurring, possibly given
with the percentages of each events occurrence - Histograms can provide similar charts for ratio
or interval scaled data
18Scatterplots
- Scatter plots or diagrams show the relationship
between two or more measures - The horizontal axis is generally the independent
variable (X), sometimes also called a factor or
grouping variable - The vertical axis is generally the dependent
variable (Y), which is the measure youre trying
to understand
19Hypothesis Testing
- Some statistics are used in the context of
testing a hypothesis - a statement whose truth
you wish to determine - Are Philadelphians more likely to be Nobel Prize
winners? - The Null hypothesis is the opposite of the
hypothesis, and generally says there is no
difference or no effect observed - Philadelphians no more likely to be Nobel Prize
winners than any other group
20Hypothesis Testing
- Cant truly PROVE anything - only determine if
the differences observed are not likely to be
due to chance - Select one or more Tests of Significance to
determine if there is a statistically significant
difference (Yes/No) if Yes, then can - Select one or more Measures of Association to
describe the strength of the difference, and
possibly its direction
21One versus Two Tailed Tests
- A null hypothesis which tests for no difference
uses a two tailed test - A null hypothesis which specifically tests for
greater than uses a one tailed test - A null hypothesis which specifically tests for
less than uses a one tailed test - One versus two tailed changes the critical z or
t score generally makes the test easier to show
significance thats why two-tailed tests are
used
22Z or T Test
- The z or t tests can be used to compare two
distribution means, or compare one distribution
mean to a fixed value (interval or ratio data) - Compare the actual z or t score to the critical z
or t score - If the actual z or t score is closer to zero than
the critical value, accept the null hypothesis
23Z or T Test (Two Tailed)
Notice this is for the x or t value, NOT the
significance of that value
24Z or T Test (One Tailed)
(Case here is testing if the actual value is
greater than the mean for a less than case,
use only the negative critical value.)
25Is My Sample Normal?
- Boxplots and stem-and-leaf diagrams can help show
graphically whether a sample has a fairly normal
distribution - The skewness and kurtosis of a data set can help
identify non-normality, if their values are more
than two times their own standard errors
26T Tests
- T tests compare means for ratio or interval data
- Independent t test is for two different strata
within one data set - Paired t test is to compare measures of the same
group before and after some event (drug test), or
the samples are otherwise believed to be
dependent on each other - One-sample t test compares one sample to a fixed
value
27T Tests
- Null hypothesis is that there is no difference
between the means - Results (e.g. significance) may differ if
variances are not equal, since df changes - The Levene test checks for equal variances
- Null hypothesis for the Levene test is that the
variances are equal - If the Levene significance lt 0.050, variances are
not equal (reject the null hypothesis)
28Independent T Test Evaluation
- Three ways to check the results of a T test
- If the T tests significance lt 0.050, reject the
null hypothesis - Check the stated t value against the critical t
value for this df level if t(actual) gt
t(critical) reject the null hypothesis - If the confidence interval for the difference
between the means does not include zero, reject
the null hypothesis
29Evaluating Significance
30Paired T Test Evaluation
- Checks before and after test cases
- Includes a correlation factor (like r)
- Can use paired test if significance lt 0.050
- Larger correlation factor means stronger
relationship between the variables - Test evaluation as Independent T Test
- Significance, t value, and confidence interval
31One-Sample T Test
- Compare a sample mean to a fixed value
- Test shows the actual values of means, with their
std deviation and std error - Same interpretation of results
- Significance, t value, and confidence interval
32F Test and ANOVA
- Compare several means against each other using
Analysis of Variance (ANOVA) and the F test - Like extending the T tests to many variables
- Want data from random samples of normal
populations with equal variances
33F Test and ANOVA
- Output includes the Levene test
- Want significance for Levene gt 0.050, so that
equal variances can be assumed - Otherwise, should not use ANOVA
- Evaluate F by its significance
- If Sig. lt 0.050, reject the null hypothesis
(there is a significant difference among the
means)
34Additional ANOVA Tests
- Once the F test shows there is some difference in
the means across a subset, additional ANOVA tests
can help identify more specific trends and
differences - Types of tests (see end of lecture 6) include
- Pairwise Multiple Comparisons
- Post Hoc Range Tests
35Pairwise Multiple Comparisons
- Pairwise Multiple Comparisons check two subsets
of data at a time - Bonferroni test is better for a small number of
subsets - Tukey test is better for many subsets
- Both assume subset variances are equal
- For each pair of subset values, Sig lt 0.050
means the difference in means is significant
36Post Hoc Range Tests
- Post Hoc Range Tests look for groups within each
subset which all have similar variances - Tukey and Tukeys-b tests include Post Hoc Range
Tests - Each column of the output is a subset with
statistically similar means - Subsets may overlap substantially
37Contrasts Across Means
- Look across subset means to see if there is a
trend, such as a linear increase or decrease
across subsets - Can check for Linear, Quadratic, or Cubic
relationships - (i.e. first, second, or third order polynomials)
- Check Significance of F for the Unweighted
version of each relationship (Linear, etc.) if
Sig. lt 0.050, reject the null hypothesis
38Determine Linearity
- An option under Compare Means / Means allows
checking just for linearity - This confirms the ANOVA test result for Linearity
- And gives R and Eta parameters, which are
Measures of Association
39R and Eta
- Pearsons R measures how well the data fits the
regression (-1 is a perfect negative correlation,
0 is no relationship, 1 is perfect positive
correlation), and describes the amount of shared
variance between them - Eta squared gives how much of the variance in one
variable is caused by the changes in the other
variable
Named for English statistician Karl Pearson,
1857-1936 (per http//human-nature.com/nibbs/03/kp
earson.html)
40Regression Analysis
- Regression Analysis looks at two interval or
ratio-scaled variables (generically X and Y) and
tries to fit an equation between them - A dozen different equations are available
- Linear, Power, Logarithmic, Exponential, etc.
- Significance is checked by ANOVA F, and Sig. of
the regression coefficients association is
measured with R Squared
41Regression Analysis
- For a regression to have any significance, we
must have ANOVAs Sig. F lt 0.050 - Then each variables coefficient (b0, b1, etc.)
must have significance lt 0.050 - Otherwise the coefficient might be zero
- Then the better regression equations are ranked
in order of strength by R Square, which is
confirmed visually by plotting
42Regression Analysis
- The standard error of coefficients is given, so
confidence intervals can be formed - Also helps report them meaningfully, so you dont
report a value as 4.861435 if it has a standard
error of 0.92 - Depending on the accuracy of the source data, you
could report that result as 5 /- 1, or 4.9 /-
0.9, or 4.86 /- 0.92
43Crosstabs
- Crosstabs display data sorted by two or more
variables in table form - Often just counts of each category, and/or the
percentage of counts - Recoding data allows interval or ratio scale
data to be put into groups (e.g. age 18-25)
44Pearsons Chi Square
- Measures how well the actual (observed) data
differs from a even (expected) distribution of
data - The expected data can be a random distribution
(same number of counts per cell), or adjusted for
the actual total counts for each row and column
45Pearsons Chi Square Evaluation
- When chi square is larger than the critical
value, reject the null hypothesis - Or if the significance of chi square is lt 0.050,
reject the null hypothesis - Can also generate Chi square for a single
variable - Beware that Chi square is less meaningful for
large matrices - Or, its too easy for large matrices to show
significance falsely using Chi square
46Residuals
- A residual is the difference between the Observed
and Estimated values for a cell - Residuals can be plotted to look for outliers
- Residuals can be standardized by dividing by
their standard deviation - Cells with a standardized residual magnitude gt 2
contribute a lot to Chi square
47Measures of Association
- Measures of Association between two variables can
be symmetric or directional - Dozens of measures have been developed to work
with chi square test - Interpret them like r - zero means no
correlation, larger values mean a stronger
correlation - Some can be gt 1
48Measures of Association
- Symmetric measures dont care which variable is
dependent (Y) - Directional measures DO care which variable is
dependent (A f(B) is not B f(A)) - Some directional measures have a symmetric
value, the weighted average of the other two
49Symmetric Measures
- The Contingency Coefficient is the main
symmetric measure with a Chi Square test - Works even with nominal data
- Evaluated like Pearsons r
- Phi and Cramers V are other symmetric measures
50Directional Measures
- Directional measures range from 0 to 1
- Lambda is the recommended directional measure -
tells what proportion of the dependent variable
is predicted by the independent variable (like
Eta) - Eta can be applied here if one variable is
interval or ratio scaled
51Relative Risk and Odds Ratio
- Use only with 2x2 tables
- Are quite directional
- Tells how much more likely one cell is to occur
than the others - Need to be very careful when interpreting
52Square Tables
- Tables with the same number of rows and columns
(RxR), and the same variables in those rows and
columns, can use kappa - Measures strength of association, like r
- Check results for significance (lt0.050)
- Then judge the value of kappa using a fixed
scale
53General RxC Measures
- Many measures can be used with a general table of
R rows and C columns - Gamma is the recommended measure (symmetric)
- Spearmans Correlation Coefficient is also widely
used - Ranges from -1 to 1, based on ordered categories
54Yules Q
- Yules Q is a special case of gamma for a 2x2
table - Is judged on a fixed scale, like r