Quantitative Analysis of Survey Data and Other Assessments for Non-Experts - PowerPoint PPT Presentation

About This Presentation

Title:

Quantitative Analysis of Survey Data and Other Assessments for Non-Experts

Description:

... Agree-disagree (Redish) plots Redish, J. Saul, and R. Steinberg, Student expectations in introductory physics, Am. J. Phys. 66, 212 224 1998. – PowerPoint PPT presentation

Number of Views:101

Avg rating:3.0/5.0

Slides: 53

Provided by: creighton

Learn more at: http://www.creighton.edu

Category:

more less

Transcript and Presenter's Notes

Title: Quantitative Analysis of Survey Data and Other Assessments for Non-Experts

1
Quantitative Analysis of Survey Data and Other
Assessments for Non-Experts

How to do SoTL without a statistician on retainer

Gintaras Duda Creighton University
June 2, 2011
2
My Background

I am a theoretical particle physicist
Came to SoTL (PER) as a junior faculty member
New faculty workshop experience
Huge roadblocks no experience with
How educational research is conducted
Quantitative or qualitative analysis
Weak background in statistics

3
Areas of SoTL interest

Attitude of introductory physics students
Particularly how it affects learning
Online discussion behavior
Realism in physics
Problem-Based Learning in upper division courses
Student note taking
How students use the internet to learn physics

4
Workshop Purpose

As SoTL matures, publication requires more and
more rigorous measures and evidence
Sophisticated statistical tests
Careful survey design and analysis
Mixed method research
Evidence, evidence, evidence!
But, many of us are untrained in these things

5
Who are you?

Please share with the group
Name, institution, and discipline
Why you picked this workshop
What you hope to gain

6
Workshop Purpose continued

Leave you with some simple tools to analyze
Likert scale surveys
Effects of instruction
Survey reliability and validity
No stats class or methods courses required

7
Part I. What to do with Likert Scales

Likert scale instruments seem ubiquitous in SoTL
work
Particularly useful in measuring students
attitudes, feelings, opinions, dispositions, etc.
Can use pre-post scheme to see changes and
growth/deterioration
Of interest in Jesuit Pedagogy (another workshop)

8
Example from physics

Attitudinal surveys
Measure students changes in attitude towards
physics due to instruction
Instruments VASS, MPEX, C-LASS, Attitude II, and
others
These instruments all show a similar trend
Students have more negative attitudes towards
physics after instruction

9
Example Questions from Attitude II Instrument

Physics is irrelevant to my life
I can use physics in my everyday life
I will did find it difficult to understand how
physics applies in the real-world
I see and understand physics in technology and
the world around me
5 point Likert scale Strongly agree, weakly
agree, neutral, weakly disagree, strongly disagree

10
One of my Likert Scale Instruments
11
What do I do with Likert Scale Data?

Two camps on analyzing Likert scale data
Interval Approach
Ordinal Approach
Methods for data analysis differ between the two
methods

12
Interval Data

Basic philosophy differences between responses
are all equal
i.e. Difference to a student between strongly
disagree and weakly disagree is the same as the
difference between a neutral response and weakly
agree
Basic technique Sum the data and do some
statistics

13
Ordinal Data

Basic philosophy Differences between responses
are not equal
i.e. Students tend not to distinguish highly
between strongly and weakly statements
3 pt Likert scale more appropriate?
Basic technique Examine statistically the number
of students who agreed or disagreed

14
Controversy over neutral response

Good debate in the literature about the
neutral/neither agree nor disagree response
Some claim its crucial
Some claim you should get rid of it
Not going to discuss it here

15
Analyzing Ordinal Data

One method is to reduce the problem to a
binomial analysis
Lump all disagrees together, all agrees together,
and dont worry about neutral responses
Visual method Agree-disagree (Redish) plots
Redish, J. Saul, and R. Steinberg, Student
expectations in introductory physics, Am. J.
Phys. 66, 212224 1998.

16
Agree-Disagree Plots

Introduced by Redish et al. in their MPEX paper -
called Redish plots

New Disagree Percentage
New Agree Percentage
Change from pre to post must be gt 2s to be
considered significant (at 5 probability level)
Standard Deviation
Redish, J. Saul, and R. Steinberg, Am. J. Phys.
66, 212224 1998.
17
Example of an Agree-Disagree Plot
Duda, G., Garrett, K., Am. J. Phys. 76, 1054
(2008).
18
Duda, G., Garrett, K., Am. J. Phys. 76, 1054
(2008).
19
Analyzing Interval Data

Basic idea here is assign a numerical value to
each response
Strong Disagree -2 (or 0)
Weakly Disagree -1 (or 1)
Neither Agree/Nor Disagree 0 (or 2)
Weakly Agree 1 (or 3)
Strong Agree 2 (or 4)
Sum the responses then analyze using standard
statistical techniques

20
Simple (student) t-test

The t-test is a simple (but robust) statistical
test
Tests a hypothesis Is there a difference between
two sets of data?
Are differences statistically significant?
95 confidence level, i.e. only a 5 probability
the difference is due to statistical fluctuations

21
Example The Gender Gap in Intro Physics
Is there a difference between male and female
students?
22
Which image is random?
Sometimes our eyes can deceive us! And sometimes
we think things are true because wed like them
to be true
23
The Gender Gap FMCE Gains
In the experimental group, there is no
statistically significant difference between the
two genders.
24
Students t-test

Assumptions
Each data set follows a normal distribution
Parameters
One-tailed vs. two-tailed
Types paired, two-sample equal variance, and a
two-sample unequal variance test
Can have different of data points if conducting
an unpaired test

25
Demo
26
Two Sample t-test
Here p lt 0.05, so the null hypothesis is
falsified statistical difference between Group
A and Group B
27
Measuring Effects of Instruction

Suppose you apply some educational innovation
Control group and experimental group
Or pre-test and post-test
How do you know if its effective?
Say you give some sort of standard assessment
How big do the changes need to be to be
statistically significant?

28
Method 1 Use a t-test

You can always use a t-test
Compare scores of control vs. experimental group
or
Compare pre vs. post tests
More difficult due to other variables

29
Method 2 Effect Size

Effect Size (ES) is a method to quantify how
effective an educational intervention has been
relative to a control group
Extremely useful when there is no familiar scale
to judge outcomes

30
A thought experiment

Suppose we do a study to see if children learn
better in the morning or afternoon
Morning trial 15.2 average on assessment
Afternoon trial 17.9 average on assessment
Is this a big difference? It depends on overlap!

Robert Coe What is an Effect Size A guide for
users
31
Two distributions
If the distributions of scores looked like this,
you would think the result is quite significant
Robert Coe What is an Effect Size A guide for
users
32
Two distributions
But if the distributions of scores looked like
this you wouldnt be so impressed
Robert Coe What is an Effect Size A guide for
users
33
Effect Size Continued

The Effect Size
Compares the difference between groups in light
of the variance of scores within each group
ES (mean of experimental group) (mean of
control group)
Standard Deviation
Actually quite simple to calculate
Robert Coe has great information online about ES

34
How to Interpret Effect Size
Robert Coe What is an Effect Size A guide for
users
35
How to Interpret Effect Size
Robert Coe What is an Effect Size A guide for
users
IQ differences between typical freshmen and
Ph.D.s corresponds to an effect size of 0.8
36
Effect Size Example
Duda, G., Garrett, K., Am. J. Phys. 76, 1054
(2008).
37
Making a better survey

In my experience surveys and assessment
instruments are difficult to write
How do you know your instrument is
Reliable
Valid
Are there alternatives to writing your own
instruments?

38
Reliability Cronbach Alpha

Cronbach Alpha measure of how closely items in a
group are related
Cronbach Alpha is often used for instruments
which are not marked right or wrong
Think Likert Scale
Measures if students responses are the same for
similar types of questions

39
How to Cronbach Alpha

You could calculate it by hand
or you buy SPSS and figure out how to use it
or you could download an excel spreadsheet which
is programmed to do this http//www.gifted.uconn.
edu/siegle/research/Instrument Reliability and
validity/reliabilitycalculator2.xls

40
Cronbach Alpha Values

Typically a Cronbach Alpha (a) gt 0.8 is
considered good
At this level survey is reliable
However, there are exceptions
Different types of surveys/instruments may have
different natural levels of reliability
Experimental instruments may be still useful even
if a0.6

41
Warning! Common Mistakes with Cronbach Alpha

Paper Calculating, Interpreting, and Reporting
Cronbachs Alpha Reliability Coefficient for
Likert-Type Scales by Joseph A. Gliem and
Rosemary R. Gliem
Lesson
Use Cronbach Alpha for Likert scale surveys
Draw conclusions based on clusters of items
Single item reliability is generally very low

42
Instrument Validity

Validity is never universal
Valid for a certain population and for a
specific purpose
Three general categories of validity
Content validity
Predictive validity
Concurrent validity

43
Ideas for Establishing Validity

Establish content or face validity
Correlate with other independent measures such as
exam scores, course grades, other assessment
instruments
Predictive validity
Longitudinal studies and student tracking are
needed here
Concurrent validity
Compare with other assessment instruments or
calibrate with the proper groups

44
Survey/Assessment Creation Tips

Build in measures to show reliability
e.g. multiple questions within a survey on the
same topic (both positive and negative)
Questions that establish that students are taking
the survey seriously
For content driven assessments, research student
difficulties
Beta-version open ended questions
Correlations can help show validity

45
An Example of evidence for Validity
Duda, G., Garrett, K., Am. J. Phys. 76, 1054
(2008).
46
Buros Institute of Mental Measurement

By providing professional assistance, expertise,
and information to users of commercially
published tests, the Institute promotes
meaningful and appropriate test selection,
utilization, and practice.

http//www.unl.edu/buros/bimm
47
Conclusion

Some simple statistical tests can provide
rigorous evidence of
Student learning
Instructional effectiveness
Improvements in attitude
All of these methods are extremely effective when
coupled with qualitative methods
Stats involved can be done with little or no
training

48
My SoTL advice

Plan a throw-away semester in any SoTL study
trial period to tinker with your study design
Flexibility to alter your study design when you
find it doesnt work
Involving students in SoTL work can be very
effective
Try to publish in discipline specific journals
When in doubt, ask your students!

49
Good References

Analysis of Likert Scales (and attitudinal data
in general) CLASS survey
http//www.colorado.edu/sei/class/
Effect Size
What is an Effect Size A guide for users by
Robert Coe (easily found by google)
Coe also has an excel spreadsheet online to
calculate effect size

50
Good references

Reliability and Validity
http//www.gifted.uconn.edu/siegle/research/Instru
ment20Reliability20and20Validity/Reliability.ht
m
http//www.gifted.uconn.edu/siegle/research/Instru
ment20Reliability20and20Validity/validity.htm
T-test
Step by step video on excel http//www.youtube.co
m/watch?vJlfLnx8sh-o

51
Good References

The FLAG Field-Tested Learning Assessment Guide
www.flaguide.org
Contains broadly applicable, self-contained
modula classroom assessment techniques (CATs) and
discipline-specific tools for STEM instructors

52
Good References
John Creswells books (and courses) have been
highly recommended to me

Write a Comment

User Comments (0)