Quantitative Analysis of Survey Data and Other Assessments for Non-Experts - PowerPoint PPT Presentation

About This Presentation
Title:

Quantitative Analysis of Survey Data and Other Assessments for Non-Experts

Description:

... Agree-disagree (Redish) plots Redish, J. Saul, and R. Steinberg, Student expectations in introductory physics, Am. J. Phys. 66, 212 224 1998. – PowerPoint PPT presentation

Number of Views:101
Avg rating:3.0/5.0
Slides: 53
Provided by: creighton
Learn more at: http://www.creighton.edu
Category:

less

Transcript and Presenter's Notes

Title: Quantitative Analysis of Survey Data and Other Assessments for Non-Experts


1
Quantitative Analysis of Survey Data and Other
Assessments for Non-Experts
  • How to do SoTL without a statistician on retainer



Gintaras Duda Creighton University
June 2, 2011
2
My Background
  • I am a theoretical particle physicist
  • Came to SoTL (PER) as a junior faculty member
  • New faculty workshop experience
  • Huge roadblocks no experience with
  • How educational research is conducted
  • Quantitative or qualitative analysis
  • Weak background in statistics

3
Areas of SoTL interest
  • Attitude of introductory physics students
  • Particularly how it affects learning
  • Online discussion behavior
  • Realism in physics
  • Problem-Based Learning in upper division courses
  • Student note taking
  • How students use the internet to learn physics

4
Workshop Purpose
  • As SoTL matures, publication requires more and
    more rigorous measures and evidence
  • Sophisticated statistical tests
  • Careful survey design and analysis
  • Mixed method research
  • Evidence, evidence, evidence!
  • But, many of us are untrained in these things

5
Who are you?
  • Please share with the group
  • Name, institution, and discipline
  • Why you picked this workshop
  • What you hope to gain

6
Workshop Purpose continued
  • Leave you with some simple tools to analyze
  • Likert scale surveys
  • Effects of instruction
  • Survey reliability and validity
  • No stats class or methods courses required

7
Part I. What to do with Likert Scales
  • Likert scale instruments seem ubiquitous in SoTL
    work
  • Particularly useful in measuring students
    attitudes, feelings, opinions, dispositions, etc.
  • Can use pre-post scheme to see changes and
    growth/deterioration
  • Of interest in Jesuit Pedagogy (another workshop)

8
Example from physics
  • Attitudinal surveys
  • Measure students changes in attitude towards
    physics due to instruction
  • Instruments VASS, MPEX, C-LASS, Attitude II, and
    others
  • These instruments all show a similar trend
  • Students have more negative attitudes towards
    physics after instruction

9
Example Questions from Attitude II Instrument
  • Physics is irrelevant to my life
  • I can use physics in my everyday life
  • I will did find it difficult to understand how
    physics applies in the real-world
  • I see and understand physics in technology and
    the world around me
  • 5 point Likert scale Strongly agree, weakly
    agree, neutral, weakly disagree, strongly disagree

10
One of my Likert Scale Instruments
11
What do I do with Likert Scale Data?
  • Two camps on analyzing Likert scale data
  • Interval Approach
  • Ordinal Approach
  • Methods for data analysis differ between the two
    methods

12
Interval Data
  • Basic philosophy differences between responses
    are all equal
  • i.e. Difference to a student between strongly
    disagree and weakly disagree is the same as the
    difference between a neutral response and weakly
    agree
  • Basic technique Sum the data and do some
    statistics

13
Ordinal Data
  • Basic philosophy Differences between responses
    are not equal
  • i.e. Students tend not to distinguish highly
    between strongly and weakly statements
  • 3 pt Likert scale more appropriate?
  • Basic technique Examine statistically the number
    of students who agreed or disagreed

14
Controversy over neutral response
  • Good debate in the literature about the
    neutral/neither agree nor disagree response
  • Some claim its crucial
  • Some claim you should get rid of it
  • Not going to discuss it here

15
Analyzing Ordinal Data
  • One method is to reduce the problem to a
    binomial analysis
  • Lump all disagrees together, all agrees together,
    and dont worry about neutral responses
  • Visual method Agree-disagree (Redish) plots
  • Redish, J. Saul, and R. Steinberg, Student
    expectations in introductory physics, Am. J.
    Phys. 66, 212224 1998.

16
Agree-Disagree Plots
  • Introduced by Redish et al. in their MPEX paper -
    called Redish plots

New Disagree Percentage
New Agree Percentage
Change from pre to post must be gt 2s to be
considered significant (at 5 probability level)
Standard Deviation
Redish, J. Saul, and R. Steinberg, Am. J. Phys.
66, 212224 1998.
17
Example of an Agree-Disagree Plot
Duda, G., Garrett, K., Am. J. Phys. 76, 1054
(2008).
18
Duda, G., Garrett, K., Am. J. Phys. 76, 1054
(2008).
19
Analyzing Interval Data
  • Basic idea here is assign a numerical value to
    each response
  • Strong Disagree -2 (or 0)
  • Weakly Disagree -1 (or 1)
  • Neither Agree/Nor Disagree 0 (or 2)
  • Weakly Agree 1 (or 3)
  • Strong Agree 2 (or 4)
  • Sum the responses then analyze using standard
    statistical techniques

20
Simple (student) t-test
  • The t-test is a simple (but robust) statistical
    test
  • Tests a hypothesis Is there a difference between
    two sets of data?
  • Are differences statistically significant?
  • 95 confidence level, i.e. only a 5 probability
    the difference is due to statistical fluctuations

21
Example The Gender Gap in Intro Physics
Is there a difference between male and female
students?
22
Which image is random?
Sometimes our eyes can deceive us! And sometimes
we think things are true because wed like them
to be true
23
The Gender Gap FMCE Gains
In the experimental group, there is no
statistically significant difference between the
two genders.
24
Students t-test
  • Assumptions
  • Each data set follows a normal distribution
  • Parameters
  • One-tailed vs. two-tailed
  • Types paired, two-sample equal variance, and a
    two-sample unequal variance test
  • Can have different of data points if conducting
    an unpaired test

25
Demo
26
Two Sample t-test
Here p lt 0.05, so the null hypothesis is
falsified statistical difference between Group
A and Group B
27
Measuring Effects of Instruction
  • Suppose you apply some educational innovation
  • Control group and experimental group
  • Or pre-test and post-test
  • How do you know if its effective?
  • Say you give some sort of standard assessment
  • How big do the changes need to be to be
    statistically significant?

28
Method 1 Use a t-test
  • You can always use a t-test
  • Compare scores of control vs. experimental group
  • or
  • Compare pre vs. post tests
  • More difficult due to other variables

29
Method 2 Effect Size
  • Effect Size (ES) is a method to quantify how
    effective an educational intervention has been
    relative to a control group
  • Extremely useful when there is no familiar scale
    to judge outcomes

30
A thought experiment
  • Suppose we do a study to see if children learn
    better in the morning or afternoon
  • Morning trial 15.2 average on assessment
  • Afternoon trial 17.9 average on assessment
  • Is this a big difference? It depends on overlap!

Robert Coe What is an Effect Size A guide for
users
31
Two distributions
If the distributions of scores looked like this,
you would think the result is quite significant
Robert Coe What is an Effect Size A guide for
users
32
Two distributions
But if the distributions of scores looked like
this you wouldnt be so impressed
Robert Coe What is an Effect Size A guide for
users
33
Effect Size Continued
  • The Effect Size
  • Compares the difference between groups in light
    of the variance of scores within each group
  • ES (mean of experimental group) (mean of
    control group)
  • Standard Deviation
  • Actually quite simple to calculate
  • Robert Coe has great information online about ES

34
How to Interpret Effect Size
Robert Coe What is an Effect Size A guide for
users
35
How to Interpret Effect Size
Robert Coe What is an Effect Size A guide for
users
IQ differences between typical freshmen and
Ph.D.s corresponds to an effect size of 0.8
36
Effect Size Example
Duda, G., Garrett, K., Am. J. Phys. 76, 1054
(2008).
37
Making a better survey
  • In my experience surveys and assessment
    instruments are difficult to write
  • How do you know your instrument is
  • Reliable
  • Valid
  • Are there alternatives to writing your own
    instruments?

38
Reliability Cronbach Alpha
  • Cronbach Alpha measure of how closely items in a
    group are related
  • Cronbach Alpha is often used for instruments
    which are not marked right or wrong
  • Think Likert Scale
  • Measures if students responses are the same for
    similar types of questions

39
How to Cronbach Alpha
  • You could calculate it by hand
  • or you buy SPSS and figure out how to use it
  • or you could download an excel spreadsheet which
    is programmed to do this http//www.gifted.uconn.
    edu/siegle/research/Instrument Reliability and
    validity/reliabilitycalculator2.xls

40
Cronbach Alpha Values
  • Typically a Cronbach Alpha (a) gt 0.8 is
    considered good
  • At this level survey is reliable
  • However, there are exceptions
  • Different types of surveys/instruments may have
    different natural levels of reliability
  • Experimental instruments may be still useful even
    if a0.6

41
Warning! Common Mistakes with Cronbach Alpha
  • Paper Calculating, Interpreting, and Reporting
    Cronbachs Alpha Reliability Coefficient for
    Likert-Type Scales by Joseph A. Gliem and
    Rosemary R. Gliem
  • Lesson
  • Use Cronbach Alpha for Likert scale surveys
  • Draw conclusions based on clusters of items
  • Single item reliability is generally very low

42
Instrument Validity
  • Validity is never universal
  • Valid for a certain population and for a
    specific purpose
  • Three general categories of validity
  • Content validity
  • Predictive validity
  • Concurrent validity

43
Ideas for Establishing Validity
  • Establish content or face validity
  • Correlate with other independent measures such as
    exam scores, course grades, other assessment
    instruments
  • Predictive validity
  • Longitudinal studies and student tracking are
    needed here
  • Concurrent validity
  • Compare with other assessment instruments or
    calibrate with the proper groups

44
Survey/Assessment Creation Tips
  • Build in measures to show reliability
  • e.g. multiple questions within a survey on the
    same topic (both positive and negative)
  • Questions that establish that students are taking
    the survey seriously
  • For content driven assessments, research student
    difficulties
  • Beta-version open ended questions
  • Correlations can help show validity

45
An Example of evidence for Validity
Duda, G., Garrett, K., Am. J. Phys. 76, 1054
(2008).
46
Buros Institute of Mental Measurement
  • By providing professional assistance, expertise,
    and information to users of commercially
    published tests, the Institute promotes
    meaningful and appropriate test selection,
    utilization, and practice.

http//www.unl.edu/buros/bimm
47
Conclusion
  • Some simple statistical tests can provide
    rigorous evidence of
  • Student learning
  • Instructional effectiveness
  • Improvements in attitude
  • All of these methods are extremely effective when
    coupled with qualitative methods
  • Stats involved can be done with little or no
    training

48
My SoTL advice
  • Plan a throw-away semester in any SoTL study
  • trial period to tinker with your study design
  • Flexibility to alter your study design when you
    find it doesnt work
  • Involving students in SoTL work can be very
    effective
  • Try to publish in discipline specific journals
  • When in doubt, ask your students!

49
Good References
  • Analysis of Likert Scales (and attitudinal data
    in general) CLASS survey
  • http//www.colorado.edu/sei/class/
  • Effect Size
  • What is an Effect Size A guide for users by
    Robert Coe (easily found by google)
  • Coe also has an excel spreadsheet online to
    calculate effect size

50
Good references
  • Reliability and Validity
  • http//www.gifted.uconn.edu/siegle/research/Instru
    ment20Reliability20and20Validity/Reliability.ht
    m
  • http//www.gifted.uconn.edu/siegle/research/Instru
    ment20Reliability20and20Validity/validity.htm
  • T-test
  • Step by step video on excel http//www.youtube.co
    m/watch?vJlfLnx8sh-o

51
Good References
  • The FLAG Field-Tested Learning Assessment Guide
  • www.flaguide.org
  • Contains broadly applicable, self-contained
    modula classroom assessment techniques (CATs) and
    discipline-specific tools for STEM instructors

52
Good References
John Creswells books (and courses) have been
highly recommended to me
Write a Comment
User Comments (0)
About PowerShow.com