Introduction to Psychometrics - PowerPoint PPT Presentation

About This Presentation

Title:

Introduction to Psychometrics

Description:

Introduction to Psychometrics Psychometrics & Measurement Validity Constructs & Measurement Kinds of Items Properties of a good measure Standardization – PowerPoint PPT presentation

Number of Views:123

Avg rating:3.0/5.0

Slides: 19

Provided by: Gar68

Learn more at: https://psych.unl.edu

Category:

more less

Transcript and Presenter's Notes

Title: Introduction to Psychometrics

1
Introduction to Psychometrics

Psychometrics Measurement Validity
Constructs Measurement
Kinds of Items
Properties of a good measure
Standardization
Reliability
Validity
Standardization Inter-rater Reliabiligy

Psychometrics
(Psychological Measurement)
The process of assigning a value to represent the
amount or kind of a specific attribute of an
individual.
Individuals can be participants, collectives,
stimuli, or processes
We do not measure individuals
We measure specific attributes of an individual

E.g., Each participant in the Heptagonal
Condition was presented with a 2 inch wide
polygon to view for 10 seconds. Then this polygon
and four similar ones were presented and the
participants reaction time to identify the
polygon presented previously was recorded.
We will focus on measuring attributes of persons
in this introduction!
3

Psychometrics is the centerpiece of scientific
empirical psychological research practice.
All psychological data result from some form of
measurement
Behaviors are collected by observation,
self-report or behavioral traces.
Measurement is the process of turning those
behaviors into data for analysis
For those data to be useful we need Measurement
Validity
The better the measurement, the better the data,
the more accurate and the more useful are the
conclusions of the data analysis for the
intended psychological research or application

Without Measurement Validity, there cant be
Internal Validity, External Validity, or
Statistical Conclusion Validity!
4
Most of what we try to measure in Psychology are
constructs Theyre called this because most of
what we care about as psychologists are not
physical measurements, such as height, weight,
pressure velocity rather the stuff of
psychology ? learning, motivation, anxiety,
social skills, depression, wellness, etc. are
things that dont really exist. Rather, they
are attributes and characteristics that weve
constructed to give organization and structure to
behavior. Essentially all of the things we
psychologists research, both as causes and
effects, are Attributive Hypotheses with
different levels of support and acceptance!!!!
5
Measurement of constructs is more difficult than
measurement of physical properties! We cant
just walk up to someone with a scale, ruler,
graduated cylinder or velocimeter and measure how
depressed they are. We have to figure out some
way to turn observations of their behavior,
self-reports or traces of their behavior into
variables that give values for the constructs we
want to measure. So, measurement is, just like
the rest what weve learned about so far in this
course, all about representation !!! Measurement
Validity is the extent to which the data
(variable values) we have represent the behaviors
(constructs) we want to study.
6

What are the different types of constructs we
measure from persons ???
The most commonly discussed types are ...
Demographics population/subpopulation
identifiers
e.g., age, gender, race/ethnic, history
variables
Ability/Skill performance broadly defined
e.g., scholastic skills, job-related skills,
research DVs, etc.
Attitude/Opinion how things are or should be
e.g., polls, product evaluations, etc.
Personality characterological contextual
attributes of an individual
e.g., anxiety, psychoses, assertiveness,
extroversion, etc.

However, it is difficult to categorize many of
the things we Psychologists measure..
Diagnostic Category
achievement limits of what can be
learned/expressed /or
personality private social expressions
/or
attitude/opinion beliefs feelings
Social Skills
achievement something that has been learned ?
/or
personality how we get along socially is part
of who we are ?
Intelligence
innate (biological) preparedness for learning
/or
achievement earlier learning more
intelligence
Aptitude
achievement know things necessary to learn
other things /or
specific capacity the ability to learn certain
skills

Each separate thing we measure is called an
item
e.g., a question, a problem, a page, a trial,
etc.
Collections of items are called many things
e.g., survey, questionnaire, instrument,
measure, test, or scale
Three kinds of item collections you should know
..
Scale (Test) - all items are put together to
get a single score
Subscale (Subtest) item sets put together
to get multiple separate scores
Surveys each item gives a specific piece of
information
Most questionnaires, surveys or interviews
are a combination of all three.

9
There are skads of ways of classifying or
categorizing items, here are three ways that I
want you to be familiar with

Kinds of items 1? objective items vs. subject
items
objective does not mean true real or
accurate
subjective does not mean made up or
inaccurate
Defined by how the observer/interviewer/coder
transforms participants responses into data

Objective Items - no evaluation or decision is
needed either response data or a
mathematical transformation e.g., multiple
choice, TF, matching, fill-in-the-blanks (strict)
Subjective Items response must be evaluated and
a decision or judgment made what should be the
data value content coding, diagnostic systems,
behavioral taxonomies e.g., essays, interview
answers, drawings, facial expressions
10
Bit more about objective vs. subjective

Seems simple
the objective measure IS the behavior of interest
e.g., impolite statements, GPA, hourly sales,
publications
problems? Objective doesnt mean
representative

Seems harder
subjective rating of behavior IS the behavior of
interest
e.g., friends eval, advisors eval, managers
eval, Chairs eval
problems? Good subjective measures are hard
work, but

Hardest most common
construct of interest isnt a specific behavior
e.g., social skills, preparation for the
professorate, sales skill, contribution to the
department
problems ? What is construct how represent it
???

Kind 2 ? Judgments, Sentiments Scored
Sentiments
Judgments ? do have a correct answer (e.g., 2
2 4)
the behavior, response or trace must be
scored (compared it to the correct answer) to
produce the variable/data
scoring may be objective or subjective,
depending on item

Scored Sentiments ? do not have a correct answer
but do have an indicative answer (e.g., Do you
prefer to be alone?)
behavior, response or trace must be scored
(compared it to the indicative answer) to
produce the variable/data
scoring may be objective or subjective,
depending on item

Sentiments ? do not have a correct answer (e.g.,
Like Psyc350?) or have a correct answer, but we
wont check (e.g., age)
the behavior, response or trace is the
variable/data
scoring may be objective or subjective,
depending on item

Using Judgments, Sentiments Scored Sentiments
Judgments ? do have a correct answer
Ability/skill
Intelligence
Diagnostic category
Aptitude

Scored Sentiments ? do not have a correct answer
but do have an indicative answer
Personality
Diagnostic category
Aptitude

Sentiments ? do not have a correct answer or
have a correct answer, but we wont check
Demographics
Attitude/Opinion

13
Kind 3 ? Direct Keying vs. Reverse Keying We
want the respondents to carefully read and
respond to each item of our scale/test. One
thing we do is to write the items so that some of
them are backwards or reversed Consider
these items from a depression measure 1. It is
tough to get out of bed some mornings.
disagree 1 2 3 4 5 agree 2. Im generally
happy about my life. 1 2 3 4 5 3.
I sometimes just want to sit and cry.
1 2 3 4 5 4. Most of
the time I have a smile on my face. 1
2 3 4 5
If the person is depressed, we would expect
then to give a fairly high rating for questions 1
3, but a low rating on 2 4. Before
aggregating these items into a composite scale or
test score, we would direct key (11, 22, 33,
44, 55) and reverse key items 2 4 (15, 24,
42, 51)
14
Desirable Properties of Psychological
Measures Interpretability of Individual and
Group Scores Population Norms Validity
Reliability Standardization
15
Desirable Properties of Psychological Measures
Interpretability of Individual Group Scores
Population Norms Scoring Distribution Cutoffs
Validity Face, Content, Criterioin-Related,
Construct
Reliability Inter-rater, Internal Consistency,
Test-Retest Alternate Forms
Standardization Administration Scoring
16

Standardization
Administration test is given the same way
every time
who administers the instrument
specific instructions, order of items, timing,
etc.
Varies greatly - multiple-choice classroom test
? hand it out - MMPI ? hand it out
- WAIS ? whole books
courses
Scoring test is scored the same way every
time
who scores the instrument
correct, partial and incorrect answers, points
awarded, etc.
Varies greatly - multiple choice test ? fill in
the bubble sheet
- MMPI ? whole books
courses
- WAIS ? whole books
courses

We need to assess the inter-rater reliability of
the scores from subjective items.
Have two or more raters score the same set of
tests (usually 25-50 of the tests)
Assess the consistency of the scores different
ways for different types of items
Quantitative Items
correlation, intraclass correlation, RMSD
Ordered Categorical Items
agreement, Cohens Kappa

Keep in mind ? what we really want is rater
validity
we dont really want raters to agree, we want
then to be right!
so it is best to compare raters with a
standard rather than just with each other