Title: Validity: Conceptual Issues
1Validity Conceptual Issues
2Contrasting Reliability Validity
- Both fundamental to a sophisticated understanding
of psychometrics - Must have a clear understanding of the
relationship between the two
3Definitions notice differences
- Reliability
- Degree to which differences in test scores
reflect differences among people in their levels
of the trait that affects those scores, whatever
that trait may be - Quantitative property of the test scores
- Validity
- Tied to interpretation of test score
- Tied to theory and implication of scores
4LINK
- Validity requires reliability
- Stable traits (Intelligence IQ)
- Measure at two point in time, scores should be
stable across time (test-retest reliability) - If not, the test cannot be a valid test of IQ
- States (Depression BDI)
- If poor internal consistency, cant be valid
- Reliability does not imply validity
- Stable Trait (Autism AQ)
- May have excellent test-retest reliability or
good internal consistency, but may not be
interpreted in a valid manner
5Iowa story
- Dont want to hire people who might abuse clients
anymore!!! - Personality tests
- Is there a test that measures the construct?
- Does it validly measure abusive personality?
- Is there a test that was designed to predict the
likelihood that a particular individual will
abuse people?
6What is validity?
- Definition
- Implications of the contemporary definition of
validity
7Validity ----- Definition
- Basic Definition
- The degree to which a test measures what it is
supposed to measure - Contemporary Definition
- The degree to which evidence and theory support
the interpretations of test scores entailed by
the proposed uses of the test
8Implications of the contemporary definition
9Implication 1
- Interpretation and use of test scores
10Validity ? about interpretation use of test
scores
- NEO-PI-R
- Conscientiousness scale 48 items
- High scores reflect an active process of
planning, organizing and carrying out tasks, and
people with high scores on this scale are
purposeful, strong willed, and determined
11NEO-PI-R?Conscientiousness Scale
- What is the correct question about the scales
validity or invalidity? - Are the test items valid or invalid?
- Are the test scores valid or invalid?
- Is the interpretation of the test scores valid or
invalid?
12Not are items or scores valid or invalid?
- The question is
- Are the authors interpretations of the scores
valid or invalid? - Are conscientiousness scores validly interpreted
in terms of planfulness, organization, and
determination?
13Proposed use of scores
- Employers may use NEO-PI-R Conscientiousness
Scale to screen potential employees - BELIEF Differentiates potentially better and
worse employees? - Predictive power of conscientiousness scale score?
14Hammer is a useful tool if you need to drive a
nail
15What if you need to saw a piece of wood?
- Hammer is not a useful tool irrespective of the
need
16Simplistic inaccurate to say
- Conscientiousness scale is valid without regard
to the way in which it will be interpreted and
used - Rather (what is accurate)
- Scores can be interpreted validly as an indicator
of conscientiousness - Scale is not valid as a measure of intelligence
or extraversion - Not a valid predictor of successful employment
17Compare
- Scores on the Conscientiousness scale of the
NEO-PI-R are validly interpreted as a measure of
conscientiousness. - vs.
- The Conscientiousness scale of the NEO-PI-R is
valid.
18Implication 2
- Validity is a matter of degree
- Strong vs. weak
- NOT valid vs. invalid
- Select test if strong enough evidence supporting
intended interpretation and use - http//www.wired.com/wired/archive/9.12/aqtest.htm
l
19Concern about the Autism Spectrum Quotient
- Marginal internal consistency, so reliability is
already of concern - What about validity?
- Is it valid to interpret a high score on the test
as reflecting a high degree of autism traits?
20Interpretation of AQ
21Regret vs. Autism? (r .45)
22AQ
- http//www.wired.com/wired/archive/9.12/aqtest.htm
l
23What is to be measured?
- What are the relative strengths of the
alternatives that are available to measure that
construct? - Select best measures of specific characteristics
to be assessed
24Implication 3
- Validity of a tests interpretation is based on
evidence and theory - Human resources in her experience, use of
NEO-PI-R was useful in selection
25Personality Color Test
- Based on color psychology (Max Luscher)
- Color preferences reveal something about your
personality - Survey of scientific literature finds almost no
empirical evidence of validity of color
preferences as a measure of personality
characteristics
26Evidence for color test
- Less than clear
- Cite implies validity
- Web site
- Is the test reliable? We leave that to your
opinion. We can only say that there are a number
of corporations and colleges that use the Lûscher
test as part of their hiring/admissions
processes. It can be a useful tool for doctors
and psychologists as well and is used to get a
quick overview of potential issues patients may
have in their lives. - http//colorquiz.com/
27Color Quiz
- Is the test useful as a measure of personality?
- Denied employment based on such a test?
28Empirical evidence theoretical underpinnings?
- Data from high quality research must be
available. - Theory alone is not adequate.
29Contemporary view of validity
- Although 3 forms, content, criterion, and
construct, contemporary perspective highlights
CONSTRUCT VALIDITY
30Standards
- Standards for Educational and Psychological
Testing - revised (1999) - Co-published by
- American Education Research Association (AERA)
- American Psychological Association (APA)
- National Council on Measurement in Education (NCME
31Remember
- Contemporary perspective highlights CONSTRUCT
VALIDITY
32Standards outline 5 types of evidence relevant
for establishing validity of test interpretations
(AERA, APA, NCME, 1999)
Associations With Other Variables
Internal Structure
Construct Validity
Response Processes
Consequences of Use
Test Content
33Construct Validity
Test Content
34Validity Evidence Test Content
- Match between the actual content of a test and
the content that should be included in the test. - Psychological nature of the construct should
dictate the appropriate content of the test.
35Face Validity
- Face validity the degree to which a measure
appears to be related to a specific construct in
the judgment of non-experts such as test takers
and representatives of the legal system. - LOOKS relevant, and this fact may increase
likelihood that the test will be well received by
users and takers
36Threats to content validity
- Construct-irrelevant content e.g., test
includes questions on content not covered in
book, lecture, or discussion - Construct under-representation e.g., test
content fails to represent the full scope of the
content implied from the construct - Related practical issues e.g., time, respondent
fatigue, respondent attention, and etc. Is
content a fair representation?
37Content Validity vs. Face Validity
- Content validity is the
- degree to which the content reflects the full
domain of the construct - can only be evaluated by experts who have a deep
understanding of the construct - Face validity is the
- degree to which non-experts perceive the test to
be relevant to what they believe is being
measured by it
38Internal Structure
Construct Validity
39Validity Evidence Internal Structure of the
Test
- For a test to be validly interpreted as a measure
of a particular construct, - the actual structure of the test should match the
theoretically based structure of the construct - Does the theoretical basis suggest a
unidimensional or a multi-dimensional structure?
40Internal Structure
- Often assess via examination of factor structure
(factor analysis) - Items that are more strongly correlated with each
other than other items form clusters called
factors - Factor analysis should clarify the number of
factors within a set of test questions - Example Self esteem is the construct uni- or
multi-dimensional?
41Factor analysis
- Clarifies number of factors
- Reveals associations among the factors within a
multi-dimensional test - Identifies which items are linked to which factors
42Rosenberg Self-Esteem Inventory (RSEI Rosenberg
1989)
- On the whole, I am satisfied with myself
- At times, I think I am no good at all.
- I feel that I have a number of good qualities
- I am able to do things as well as most other
people - I feel I do not have much to be proud of
- I certainly feel useless at time
- I feel that Im a person of worth, at least on an
equal plan with others - I wish I could have more respect for myself
- All in all, I am inclined to feel that I am a
failure - I take a positive attitude toward myself
43RSEI - Scree Plot
- Number of factors evident in the plot?
- Question
- This scree plot provides evidence for what type
of structure - Unidimensional
- Multidimensional
44Construct Validity
Response Processes
45Validity EvidenceResponse Processes
- Match between the psychological processes that
respondents actually use when completing a
measure and the processes that they should use. - When I say start, raise your finger when you feel
10 s have elapsed. - Assumption should use feel (feels like time is
up) - but could use another process such as covert
counting, copying others, or looking at a second
hand on a watch
46Response processes
- If a different response process used is different
than the one assumed to be used, then the scores
may not be interpretable as the test developer
intended - Attention to the internal feel of time passing
vs. use of some selected process to intentionally
mark passage of time
47Associations With Other Variables
Construct Validity
48Validity EvidenceAssociation With Other
Variables
- Match between a measures actual associations
with other measures and the associations that the
test should have with the other measures.
49Convergent evidence
- The degree to which test scores are correlated
with tests of related constructs
50Discriminant evidence
- Degree to which test scores are uncorrelated with
tests of unrelated constructs
51Example
- Hypothesis Schizophrenia and autism are
diametrically opposed constructs
52(No Transcript)
53Measure of autism should be uncorrelated with
measures of schizophrenia
54Support for C Bs theory?
- NO Convergent evidence - autism measure
correlated positively with sz measures - Finding AU SZ are related constructs?
- i.e., Crespi Badcock are wrong
- Or
- Not really yes, but could assume strong
correlations indicate weak validity of AQ as a
measure of autism construct
55Concurrent validity evidence
- The degree to which test scores are correlated
with other relevant variables that are measured
at the same time as the primary test of interest - SAT is a measure of skills needed for academic
success? - Compare SAT administered during high school
senior year to hs senior year GPA
56Predictive validity evidence
- The degree to which test scores are correlated
with relevant variables that are measured at a
future point in time. - SAT is a measure of skills needed for academic
success? - Compare SAT administered during senior year of
high school to college freshman year GPA
57Validity EvidenceConsequences of Testing
- Social consequences of test are a facet of
validity - Standards for Educational and Psychological
Testing - Validity includes the intended and unintended
consequences of test use - E.g., does a construct and its measurement
benefit one group?
58Not all agree
- Consequences of a testing program should be
considered a facet of the scientific evaluation
of the meaning of a test score. - Some feel that this is an intrusion of politics
into science - Can science be separated from personal and social
values?
59Summary
- Conceptual basis for validity
Associations With Other Variables
Internal Structure
Construct Validity
Response Processes
Consequences of Use
Test Content
60Validity
- Standard for Education and Psychological Tests
(1999) - The degree to which
- evidence and theory support the
- interpretations of test scores entailed by the
- proposed uses of a test
61Validity
- Are decisions based on valid interpretations of
test scores? - Educational placement
- Access to services
- Hiring
- Clinical decisions