Validity - PowerPoint PPT Presentation

About This Presentation

Title:

Validity

Description:

Validity In our last class, we began to discuss some of the ways in which we can assess the quality of our measurements. We discussed the concept of reliability (i.e ... – PowerPoint PPT presentation

Number of Views:239

Avg rating:3.0/5.0

Slides: 32

Provided by: Chris1947

Category:

more less

Transcript and Presenter's Notes

Title: Validity

1
Validity

In our last class, we began to discuss some of
the ways in which we can assess the quality of
our measurements.
We discussed the concept of reliability (i.e.,
the degree to which measurements are free of
random error).

2
Why reliability alone is not enough

Understanding the degree to which measurements
are reliable, however, is not sufficient for
evaluating their quality.
In-class scale example
Recall that test-retest estimates of reliability
tend to range between 0 (low reliability) and 1
(high reliability)
Note An on-line correlation calculator is
available at http//easycalculation.com/statistics
/correlation.php

3
Validity

In this example, the measurements appear
reliable, but there is a problem . . .
Validity reflects the degree to which
measurements are free of both random error, E,
and systematic error, S.
O T E S
Systematic errors reflect the influence of any
non-random factor beyond what were attempting to
measure.

4
Validity Does systematic error accumulate?

Question If we create a composite of multiple
observations, how will systematic errors
influence our estimates of the true score?

5
Validity Does error accumulate?

Answer Unlike random errors, systematic errors
accumulate.
Systematic errors exert a constant source of
influence on measurements. We will always
overestimate (or underestimate) T if systematic
error is present.

6
Note Each measurement is 2 points higher than
the true value of 10. The errors do no average
out.
7
Note Even when random error is present, E
averages to 0 but S does not. Thus, we have
reliable measures that have validity problems.
8
Validity Ensuring validity

What can we do to minimize the impact of
systematic errors?
One way to minimize their impact is to use a
variety of indicatorsdifferent sources of
information.
Different kinds of indicators of a latent
variable may not share the same systematic errors
If true, then S will behave like random error
across measurements (but not within measurements)

9
Example

As an example, lets consider the measurement of
self-esteem.
Some methods, such as self-report questionnaires,
may lead people to over-estimate their
self-esteem. Most people want to think highly of
themselves.
Other methods, such as clinical ratings by
trained observers, may lead to under-estimates of
self-esteem. Clinicians, for example, may be
prone to assume that people are not as well-off
as they say they are.

10
Self-reports
Clinical ratings
Note Method 1 systematically overestimates T
whereas Method 2 systematically underestimates T.
In combination, however, those systematic errors
cancel out.
11
Another example

One problem with the use of self-report
questionnaire rating scales is that some people
tend to give high (or low) answers consistently
(i.e., regardless of the question being asked).
This is sometimes referred to as a yay-saying
or nay-saying bias. Acquiescence

12
1 strongly disagree 5 strongly agree
Item T S O
I think I am a worthwhile person. 4 1 5
I have high self-esteem. 4 1 5
I am confident in my ability to meet challenges in life. 4 1 5
My friends and family value me as a person. 4 1 5
Average score 4 1 5
In this example, we have someone with relatively
high self-esteem, but this person systematically
rates questions one point higher than he or she
should.
13
1 strongly disagree 5 strongly agree
If we reverse key half of the items, the bias
averages out. Responses to reverse keyed items
are counted in the opposite direction. T (4 4
6-2 6-2) / 4 4 O (5 5 6-3
6-3) / 4 4
Item T S O
I think I am a worthwhile person. 4 1 5
I have high self-esteem. 4 1 5
I am NOT confident in my ability to meet challenges in life. 2 1 3
My friends and family DO NOT value me as a person. 2 1 3
Average score 4 1 4
14
Validity

To the extent to which a measure has validity, we
say that it measures what it is supposed to
measure
Question How do you assess validity?

Very tough question to answer!
15
Different ways to think about validity

To the extent that a measure has validity, we can
say that it measures what it is supposed to
measure.
There are different reasons for measuring
psychological variables. The precise way in which
we assess validity depends on the reason that
were taking the measurements in the first place.

16
Prediction

As an example, if ones goal is to develop a way
to determine who is at risk for developing
schizophrenia, ones goal is prediction.

17
Predictive Validity

We may begin by obtaining a group of people who
have schizophrenia and a group of people who do
not.
Then, we may try to figure out which kinds of
antecedent variables differentiate the two groups.

18
Correct classifications
Lost a parent before the age of 10 10
Parent or grandparent had schizophrenia 50
Mother was cold and aloof to the person when he or she was a child 15
19
Predictive Validity

In short, some of these variables appear to be
better than others at discriminating
schizophrenics from non-schizophrenics
The degree to which a measure can predict what it
is supposed to predict is called its predictive
validity.
When we are taking measurements for the purpose
of prediction, we assess validity as the degree
to which those predictions are accurate or useful.

20
Reality Schizophrenic
Yes
10
No
Measure Schizophrenic
40
Yes
21
Reality Schizophrenic
Yes
No
10
10
No
Measure Schizophrenic
40
40
Yes
50 ( 40 10 / 100) people were correctly
classified (with a 50 base rate. Yuck.)
22
Reality Schizophrenic
Yes
No
0
98
No
Measure Schizophrenic
1
1
Yes
99 ( 98 1 / 100) people were correctly
classified, but note the base rate problem.
Cohens kappa is used to account for this
problem. Kappa in this example is 66
23
Construct Validity

Sometimes were not interested in measuring
something just for technological purposes, such
as prediction.
We may be interested in measuring a construct in
order to learn more about it
Example We may be interested in measuring
self-esteem not because we want to predict
something with the measure per se, but because we
want to know how self-esteem develops, whether it
develops differently for males and females, etc.

24
Construct Validity

Notice that this is much different than what we
were discussing before. In our schizophrenia
example, it doesnt matter whether our measure of
schizophrenia really measured schizophrenic
tendencies per se.
As long as the measure helps us predict
schizophrenia well, we dont really care what it
measures.

25
Construct Validity

When we are interested in the theoretical
construct per se, however, the issue of exactly
what is being measured becomes much more
important.
The general strategy for assessing construct
validity involves (a) explicating the theoretical
relations among relevant variables and (b)
examining the degree to which the measure of the
construct relates to things that it should and
fails to relate to things that it should not.

26
Nomological Network

The nomological network represents the
interrelations among variables involving the
construct of interest.

achieve in school
ability to cope

self- esteem
-
distrust friends
27
Nomological Network Validity

The process of assessing construct validity
basically involves determining the degree to
which our measure of the construct behaves in the
way assumed by the theoretical network in which
it is embedded.
If, theoretically, people with high self-esteem
should be more likely to succeed in school, then
our measure of self-esteem should be able to
predict peoples grades in school.

28
Construct Validity

Notice here that establishing construct validity
involves prediction. The difference between
prediction in this context and prediction in the
previous context is that we are no longer trying
to predict school performance as best as we
possibly can.
Our measure of self-esteem should only predict
performance to the degree to which we would
expect these two variables to be related
theoretically.

29
Discriminant Validity

The measure should also fail to be related to
variables that, theoretically, are unrelated to
self-esteem.
The ability of a measure to fail to predict
irrelevant variables is referred to as the
measures discriminant validity.

achieve in school
ability to cope

self- esteem
0
-
like coffee
distrust friends
30
Validity Assessing validity

Finally, it is useful, but not necessary, for a
measure to have face validity.
Face validity The degree to which a measure
appears to measuring what it is supposed to
measure.
A questionnaire item designed to measure
self-esteem that reads I have high self-esteem
has face validity. An item that reads I like
cabbage in my Frosted Flakes does not.
In the context of prediction, face validity
doesnt matter. In the context of construct
validity, it matters more.

31
A Final Note on Construct Validity

The process of establishing construct validity is
one of the primary enterprises of psychological
research.
When we are measuring the association between two
variables to assess a measures predictive or
discriminant validity, we are evaluating both (a)
the quality of the measure and (b) the soundness
of the nomological network.
It is not unusual for researchers to refine the
nomological network as they learn more about how
various measures are inter-related.