Chapter%204.%20Validity: presentation

About This Presentation

Transcript and Presenter's Notes

Title: Chapter%204.%20Validity:

1
Chapter 4. Validity

Does the test cover what we are told (or believe)
it covers?
To what extent?
Is the assessment
being used for an
appropriate purpose?

2
Validity Topics

Definition (usual and refined)
Categories of validity evidence
A. face validity
B. content validity table of specifications,
alignment analysis, opportunity to learn
C. criterion-related validity
D. construct validity
E. consequential validity
Test fairness

3
Introduction

Without good validity, all else is lost. Validity
is the most important characteristic of a test or
assessment technique.
Usual Definition
It measures what it purports to measure.
Refined Definition
It involves the interpretation of a score for a
particular purpose or use (because, a score may
be valid for one use but not another)
It is a matter of degree, not all-or-none. As a
practical matter, our concern is to determine the
extent (for example in non-mathematical terms we
might say slight, moderate, considerable)

4
Some Helpful Terms

Construct
The trait or characteristic that interests us.
We might call it a target or what we want to
get at. We create a test to cover this
attribute.
Validity addresses how well an assessment
technique provides useful information about the
construct / target.
Construct underrepresentation
The test we made is not assessing all of the
construct our test misses things we should be
assessing.
Construct irrelevant variance
The test we made is assessing things that are not
really part of our construct we are assessing
irrelevant stuff that we dont want.
see next two slides for illustrations

5
The Construct and Valid Measurement
6
Varying Degrees ofConstruct Underrepresentation
andConstruct Irrelevant Variance
7
A. Face ValidityThink of the idiom on the face
of it . . .

A test is said to have face validity if it "looks
like" it is going to measure what it is supposed
to measure
Face validity is not empirical one is saying
that the test appears it will work, as opposed
to saying it has been shown to work.
Face validity is often created to influence the
opinions of participants who are not expert in
testing methodologies, e.g. test takers, parents,
politicians.

8
B. Content ValidityMost used in achievement
tests and employment exams

Meaning of this type of validity
there is a good match between the content of the
test and some well-defined domain of knowledge or
behavior. Reference to content defines the
orientation of the test.
For teachers, considered most important type of
validity for
your own classroom tests
achievement tests
Where do we find the well-defined domain
Examination of textbooks in the field with
special attention to the learning objectives at
beginning of chapter and terms at the end.
Curriculum guides of school districts
Ohios Academic Content Standards
So, we now we have the content topics identified,
but what should we actually expect students to
know and be able to do in relation to these
topics? This question deals with process or
depth indicators. How should we make sure we
include both the content and the depth expected
in our tests?

9
The Table of SpecificationsBuilding content
validity into my own classroom tests

Table of Specifications this connects the
content determined earlier to the mental
processes students are expected to employ
regarding this content
Two way table
Content
Blooms taxonomy (simplest mental operation to
the most complex)
Each test item I create then falls into one cell
By creating the table, I can see the relative
weight assign to each cell. Is this what I want?

10
Alignment AnalysisChecking content validity in
existing tests

These steps are parallel to building your own
good test and the table of specifications
construction. There are some things to watch for
and consider as you do this
Be wary of using the summary outline provided by
the test maker examine the actual test items
Match items on test with content you are
teaching watch for mismatches
Items on the test you are not teaching
Content you are teaching that is not tested
This matching requires considerable judgment
The test does not have to cover every detail it
could be a representative sample
If stakes are high, use a panel of individuals

11
Opportunity to LearnBut was it taught . . .

An emerging idea related to content validity is a
concern called instructional validity. This
relates to your behavior as teacher. The content
may be in the book the content may be in the
state standards . . . BUT . . . did you actually
teach it? Some teachers skip items of
instruction they dont like, dont understand or
dont have time for.
If related items appear on a test, this would
reduce the validity of the test since the
students had no opportunity to learn the
knowledge or skill being assessed.

12
C. Criterion-Related ValidityWhile the term
test is used, also think measure or
procedure

The basic idea to demonstrate the degree of
accuracy of a test by comparing it with another
test, measure or procedure which has been
demonstrated to be valid (i.e. a valued
criterion).
Two general contexts
predictive validity - one measure is now one is
later. The later test is known to be valid.
This approach allows me to show my current test
is valid by comparing it to a future valid test.
For example, a behind-the-wheel driving test has
been shown to be an accurate test of driving
skills. By comparing the scores on a written
rules-of-the-road test with the scores from the
driving test, the written test can be validated
by using a criterion related strategy.
concurrent validity both measures are current.
This approach allows me to show my test is valid
by comparing it with an already valid test. I
can do this if I can show my test varies directly
with a measure of the same construct or
indirectly with a measure of an opposite
construct.
The computed statistic in both cases is r
(which we now call a validity coefficient) and it
has all the characteristics we have already
discussed about correlations coefficients in
general.

13
Special Considerations for Interpreting
Criterion-Related Validity

Group Variability
Greater the variability, the greater the r.
Reliability-Validity Relationship
Reliability limits validity reliability is a
prerequisite to validity
Validity of the Criterion
How good is the criterion? Do you agree with the
operational definition of the critierion?

14
D. Construct Validity

When we ask about a tests construct validity, we
are taking a broad view of the test. Does the
test adequately measure the underlying,
unobserved construct? The question is asked both
in terms of
convergent validity, are test scores related to
behaviors and tests that it should be related to
and
divergent validity, are test scores unrelated to
behaviors and tests that it should be unrelated
to?
There is no single measure of construct
validity. Construct validity is based on the
accumulation of knowledge about the test and its
relationship to other tests and behaviors.
To establish construct validity, we demonstrate
that the measure changes in a logical way when
other conditions change.

15
E. Consequential ValidityRecent controversial
entry into assessment lexicon . . .

Some professionals feel that, in the real world,
the consequences that follow from the use of
assessments are important indications of
validity.
Some professionals feel that these consequences
are matters of politics and policymaking
important considerations, yes, but not matters of
validity.
On which side are we? As educators, we sometimes
see the consequences as more important than the
technical validity of the test. Judgments based
on assessments we give and use have value
implications and social consequences.
What is the intended use of these test scores?
How are the scores really being used?
Does this testing lead to educational benefits?
Are there negative spin-offs?

16
Test Fairness, Test Bias

Test fairness / test bias have the same meaning
with opposite connotations
Fairness an assessment or test measures a
trait, construct, or target with equal validity
for different groups.
Bias the groups do not differ in terms of real
status on the trait, construct, or target being
assessed yet, the test suggests they do.

17
Methods of Reviewing Fairness

Test Companies (look in test manual to see what
a particular company did about test fairness
issues on this test)
Panel review - most popular but is this just
face validity?
Differential item functioning (DIF) - subsets
Criterion-related validity whole test
Teacher Created Assessments (teachers need to
be knowledgeable about, and sensitive to, issues
of test fairness)
Is there anything about my test that will
unfairly advantage or disadvantage a student or
group of students?
Is there anything about the mechanics of the test
that calls for skills other than those I intend
to measure?

18
Practical Advice

For building your own tests, think content
validity.
For judging externally prepared achievement test,
start with a clear definition of whats to be
covered.
For criterion-related validity, take into account
group variability and think about validity of
the criterion.
For test fairness (bias), distinguish between
differences in groups average scores and group
status on the trait.
For your own assessments, try to eliminate the
influence of any factors not related to what you
want to measure.

19
Terms Concepts to Review andStudy on Your Own (1)

alignment analysis
Blooms taxonomy
concurrent validity
consequential validity
construct
construct irrelevant variance
construct underrepresentation
construct validity
content validity
criterion-related validity

20
Terms Concepts to Review andStudy on Your Own (2)

differential item functioning (DIF)
external criterion
face validity
Fairness (or its opposite, bias)
instructional validity
opportunity to learn
predictive validity
table of specifications (two-way table)
validity
validity coefficient

Write a Comment

User Comments (0)

About PowerShow.com

Chapter%204.%20Validity: PowerPoint PPT Presentation