Chapter%204.%20Validity: - PowerPoint PPT Presentation

About This Presentation
Title:

Chapter%204.%20Validity:

Description:

We create a test to 'cover' this attribute. ... A test is said to have face validity if it 'looks like' it is going to measure ... – PowerPoint PPT presentation

Number of Views:174
Avg rating:3.0/5.0
Slides: 21
Provided by: McEw
Category:

less

Transcript and Presenter's Notes

Title: Chapter%204.%20Validity:


1
Chapter 4. Validity
  • Does the test cover what we are told (or believe)
  • it covers?
  • To what extent?
  • Is the assessment
  • being used for an
  • appropriate purpose?

2
Validity Topics
  • Definition (usual and refined)
  • Categories of validity evidence
  • A. face validity
  • B. content validity table of specifications,
    alignment analysis, opportunity to learn
  • C. criterion-related validity
  • D. construct validity
  • E. consequential validity
  • Test fairness

3
Introduction
  • Without good validity, all else is lost. Validity
    is the most important characteristic of a test or
    assessment technique.
  • Usual Definition
  • It measures what it purports to measure.
  • Refined Definition
  • It involves the interpretation of a score for a
    particular purpose or use (because, a score may
    be valid for one use but not another)
  • It is a matter of degree, not all-or-none. As a
    practical matter, our concern is to determine the
    extent (for example in non-mathematical terms we
    might say slight, moderate, considerable)

4
Some Helpful Terms
  • Construct
  • The trait or characteristic that interests us.
    We might call it a target or what we want to
    get at. We create a test to cover this
    attribute.
  • Validity addresses how well an assessment
    technique provides useful information about the
    construct / target.
  • Construct underrepresentation
  • The test we made is not assessing all of the
    construct our test misses things we should be
    assessing.
  • Construct irrelevant variance
  • The test we made is assessing things that are not
    really part of our construct we are assessing
    irrelevant stuff that we dont want.
  • see next two slides for illustrations

5
The Construct and Valid Measurement
6
Varying Degrees ofConstruct Underrepresentation
andConstruct Irrelevant Variance
7
A. Face ValidityThink of the idiom on the face
of it . . .
  • A test is said to have face validity if it "looks
    like" it is going to measure what it is supposed
    to measure
  • Face validity is not empirical one is saying
    that the test appears it will work, as opposed
    to saying it has been shown to work.
  • Face validity is often created to influence the
    opinions of participants who are not expert in
    testing methodologies, e.g. test takers, parents,
    politicians.

8
B. Content ValidityMost used in achievement
tests and employment exams
  • Meaning of this type of validity
  • there is a good match between the content of the
    test and some well-defined domain of knowledge or
    behavior. Reference to content defines the
    orientation of the test.
  • For teachers, considered most important type of
    validity for
  • your own classroom tests
  • achievement tests
  • Where do we find the well-defined domain
  • Examination of textbooks in the field with
    special attention to the learning objectives at
    beginning of chapter and terms at the end.
  • Curriculum guides of school districts
  • Ohios Academic Content Standards
  • So, we now we have the content topics identified,
    but what should we actually expect students to
    know and be able to do in relation to these
    topics? This question deals with process or
    depth indicators. How should we make sure we
    include both the content and the depth expected
    in our tests?

9
The Table of SpecificationsBuilding content
validity into my own classroom tests
  • Table of Specifications this connects the
    content determined earlier to the mental
    processes students are expected to employ
    regarding this content
  • Two way table
  • Content
  • Blooms taxonomy (simplest mental operation to
    the most complex)
  • Each test item I create then falls into one cell
  • By creating the table, I can see the relative
    weight assign to each cell. Is this what I want?

10
Alignment AnalysisChecking content validity in
existing tests
  • These steps are parallel to building your own
    good test and the table of specifications
    construction. There are some things to watch for
    and consider as you do this
  • Be wary of using the summary outline provided by
    the test maker examine the actual test items
  • Match items on test with content you are
    teaching watch for mismatches
  • Items on the test you are not teaching
  • Content you are teaching that is not tested
  • This matching requires considerable judgment
  • The test does not have to cover every detail it
    could be a representative sample
  • If stakes are high, use a panel of individuals

11
Opportunity to LearnBut was it taught . . .
  • An emerging idea related to content validity is a
    concern called instructional validity. This
    relates to your behavior as teacher. The content
    may be in the book the content may be in the
    state standards . . . BUT . . . did you actually
    teach it? Some teachers skip items of
    instruction they dont like, dont understand or
    dont have time for.
  • If related items appear on a test, this would
    reduce the validity of the test since the
    students had no opportunity to learn the
    knowledge or skill being assessed.

12
C. Criterion-Related ValidityWhile the term
test is used, also think measure or
procedure
  • The basic idea to demonstrate the degree of
    accuracy of a test by comparing it with another
    test, measure or procedure which has been
    demonstrated to be valid (i.e. a valued
    criterion).
  • Two general contexts
  • predictive validity - one measure is now one is
    later. The later test is known to be valid.
    This approach allows me to show my current test
    is valid by comparing it to a future valid test.
  • For example, a behind-the-wheel driving test has
    been shown to be an accurate test of driving
    skills. By comparing the scores on a written
    rules-of-the-road test with the scores from the
    driving test, the written test can be validated
    by using a criterion related strategy.
  • concurrent validity both measures are current.
    This approach allows me to show my test is valid
    by comparing it with an already valid test. I
    can do this if I can show my test varies directly
    with a measure of the same construct or
    indirectly with a measure of an opposite
    construct.
  • The computed statistic in both cases is r
    (which we now call a validity coefficient) and it
    has all the characteristics we have already
    discussed about correlations coefficients in
    general.

13
Special Considerations for Interpreting
Criterion-Related Validity
  • Group Variability
  • Greater the variability, the greater the r.
  • Reliability-Validity Relationship
  • Reliability limits validity reliability is a
    prerequisite to validity
  • Validity of the Criterion
  • How good is the criterion? Do you agree with the
    operational definition of the critierion?

14
D. Construct Validity
  • When we ask about a tests construct validity, we
    are taking a broad view of the test. Does the
    test adequately measure the underlying,
    unobserved construct?  The question is asked both
    in terms of
  • convergent validity, are test scores related to
    behaviors and tests that it should be related to
    and
  • divergent validity, are test scores unrelated to
    behaviors and tests that it should be unrelated
    to?
  • There is no single measure of construct
    validity.  Construct validity is based on the
    accumulation of knowledge about the test and its
    relationship to other tests and behaviors.
  • To establish construct validity, we demonstrate
    that the measure changes in a logical way when
    other conditions change.

15
E. Consequential ValidityRecent controversial
entry into assessment lexicon . . .
  • Some professionals feel that, in the real world,
    the consequences that follow from the use of
    assessments are important indications of
    validity.
  • Some professionals feel that these consequences
    are matters of politics and policymaking
    important considerations, yes, but not matters of
    validity.
  • On which side are we? As educators, we sometimes
    see the consequences as more important than the
    technical validity of the test. Judgments based
    on assessments we give and use have value
    implications and social consequences.
  • What is the intended use of these test scores?
  • How are the scores really being used?
  • Does this testing lead to educational benefits?
  • Are there negative spin-offs?

16
Test Fairness, Test Bias
  • Test fairness / test bias have the same meaning
    with opposite connotations
  • Fairness an assessment or test measures a
    trait, construct, or target with equal validity
    for different groups.
  • Bias the groups do not differ in terms of real
    status on the trait, construct, or target being
    assessed yet, the test suggests they do.

17
Methods of Reviewing Fairness
  • Test Companies (look in test manual to see what
    a particular company did about test fairness
    issues on this test)
  • Panel review - most popular but is this just
    face validity?
  • Differential item functioning (DIF) - subsets
  • Criterion-related validity whole test
  • Teacher Created Assessments (teachers need to
    be knowledgeable about, and sensitive to, issues
    of test fairness)
  • Is there anything about my test that will
    unfairly advantage or disadvantage a student or
    group of students?
  • Is there anything about the mechanics of the test
    that calls for skills other than those I intend
    to measure?

18
Practical Advice
  1. For building your own tests, think content
    validity.
  2. For judging externally prepared achievement test,
    start with a clear definition of whats to be
    covered.
  3. For criterion-related validity, take into account
    group variability and think about validity of
    the criterion.
  4. For test fairness (bias), distinguish between
    differences in groups average scores and group
    status on the trait.
  5. For your own assessments, try to eliminate the
    influence of any factors not related to what you
    want to measure.

19
Terms Concepts to Review andStudy on Your Own (1)
  • alignment analysis
  • Blooms taxonomy
  • concurrent validity
  • consequential validity
  • construct
  • construct irrelevant variance
  • construct underrepresentation
  • construct validity
  • content validity
  • criterion-related validity

20
Terms Concepts to Review andStudy on Your Own (2)
  • differential item functioning (DIF)
  • external criterion
  • face validity
  • Fairness (or its opposite, bias)
  • instructional validity
  • opportunity to learn
  • predictive validity
  • table of specifications (two-way table)
  • validity
  • validity coefficient
Write a Comment
User Comments (0)
About PowerShow.com