Evaluation: Testing, ObjectivetoTestItem Matching and Judgments of Worth - PowerPoint PPT Presentation

About This Presentation
Title:

Evaluation: Testing, ObjectivetoTestItem Matching and Judgments of Worth

Description:

James Marshall. Session Overview. Evaluation Approaches ... Establish goals-- set objectives-- tailor instruction to obj-- judge effectiveness. ... – PowerPoint PPT presentation

Number of Views:72
Avg rating:3.0/5.0
Slides: 30
Provided by: Educationa110
Category:

less

Transcript and Presenter's Notes

Title: Evaluation: Testing, ObjectivetoTestItem Matching and Judgments of Worth


1
Evaluation Testing, Objective-to-Test-Item
Matchingand Judgments of Worth
  • EDTEC 540
  • James Marshall

2
Session Overview
  • Evaluation Approaches
  • Testing one possible data point in evaluation
  • Norm-referenced
  • Criterion-referenced
  • Objective-to-test-item matching
  • Measurement error, reliability and validity

3
Evaluation, typically
  • Typically, it doesnt happen! That said, it
    should
  • And it is required for many funded projects
  • What happened? Were goals and objectives
    achieved? How can we find that out?
  • At the end is NOT the only time to measure worth.
    When else?
  • Strategies tests, observations, surveys, chats
    with managers, look at work, results

4
Evaluation Approaches
  • Objectivist
  • Belief in a reality that can be known and
    measured. Prevalent in education and our
    business.
  • Objectives-based, deceptively simple. Establish
    goals--set objectives-- tailor instruction to
    obj--judge effectiveness.
  • Measures are analytical/quantitative in nature.
  • Examples
  • Do first-graders know the letters of the
    alphabet?
  • Can the new account representative describe the
    features of each checking account as defined by
    the bank?
  • Others?
  • Advantages/disadvantages?

5
Evaluation Approaches
  • Constructivist
  • Belief that people construct their own realities.
    Advocates believe that truth is a matter of
    consensus, not measurement against an objective
    reality.
  • Evaluation creates detailed descriptions of that
    which is inside the head of the learner.
  • Reliance upon open-ended exercises, observation,
    cases and immersion in the field.
  • Observation is useful for us, in that IDs build
    prototypes, conduct formative evaluations, revise
    and cycle again.
  • Measures are qualitative in nature.
  • Examples
  • Role play exercise to deal with a hostile
    customer
  • Theme Park Tycoon running a theme park for a
    year
  • Essay question asking you to describe your
    understanding of Educational Technology
  • Advantages/disadvantages?

6
Evaluation Approaches
  • Postmodern/Critical
  • Objectivists proclaim objectivity.
    Constructivists approve of subjectivity.
    Postmoderns are social activists.
  • Focus on questions of power, Who are you to set
    objectives for others? Use of deconstruction to
    see whats inside texts and materials.
  • Most interested in the hidden curriculum, such as
    the teaching of traditional gender roles.
  • What does the curriculum teach?
  • Why should IDs care about this evaluation
    approach?

7
Evaluation FrameworksKirkpatricks Model
  • Level 4 Does it matter? Does it advance
    strategy?
  • Level 3 Are they doing it (objectives)
    consistently and appropriately?
  • Level 2 Can they do it (objectives)? Do they
    show the skills and abilities?
  • Level 1 Did they like the experience?
    Satisfaction? Use? Repeat use?

8
Evaluation Frameworks CIPP
  • Context assesses program/product needs, problems
    or opportunities specific to the project
    environment.
  • Input to assess, evaluate and allocate project
    resources in order to meet identified needs and
    objectives, solve problems, and optimize program
    impact.
  • Process assesses project implementation.
  • Product assesses planed and unintended
    (unforeseen) outcomes, both to keep a project on
    track and to determine effectiveness or impact.

9
Types of Tests
  • Used to evaluate changes in skills and knowledge
  • Is testing alone sufficient?

10
Test Types Norm-Referenced
  • Compare an individual's performance to the
    performance of other people.
  • Require varying item difficulties.
  • Assume not everybody is going to "get it"
  • Discern those who "got it" from those who didn't.

11
Normal Distribution
12
Test Types Norm-Referenced
  • Norm-referenced tests compare the individual to
    the group.
  • Accomplished statistically by norming the test
    with large numbers of people.
  • Consider
  • You sat for the GRE and received the following
    scores. You need to retake the test.
  • What is your study plan?

13
Test Types Norm-Referenced
  • Limitations
  • Not especially helpful for
  • identifying individual skill deficiencies
  • identifying weaknesses in the instruction

14
Test Types Criterion-Referenced
  • Compares an individual's performance to the
    acceptable standard of performance for those
    tasks.
  • Requires completely specified objectives.
  • Asks Can this person do that which has been
    specified in the objectives?
  • Results in yes-no decisions about competence.

15
Test Types Criterion-Referenced
  • Applications
  • Diagnosis of individual skill deficiencies
  • Certification of skills
  • Evaluation and revision of instruction
  • Limitations
  • Tend to focus on specific skills
  • Results may not reflect general aptitudes
  • Everyone may get an A

16
Which Test is Which?
NR CRT
IQ test GRE SDSU Writing Competency Red Cross
Lifesaving Certificate EDTEC 540 midterm and
final exams
17
Which Test is Which?
NR CRT
Give out a CA driver's license Pick students for
Russian lang. training Determine entrance into
medical school PADI Scuba Certification Select
one EDTEC scholarship recipient Figure out where
to revise a course Decide which students need
remediation
18
Utility of Test Scores
  • Selection screening (before)
  • mastery of prerequisites -- for
    remediation/placement
  • mastery of course objectives -- for acceleration
    (testing out)
  • Individual diagnosis and prescription (along the
    way)
  • Practice (along the way)
  • Grades summative scores (at or after the end)
  • promotion
  • certification and licensure
  • Administrative
  • course evaluation
  • trainer accountability

19
Criterion-referenced Test Items
Objectives Items
  • Here is a map of the USA with the states
    outlined-- but no names. Use the state
    abbreviations and fill them in-- you've got 15
    mins to get at least 45.
  • Take a look at this pair of shoes. What problems
    do you see? What will you need to fix them?
  • The goal of the instruction is "ID's will know
    how to write resumes." Write at least 2
    objectives with all four parts.

Given a map of the USA with state borders marked,
the lwbat write the abbreviation for 45 of 50
states in 15 mins. Given a pair of well-worn
shoes, the lwbat identify what's wrong with the
shoes and the tools and materials necessary to
fix them. Given a goal, lwbt write at least two
appropriate objectives with proper ABCD parts.

20
Matching Test Items to Objectives
  • Matching ensures validity
  • Validity is the extent to which the test measures
    what is important to performance. Does a high
    score on the test equate to high performance on
    the job?
  • The validity of a criterion-referenced test is
    enhanced when
  • objectives match real-world performances (based
    on solid analysis)
  • test items match stated objectives (including
    condition).

21
Match, or Not?
  • Given any stocked fruit or vegetable, the Ralphs
    Grocery Checker will be able to verbally state
    the code which matches the produce provided with
    100 accuracy.
  • Here is a persimmon from the produce department
    and the produce code job aid. Please state the
    produce code for this item. You may examine the
    persimmon and reference the job aid.

22
Match, or Not?
  • Given a tree in need of pruning, the gardeners
    apprentice will be able to select the correct
    tree pruning device, based upon the type of tree
    presented.
  • Here is an overgrown elm tree. Please select the
    appropriate tool with which you will prune the
    tree.

23
Match, or Not?
  • Given a descriptive order for a Café Mocha,
    including size, caf/decaf, type of milk, the
    barista will be able to create the drink as
    specified in the Starbucks Guide to Coffee
    Creations.
  • A customer has just ordered a Grande, non-fat,
    mocha. Please list the ingredients you will
    need, and describe the steps you would take to
    create the drink.

24
Evaluating a Training Program
  • Consider
  • Your evaluation uses a criterion-based test to
    see if the new account representatives can
    describe the different types of accounts offered
    by the bank.
  • All representatives were able to meet the
    specified criteria
  • Case closed or, do you want to know more?

25
Ideas in Testing
  • Measurement Error
  • Validity
  • Reliability

26
Measurement Error
  • Many causes
  • mechanical or scoring errors
  • poor wording (confusing, ambiguous)
  • poor subject matter, content (validity)
  • score variation from one time to another
    (reliability)
  • score variation from "equivalent" tests
  • test administration procedure
  • inter-rater reliability
  • mood of the student

27
Validity
  • Does the test assess what's important? Does it
    really seek out the skill and knowledge linked to
    the world? (content validity)
  • Types
  • Content Validity (most important to us)
  • Predictive Validity (e.g. SAT, GRE)

28
Reliability
  • Are the scores produced by the test trustworthy
    and stable over time?
  • Assessed by
  • parallel (equivalent) forms or test-retest
  • internal consistency

29
Testing and Evaluation
  • A Look Ahead
  • ED 690 Procedures of Investigation
  • Provides introduction to evaluation procedures
    and methods
  • Introduces research process, statistical analysis
  • ED 791A, 791B, 791C
  • Evaluation sequence most often completed by EDTEC
    students, over writing a thesis
  • Conduct a full-scale evaluation (design,
    research, report) for a living, breathing client
    over a two-semester timeframe
Write a Comment
User Comments (0)
About PowerShow.com