Designing a Classroom Test - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

Designing a Classroom Test

Description:

Ensure the test assesses the level or depth of learning you want to measure ... Guidelines for University Faculty (Brigham Young University: (testing.byu.edu ... – PowerPoint PPT presentation

Number of Views:1312
Avg rating:3.0/5.0
Slides: 46
Provided by: kumc
Category:

less

Transcript and Presenter's Notes

Title: Designing a Classroom Test


1
Designing a Classroom Test
  • Anthony Paolo, PhD
  • Director of Assessment Evaluation
  • Office of Medical Education
  • Psychometrician for CTC
  • Teaching Learning Technologies
  • September 2008

2
Content
  • Purpose of classroom test
  • Test blueprint specifications
  • Item writing
  • Assembling the test
  • Item analysis

3
Purpose of Classroom Test
  • Establish basis for assigning grades
  • Determine how well each student has achieved
    course objectives
  • Diagnose student problems
  • Identify areas where instruction needs
    improvement
  • Motivate students to study
  • Communicate what material is important

4
Test Blueprint
  • To ensure the test assesses what you want to
    measure
  • Ensure the test assesses the level or depth of
    learning you want to measure

5
Blooms Revised Cognitive Taxonomy
  • Remembering Understanding
  • Remembering Retrieving, recognizing, recalling
    relevant knowledge.
  • Understanding Constructing meaning from
    information through interpreting, classifying,
    summarizing, inferring, explaining.
  • ITEM TYPES MC, T/F, Matching, Short Answer
  • Applying Analyzing
  • Applying Implementing a procedure or process.
  • Analyzing Breaking material into constituent
    parts, determining how the parts relate to one
    another and to an overall structure or purpose
    through differentiating, organizing, and
    attributing.
  • ITEM TYPES MC, Short Answer, Problems, Essay
  • Evaluating Creating
  • Evaluating Making judgments based on criteria
    standards through checking and critiquing.
  • Creating Putting elements together to form a
    coherent or functional whole reorganizing
    elements into a new pattern or structure through
    generating, planning, or producing.
  • ITEM TYPES MC, Essay

6
Test Blueprint
7
Test Specifications
  • To ensure the test covers the content and/or
    objectives in the proper proportions

8
Test Specifications
9
Item Writing General Guidelines1
  • Present a single clearly defined problem that is
    based on a significant concept rather then
    trivial or esoteric ideas
  • Use simple, precise unambiguous wording
  • Exclude extraneous or irrelevant information
  • Eliminate any systematic pattern of answers that
    may allow guessing correctly

10
Item Writing General Guidelines2
  • Avoid cultural, racial, ethnic sexual bias.
  • Avoid presupposed knowledge which favors one
    group over another (fly ball favors those that
    know baseball)
  • Refrain from providing unnecessary clues to the
    correct answer.
  • Avoid negatively phrased items (i.e., except,
    not)
  • Arrange answers in alphabetical / numerical order

11
Item Writing General Guidelines3
  • Avoid None of the above or All of the above
    type answers
  • Avoid Both A B or Neither A or B type
    answers

12
Item Writing Correct Answer is
  • Longer
  • More qualified or more general
  • Uses familiar phraseology
  • Is grammatically correct for item stem
  • Is 1 of the 2 similar statements
  • Is 1 of the 2 opposite statements

13
Item Writing Wrong Answer is
  • Usually the first or last option
  • Contain extreme words (always, never, nonsense,
    etc.)
  • Contain unexpected language or technical terms
  • Contain flippant remarks or completely
    unreasonable statements

14
Item Writing Grammatical Cues
15
Item Writing Logical Cues
16
Item Writing Absolute Terms
17
Item Writing Word Repeats
18
Item Writing Vague Terms
19
Item Writing Vague Terms
20
Item Writing
  • Effective test items match the desired depth of
    learning as directly as possible
  • Applying Analyzing
  • Applying Implementing a procedure or process.
  • Analyzing Breaking material into constituent
    parts, determining how the parts relate to one
    another and to an overall structure or purpose
    through differentiating, organizing, and
    attributing.
  • ITEM TYPES MC, Short Answer, Problems, Essay

21
Comparison of MC Essay1
22
Comparison of MC Essay2
23
Item Writing - Application
  • MC application of knowledge items tend to have
    long vignettes that require decisions.
  • Case, et al. at the NBME investigated the impact
    of increasing levels of interpretation, analysis
    and synthesis required to answer a question on
    item performance.
  • (Academic Medicine, 199671528-530)

24
Item Writing - Application
25
Item Writing - Application
26
Item Writing - Application
27
Preparing Assembling the Test
  • Provide general directions
  • Time allowed (allow enough time to complete test)
  • How items are scored
  • How to record answers
  • How to record name /ID
  • Arrange items systematically
  • Provide adequate space for short answer and essay
    responses
  • Placement of easier harder items

28
Interpreting test scores
  • Teachers
  • High scores good instruction
  • Low scores poor students
  • Students
  • High scores smart, well-prepared
  • Low scores poor teaching, bad test

29
Interpreting test scores
  • High scores
  • too easy, only measured simple educational
    objectives, biased scoring, cheating,
    unintentional clues to right answers
  • Low scores
  • too hard, tricky questions, content not covered
    in class, grader bias, insufficient time to
    complete test

30
Item Analysis
  • Main purpose of item analysis is to improve the
    test
  • Analyze items to identify
  • Potential mistakes in scoring
  • Ambiguous/tricky items
  • Alternatives that do not work well
  • Problems with time limits

31
Reliability
  • The reliability of a test refers to the extent to
    which a test is likely to produce consistent
    results.
  • Test-Retest
  • Split-Half
  • Internal consistency
  • Reliability coefficients range from 0 (no
    reliability) to 1 (perfect reliability)
  • Internal consistency usually measured by
    Kuder-Richardson 20 (KR-20) or Cronbachs
    coefficient alpha

32
Internal Consistency Reliability
  • High reliability means that the questions of the
    test tended to hang together. Students that
    answered a given question correctly were more
    likely to answer other questions correctly.
  • Low reliability means that the questions tended
    to be unrelated to each other in terms of who
    answered them correctly.

33
Reliability Coefficient Interpretation
  • General guidelines for homogeneous tests
  • .80 and above Very good reliability
  • .70 to .80 Good reliability, a few test items
    may need to be improved
  • .50 to .70 Somewhat low, several items will
    likely need improvement (unless short test 15 or
    fewer items)
  • .50 and below Questionable reliability, test
    likely needs revision

34
Item difficulty1
  • Proportion of students that got the item correct
    (ranges from 0 to 100)
  • Helps evaluate if an item is suited to the level
    of examinee being tested.
  • Very easy or very hard items cannot adequately
    discriminate between student performance levels.
  • Spread of student scores is maximized with items
    of moderate difficulty.

35
Item difficulty2
  • Moderate item difficulty is the point halfway
    between a perfect score and a chance score.

36
Item discrimination1
  • How well does the item separate those that know
    the material from those that do not.
  • In LXR, measured by the Point-Biserial (rpb)
    correlation (ranges from -1 to 1).
  • rbp is the correlation between item and exam
    performance

37
Item discrimination2
  • rpb means that those scoring higher on the exam
    were more likely to answer the item correctly.
    (better discrimination)
  • - rpb means that high scorers on the exam
    answered the item wrong more frequently than low
    scorers. (poor discrimination)
  • A desirable rpb correlation is 0.20 or higher.

38
Evaluation of Distractors
  • Distractors are designed to fool those that do
    not know the material. Those that do not know
    the answer, guess among the choices.
  • Distractors should be equally popular.
  • ( expected answered item wrong / of
    distractors)
  • Distractors ideally have a low or -rpb

39
LXR Example 1( correct answer)
Very easy item, would probably review the
alternates to make sure they are not ambiguous
and/or provide clues that they are wrong.
40
LXR Example 2( correct answer)
Three of the alternatives are not functioning
well, would review them.
41
LXR Example 3( correct answer)
Probably a miskeyed item. The correct answer is
likely option E.
42
LXR Example 4( correct answer)
Relatively hard item with good discrimination.
Would review alternatives C D to see why they
attract a relatively low high number of
students.
43
LXR Example 5( correct answer)
Poor discrimination for correct choice B.
Choice E actually does a better job
discriminating. Would review item for proper
keying, ambiguous wording, proper wording of
alternatives, etc. This item needs revision.
44
Resources
  • Constructing Written Test Questions for the Basic
    and Clinic Sciences (www.nbme.org)
  • How to Prepare Better Multiple-Choice Test Items
    Guidelines for University Faculty (Brigham Young
    University (testing.byu.edu/info/handbooks/better
    items.pdf)

45
Thank you for your timeQuestions ???
Write a Comment
User Comments (0)
About PowerShow.com