Construction and analysis of test - PowerPoint PPT Presentation

1 / 58
About This Presentation
Title:

Construction and analysis of test

Description:

Construction and analysis of test – PowerPoint PPT presentation

Number of Views:126
Avg rating:3.0/5.0
Slides: 59
Provided by: Dell497
Category:

less

Transcript and Presenter's Notes

Title: Construction and analysis of test


1
ITEM ANALYSIS
2
CONSTRUCTION of test analysis
3
Characteristics of Good Test
  • Validity
  • It refers to the appropriateness or truthfulness
    of a tool. A tool is valid if it measures what it
    is supposed to measure.
  • Reliability
  • It refers to the trustworthiness or consistency
    of measurement of a tool , whatever it measures.

4
  • Objectivity
  • Refers to the absence of subjective bias in the
    interpretation of responses obtained by a tool.
  • Economy
  • The test should be simple and administered in a
    short time , saving money and time.

5
  • Practicability or Feasibility
  • The test should not require special infra-
    structure like dark room, one way see-through
    room etc.

6
Decision to gather evidence
â
Decision to allocate resources
â
Content analysis and test blue print
â
Item writing
â
Item review 1
â
Planning item scoring
â
Production of trial tests
â
Trials
â
Item review 2
â
Amendment (revise/replace/discard)
â
More items needed?
â
No
â
Assembly of final tests
7
Trail Test
  • It involves time and resources
  • Prepare content analysis and blue print
  • Review each item before trail testing

8
Content Analysis
  • What is the area of curriculum is selected?
  • Are there significant sections in the content?
  • Are there significant subdivisions in the
    content?
  • Which of the representative areas should include ?

9
Blue Print
  • Title
  • Fundamental purpose
  • The aspects of curriculum covered
  • For whom the test is constructed
  • Time ,date, who will administer and who will
    score
  • Weightage for recall , comprehensive and
    reflective thinking

10
Blue Print
content Recall comprehension Critical thinking Total
PROSE 2 ITEMS 2 ITEMS 5 ITEMS 9
POETRY 2 ITEMS 4 ITEMS 5 ITEMS 11
GRAMMER 2 ITEMS 4 ITEMS 12 ITEMS 18
CRITICISM 4 ITEMS 2 ITEMS --------------- 6
COMPARISIONS 4 ITEMS 2 ITEMS ------------------- 6
TOTAL 14 ITEMS 14 ITEMS 22 ITEMS 50
11
Item Specification
content Recall comprehension Critical thinking Total
PROSE Items 2,5 Items 12,23 Items 28 ,31,32,40 ,50 9
POETRY Items 6,10 Items 13,14,16,17 Items 33,36,37,38.39 11
GRAMMER Items 1,7 Items 18,19,20,21 Items 21,29,30,41,42,43,44,45,46,47,48,49 18
CRITICISM Items 3,4,8,9 Items 34,35 --------------- 6
COMPARISIONS Items 11,15,22,25 Items 26,27 ------------------- 6
TOTAL 14 ITEMS 14 ITEMS 22 ITEMS 50
12
Scoring Key
1 2 3 4 5 6 7 8 9 10
2 5 1 2 3 4 1 4 5 3
13
Item Revision-1
  • The dependable inferences can be made about the
    choice of the content
  • All important parts of curriculum is addressed
  • Achievement over the range is assessed

14
How to review?
  • Is the item is clear in expression ?
  • Are the items expressed in a simplest possible
    language ?
  • Are there unintended clues to correct answer?
  • Is the format reasonably consistent?
  • Is there a single, clearly correct answer for
    each item ?
  • Is the type of item appropriate to the
    information required ?
  • Are there enough items to provide adequate
    coverage to behaviour to be assessed ?

15
Purpose of Trail Test
  • Establishes the difficulty of each item
  • Identify the distracters which do not appear
    plausible.
  • Suggest number of items to be included in the
    final test
  • Establishing the contribution of each item to the
    discrimination between candidates who achieve low
    and high.
  • Check the adequacy of the administration
    instructions to identify misconceptions held by
    the students through analysis of their responses.

16
Choosing a Sample
  • Sample of 100 to 150 students of varied abilities
    may be selected
  • Approximately male and female students are equal
  • Judgment Sampling technique- Target group

17
Try out of the Test
  • The test to be administered on a representative
    sample , chosen from the target population for
    whom the test is intended , and scored . This
    pilot study will be useful for the following
  • To identify the weak or defective item and to
    reveal needed improvements.
  • To determine the difficulty level and
    discriminating power of each individual item in
    order that a selection of item may be made.

18
  • To provide data needed to determine appropriate
    time limit for the final test.
  • To standardize the instruction and procedures.
  • To know how to organize the items.
  • To decide the proper format.

19
Scoring of Trail Test
  • Needs training
  • Not according to the scorers' judgment
  • Refer to scoring key
  • Mechanical scoring is recommended to maintain
    accuracy

20
Scores in the Matrix
Item GEET RAI RAJU RANI SURI POO RITA JOE CATH RUTH Total
1 1 1 1 1 1 0 1 0 0 1 7
2 1 0 0 1 0 1 0 0 0 0 3
3 1 1 1 1 1 1 1 1 0 0 8
4 1 1 1 0 1 0 1 1 1 0 7
5 1 1 1 1 1 1 1 1 1 1 10
6 1 1 0 0 1 1 0 0 1 0 5
7 1 1 1 1 0 1 0 1 0 0 6
8 1 0 1 0 0 0 0 1 0 0 3
9 1 0 0 0 0 1 0 0 0 0 2
10 1 1 1 1 1 0 1 0 0 0 6
Total 10 7 7 6 6 6 5 5 3 2 57
21
Arranging Pupil
  • After scoring the test in the trial test ,
    according to the total score value , individuals
    are placed in order from high to low .

22
Arranging Pupils' Scores
Item GEET RAI RAJU RANI SURI POO RITA JOE CATH RUT Total
5 1 1 1 1 1 1 1 1 1 1 10
3 1 1 1 1 1 1 1 1 0 0 8
1 1 1 1 1 1 0 1 0 0 1 7
4 1 1 1 0 1 0 1 1 1 0 7
7 1 1 1 1 0 1 0 1 0 0 6
10 1 1 1 1 1 0 1 0 0 0 6
6 1 1 0 0 1 1 0 0 1 0 5
2 1 0 0 1 0 1 0 0 0 0 3
8 1 0 1 0 0 0 0 1 0 0 3
9 1 0 0 0 0 1 0 0 0 0 2
Total 10 7 7 6 6 6 5 5 3 2 57
23
Indices of difficulty and discriminating power of
items
  • Top 27 constitutes the high achievers and the
    bottom 27 constitutes the low achieving group.
  • The indices of discriminating power and
    difficulty level are computed for each item of
    the test using the following formulae.

24
Analysis of an Item
I 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
0 1 0 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1
25
  • Discriminating power Ph-Pl
  • U
  • Difficulty level (Ph Pl )
  • U
  • Ph the proportion of pupils in the high
    achieving group who answered the items correctly.
  • Pl the proportion of pupils in the low achieving
    group who answered the items correctly.
  • UTotal number of pupils in both groups

26
Types of Discriminators
  • Positive Discriminator
  • Negative Discriminator
  • Non Discriminator

27
Graphical Analysis of Scores
  • Acceptable may be acceptable correct answer
    response pattern.
  • Non acceptable correct answer response pattern.

28
Criteria for Selection
Discriminating Power Difficulty Level
.4 above Excellent item Between .4 .6 Average difficulty
Between .4 .3 Good Between .2 .4 Difficult item
Between .2 .3 Average item Between .6 .8 Easy item
Between .2 .1 Requires improvement Between .8 1 Very easy item
Less than .1 Item to be dropped Between 0 .2 Very difficult item
29
Is this a good item ?
  • Compute the difficulty and discrimination indices
    for an item administered to 263 pupils where 74
    pupils answered the item correctly, 32 pupils in
    upper group and 23 pupil in the lower group
    passed the item.
  • Is this a good item ?

30
Is this a good item ?
  • Compute the difficulty and discrimination indices
    of a test item administered to 84 pupils if 52
    test takers answered the item correctly, 20 in
    the upper group and 12 in the lower group.
  • Is this a good item ?

31
Selection of Items
  • Based on the calculated values of item
    discrepancy and difficulty , appropriate items
    are chosen for the final form of the standardized
    test.
  • Arranged the items in the increasing order of
    difficulty.

32
Assembly of the test in the final form
  • Based upon discriminating power items are first
    chosen and among the so chosen items, items with
    proper difficulty level are finally selected for
    the final form.
  • Care should be taken to see that at least 50 of
    the items are of average difficulty, 25 are easy
    , 20 difficult and 5 are very difficult.

33
  • A detailed scoring scheme is also to be prepared
  • so as to ensure objective evaluation of pupil
    responses.
  • Appropriate instruction/procedure for
    administering the test has also to be developed
    and incorporated suitably in the test.

34
Advantages of Item Analysis
  • Powerful technique to improve instruction.
  • Helpful for guidance.
  • Valid measures of instructional objectives.
  • Gives clue to the nature of the misunderstanding
    and suggests remediation.

35
Reliability
  • Stability and trustworthiness is called
    reliability.
  • It should be free from error.
  • (E.G.) Standford Binets I.Q.
  • The score is a good estimate of the childs
    mental ability.

36
Methods of determining Reliability
  • Four procedure for computing reliability
    coefficient.
  • Test Retest method
  • Alternative or Parallel form
  • Split half technique
  • Rational Equivalence

37
Test Retest Method
  • Repetition of the test is the most simplest
    method of determining agreement between two sets
    of scores.
  • The test is given and repeated on the same group
    and the correlation computed between the first
    and second set of scores.

38
Defects in Test Retest method
  • If the test is repeated immediately, many
    subjects will recall their first answer- tend to
    increase their scores.
  • Practice and confidence induced by familiarity
    also affect scores.
  • If the interval is longer ( six month) growth
    changes will effect the retest.
  • Because of these defects test retest is generally
    less useful than are the other methods.

39
Alternative or Parallel form method
  • When alternative or parallel forms of a test can
    be constructed , the correlation between form A
    and form B may be taken as a measure of the self
    correlation of the test.
  • The alternative form method is satisfactory when
    sufficient time has intervened between the
    administration of the two forms to weaken or
    eliminate memory and practice effects.

40
  • When form B of a test follows form A closely ,
    scores on the second form of the test will often
    be increased because of familiarity.
  • If such increases are approximately constant
    (3 to 5 points) the reliability coefficient of
    the test will not be affected, since the paired A
    and B scores maintain the same relative positions
    in the two distributions.

41
  • In drawing up alternative test forms ,care must
    be exercised to match test materials for content,
    difficulty and form.
  • When alternative forms are virtually identical ,
    reliability will be too high otherwise
    reliability will be too low.
  • An interval of at least two to four weeks should
    be allowed between administration of the test.

42
The split half method
  • In this method the test is first divided into two
    equivalent haves and the correlation found for
    these half tests .
  • From the reliability of the half test the self
    correlation of the whole test is then estimated
    by the Spearman Brown Prophecy formula.

43
  • The split half method is regarded by many as the
    best of the methods for measuring test
    reliability.

44
  • Advantage
  • Advantage is the fact that all data for computing
    reliability are obtained upon one occasion. So
    that variations brought about by difference
    between the two testing situations are
    eliminated.

45
  • How to divide ?
  • Alternative Statements
  • All the items are of equal difficulty

46
Method of Rational Equivalence
  • This method represents an attempt to get an
    estimate of the reliability of a test free from
    the objections raised against the methods
    outlined above.
  • Two forms of tests are equivalent when the items
    a A , b B ,c C etc are inter changeable and when
    the inter item correlations are the same for both
    forms.

47
Errors
  • Chance Error
  • Many psychological factors affect the
    reliability coefficient of a test fluctuations
    in interest and attention shifts in emotional
    attitude and differential effects of memory and
    practice.
  • The environmental factors such as distractions,
    noise , interruptions, scoring errors etc all
    these are called chance error or error of
    measurement
  • The scores may go up or down from the true value.

48
  • Constant Errors
  • Constant errors work in only one direction .
    Constant error raise or lower all of the scores
    on a test but doesn't affect the reliability
    coefficient.
  • Such errors are easily be avoided than are chance
    errors by subtracting two points from a retest
    score to allow for practice.

49
Validity
  • The validity of a test or of any measuring
    instrument , depends upon the fidelity with which
    it measures , what it purports to measure.
  • A test is valid when the performances which it
    measures correspond to the same performances as
    otherwise independently measured or objectively
    defined.

50
Difference between Reliability and Validity
  • Suppose that a clock is set forward 20 minutes ,
    if the clock is a good time piece the time it
    tells will be reliable(consistent) but will not
    be valid as judged by standard time.
  • Validity is a relative term.

51
  • A test is valid for a particular purpose or in a
    particular situation it is not generally valid.

52
  • Content Validity
  • This requires content analysis. Validity inferred
    by subject experts after going through the test
    items and giving their opinions to what extent
    the test items forms a fair representative sample
    of the universe of items that could be , form the
    content areas being tested.

53
  • Construct Validity
  • This is the functional aspect of content
    validity.
  • Suppose the test is to measure the creative
    writing of students , then the items should cover
    the creative expression only.
  • A well known test on creative expression , as
    well as the newly constructed creative expression
    test both are administered to a group of students
    for whom it is meant.
  • The coefficient correlation computed for the
    scores from the two tests is an index of validity
    of the newly constructed test.

54
  • Predictive Validity
  • It is concerned with the relation of test scores
    to some measures on future performance.
  • If scores on a spelling test help us to
    differentiate between pupils who will succeed and
    pupil who fail in stenography course, then we can
    infer that the spelling test has predictive
    validity as far as stenography is concerned.
  • This type of validity is mainly useful in
    evaluating aptitude tests.

55
Relations of Validity and Reliability
  • They differ to different aspects for test
    efficiency.
  • A reliable test is theoretically valid ,but may
    be practically invalid , as judged by its
    correlations with various independent criteria.
  • A highly valid test cannot be unreliable since
    its correlation with a criterion is limited by
    its own index of reliability.

56
  • Want to have a best choice
  • then
  • ANALYSE AND CHOOSE

57
(No Transcript)
58
THANKS FOR MAKING ME HAPPY
Write a Comment
User Comments (0)
About PowerShow.com