Quality Control in Evaluation and Assessment - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

Quality Control in Evaluation and Assessment

Description:

Title: Technology and Testing: Opportunity or Threat? Author: Charles Alderson Last modified by: charles Created Date: 3/26/2002 1:05:21 PM Document presentation format – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 33
Provided by: CharlesA88
Category:

less

Transcript and Presenter's Notes

Title: Quality Control in Evaluation and Assessment


1
Quality Control in Evaluation and Assessment
  • J Charles Alderson,
  • Department of Linguistics and Modern English
    Language,
  • Lancaster University

2
  • Assessment is central to language learning, in
    order
  • to establish
  • where learners are at present,
  • what level they have achieved,
  • to give learners feedback on their learning,
  • to diagnose their needs for further development,
    and
  • to enable the planning of curricula, materials
    and activities.

3
Outline
  • Current practice
  • Assessment for certification
  • Tradition one teacher-centred, school-based
  • Tradition two central, quality controlled
  • Basic parameters
  • What is needed to ensure parameters are met

4
Current practice
  • Quality of important examinations not monitored
  • No obligation to show that exams are relevant,
    fair, unbiased, reliable, and measure relevant
    skills
  • University degree in a foreign language qualifies
    one to examine language competence, despite lack
    of training in language testing
  • In many circumstances merely being a native
    speaker qualifies one to assess language
    competence.
  • Teachers assess students ability without having
    been trained.

5
First tradition
  •     Teacher-centred
  •     School/university-based assessment
  •     Teacher develops the questions
  •     Teacher's opinion the only one that counts
  •     Teacher-examiners have no explicit marking
    criteria
  •     Assumption that by virtue of being a
    teacher, and having taught the student being
    examined, teacher- examiner makes reliable and
    valid judgements
  •     Authority, professionalism, reliability and
    validity of teacher rarely questioned
  •     Rare for students to fail

6
Second tradition
  •     Tests externally developed and administered
  •     National or regional agencies responsible
    for development, following accepted standards
  •     Tests centrally constructed, piloted and
    revised
  •     Difficulty levels empirically determined
  •     Externally trained assessors
  •     Empirical equating to known standards or
    levels of proficiency

7
Basic parameters
  • Validity
  • Reliability
  • Practicality
  • Authenticity
  • Washback
  • Impact
  • Currency

8
  • Validity in general refers to the
    appropriateness of a given test or any of its
    component parts as a measure of what it is
    purported to measure. A test is said to be valid
    to the extent that it measures what it is
    supposed to measure. It follows that the term
    valid when used to describe a test should usually
    be accompanied by the preposition for. Any test
    may then be valid for some purposes, but not for
    others.(Henning, 1987)

9
Validity
  • Rational, empirical, construct
  • Internal and external validity
  • Face, content, construct
  • Concurrent, predictive
  • Construct

10
How can validity be established?
  • My parents think the test looks good.
  • The test measures what I have been taught.
  • My teachers tell me that the test is
    communicative and authentic.
  • If I take the Rigo utca test instead of the FCE,
    I will get the same result.
  • I got a good English test result, and I had no
    difficulty studying in English at university.

11
How can validity be established?
  • Does the test look valid to the general public?
  • Does the test match the curriculum, or its
    specifications?
  • Is the test based adequately on a relevant and
    acceptable theory?

12
How can validity be established?
  • Does the test yield results similar to those from
    a test known to be valid for the same audience
    and purpose?
  • Does the test predict a learners future
    achievements?
  • Note a test that is not reliable cannot, by
    definition, be valid

13
How can validity be established?
  • A tests items should work well they should be
    of suitable difficulty, and good students should
    get them right, whilst weak students are expected
    to get them wrong.
  • All tests should be piloted, and the results
    analysed to see if the test performed as predicted

14
Factors affecting validity
  • Unclear or non-existent theory
  • Lack of specifications
  • Lack of training of item/ test writers
  • Lack of / unclear criteria for marking
  • Lack of piloting/ pre-testing
  • Lack of detailed analysis of items/ tasks
  • Lack of standard setting to CEF
  • Lack of feedback to candidates and teachers

15
Reliability
  • If I take the test again tomorrow, will I get the
    same result?
  • If I take a different version of the test, will I
    get the same result?
  • If the test had had different items, would I have
    got the same result?
  • Do all markers agree on the mark I got?
  • If a marker marks my test again tomorrow, will I
    get the same result?

16
Reliability
  • Over time test re-test
  • Over different forms parallel
  • Over different samples homogeneity
  • Over different markers inter-rater
  • Within one rater over time intra-rater

17
Factors affecting reliability
  • Poor administration conditions noise, lighting,
    cheating
  • Lack of information beforehand
  • Lack of specifications
  • Lack of marker training
  • Lack of standardisation
  • Lack of monitoring

18
Practicality
  • Number of tests to be produced
  • Length of test in time
  • Cost of test
  • Cost of training
  • Cost of monitoring
  • Difficulty in piloting/ pre-testing
  • Time to report results

19
Factors affecting practicality
  • Awareness of complexity and cost
  • Time to do the job quick and dirty remains
    dirty
  • Funding to support development, monitoring and
    further development
  • Recognition of need for training of testers and
    of teachers

20
Authenticity
  • Genuineness of text
  • Naturalness of task
  • Naturalness of learners response
  • Suitability of test for purpose
  • Match of test to learners needs (if known)
  • Face validity
  • Expectations of stakeholders and culture

21
Factors affecting authenticity
  • A test is a test is a test
  • Availability of resources
  • Training of test developers/ item writers
  • Relative importance of reliability over validity
  • Purpose of test proficiency versus progress or
    diagnosis

22
Washback
  • Test can have positive or negative effects
  • Test can affect content of teaching
  • Test can affect method of teaching
  • Test can affect attitudes and motivation
  • Test can affect all teachers and students in same
    way, or individuals differently
  • Importance of test will affect washback

23
Factors affecting washback
  • Extent to which teachers know nature of test
  • Extent to which teachers understand rationale of
    test
  • Extent to which teachers consider how best to
    prepare learners for test
  • Nature of teachers beliefs about teaching
  • Effort teachers are willing to make
  • Difficulty of test

24
Impact
  • Effect of test on society
  • Effect of test on stakeholders employers, higher
    education, parents, politicians
  • Intended and unintended
  • Beneficial or detrimental

25
Factors affecting impact
  • Extent to which purpose of test is understood and
    accepted
  • Currency of test
  • Face validity of test
  • Stakes of test
  • Availability of information
  • Education of stakeholders re complexity of testing

26
Currency of test
  • Extent to which test is valued by stakeholders
  • Different stakeholders may have different
    perspectives university vs employer parents vs
    teachers teachers vs principals? politicians vs
    professionals?

27
Factors affecting currency
  • Consequences of passing or failing stakes
  • Extent to which stakeholders take results
    seriously into consideration
  • Beliefs about value of tests in general
  • Extent to which test matches expectations about
    tests in general or language tests in particular
  • Difficulty of test
  • Institution offering the test

28
General Issues
  •     Teacher-based assessment vs central quality
    control
  •     Internal vs external assessment
  •     Quality control of exams (and the associated
    cost)
  •     Piloting and pre-testing
  •     Test analysis and the role of the expert
  •     The existence of test specifications
  •     Guidance and training for test developers
    and markers
  •    

29
General Issues (continued)
  • Feedback to candidates
  • Pass / fail rates
  • The currency of the old and the new traditions
  • The relationship with other languages and
    countries
  • The standards of the local exams in terms of
    "Europe"

30
Constraints on testing
  •    Time much less than for teaching 
  •     Sample inevitably limited
  •     Resources always limited money,
    infrastructure, trained personnel
  •     Assessment culture / tradition
  •     Lack of awareness of problems and solutions

31
BUT WASHBACK
  •     Testing is too important to be left to the
    teacher
  •     Testing is too important to be left to the
    tester
  •     Both are needed, to reflect and influence
    teaching, validly and reliably.

32
  • Assessment is central to language learning, in
    order
  • to establish
  • where learners are at present,
  • what level they have achieved,
  • to give learners feedback on their learning,
  • to diagnose their needs for further development,
    and
  • to enable the planning of curricula, materials
    and activities.
Write a Comment
User Comments (0)
About PowerShow.com