Standards-based assessment

About This Presentation

Title:

Standards-based assessment

Description:

Title: Validity in language assessment Author: Linguistics Applied Linguistics Last modified by: Raquel Created Date: 4/28/2006 11:56:42 AM Document presentation format – PowerPoint PPT presentation

Number of Views:59

Avg rating:3.0/5.0

Slides: 30

Provided by: Linguisti

Category:

more less

Transcript and Presenter's Notes

Title: Standards-based assessment

1
Standards-based assessment

Tim McNamara
The University of Melbourne

2
Standards-based assessment and criterion
referencing

Standards-based assessment is a form of
criterion-referenced assessment (cf
norm-referenced assessment).

3
Information derived from a Criterion-Referenced
Test

The degree to which the student has attained
criterion performance, for example whether he can
satisfactorily prepare an experimental report.
Glaser 1994 1963, p.6

4
Information derived from a Norm-Referenced Test

The relative ordering of individuals with respect
to their test performance, for example, whether
Student A can solve his problems more quickly
than Student B.
Glaser 1994 1963, p.6

5
Definition of a criterion-referenced test

A criterion-referenced test is one that is
deliberately constructed to yield measurements
that are directly interpretable in terms of
specified performance standards. Performance
standards are generally specified by defining a
class or domain of tasks that should be performed
by the individual.
Glaser and Nitko, 1971, p. 653

6
Definition of a criterion-referenced test (2)

A students score on a criterion-referenced
measure provides explicit information as to what
the student can and cant do. Criterion-reference
d measures indicate the content of the
behavioural repertory, and the correspondence
between what an individual does and the
underlying continuum of achievement. Measures
which assess student achievement in terms of a
certain criterion standard thus provide
information as to the degree of competence
attained by a particular student which is
independent of reference to the performance of
others.
Glaser, 1963, p. 519

7
Norm-referenced test

Any test that is primarily designed to disperse
the performances of students in a normal
distribution based on their general abilities, or
proficiencies, for purposes of categorizing the
students into levels or comparing students
performances to the performances of others who
formed the normative group.
Brown and Hudson (2002, p. 2)

8
Is CRT behaviourist?

Criterion-referenced testing has its origins in
behaviourism, but need not be atomistic, purely
dichotomous, or reductive.

9
Criterion-referencing and levels on a continuum

Underlying the concept of achievement measurement
is the notion of a continuum of knowledge
acquisition ranging from no proficiency at all to
perfect performance. An individuals achievement
level falls at some point on this continuum as
indicated by the behaviors he displays during
testing. The degree to which his achievement
resembles desired performance at any level is
assessed by criterion-referenced measures of
achievement or proficiency.

10
Scales and CRT

The standard against which a students
performance is compared when measured in this
manner is the behavior which defines each point
along the achievement continuum. The term
criterion, when used in this way, does not
necessarily refer to final end-of-course
behavior. Criterion levels can be established at
any point in instruction where it is necessary to
obtain information as to the adequacy of an
individuals performance.
Glaser, 1963, pp. 519-520

11
Interface with policy - scales and frameworks

Dominant movement in language education
internationally
Driven by need for accountability and emphasis on
demonstrable outcomes
Has adopted functionalist view of language
education (i.e. not cultural, intellectual,
values dimension)
Response to demands of globalization, efficiency
Curriculum and assessment addressed in single
framework
Emphasis on reporting

12
Format of standards

Standards are typically formulated as an ordered
series of statements about levels of achievement
or stages of development.
(There may be multiple sets of ordered statements
for different aspects of language development)

13
CEFR Levels A2 , B1 (speaking)

A2 Can understand sentences and frequently used
expressions related to areas of most immediate
relevance (e.g. very basic personal and family
information, shopping, local geography,
employment). Can communicate in simple and
routine tasks requiring a simple and direct
exchange of information on familiar and routine
matters. Can describe in simple terms aspects of
his/her background, immediate environment and
matters in areas of immediate need.
B1 Can understand the main points of clear
standard input on familiar matters regularly
encountered in work, school, leisure, etc. Can
deal with most situations likely to arise whilst
travelling in an area where the language is
spoken. Can produce simple connected text on
topics which are familiar or of personal
interest. Can describe experiences and events,
dreams, hopes and ambitions and briefly give
reasons and explanations for opinions and plans.

14
Mislevy claims and evidence
An assessment is a machine for reasoning ASSESSMENT ARGUMENT

about what students know, can do or have accomplished CLAIMS

based on a handful of things they say, do, or make in particular settings OBSERVATIONS/ EVIDENCE
15
What is the CEFR?

It represents a construct definition it is an
exercise in domain modelling
It provides a set of claims
It provides a general characterization of
evidence and tasks
It is not a test - it allows different kinds of
tests to be realizations of this construct

16
Possible functions of standards

Planning to act as a series of objectives of
goals for teaching and learning involve clear
and specific statements of teaching aims
Professional understanding to inform teachers
about the typical progress of learning more
complex statements and include contextual and
interpretative information in order to help the
teacher understand more fully the nature of the
emergent ability in the learner
Accountability to act as statements of learning
outcomes for administrative purposes - tends to
be dominant function

17
Formative vs summative assessment

Can standards-based assessment help with
formative assessment?

18
Gathering evidence to form basis of reporting

Gathering of evidence a mixture of teacher-led
assessment and external examination
External evidence may be seen as intrusive,
insensitive to learning
Places burden on teacher for record keeping
Requires intensive professional development of
teachers
Best schemes provide good advice to teachers
about integrating assessment in instruction -
Assessment for learning movement

19
The assessment pyramid

LEVELS
(NUMBERED)
LEVEL
SUMMARIES
STRAND DESCRIPTIONS
WITHIN EACH MODE, EXAMPLES PROVIDED
ADVICE TO TEACHERS DETAILED EXAMPLES
TEACHER CHOOSES ACTIVITY CRITERIA

20
Competing demands in standards-based assessment
Validity demands Managerialist demands Teacher/ learner demands
Intellectual defensibility of construct Evidence of Reliability Other validity evidence Concern for consequences Reporting Accountability Meaningfulness in instructional process Facilitation of learning Enhanced quality of teaching Minimization of administrative burden on teachers
21
Dylan Wiliam Beyond norm- and criterion-reference
d tests

Norm-referenced - hard to interpret in terms of
what a student can do limited to placing student
in cohort group
Criterion-referenced -
leads to narrowing of teaching
Also implies a cohort group

22
Wiliam on the role of teachers

An assessment is valid to the extent that you are
happy for teachers to teach towards the test
Therefore
Involve teachers in summative assessment
Increases reliability and validity
Externalize standards
Locates teacher as coach, not judge
Requires teachers to form a community of
practice

23
Wiliam on construct-referenced assessment

Criteria do not define but exemplify grades
Standards are shared by the community of
practice
Standards are implicit and evolve

24
Example Standards and the PhD

Implies a yes/no decision about individuals
Impossible to specify criteria
But examination process proceeds successfully
Granting PhD is a performative utterance, an
illocutionary act (not a description) - the
person is launched on their career

25
Wiliam on summative and formative assessment

Effective summative assessment
requires teachers to share a construct of quality
Effective formative assessment
Requires students to share the same construct of
quality
Requires teachers to posses an anatomy of quality

26
Wiliam on quality rather than criteria

Maxims cannot be understood, still less applied
by anyone not already possessing a good practical
knowledge of the art. They derive their interest
from our appreciation of the art and cannot
themselves either replace or establish that
appreciation.(Polanyi, 1958 p50).
Quality doesnt have to be defined. You
understand it without definition. Quality is a
direct experience independent of and prior to
intellectual abstractions.(Pirsig, 1991 p64).

27
Our questions

1 assessment vs testing vs evaluation vs
validation vs measurement
2 affective factors in assessment
3 influence of L1 on assessment
4 raters/judges
5 effect of tasks - (esp CELU)
6 criteria in writing and oral interaction
7 history of assessment
8 why assessment? Can we do without it?
9 performance assessment

28
Our questions

10 qualitative vs quantitative aspects
11 correction in an oral exam
12 assessment as a process - and the final exam?
13 scales/descriptors for oral language
14 should listening be part of the oral exam?
15 Are we assessing what we want to assess?
16 Defining standards - intermed/advanced etc
17 Inter-rater reliability?

Standards-based assessment - PowerPoint PPT Presentation

Standards-based assessment

Title: Validity in language assessment Author: Linguistics Applied Linguistics Last modified by: Raquel Created Date: 4/28/2006 11:56:42 AM Document presentation format – PowerPoint PPT presentation