Suggestions for Using Information- Exchange Tasks for Oral Testing - PowerPoint PPT Presentation

1 / 53
About This Presentation
Title:

Suggestions for Using Information- Exchange Tasks for Oral Testing

Description:

Chapter 5 Suggestions for Using Information- Exchange Tasks for Oral Testing In this chapter we explore: Four general criteria for designing language tests that can ... – PowerPoint PPT presentation

Number of Views:170
Avg rating:3.0/5.0
Slides: 54
Provided by: Andre291
Category:

less

Transcript and Presenter's Notes

Title: Suggestions for Using Information- Exchange Tasks for Oral Testing


1
Chapter 5
  • Suggestions for Using Information- Exchange Tasks
    for Oral Testing

2
In this chapter we explore
  • Four general criteria for designing language
    tests that can be applied to the design of oral
    tests
  • Washback effects
  • Suggestions for developing oral tests from
    information-exchange tasks
  • Evaluation criteria for oral tests

3
Four criteria for designing a good test
  • Carroll (1980) identifies four general criteria
    in foreign language testing
  • Economy
  • Relevance
  • Acceptability
  • Comparability

4
Economy
  • By economy Carroll means obtaining the greatest
    amount of information about the learners
    language in as little time as possible and with a
    minimum of energy expended.
  • For a test to be economic, it should merely
    sample the material covered, not exhaust it.
  • An instructor can select from among the many
    items covered and infer or project something
    about the learners overall knowledge or ability.

5
Relevance
  • Relevance refers to the match between the course
    and curriculum goals and the tests.
  • For example, if you teach a course in
    conversational use of Italian, you would not want
    to give a formal composition as the final exam.
  • For a test to be relevant, it should reflect not
    simply what is taught but, more importantly, how
    it is taught.

6
Acceptability
  • Acceptability is a concept that takes the
    learners point of view into consideration.
  • It implies learners willingness to participate
    in the testing and their satisfaction that the
    test evaluates their progress.
  • For many learners, acceptability is tied to
    familiarity. If they are not familiar with a
    testing format or procedure, they may view it as
    unacceptable.

7
Comparability
  • Comparability is a concept that takes the
    institutions point of view into consideration.
  • Test scores for learners who are taught the same
    material by the same method should be similar.
  • For example, those enrolled in the 9am section of
    Portuguese 102 should have test scores similar to
    the scores of learners enrolled in the 2pm
    section if the two sections have common goals,
    materials, syllabi, and methods.

8
Washback effects
  • Krashen and Terrell (1983) made a statement that
    addresses the acceptability of a test.
  • Testing can be done in a way that will have a
    positive effect on the students progress. The
    key to effective testing is the realization that
    testing has a profound effect on what goes on in
    the classroom

9
Krashen and Terrell (1983) continued
  • Teachers are motivated to teach and students
    are motivated to study materials which will be
    covered on tests. Quite simply, if we want
    students to acquire a second language, we should
    give tests that promote the use of acquisition
    activities in and out of the classroom. In
    other words, our tests should motivate students
    to prepare for the tests by obtaining more
    comprehensible input and motivate teachers to
    supply it.
  • (Krashen Terrell, 1983, p.165)

10
Washback effect
  • What and how you test has ramifications for what
    instructors do in the classroom, what learners
    expect instructors to do in the classroom, and
    what learners do outside the classroom.
  • Testing cannot be viewed as an isolated event it
    must be an integral part of the teaching and
    learning enterprise.

11
The relevance of a test
  • Using an approach in the classroom which
    emphasizes the ability to exchange messages, and
    at the same time testing only the ability to
    apply grammar rules correctly, is an invitation
    to disaster.
  • (Krashen Terrell, 1983, p.165)

12
Oral testing in classrooms
  • Adapting information-exchange tasks for use as
    oral tests and quizzes
  • Lee and VanPatten define communicative burden
    as the responsibility of an individual test taker
    to initiate, respond, manage, and negotiate an
    oral event.
  • The communicative burden of a group discussion is
    less than the communicative burden of an oral
    interview.
  • In a discussion, multiple participants share the
    communicative burden, each one assuming the
    responsibilities of initiating, responding,
    managing, and negotiating the event.

13
Communicative burden
  • The communicative burden of a test format becomes
    an issue when the teacher is considering whether
    to give an oral quiz or test.
  • One might decide that an oral quiz at the end of
    a lesson in the first semester should have a low
    communicative burden, whereas a quiz at the end
    of a lesson in the fourth semester should have a
    greater one.
  • There are a number of instructional decisions to
    make regarding oral testing, and these decisions
    depend on a variety of pedagogical and practical
    factors.

14
Washback effect
  • These decisions may well have a washback effect
    on instruction.
  • By knowing and being familiar with the
    characteristics of the test, instructors may
    incorporate activities into the classroom that
    they feel will lead to success on the test.
  • The type of test can influence both what
    instructors emphasize and the way in which they
    emphasize it.

15
Content of the oral quiz
  • The content of the oral quiz or test can have
    another kind of washback effect on instruction.
  • If the content of the oral test is overtly tied
    to classroom activities, the learners are
    provided a stronger motivation for participating
    in the activities.
  • Testing and teaching should be interrelated so
    that learners are responsible for what happens in
    class.

16
Demonstration
  • To demonstrate how Lee and VanPatten interrelate
    teaching and testing, they convert four of the
    information-exchange tasks presented in Chapter 3
    into test sections. One of these examples is
    illustrated here.
  • Recall the following activity from Chapter 3.

17
Compare your birthday experiences
  • Step 1 Fill in the chart as you interview a
    classmate.

Birthday Where? With whom? Food? Fun?
2 yrs. Ago
5 yrs. Ago
10yrs. ago
Step 2 Now write a paragraph in which you
compare and contrast your birthdays.
18
Test section on this activity
  • Phase 1 Warm up. Make the test taker feel
    comfortable.
  • Phase 2 Initial questioning. Who was your
    partner? When is that persons birthday? When is
    your birthday?
  • Phase 3 Activity-related questions. Referring to
    the chart you made in class, tell me whether you
    and your partner have celebrated your birthdays
    in similar or different ways.

19
Two tests for evaluating spoken language
  • The first oral proficiency test is the Oral
    Proficiency Interview (OPI) which was developed
    by the American Council on the Teaching of
    Foreign Languages (ACTFL) in conjunction with the
    Educational Testing Service and several
    government agencies.
  • The other test is the Israeli National Oral
    Proficiency Test developed by Elana Shohamy and
    her colleagues.

20
The Oral Proficiency Interview (OPI)
  • The ACTFL Oral Proficiency Interview has been
    likened to a face-to-face conversation because an
    interviewer converses with an interviewee.
  • The goal of the OPI is to obtain a sample of
    speech that can be rated using the ACTFL
    Proficiency Guidelines as the measure.

21
Guidelines
  • These guidelines comprise level-by-level (from
    Novice to Superior) descriptions of learner
    performance
  • The content that a learner at a particular level
    might dominate
  • Simple greetings, health matters, family, etc.
  • The functions the learner dominates
  • Narrating in the past, present, and future
  • The accuracy present in the learners speech

22
Phases
  • The procedures used to elicit learner language
    during the OPI are termed phases.
  • Omaggio-Hadley (1993, pp.456-58) describes each
    phase as follows

23
Phase 1 Warm up
  • The warm-up portion of the interview is very
    brief and consists of greeting the interviewee,
    making him or her feel comfortable, and
    exchanging the social amenities that are normally
    used in everyday conversations.
  • Typically, the warm-up lasts less than three
    minutes.

24
Phase 2 Level check
  • This phase consists of establishing the highest
    level of proficiency at which the interviewee can
    sustain speaking performance.
  • This phase of the interview allows the person
    being tested to demonstrate his or her strengths.
  • Designed to elicit a speech sample that is
    adequate to prove that the person can function
    accurately at the level hypothesized by the
    interviewer during the warm-up phase.
  • Allows the interviewer to get a better idea of
    the actual proficiency level of the interviewee.

25
Phase 3 Probes
  • Probes are questions or tasks designed to elicit
    a language sample at one level of proficiency
    higher than the hypothesized level in order to
    establish a ceiling on the interviewees
    performance.
  • The probes may result in linguistic breakdown-
    the point at which the interviewee ceases to
    function accurately or cogently because the task
    is too difficult.

26
Phase 4 Wind-down
  • When a ratable sample has been obtained, the
    tester brings the interviewee back to the level
    at which he or she functions most comfortably for
    the last few minutes of the interview.
  • This last phase gives the tester one more
    opportunity to verify that his or her rating is
    indeed correct.

27
Single-format test
  • Each test giver follows the standard, prescribed
    phases.
  • OPI training ensures that raters carry out the
    interview uniformly and apply the ratings
    consistently.
  • The OPI is referred to as a single-format test,
    for it consists of only one task (an interview)
    and there are no other components to the test.

28
Two concepts
  • There are two important concepts that emerge from
    a consideration of testing.
  • Bias refers to situations in which elicitation
    and evaluation procedures are not the same for
    all test takers. The test giver is the variable
    in this scenario.
  • Inter-rater reliability refers to the desire to
    have all raters evaluate a test the same way.
    Given a set of criteria, all raters should apply
    them the same way.

29
Questions about OPI
  • Although useful for a variety of reasons, the OPI
    has been questioned because of its single-format
    nature.
  • Shohamy states that
  • Viewing oral language as constituting a
    multiple of different speech styles and
    functions, (e.g., discussing, arguing,
    apologizing, interviewing, conversing, being
    interviewed, reporting, etc.) means that

30
Shohamy continued
  • being interviewed, the speech style and
    function tapped in an oral interview, represents
    only a single type of oral interaction. No doubt
    that it is an important speech style, but
    clearly, there are also other oral interactions
    which are equally important in real life
    situations.
  • (Shohamy, 1987, p.52)

31
The Israeli National Oral Proficiency Test
  • In a series of studies, Shohamy and her
    colleagues (Reyes, 1982 Shohamy , Reyes,
    Bejerano, 1986) found that a learners
    performance on an oral interview was not a valid
    predictor of that learners performance on other
    oral tasks.
  • This test was introduced in Israel in 1986 as the
    national examination for students at the end of
    twelfth grade.

32
INOPT continued
  • The Israeli National Oral Proficiency Test in
    English as a Foreign Language (INOPT), in
    contrast to the OPI, is multicomponential by
    design and therefore, more comprehensive.
  • In addition to the oral interview, three other
    tasks are also used to evaluate test takers oral
    proficiency role play, a reporting task, and
    group discussion.

33
Justification of the four formats
  • Each format elicits a different speech style, so
    that the test as a whole comprises a range of
    speech styles that reflect communicative language
    use in authentic situations.
  • Their research demonstrated that the test did
    discriminate well among various levels of oral
    proficiency.

34
Justification continued
  • Their statistical analyses on the test allowed
    them to conclude that each section of the test
    was indeed different from the other sections.
  • They concluded that if the goal was to test
    various speech styles, then each would need to be
    tested via separate oral tests.

35
Descriptions
  • Shohamy and her fellow researchers offer the
    following descriptions used in the INOPT.
  • You will notice that their Oral Interview and the
    ACTFL OPI are quite similar.

36
Shohamys tests
  • Test 1Oral interview. The rationale underlying
    this test was to guide the test-takers into a
    dialogue with the tester.
  • Test 2Role play. The rationale behind this test
    was to stimulate the test-taker to produce
    spontaneous speech-behavior within given roles
    eliciting specific speech functions. In it, the
    test-taker had to play one role, with the tester
    playing another, both partners in a dialogue.

37
Shohamys tests continued
  • Test 3Reporting test. The rationale underlying
    this test was to stimulate the test-taker into a
    monologue in the foreign language. The student
    read a newspaper article silently in Hebrew, and
    was asked to report its general content in
    English.
  • Test 4Group Discussion. The rationale underlying
    this test was to stimulate the test-takers into a
    spontaneous discussion of a controversial issue.

38
Evaluation criteria for tests of spoken language
  • The speech sample elicited via the OPI is judged
    against the ACTFL Proficiency Guidelines.
  • The following level descriptions are taken from
    Omaggio Hadley (1993, pp.502-504).
  • Novice The Novice level is characterized by the
    ability to communicate minimally with learned
    material.

39
Level descriptions continued
  • Intermediate
  • Create with the language by combining and
    recombining learned elements, though primarily in
    a reactive mode
  • Initiate, minimally sustain, and close in a
    simple way basic communicative tasks
  • Ask and answer questions

40
Level descriptions continued
  • Advanced
  • Converse in a clearly participatory fashion
  • Initiate, sustain, and bring to closure a wide
    variety of communicative tasks
  • Satisfy the requirements of the school and work
    situations
  • Narrate and describe with paragraph-length
    connected discourse

41
Level descriptions continued
  • Superior
  • Participate effectively in most formal and
    informal conversations on practical, social,
    professional, and abstract topics
  • Support opinions and hypothesize using
    native-like discourse strategies

42
Judging speech samples
  • The four speech samples elicited by the INOPT are
  • each judged separately according to the following
    scale.
  • 4 Unintelligible
  • No language produced
  • No interaction possible
  • 5 Hardly intelligible
  • Very poor language produced
  • Only simplest, fragmentary interaction possible

43
Judging speech samples continued
  • 6 Clearly intelligible
  • Simple language produced
  • Interaction possible
  • Not articulate
  • 7 Responsive in interaction
  • Slightly more sophisticated language produced
  • Consistent error but do not interfere with
    fluency
  • Strong MT mother tongue interference
    (translated patterns, etc.)

44
Judging speech samples continued
  • 8 Almost effortless in expression
  • Adequate in interaction
  • Errors not consistent
  • 9 Facility of expression
  • Comfortable initiating in interaction
  • Sporadic mistakes
  • 10 No limitation whatsoever
  • Near-native
  • (Shohamy et al. 1986, p.219)

45
Similarities between OPI and INOPT
  • Each contains some kind of interview.
  • Each uses holistic ratings (that is, a single
    final score for the entire test).
  • Bachman (1990, p.328) argues that proficiency is
    not a unitary ability, but rather a componential
    one because we can identify the pieces and
    constituent parts of oral proficiency.

46
Componential rating scales
  • If oral proficiency is not a unitary ability,
    then it should not be tested as such (Shohamy et
    al., 1986), and just as important, it should not
    be scored as such (Bachman,1990).
  • Bachman proposes that tests of oral proficiency
    be evaluated using componential scoring criteria
    and provides the following criteria used in a
    test of oral proficiency he developed with a
    colleague (Bachman Palmer, 1983.)
  • The three scales assess grammatical, pragmatic,
    and sociolinguistic competence.

47
Sample of scale for grammatical competence
Rating Range Accuracy
0 No systematic evidence of morphologic and syntactic structures Control of few or no structures errors of all or most possible types
3 Large, but not complete range of both morphologic and syntactic structures Control of some structures used, but with many error types
6 Complete range No systematic errors
48
Sample of scale for pragmatic cohesion
Rating Vocabulary Cohesion
0 Extremely limited (a few phrases and formulaic words) No cohesion (utterances completely disjointed)
2 Moderate size (frequently misses or searches for words) Moderate cohesion (relationships between utterances generally marked)
4 Extensive size (rarely, if ever, misses or searches for words) Excellent cohesion (uses a variety of appropriate devices)
49
Sample of scale for sociolinguistic competence
Registers Nativeness Use of cultural references
0 Evidence of only one register 1 Frequent nonnative but grammatical structures 0.5 No evidence of ability to use cultural references
3.5 Control of both formal and informal registers 4 No nonnative but grammatical structures 4 Full control of appropriate cultural references
50
Pause to consider(p. 111)
  • the diagnostic uses of classroom tests. One of
    the important functions of classroom testing is
    its diagnostic function By examining learners
    performance on a test, we can provide them
    feedback on their strengths and weaknesses.
  • Does a global, holistic score provide an
    instructor the capability of giving diagnostic
    feedback? Think about what you would want to
    know about your own oral proficiency in the L2.
  • Would a high score on an oral proficiency test
    mean that you did not have weaknesses?
  • Would a low score indicate what specific things
    you could do to improve?

51
Summary of chapter 5
  • Adapted classroom activities for testing
    situations
  • Examined two tests for evaluating spoken language
  • Suggested the use of tests that examine a variety
    of speech styles and functions via multiple
    formats

52
Summary of chapter 5 continued
  • Presented several componential rating scales,
    which allow a more precise evaluation of the
    speech sample as well as a more detailed
    diagnosis of the learners language.
  • Suggested that the choice of rating scales should
    depend on the types of oral interactions elicited
    and whether the interaction involves just a test
    giver or other learners.

53
Thinking more about it p. 115
  • 3 Consider the advantages and disadvantages of
    having three language learners perform an
    information-exchange task as an oral test or
    quiz. What rating scales would you use?
Write a Comment
User Comments (0)
About PowerShow.com