Word clusters and the idiomaticity of learner English - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Word clusters and the idiomaticity of learner English

Description:

How a chunk is calculated? Based on frequency ... you're having a discussion with your boyfriend or girlfriend and your opinions are different. ... – PowerPoint PPT presentation

Number of Views:69
Avg rating:3.0/5.0
Slides: 34
Provided by: flt8
Category:

less

Transcript and Presenter's Notes

Title: Word clusters and the idiomaticity of learner English


1
????????????Word clusters and the idiomaticity
of learner English
  • ???
  • xujiajin_at_bfsu.edu.cn
  • Beijing Foreign Studies University

2
Major points
  • Defining word clusters
  • Hypothesis and methodology
  • DIY step by step
  • Data display and interpretation

3
An alternative view of L
  • Some linguists see language as an edifice
    assembled with a great amount of ready-made
    templates or building blocks(?????).

4
Some Chinese examples
  • ????(Have you had your dinner?)
  • ??????(Where are you going?)
  • ???(Thats it.)
  • ??? (That is to say, kind of)
  • ???? (Then, so)
  • Such lexical phrases r stored uttered in one
    spurt at a time highly frequent in use.

5
Word clusters
Defining clusters
  • A word cluster is a group of words which follow
    each other in a text (Scott 2004). In our case,
    we take a frequency-driven approach to clusters.

6
Aliases of cluster
Defining clusters
  • They are similar to phrases in most pedagogic
    grammars, but bear different confusing names in
    the literature, like formulaic sequences, lexical
    bundles, clusters, chunks, multi-word
    expressions, recurrent word combinations,
    pre-fabs, ngrams etc.

7
Word list multi-word list
Defining clusters
  • How a chunk is calculated?
  • Based on frequency
  • Lexical chunks are generated like a multi-word
    list.

8
How a cluster is calculated (e.g.3-word)?
Defining clusters
  • The idea of respect comes from the concept that
    everyone, including yourself, has self-worth, and
    therefore should be treated with dignity. Say,
    for example, that you're having a discussion with
    your boyfriend or girlfriend and your opinions
    are different. While you may disagree with each
    other, each of you still has a right to your own
    feelings.

9
How a cluster is calculated (e.g.3-word)?
Defining clusters
  • The idea of respect comes from the concept that
    everyone, including yourself, has self-worth, and
    therefore should be treated with dignity. Say,
    for example, that you're having a discussion with
    your boyfriend or girlfriend and your opinions
    are different. While you may disagree with each
    other, each of you still has a right to your own
    feelings.

10
How a cluster is calculated (e.g.3-word)?
Defining clusters
  • The idea of respect comes from the concept that
    everyone, including yourself, has self-worth, and
    therefore should be treated with dignity. Say,
    for example, that you're having a discussion with
    your boyfriend or girlfriend and your opinions
    are different. While you may disagree with each
    other, each of you still has a right to your own
    feelings.

11
How a cluster is calculated (e.g.3-word)?
Defining clusters
  • The idea of respect comes from the concept that
    everyone, including yourself, has self-worth, and
    therefore should be treated with dignity. Say,
    for example, that you're having a discussion with
    your boyfriend or girlfriend and your opinions
    are different. While you may disagree with each
    other, each of you still has a right to your own
    feelings.

12
How a cluster is calculated (e.g.3-word)?
Defining clusters
  • The idea of respect comes from the concept that
    everyone, including yourself, has self-worth, and
    therefore should be treated with dignity. Say,
    for example, that you're having a discussion with
    your boyfriend or girlfriend and your opinions
    are different. While you may disagree with each
    other, each of you still has a right to your own
    feelings.

13
How a cluster is calculated (e.g.3-word)?
Defining clusters
  • The idea of respect comes from the concept that
    everyone, including yourself, has self-worth, and
    therefore should be treated with dignity. Say,
    for example, that you're having a discussion with
    your boyfriend or girlfriend and your opinions
    are different. While you may disagree with each
    other, each of you still has a right to your own
    feelings.

14
How a cluster is calculated (e.g.3-word)?
Defining clusters
  • The idea of respect comes from the concept that
    everyone, including yourself, has self-worth, and
    therefore should be treated with dignity. Say,
    for example, that you're having a discussion with
    your boyfriend or girlfriend and your opinions
    are different. While you may disagree with each
    other, each of you still has a right to your own
    feelings.

15
How a cluster is calculated (e.g.3-word)?
Defining clusters
  • The idea of respect comes from the concept that
    everyone, including yourself, has self-worth, and
    therefore should be treated with dignity. Say,
    for example, that youre having a discussion with
    your boyfriend or girlfriend and your opinions
    are different. While you may disagree with each
    other, each of you still has a right to your own
    feelings.

16
Cluster and idiomaticity
Hypothesis method
  • When a word cluster occurs highly frequently, it
    may imply that the cluster is formulaic, being
    able to enhance idiomaticity and fluency (Wray
    2002).

17
Cluster and idiomaticity
Hypothesis method
  • Studies in pattern grammar (Hunston 1996), the
    lexical approach (Lewis 1993) to language
    teaching, etc. show that native speakers use much
    more chunks in their language production than L2
    learners do. Alternberg (1998) reported that 96
    of native speakers language follow some
    prefabricated patterns.

18
Cluster and idiomaticity
Hypothesis method
  • It is, therefore, believed that the more
    proficient an L2 learner is, the more formulaic
    language is used in his language. Such
    formulaicity makes his language more idiomatic
    and fluent.

19
Measuring the idiomaticity of learner language
Hypothesis method
  • Ideally, it would be best to extract a set of
    clusters used in native speakers language, and
    to use the set of clusters to measure the
    idiomaticity of L2 learners language.

20
Measuring the idiomaticity of learner language
Hypothesis method
  • However, this is often difficult, as there may be
    thousands of such clusters (Pawley Syder 1983),
    and clusters are often content-related (Schmitt
    Carter 2004). It is not easy to find native
    speakers language data with shared topics.

21
Measuring the idiomaticity of learner language
Hypothesis method
  • It is wondered that, if NS data cannot be found,
    proficient L2 learners language data may also
    serve the purpose.
  • Word clusters from proficient L2 learners
    language the measure of learner language
    idiomaticity.

22
Data and Methodology
  • 90 essays written by university students, with
    scores assigned by expert human raters
  • 30 best compositions were chosen from the 90
  • Most frequent 3- and 4-word clusters were
    extracted from the 30 best compositions

23
Methodology
  • The remaining 60 texts were searched for the of
    clusters found in the 30 best essays
  • Statistical analysis was then conducted to see
    whether the of clusters contained in the 60
    compositions correlates with essay scores.

24
Flow chart
No. of clusters in each of the 60 texts
90 essays
60 Others
30 Best
Correlation analysis
PatCount search
Clusters
Ngram list
File-based Concordancing
25
DIY steps
  • Cluster/Ngram extraction
  • 1. Choose the 30 best students files
  • 2. Compute 3-4 word clusters
  • 3. Save cluster results

26
DIY steps
  • Counting clusters
  • 4. PatCount is used to search the 60 texts for
    the 3-4 word clusters.

27
DIY steps
  • Correlation analysis
  • 5. Numbers of counted cluster frequencies are
    correlated with rater-assigned scores.

28
Data display
29
Correlation co-efficient
???
????
???
???
???
???
0
1
0.8
0.6
0.4
0.2
30
Summary
  • Content
  • Notion of cluster
  • Methodology
  • The integration of corpus techniques and
    statistical tests.

31
Some reflections
  • In-depth analysis of the results or the
    underlying rationale
  • Functional grouping of clusters
  • Use of clusters across different proficiency
    levels
  • Tag-sequence/POS-gram/colligation

32
Thank you!
33
Assignment
  • The correlation between Tag sequence/POSgram and
    idiomaticity
Write a Comment
User Comments (0)
About PowerShow.com