Title: Learner Corpus Research in EAP: Past, Present and Future
1Learner Corpus Research in EAP Past, Present and
Future
2Outline of talk
- Research in
- English for General Academic Purposes (EGAP)
- English for Specific Academic Purposes (ESAP)
- Some core issues
- Current and future directions
- Focus on academic writing
3EGAP learner corpus research key areas
- Epistemic modality/stance markers
- Tense and aspect
- Vocabulary, collocations and phraseology
- (also connectors)
4ESAP Learner Corpus Research
5ESAP learner corpus research key areas
- Lexical bundles
- Keywords to phraseology
- Metadiscoursal expressions
6Some core issues
- Starting point for analysis
- derived from existing framework or inductive
- corpus-based vs. corpus-driven (Tognini
Bonelli) - bottom-up vs. top-down
- Choice of reference/control corpus
- Explanation for data (taking into account
variables)
7EGAP Learner Corpus Research
8Epistemic modality/stance markers
- Argumentative essays, e.g. ICLE sub-corpora
- governed by epistemic modality
- overlaps with Hylands (1999) stance markers
- hedges, boosters, self-mentions, attitude
markers - difficult for students to master complex
interplay between hedges and boosters
91. Learner Corpus Research (L1 L2)
- Doubt and certainty (Hyland Milton 1997)
- 2 corpora, 500,000 words each
- Learner Native-speaker
- 150 essays by A level
scripts - HK students for British
school leavers - A Level UofEng of similar age
education
101.Learner Corpus Research
- Procedure
- Inventory of 75 of most frequently occurring
epistemic lexical devices from - Holmes (1983) analysis of learned sections of
Brown LOB - Research literature on modality (Coates 1983)
- Reference grammars (Quirk et al. 1972)
- 50 sentences of each item(cut-off point, cf.
Altenberg)
111. Learner Corpus Research
- Key findings
- Chinese learners limited range of devices,
inappropriate strong convictions - e.g. As I know, I am quite sure some parents are
willing to pay vs.
native-speaker - e.g. On balance, it would seem that the only
real solution to the problem would be to. - Explanation
- due to lack of awareness of socio-pragmatic norms
- Teaching induced effect tutorial schools
122. Learner Corpus Research (L1 L2)
- Impact of culture on use of stance exponents in
GRICLE (Hatzitheodorou Mattheoudakis 2011) - 2 sub-corpora of around 200,000 words each
- Learner
Native-speaker - Greek component
LOCNESS (not all timed) - of ICLE (timed)
PELCRA (timed) -
(cf. Ädel 2008) - Procedure
- Focus on adverbials (hedges, boosters, attitude
markers) - Hylands (2005) categories of metadiscourse as
- reference point
- own list (e.g. happily)
132. Learner Corpus Research (L1 L2)
- Results
- Motivation for research initial reading (Greek
learners preference for wide range of boosters) -
- Preliminary observations borne out by results
Greek learners more emphatic extensive use of
boosters - Explanation (with reference to cultural factors)
- Consulted 1.7 million word Hellenic National
Corpus - Culturally induced from the Greek, authoritative
style of writing - NB unlike Hyland Miltons (1997) study and
GRICLE (2011) - study, McEnery Kifle (2002) found overuse of
hedging in - argumentative writing of Eritrean students
143. Learner Corpus Research (L2 vs. L2)
- Use of modal and reporting verbs in expression of
stance (Neff et al. 2003) - Corpora
- 5 learner sub- corpora from ICLE (French,
Spanish, - Italian, Dutch, German)
- 290,000 words in French . 195,000 in Spanish
- LOCNESS (150,000 words) as native reference
corpus - Focus on
- modal verbs (can, could, may, might, must)
- 9 reporting verbs (suggest, argue etc. among 12
most frequently used in preliminary analysis)
153. Learner Corpus Research (L2 vs. L2)
- Results explanations
- can overused by all groups of non-native writers
(massive overuse in Italian and Spanish
sub-corpora) - L1 transfer involving meaning, typology and
sociolinguistic norms - Typology
- Se puede apreciar un contraste entre...
- It can be appreciated a contrast between
- Positive politeness strategies
- The problems that we can find
- Developmental factors may play a role
- (cf. Aijmer 2002 Gilquin Paquot 2008)
164. ESAP Learner Corpus Study
- Stance options in data-description task in
statistics (Wharton 2012) - Corpus 40 student texts (4705 words)
- Procedures
- Inductive using Nvivo
- (small set of data in under-researched genre)
- Examination and reexamination of texts
- Common Content Assertions (5 propositions)
- Common Stances in Assertions (bare, hedged,
vague, boosted, reader-inclusive)
17ESAP Learner Corpus Research
18ESAP learner corpus research key areas
- Lexical bundles
- Keywords to phraseology
- Metadiscoursal expressions
191. Learner Corpus Research keywords .
phraseology
- Function keywords in academic writing
- (Lee Chen 2009)
- 2 apprentice corpora
- Chinese undergraduate dissertations (Eng. Lang.)
- Comparable L1 student corpus from BAWE
- Both compared with expert corpus of journal
articles from same field - Key finding (article usage)
- e.g. the students in this study
- students will be more motivated if.
202. ESAP Corpus Research keywords . phraseology
- Content keywords in Problem-Solution text
(Flowerdew 2008) - 2 corpora of around 225,000 words each
- Learner corpus of recommendation-based reports
- Expert corpus of professional reports
- (analogue rather than exemplar corpusTribble
2002) - Key finding
- Overuse of superordinate terms in topic sentences
- e.g. find a solution to the problem
- Over-reliance on rubrics from assignment
(lexical teddy bears, Hasselgren 1994)
21- Cf. Paquots (2010) six interlanguage features
- Limited lexical repertoire
- Lack of register awareness
- Unidiomatic phraseology
- Semantic misuse (problem vs. issue or question)
- Overuse of connective devices
- Strings of connectives in subject-initial
position
22ESAP Learner Research metadiscourse
- Hylands (2005) model of metadiscourse
- Interactive
- guide reader through text using transitions
etc. - Interactional
- involves reader in text using hedges,
boosters, - attitude markers
231. ESAP Corpus Research Interactional
- Anticipatory it in student and published
writing (Hewings Hewings 2002) - 2 corpora of business writing
- Student corpus 15 MBA dissertations by NNS
(123,633 words) - Comparable corpus 28 papers from three different
business studies journals (203,389 words )
241. ESAP Corpus Research Interactional
- Procedures
- Excluded it when propositional content, or
text-organising role - four categories derived from data
- Hedges, attitude markers, emphatics, attribution
- Results normalised per 1,000 words
- Results explanations
- Learners made less use of it clauses in hedging
- Greater use of it in other 3 categories
- Speculate more overt effort at persuasion on
account of readership
252. ESAP Corpus Research Interactive
- Use of we in a learner corpus of reports (Luzon
2009) - 2 corpora
- Learner corpus of reports written by Spanish
engineering students - Corpus for comparison (analogue) engineering RAs
- Key finding
- Lack of awareness of conventions for we
-
- e.g. With this paper we want to give you some
recommendations - We are going to consider the advantages
and
26Current and future directions
27Very recent initiatives in Learner Corpora
- Aggregate vs. individual data (cf. Hong article
on software in Tono et al. 2012, ICCI project) - Longitudinal studies (Meunier Littre 2013)
- More individual metadata
- MUCH longitudinal corpus (Eriksson et al.)
28Very recent initiatives in Learner Corpora
- Domain- and genre-specific
- VESPA (Paquot) , CALE (Callies)
- Linguistic theories (genre, SFL, pragmatics,
politeness strategies, cognitive linguistics etc. - Computer-mediated, synchronous asynchronous
(McDonald et al. 2013)
29