Resources for Using Corpus Linguistics in Language Education PowerPoint PPT Presentation

presentation player overlay
1 / 44
About This Presentation
Transcript and Presenter's Notes

Title: Resources for Using Corpus Linguistics in Language Education


1
Resources for Using Corpus Linguistics in
Language Education
  • Kenji Kitao
  • Doshisha University
  • Kyoto, Japan
  • S. Kathleen Kitao
  • Doshisha Womens College
  • Kyoto, Japan

2
  • Resources for Corpus Linguistics
  • http//www.cis.doshisha.ac.jp/kkitao/library/resou
    rce/corpus/corpus.htm
  • These PowerPoint slides
  • http//www.cis.doshisha.ac.jp/kkitao/Japanese/bio/
    present.htm24

3
  • I. Presentation
  • A. Corpus linguistics and corpus-related
    resources
  • B. Using corpus-related resources for language
    teaching
  • C. Online resources for corpus linguistics
  • 1. Types of resources
  • 2. Examples of resources

4
  • II. Application
  • A. Assigned tasks
  • B. Free exploration

5
Presentation
  • Definitions
  • Corpus (Latin for body)
  • A text or collection of texts
  • Now generally used to refer to machine-readable
    texts

6
  • Corpus linguistics
  • the use of the empirical data from a corpus to
    study language usage and to find patterns of
    language usage by analyzing actual language use

7
  • Requirements
  • A corpus
  • Can be a single text or a large collection of
    texts
  • Larger corpora provide more reliable results, if
    the purpose is making generalizations about
    language use

8
  • Balanced corpora
  • A variety of genres, including academic writing,
    newspapers, fiction, and spoken language

9
  • Specialized corpora
  • Examples
  • Academic writing
  • Text by learners of English
  • Teachers can develop their own corpora
  • Newspaper articles
  • Learners texts

10
  • Corpus analysis tool(s)
  • Types
  • Tools with specific corpora
  • Tools that can be used with any text or
    collection of texts
  • General
  • Word, Excel, etc.
  • Specialized
  • Count words
  • Find example of specific words or parts of speech
  • Analyze word frequencies
  • Evaluate readability

11
  • Online Corpora
  • Free to all users
  • Available for a fee or for purchase
  • Available only to restricted users
  • In this presentation, we will only introduce
    resources that are free.

12
  • Using Corpus Linguistics for Language Teaching
  • Technology has become widespread and accessible
  • Larger, more powerful computers that can analyze
    large amounts of data quickly are available
  • Many corpus-related resources have become
    available
  • Language teachers and learners can use corpora

13
  • Two types of uses of corpus-related resources
  • Low contact uses teacher uses resources to
    help in teaching, e.g., to find the difficult
    words in a reading passage
  • High contact uses students use the corpora
    themselves to learn about language, e.g., to find
    out which adjectives collocate with rain

14
  • Data-driven learning is a high contact use of
    corpus-related resources.
  • Using corpora to deduce rules of grammar or
    usage, e.g., to determine if a words connotation
    is positive or negative
  • Advantages of data-driven learning
  • Focus on authentic language
  • Encouragement of students to deduce
  • Real, exploratory activities rather than drills
  • A learner-centered activity

15
  • Web sites with suggestions for data-driven
    learning activities
  • How to use concordances in teaching English Some
    suggestions
  • http//www.nsknet.or.jp/7Epeterr-s/concordancing/
    usingconcs.html
  • Data-Driven Learning (DDL) the idea
  • http//www.ecml.at/projects/voll/rationale_and_hel
    p/booklets/resources/menu_booklet_ddl.htm

16
  • Corpus-related Internet resources
  • 1. General resources on corpus linguistics
  • 2. Vocabulary frequency lists and frequency level
    checkers
  • 3. Online corpora, concordancers and other
    text-analysis software
  • 4. E-texts
  • 5. Information about using corpus linguistics for
    language teaching

17
  • Resources for Corpus Linguistics
  • http//www.cis.doshisha.ac.jp/kkitao/library/resou
    rce/corpus/corpus.htm

18
  • 1. General resources on corpus linguistics
  • Web sites that help orient users to corpora and
    to what is available online for teachers to use
    in the classroom or in preparing material

19
  • The Compleat Lexical Tutor
  • http//www.lextutor.ca/
  • Resource for data-driven learning
  • Tutorials, resources of teachers, resources for
    research
  • Bookmarks for Corpus Linguists
  • http//devoted.to/corpora/
  • extensive annotated list of links related to
    corpus linguistics, including software, tools,
    and frequency lists papers and articles and
    English and non-English corpora

20
  • 2. Vocabulary frequency lists and frequency level
    checkers
  • Frequency lists
  • Words used most frequently in English and thus
    words that are most useful for students to know
  • Often divided into sublists
  • 3000 Most Commonly Used Words in the English
    Language
  • http//www.paulnoll.com/China/Teach/English-3000-c
    ommon-words.html

21
  • Specialized word list
  • Academic Word List
  • http//www.nottingham.ac.uk/alzsh3/acvocab/index.
    htm
  • List includes 570 headwords with their word
    families
  • Site includes an explanation of the word lists,
    the words in each sublist, suggestions for using
    the list, and a gapmaker that can be used to
    produce gap-filling exercises

22
  • Frequency-level checkers
  • Produces a list of words at each level of
    difficulty
  • Helps a teacher understand how difficult the
    vocabulary in the reading passage is and which
    words students at different levels of proficiency
    might need to learn
  • JACET 8000 Word List
  • http//www01.tcp-ip.or.jp/shin/j8web/j8web.cgi

23
  • 3. Online corpora, concordancers and other
    text-analysis software
  • Concordancers
  • A type of software for searching corpora
  • Produces a list of key words in context (KWIC),
    that is, search terms with the words that come
    before and after them.
  • May be able to search for parts of speech, e.g.,
    take, followed by a preposition
  • May be able to search for two words that are not
    next to each other

24
  • Corpora (or parts of corpora) may have spoken
    language, written language, American English,
    British English, academic English, and so on.
  • Specialized corpora include
  • parallel corpora, which have same texts in
    different languages (to compare same passages in
    different languages)
  • learner corpora, which have students writing/
    speaking (to help identify learners problems)

25
  • Examples of concordancers
  • Turbo Lingo
  • http//www.staff.amu.edu.pl/sipkadan/lingo.htm
  • Can enter a text or URL and get a list of KWIC,
    average sentence length, word frequency list, and
    other analyses

26
  • VIEW (Variation in English Words and Phrases)
  • http//view.byu.edu/
  • Concordancing tool for the British National
    Corpus
  • A powerful concordancing tool
  • Has a useful tutorial
  • Click on what you want to do to see samples of
    searches

27
  • Types of searches
  • Search by exact word, exact phrase, wildcard, or
    part of speech
  • For example, mysterious
  • Use ? or as a wildcard
  • For example, point
  • Search for an exact word plus a part of speech
  • For example, white n

28
  • Compare usage of semantically related words
  • sheer/total n
  • Search for surrounding words
  • Nouns that follow the verb wrap
  • Limit the search to one register
  • Adjectives in tabloid newspapers

29
  • Compare usage between registers, e.g., news and
    speaking
  • we verb that ACAD vs SPOKEN
  • Find words with similar, more general, or more
    specific meanings
  • Similar words to small
  • More general than shriek
  • More specific than woman

30
  • Online concordancer
  • http//www.lextutor.ca/concordancers/concord_e.htm
    l
  • Can search a variety of corpora, including the
    Brown Corpus, the British National Corpus
    (written and spoken), a learner corpus, etc.
  • Produces a KWIC list for a given word and a list
    of collocates and their frequency

31
  • WebCorp
  • http//www.webcorp.org.uk/
  • uses the Internet as a corpus and produces KWIC
    as well as providing other information

32
  • Online software to assess readability
  • Tests of document readability and suggestions how
    to improve readability
  • http//www.online-utility.org/english/readability_
    test_and_improve.jsp
  • Can calculate texts of any length (some online
    text analysis programs have limits)

33
  • Can enter the text directly or enter a URL
  • e.g., http//www.cis.doshisha.ac.jp/kkitao/Japan/s
    himoda/s1.htm
  • Provides statistics
  • Number of characters
  • Number of words
  • Number of sentences
  • Number of syllables/word
  • Number of words/sentence

34
  • Calculates readability indexes, including
  • Gunning Fog Index
  • Coleman-Liau Index
  • Flesch Kinkaid Grade Level
  • Flesch Reading Ease
  • Lists sentences that might be rewritten to
    improve readability.

35
  • 4. E-texts
  • In some cases, teachers or students may want to
    develop their own corpora. There are large
    numbers of e-text available.
  • Project Gutenberg
  • http//www.gutenberg.org/wiki/Main_Page
  • Large collection of downloadable fiction and
    non-fiction

36
  • Internet Public Library Online Texts
  • http//www.ipl.org/div/subject/browse/hum60.60.00/
  • A large number of online texts on a wide variety
    of subjects
  • Drews Script-o-Rama
  • http//www.script-o-rama.com/oldindex.shtml
  • A website with a large number of scripts of
    movies and TV programs
  • American Rhetoric Online Speech Bank
  • http//www.americanrhetoric.com/speechbank.htm
  • A website with a large collection of speeches

37
  • 5. Information about using corpus linguistics for
    language teaching
  • Corpus-related websites specifically for language
    teachers
  • Learner corpora and SLA Research
  • http//leo.meikai.ac.jp/7Etono/
  • Links to learner corpora made up of language
    produced by speakers of various languages, links
    to useful tools, a bibliobraphy, and so on

38
  • Corpus linguistics What it is and how it can be
    applied to teaching
  • http//iteslj.org/Articles/Krieger-Corpus.html
  • An article about corpus linguistics and how it
    can be used in the language classroom

39
Activities
  • Use a corpus to check grammar
  • http//www.lextutor.ca/grammar_tester/
  • Use the concordancer in the bottom frame to check
    the grammar of the sample sentences in the top
    half

40
  • Use a concordancer to make a gap-filler or a quiz
  • http//www.lextutor.ca/multi_conc/
  • http//www.nottingham.ac.uk/alzsh3/acvocab/awlgap
    maker.htm

41
  • Find examples of a word and group them according
    to meaning
  • Examples (http//www.lextutor.ca/concordancers/con
    cord_e.html)
  • party
  • run

42
  • Use the results of a KWIC search to determine how
    synonyms are used differently
  • Examples http//www.lextutor.ca/concordancers/conc
    ord_e.html
  • travel, journey, trip, voyage, tour
  • confident, fearless, pushy, upbeat, self-reliant

43
  • Use the academic word list web page and enter a
    text and make a gap-filling activity
  • http//www.nottingham.ac.uk/alzsh3/acvocab/awlgap
    maker.htm

44
  • Thank you
Write a Comment
User Comments (0)
About PowerShow.com