CULT 2004 Barcelona - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

CULT 2004 Barcelona

Description:

... was the primary reason for large and general corpus collections ... Corpus information is not validated (peer-reviewed) the way dictionary information is ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 22
Provided by: kristava
Category:
Tags: cult | barcelona | corpus

less

Transcript and Presenter's Notes

Title: CULT 2004 Barcelona


1
CULT 2004Barcelona
  • Krista Varantola
  • The contextual turn in learning
  • to translate

2
Why personification?
  • Partners in collaboration
  • Human translators
  • Dictionaries
  • Corpora
  • -------------
  • Translation memory software
  • Machine translation
  • Other human experts
  • Encyclopedias
  • Handbooks
  • ...

3
Competence Performance Translation
Translating
  • How to match competence with performance ?
  • Translating text production with or without
    source text
  • In L1-L2 translating?
  • In L1-L2 or L2-L1 translating in special domains

4
The contextual turn in translating
  • Turns in translation studies
  • Linguistic
  • Cultural
  • Historical
  • Translators visibility
  • Social
  • What about translating?
  • A contextual turn
  • See papers at this conference!

5
Why is the contextual turn so strong just now?
  • New developments in
  • The electronic format
  • The Web
  • Access to corpora
  • Corpus-based dictionaries
  • Translators problem-solving techniques
  • Dictionaries and corpora are converging
  • in the sense to come from other places to meet
    in a particular place and to work together
  • Translators meet them there
  • to translate with them
  • Collaboration presupposes interaction

6
How could we improve this collaboration?
  • Dictionaries and Corpora
  • Very different but interdependent tools
  • Modern dictionaries are based on corpora
  • Dictionary development was the primary reason for
    large and general corpus collections
  • Lexicographers corpus needs are different from
    those of translators
  • Translators can benefit from large and balanced
    general corpora (such as the BNC)
  • KWIC concordances
  • keyword extraction

7
Corpora
  • They also need smaller targeted corpora
  • permanent
  • disposable
  • parallel
  • comparable
  • Corpus information is by definition context-bound
  • Lexical items in corpora do not appear in
    isolation
  • Corpora help learners with real language
  • Corpora are more cumbersome to use than
    dictionaries
  • We need corpus tools to get information from
    corpora
  • Corpus information is not validated
    (peer-reviewed) the way dictionary information is

8
Dictionaries
  • Used to have the ideal of context-free, generally
    valid information
  • The electronic format allows more freedom
  • Greater awareness of the role of the context
  • More collocational information
  • More usage examples
  • Even access to corpus examples
  • Awareness of semantic prosody
  • Meaning a much less clear-cut category
  • Ordering of word senses based on their real
    frequency
  • Awareness of lexicographic relevance

9
Dictionary issues
  • Meaning and comments from corpus lexicographers
  • I dont believe in word senses (Atkins)
  • Word meaning can be regarded as (at best) yet
    another form of prototype (Rundell)
  • Words do not have separate meanings but meaning
    potentials which may be activated in particular
    contexts (Hanks)
  • A word can have about as many senses as a
    lexicographer cares to perceive (Hanks)
  • Instead of listing subsenses, the generative
    lexicon aims at making them result from the
    combination of the general meanings of the
    various words in a sentence (Zaenen)
  • In bilingual dictionaries, a translation
    equivalent, the gloss or glosses on the right
    hand side should be regarded as an approximation
    or key to the meaning of the entry word
    (Varantola)

10
Dictionary issues cont.
  • What is word meaning?
  • Word meaning is probabilistic
  • We discern tendencies or preferencies
  • We need prototype theory and core meanings
    (Hanks)
  • What do we need?
  • A more phrasally-oriented approach with
    phraseology (Cowie)
  • information on collocational patterns and
    combinatory tendencies of prefabricated units
  • information on variation. Conversation is a
    journey (Cf. Lakoff and Johnson). Conversation
    drifts, revolves around, veers, wanders, moves.
    (Rundell)

11
Problems in present-day dictionaries
  • Equal prominence given to rare and unusual words
    and senses (Rundell)
  • Lack of pragmatics ( right - is used
    conversationally far more frequently than in the
    sense correct)
  • Lack of lexicographic relevance
  • Number of lines in the entry vs. frequency in use
  • inarticulacy, inarticulateness appear a total of
    seven times in the BNC, yet they are allocated
    several lines in a well-known learners dictionary

12
Solutions
  • The context rules
  • Meanings come from contexts
  • NODE 1998 - entries are divided into core
    senses which act as a gateway to other related
    subsenses
  • MED 2002 - words may have meaning clusters with
    several subsenses - cf. escape v. - essentially
    monosemous
  • Information on semantic prosody included
  • Cause has a marked preference fo a negative
    object
  • Consequence a 51 ratio of negative to positive
  • Result has a mainly positive one
  • What about adjectives?
  • Old-fashioned - positive, value-free, negative
    (Rundell)
  • lean - lean meat, lean years, lean clothes, lean
    songs - Which of them are candidates for
    dictionary entry? (Hanks)

13
What are corpora good for?
  • They are not word-oriented.
  • The searchable unit-size is not fixed in the way
    it is in dictionaries
  • They can be manipulated and dissected
  • They give information of word use or of the use
    of prefabricated chunks and longer stretches of
    text
  • They allow users to observe real-time meaning
    changes and developments
  • - gates (chaos theory)
  • They reassure the user (This expression is really
    used in this context)
  • They enable fuzzy searches and serendipitous
    finds
  • They are very helpful in solving genre- and text
    type-related questions in editing tasks

14
Wordness
  • Wordness in corpora and in dictionaries are
    very two different concepts
  • All corpus words do not fulfil the stringent
    criteria of dictionary words
  • Corpus word essentially a meaningful string of
    letters separated by a blank?
  • permanency vs. non-permanency
  • creative uses vs. encapsulated meanings
  • All derivatives are not in the dictionary
  • Nonce words have no place in dictionaries

15
Corpus words and dictionary words
  • Translators need information about dictionary
    words and corpus words
  • A typical question
  • Is this word/expression really used?
  • A Google check on the Web
  • If so, is the context relevant? (reassurance,
    e.g. unvalidated)
  • Another typical question
  • Is the dictionary word adequate - what else is
    there?
  • A Google check on the Web
  • (reassurance, e.g. iron-bar lever, lever, steel
    pole)

16
Problems with corpus use
  • Cumbersome to compile
  • Cumbersome to use
  • Non-systematized, non-digested
  • User-driven, user skills key to success
  • Reliability of information needs to be assessed
    by user
  • Analysis tools are not easy to use
  • Users have various corpus needs in and between
    different languages
  • Permanent collections
  • Non-permanent collections
  • Comparabe corpora
  • Parallel corpora
  • Large and small corpora, etc.
  • Spoken-language corpora
  • Multiply tagged corpora, etc.

17
Intelligent tools and the user
  • What is intelligence in this case?
  • Intelligence interactivity - a way of getting
    deeper and closer to the answer the user is
    satisfied with
  • Shallow intelligence vs. deeper intelligence in
    dictionaries
  • Shallower
  • Spell-checking systems, cross-references to
    near-synonyms, antonyms, etc.
  • Mechanical information in general
  • Deeper
  • User-definable user profiles user-specified
    filters and displays
  • Look-up modes vs. browsing modes
  • A larger selection of data categories, such as
    categories relating to semantic prosody and
    difficult-to-spot collocational patterns
  • Information on semantic valence and sense
    differentiation (The potential of Frame
    Semantics)
  • Full compatibility with different corpora
  • Multimedia displays
  • Interactive suggestions to other available links
    (domain-specific information, encyclopedic
    information, contextually relevant information,
    suggestions based on user search pattern logs)

18
Intelligence in corpora
  • Shallower
  • Raw data or POS-tagged corpora, present-day
    stand-alone analysis tools
  • Deeper
  • Syntactically and semantically tagged corpora
  • Dictionary-linked corpora
  • Bidirectional access between corpora and
    dictionaries
  • Easy-to-use compiling tools
  • corpus design and balancing
  • parallel and comparable corpus compilation tools
  • housekeeping tools (Data source, genre,
  • data capture
  • Easy-to-use analysis tools
  • Easy to use WordSmith-style tools (concordancers,
    collocational patterning, frequency and keyword
    lists, statistical analyses)
  • WordSketch-style tools for search word usage
    patterns
  • Easy-to-use maintenance tools
  • Updating and consistency control
  • Syntactic and semantic taggers

19
Users and User skills How to improve users?
  • Modern translator competence presupposes an
    understanding of modern performance enhancing
    tools and their limits such as electronic
    dictionaries, corpora and the use of corpus tools
  • How are these skills acquired? - Through
    training
  • How are the trainers trained? - Through
    cooperation and conferences - CULT
  • Who is the boss in the collaborative
    environment?
  • The translator. The final responsibility lies
    with the translator and the translator decides
    on the most adequate solution with the help of
    up-to-date tools which all belong to one snap-on
    toolbox.

20
Dictionaries vs. corpora
  • Dictionaries chase a moving target (Atkins)
  • Corpora get closer to the moving target but even
    they are always slightly behind
  • Dictionary information is more elaborated and on
    a firmer basis
  • Corpus information is shakier but more real

21
Challenges
  • WE SHOULD FIND OUT WHY THE USER IS LOOKING UP A
    PARTICULAR WORD?
  • WE SHOULD MOVE FROM INFORMATION OVERLOAD TO
    LEXICAL KNOWLEDGE MANAGEMENT
  • DICTIONARIES SHOULD NO LONGER BE STAND-ALONE
    PRODUCTS
Write a Comment
User Comments (0)
About PowerShow.com