Lexicalization across Languages - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Lexicalization across Languages

Description:

The men walked across the street = Los hombres cruzaron la calle caminando ... colloquial in style (not literary) pervasive/wide range of semantic notions expressed ... – PowerPoint PPT presentation

Number of Views:97
Avg rating:3.0/5.0
Slides: 29
Provided by: robert693
Category:

less

Transcript and Presenter's Notes

Title: Lexicalization across Languages


1
Lexicalization across Languages
  • Robert Belvin - Fall 2005

2
Surprise from Spanish
  • The men walked across the street
  • Los hombres cruzaron la calle caminando
  • Los hombres caminaron a través de la calle
  • (Sp. meaning is used contrastively to focus on
    walk)
  • We ran down the stairs
  • Bajamos las escaleras corriendo
  • Corrimos hacia abajo por las escaleras
  • (Sp. meaning is used contrastively to focus on
    run)

3
Insights from Typological and Cognitive
Linguistics
  • Leonard Talmy (1985, etc.), Jackendoff, etc.
  • - Possible to isolate meaning elements,
    especially by cross-linguistic investigation
    (transparent morphology, diathesis)
  • - For a motion event verb motion, path, figure,
    ground, manner, cause
  • Then
  • - We can then examine which semantic elements
    are expressed by which lexical and syntactic
    (surface) elements

4
Meaning-Surface elements not in 1-1 correspondence
  • The relationship is mostly not one-to-one
  • Combination of surface elements are sometimes
    required to express one semantic element
  • French negative ne pas
  • Semitic definite article def-N def-Adj
  • al-rajulu(a)l-thariiyun (Mod. Std. Arabic)
  • the-man the-wealthy the wealthy man
  • More than one semantic element expressed by one
    lexical item (what we're looking at here) Talmy
    calls this lexicalization (note there is also a
    different NLP use, meaning retrieve from lexicon
    and associate with tokens POST)

5
Typical Pattern for English
Focusing on verb root (vs. rootaffixes) Most
common type for Indo-European Lang. (also
Chinese) Exs BELOC -- The lamp
stood/lay/leaned on the table. MOVE/GO -- The
rock slid/rolled/bounced down the hill.
I ran/limped/jumped/stumbled/groped my way down
the stairs.
6
Conceptual Semantics Representation
  • Jackendoff (1983, 1990, etc.) Implemented in an
    MT system by Bonnie Dorr (1993--U. Maryland)
  • (capitalized words are meta-lg.
    elementswhich happen to be in English)
  • EVENT GOLOC
  • (THINGltanimgt __ , PATH TO LOCATION __
    )
  • MANNER WALKING
  • Lexical Conceptual Structure (LCS) of verb root
    walk

7
LCS Composition
  • Combines with LCS for preposition across
  • PATH ACROSS LOCATION
  • and with LCSs for the nouns Joe and street
  • THINGltanimgt Joe LOCATION street

8
LCS Composition
  • to yield
  • EVENT GOLOC
  • (THINGltanimgt Joe , PATH ACROSS LOC
    street ) MANNER WALKING
  • Joe walked across the street

9
Spanish prefers to incorporate Path
Talmy again, on Romance packaging of motion verbs
10
Spanish LCS for motion verb
  • In Jackendoff version, verb root "cruzar" is
    then
  • EVENT GOLOC
  • (THING , PATH ACROSS LOC )
  • Contrast with English walk
  • EVENT GO
  • (THINGltanimgt , PATH TO LOCATION
    )
  • MANNER WALKING
  • Generic TO Path unifies with any other Path
    element

11
LCS Composition
  • Adjoin the manner adverbial caminando
  • MANNER WALKING
  • and the location and figure expressions, to
    yield
  • EVENT GOLOC
  • (THINGltanimategt José, PATH ACROSS
    LOCATION calle )
  • MANNER WALKING
  • José cruzó la calle caminando

12
Other Languages Incorporate Figure
  • Atsugewi (Hokan, Northern Calif.)
  • We also have a some examples in English (rain,
    spit, drool, etc.)
  • But most Atsugewi motion verbs are of this type
  • No attested languages that incorporate
    motionground as primary type not clear why

13
Criteria for Determining Language Type
  • Talmy gives 3 criteria for determining a
    languages lexicalization type
  • verb type should be frequent in occurrence
  • colloquial in style (not literary)
  • pervasive/wide range of semantic notions
    expressed
  • Thus, we have spit and cross and
    descend, but majority type is like walk and
    run. Similarly, Spanish has caminar and
    correr, but majority type is like cruzar or
    bajar

14
General Analysis Procedure
  • Identify main verb, retrieve the LCS, then fill
    required arguments by unifying with compatible
    LCS elements adjoin any additional
    (non-argumental) phrases indicates obligatory
    argument
  • walk across
  • EVENT GOLOCATION Path ACROSS
    (Location )
  • (THING , PATH TO LOCATION )
  • MANNER WALKING
  • joe street
  • THINGltanimgt joe LOCATION street

15
General Analysis Procedure
  • Since across will unify with the default PATH
    (expressed as TO), it gets incorporated in as
    argument of GO event
  • EVENT GOLOCATION
  • (THING Joe , PATH ACROSS LOCATION
    street )
  • MANNER WALKING
  • Composed LCS (CLCS-Dorr 93)
  • (versus Root LCS (RLCS), which is just verb root
    LCS

16
Lexical Selection in Generation
  • Look for match which covers the most LCS
    elements possible cruzar can cover two elements
  • CLCS
  • EVENT GOLOC (THING joe , PATH ACROSS LOC
    street ) MANNER WALKING
  • EVENT GOLOC (THING , PATH ACROSS LOC )
    ? LCS included as
  • part of lexical entry for cruzar

17
Lexical Selection in Generation
  • May have multiple matches the verb caminar also
    covers two elements
  • CLCS
  • EVENT GOLOC (THING joe , PATH ACROSS LOC
    street ) MANNER WALKING
  • EVENT GOLOC (THING , PATH TO LOC )
    MANNERWALKING ? LCS included as
  • part of lexical entry
    for walk

18
Scaling Problems
Familiar Knowledge Acquisition Bottleneck
Application of 500 verbs is a problem, with
5,000 near impossible, though now large
annotation projects which may provide some of
this knowledge in a machine-readable
form Labor-intensive, consistency problems

19
Automate Acquisition of LCSs?
  • Tool can lead the developer through a decision
    tree
  • Is core verb meaning stative or eventive?
  • If eventive
  • Does the core meaning entail GO, STAY or INCHO
  • If GO
  • Choose the best PATH from list of 12
  • (ABOUT, ACROSS,ALONG, AWAY-FROM,DOWN,FROM,IN,
    TO,TOWARD, etc)
  • Is a causative alternation possible
  • Is the causative element CAUSE or LET
  • etc., etc.

20
Automate Acquisition?
  • Still problems with labor intensive aspect of
    development
  • System was brittle
  • Many other kinds of information are required in
    addition to the thematic structure (LCS
    representation)
  • Problems may not be insurmountable--probabalistic
    methods seem to be a likely solution to
    brittleness problem.
  • Learning algorithms - CMU automated elicitation
    method (Avenue project)

21
Toward Automating the Acquisition Process
  • Apply Diagnostic Tests, either via human in the
    loop or by searching large corpora. Note that
    the hypothesis is that theres a finite (fairly
    small) number of templates (below is a large
    fraction of them)
  • EVENT ? INCHO(STATE)
  • EVENT ? GO (THING,PATH)
  • EVENT ? STAY (THING,POSITION)
  • STATE ? BE (THING,POSITION)
  • STATE ? ORIENT (THING,PATH)
  • STATE ? GO-EXTENT(THING,PATH)
  • EVENT ? LET ( THING/EVENT, EVENT/STATE)
  • EVENT ? CAUSE-EXCHANGE (EVENT,EVENT)
  • EVENT ? CAUSE ( THING/EVENT, EVENT/STATE)

22
Automating the Acquisition Process--Diagnostics
  • Determining if the verb root is eventive or
    stative
  • -does it occur in present progressive
    constructions?
  • John is walking to the store
  • -does it occur in Pseudo-cleft constructions?
  • What John did was walk to the store
  • -does it have only a habitual interpretation if
    used in the simple present tense?
  • John walks to the store (every day)
  • ??Oh look, John walks to the store
  • If so, its likely eventive

23
Automating the Acquisition Process--Diagnostics
  • Stative predicates pattern oppositely
  • John is knowing Spanish
  • What John did was know Spanish
  • John knows Spanish - has a true present tense
    interpretation

24
Automating the Acquisition Process--Diagnostics
  • Many other observable patterns which can offer
    clues
  • Levin Alternations (English Verb Classes and
    Alternations)
  • 11 different properties listed for this class of
    manner of motion verb
  • Wide variation in path expressions allowed
  • Induced Action alternation
  • Locative preposition drop
  • Locative inversion
  • There-insertion
  • Adjectival Passive
  • etc.
  • Taken alone, tests are inconclusive, but
    effective in combination

25
Automating the Acquisition Process--Diagnostics
  • Certain patterns of occurrence are indicative of
  • very specific LCS characteristics
  • -adjectives as base for inchoative events like
    redden, sadden, darken, lessen
  • EVENT INCHO (STATE BE THING ,
    POSITION AT PROPERTY RED )

26
Automating the Acquisition Process--Diagnostics
  • Certain patterns of occurrence are indicative of
  • very specific LCS characteristics
  • Causative alternations with animate subjects very
    suggestive of
  • cause (Tltanimgt , E)
  • The boat docked/The captain docked the boat
  • Non-movable objects appearing to go somewhere
    (fictive motion) STATE GO-EXTENT (THING
    , PATH THING/LOCATION )
  • The road went under the bridge

27
Automating the Acquisition Process--Diagnostics
  • Difference between Stative and Durative
  • Durative predicates can be volitionally
    maintained, stative generally cant
  • John deliberately sits in front of Bill versus
    John is deliberately muddy
  • Note that even though the verb sit is intuitively
    durative (extends over a period of time) and
    homogenous (sitting doesnt entail change of
    state) it is not truly stative, since it fails
    the tests for eventivity we noted earlier.
  • In LCS terms, this translates to using EVENT
    STAY ... for durative events, and STATE BE ...
    for states.
  • Characteristic of processes imperfective
    paradox
  • John is running gt John has run
  • versus accomplishments or achievements (change of
    state events)
  • John is building a house ?gt John has built a house

28
Toward Automating the Acquisition Process
  • Either search large corpora or employ native
    speakers with appropriate interface as done for
    automated acquisition of transfer rules in
    Carbonell et al (2002) CMU Avenue project.
  • Some diagnostic questions are impossible to
    answer fully automatically because were looking
    for an absence of something (e.g. pseudo-cleft of
    verb hypothesized to be stative) fact that it
    doesnt occur doesnt mean it cant occur.
Write a Comment
User Comments (0)
About PowerShow.com