Linguistics 187/287 Week 2 - PowerPoint PPT Presentation

About This Presentation
Title:

Linguistics 187/287 Week 2

Description:

Title: Contextually-related Entities Author: Francine Chen Last modified by: Powerset Inc. Created Date: 8/15/2003 12:17:06 AM Document presentation format – PowerPoint PPT presentation

Number of Views:79
Avg rating:3.0/5.0
Slides: 62
Provided by: Francin150
Learn more at: http://web.stanford.edu
Category:

less

Transcript and Presenter's Notes

Title: Linguistics 187/287 Week 2


1
Linguistics 187/287 Week 2
Engineering and Linguistic Generalizations

2
  • Homework
  • Due Friday
  • Can discuss in class or via email or ask us for
    office hours
  • Last assignment
  • How much time?
  • Trouble access, procedure?
  • Issues XLE, LFG, grammar?

3
Topics for this week
  • Notation in LFG (more background)
  • Templates
  • Lexical rules
  • Configurations
  • Feature declaration
  • Metarulemacro

4
Grammar engineering for deep processing
  • Draws on theoretical linguistics, software
    engineering
  • Theoretical linguistics gt papers
  • Generalizations, universality, idealization
    (competence)
  • Software engineering gt programs
  • Coverage, interface, QA, maintainability,
    efficiency, practicality
  • Grammar engineering
  • GrammarTheory ProgramProgramming language
  • Reflect linguistic generalizations
  • Respect special cases of ordinary language
  • Deal with large-scale interactions
  • Theory/practice trade-offs

5
Grammar Engineering and Linguistic Theory
  • Description vs. representation
  • Program vs. data
  • Expressiveness of notation
  • Regular predicates for c-structure
  • Boolean combinations (esp. disjunction)
  • Equality, set-membership
  • Defaults and marking conventions
  • Constraining vs. defining, existentials, defaults
  • Abbreviation and factoring
  • Templates, macros, lexical rules
  • Configuration management
  • Combining rules, templates, lexicons
  • Priority of core/specializations/extensions

6
Description vs. Representation
  • Complexity trades (program vs. data)
  • Simplify descriptions but complicate
    representations
  • Complicate descriptions but simplify
    representations
  • Example Arguments and adjuncts
  • Different behavior
  • Arguments selected by predicate, unique
  • Adjuncts modify predicate, multiple instances
  • Similar behavior Can both be questioned
  • Representation solution (HPSG)
  • ARG ADJ DEP ARG ? ADJ
    (new type)
  • Description solution (LFG)
  • ARG ADJ ARG ADJ

7
Description vs. Representation
  • External constraints on representation
  • Linguistic theory
  • Applications
  • Multilingual/cross-grammar similarity

8
Expressiveness of notation
  • Regular predicates for c-structure

Simple context-free rules
Compact notation
NP --gt N NP --gt Det N
NP --gt (Det) N optionality
9
Expressiveness of notation and Representation
  • Equality attribute values
  • Set-membership sets and elements
  • Adjuncts PP ( ADJUNCT)!
  • PP ! ( ADJUNCT)
  • Coordination (more next week)
  • NP --gt NP ! CONJ NP ! .
  • Semantic forms
  • ( PRED)kicklt( SUBJ)( OBJ)gt
  • Semantic relations, instantiation,
    subcategorization

10
Defaults and Marking Conventions
  • Constraining vs. defining
  • Must be assigned nom ( SUBJ CASE)c nom
  • Is nom ( SUBJ CASE)nom
  • Existentials
  • Must have case ( CASE)
  • Defaults
  • NTYPE proper pronoun common
  • ( NTYPE) ( NTYPE)common
  • ( NTYPE)common

  • (make choices disjoint)

11
Abbreviations and Factoring
  • Templates
  • Capture generalizations of annotations
  • Maintainability changes, mistakes
  • Compare HPSG type hierarchy
  • Macros
  • Capture generalizations of rules
  • Lexical Rules
  • Theoretical proposal to manipulate predicates
  • Implemented to expand lexicons consistently

12
Example The verb bakes
  • Belongs to several classes
  • Third-person, singular, present-tense verb
  • Transitive or intransitive
  • Shares
  • Some properties with falls
  • Other properties with cooked

13
The lexicon à la Kiparsky
  • A dumping ground for exceptions
  • A kind of appendix to the grammar, whose
    function is to list what is unpredictable and
    irregular about the words of a language

14
The lexicon à la Bresnan
  • A repository of linguistic generalizations
  • Active and passive forms are related by lexical
    rules, not syntactic transformations
  • ( SUBJ) ? ( OBL-AG)
  • ( OBJ) ? ( SUBJ)
  • Rules relating lexical items are a prime locus of
    syntactic generalizations

15
The lexicon à la Flickinger
  • A hierarchical structure of classes
  • Each class represents some piece of syntactic
    information
  • bakes belongs to
  • the third-person singular present-tense class
  • (like appears)
  • the transitive/intransitive class
  • (like cooked)
  • and others
  • Classes may be subclasses of other classes
  • Classes may partition other classes along several
    dimensions

16
LFG Relations between descriptions
LFG can encode linguistic generalizations
asrelations between descriptions of structures
  • LFG functional description is a collection of
    equations
  • These can be named
  • This name can stand for those equations in
    linguistic descriptions
  • Named descriptions are referred to as templates
  • Interpretation Simple substitution
    Template-description is substituted for
    template-name that appears in (is invoked by)
    another description

17
3SG and PRESENT templates
  • 3SG ( SUBJ PERSON) 3
  • ( SUBJ NUM) SG.
  • 3SG names ( SUBJ PERSON)3 ( SUBJ
    NUM)SG
  • PRESENT ( TENSE) PRES.

_at_ marks invocation (in lexicon, rules,
templates) Substitute ( TENSE)PRES for
_at_PRESENT in other descriptions
18
Templates enable hierarchical generalizations
  • Template definitions can refer to other templates
    by name
  • E.g. further divide 3SG into
  • 3PERS ( SUBJ PERSON) 3.
  • SING ( SUBJ NUM) SG.
  • then 3SG _at_3PERS _at_SING.
  • Hierarchy of references represents inclusion
    hierarchy of named descriptions
  • Frequently repeated subdescriptions
  • specified in one place
  • effective in many

19
Hierarchy of template invocations
  • Sharing in verb agreement

SING
3PERS
PRESENT
3SG
PRES3SG
  • Boolean combinations of template references
  • (just like ordinary descriptions)
  • Sharing is distinct from mode of combination

20
Functional description for bakes
  • ( PRED)bakeltSUBJ,OBJgt (
    PRED)bakeltSUBJgt
  • ( TENSE)PRES
  • ( SUBJ PERS)3
  • ( SUBJ NUM)SG
  • With agreement template
  • ( PRED)bakeltSUBJ,OBJgt (
    PRED)bakeltSUBJgt
  • _at_PRES3SG
  • Agreement template invoked by other verbs

21
Templates with parameters Valency
Pargram convention Parameters begin with _
  • TRANS-OR-INTRANS(_p)
  • ( PRED) _pltSUBJ, OBJgt
  • ( PRED) _pltSUBJgt .
  • PRED value as a parameter of the template
  • _at_TRANS-OR-INTRANS(bake)
  • ? ( PRED) bakeltSUBJ, OBJgt
  • ( PRED)
    bakeltSUBJgt
  • Arguments can substitute for any part of an
    f-description
  • Attributes
  • Values
  • Semantic relation-names
  • Descriptions

22
Valency hierarchy
  • TRANS-OR-INTRANS(p)
  • _at_INTRANSITIVE(p) _at_TRANSITIVE(p)
    .
  • INTRANSITIVE(p) ( PRED)pltSUBJgt
  • TRANSITIVE(p) ( PRED)pltSUBJ, OBJgt.

INTRANSITIVE
TRANSITIVE
TRANS-OR-INTRANS
23
Templates and generalizations bakes
  • bakes _at_TRANS-OR-INTRANS(bake) _at_PRES3SG
  • TRANS-OR-INTRANS(p) shared by eat, cooked,
  • PRES3SG shared by appears, goes, cooks,
  • PRESENT
  • used by PRES3SG template
  • shared by bake, laugh, etc.

24
Lexical sharing
3PERS
SING
INTRANSITIVE
TRANSITIVE
PRESENT
3SG
TRANS-OR-INTRANS
PRES3SG
bakes
cooked
falls
25
Type hierarchy vs. templates
  • Templates can play the same role as hierarchical
    type systems in theories like HPSG
  • A notational device for factoring descriptions
  • Interpreted as simple substitution
  • Not part of a formal ontology
  • Do not require an elaborate mathematical
    characterization

26
Templates also invoked by Rules
  • Rule annotations can also call templates
  • Global changes, typo prevention
  • Example adjunct annotation
  • PP ! ( ADJUNCT) (! ADJ-TYPE)VP
  • ADVP ! ( ADJUNCT) (! ADJ-TYPE)VP
  • ADJ(_T) ! ( ADJUNCT) (! ADJ-TYPE)_T.
  • PP _at_(ADJ VP) PP _at_(ADJ NP)
  • ADVP _at_(ADJ VP) ADVP _at_(ADJ S)

27
Templates Rules
  • Example null pronouns
  • Push it! They left (in order) to be on time.
  • NULL-PRON(_P) (_P PRED)pro
  • (_P
    PRON-TYPE)null.
  • VPimp --gt VP _at_(NULL-PRON ( SUBJ)).
  • VPimp --gt VP ( SUBJ PRED)pro
  • ( SUBJ PRON-TYPE)null.

28
Templates Extend notation
  • DEFAULT(D V) D DV DV .
  • e.g. _at_(DEFAULT ( NTYPE) common)
  • IF(P1 P2) P1 P2
  • IFF(P1 P2) P1 P2 P1 P2 .

29
Templates and Principles
  • Subject principle every verb has a subject.
  • Implementaton
  • VERB ( SUBJ).
  • Put _at_VERB in every verbal entry.
  • or
  • Put _at_VERB in the templates called by the verbal
    entries.

30
Lexical Rules
  • Theoretical construct
  • Templates can often achieve the same result
  • Disjunction of several templates
  • Parameterization of a complex template

31
Lexical Rules Example
  • Active
  • They ate the cake.
  • ( PRED)eatlt(SUBJ)(OBJ)gt'
  • Passive
  • The cake was eaten.
  • ( PRED)'eatltNULL (SUBJ)gt'
  • Could have VTRANS have two disjuncts
  • Or manipulate PRED with lexical rule

32
Lexical Rules Example
  • Passive lexical rule
  • _SCHEMA is a subcategorization frame
  • PASSIVE(_SCHEMA)
  • _SCHEMA ( PASSIVE)-
  • _SCHEMA
  • ( SUBJ) --gt NULL
  • ( OBJ) --gt ( SUBJ)
  • ( PASSIVE)c .
  • Example calls
  • TRANS(_P) _at_(PASSIVE ( PRED)'_Plt(SUBJ)(OBJ)gt'
    ).
  • DITRANS(_P) _at_(PASSIVE ( PRED)'_Plt(SUBJ)(OB
    J)(OBJ2)gt').

33
Lexical Rules Summary
  • Lexical rules manipulate arguments of predicates
  • capture systematic alternations like
    active-passive
  • Rename and remove roles
  • No good implementation for adding roles
  • causative
  • complex predicates
  • benefactives

34
Configuration Management
  • Combining rules, templates, lexicons,
  • System needs to know where everything is
  • For large grammars, need modularization (multiple
    grammar rule files, multiple lexicons)
  • Priority of core/specializations/extentions
  • Want to specialize a grammar
  • No questions in instruction manuals
  • Loosen subj-V agreement
  • Have lexicons of varying quality

35
Combining Rules, Templates, Lexicons
  • XLE configuration section
  • Specify what files are called
  • Specify which rule, template, and lexicon
    sections are used
  • RULES (TOY ENGLISH).
  • RULES (CORE ENGLISH) (SPECIAL ENGLISH).
  • Other grammar information

36
Configurations and Declarations
  • Configurations
  • File management
  • Priority
  • Declarations
  • Governable relations and semantics
  • Features
  • Global Operators
  • METARULEMACRO

37
Files
  • Priority ordered rules/entries in later files
    override those in earlier ones
  • Example
  • FILES standard-english-rules.lfg
  • eureka-english-rules.lfg
  • standard-english-lexicon.lfg
  • eureka-english-lexicon.lfg.

38
Eureka vs. Standard rules
  • STANDARD ENGLISH RULES (1.0)
  • N --gt _at_NOUN-COMMON
  • _at_NOUN-PROPER.
  • NOUN-COMMON -gt
  • NOUN-PROPER -gt
  • EUREKA ENGLISH RULES (1.0)
  • N --gt _at_NOUN-COMMON
  • _at_NOUN-PROPER
  • _at_NOUN-EUREKA
  • N PL .
  • NOUN-EUREKA --gt EUR-PART EUR-NUM .

39
Sections Used
  • All lexicon, rule, and template sections have
    names and versions.
  • These are called in priority order in the config.
  • Use with the file order to create overrides.
  • RULES (STANDARD RULES) (EUREKA RULES).
  • LEXENTRIES (all all).

Versions allow for future XLE upgrades
40
Multiple Lexicon Sections
  • LEXENTRIES (AUTOMATIC ENGLISH)

  • (CORRECTED ENGLISH).
  • AUTOMATIC ENGLISH LEXICON (1.0)
  • appear V XLE _at_(V-TRANS appear)
  • _at_(V-INTRANS
    appear).
  • CORRECTED ENGLISH LEXICON (1.0)
  • appear V XLE _at_(V-INTRANS appear)
  • _at_(V-SUBJ-XCOMP
    appear).

41
Other Configuration Information
  • ROOTCAT default top level category
  • Standard ROOT, Eureka FIELD
  • Nondistributives for coordination
  • External attributes for applications
  • Character encoding
  • Reparse category and Optimality order for
    robustness
  • See XLE documentation for complete list

42
Declarations
  • Must declare grammatical and semantic functions
    for each grammar.
  • Used for completeness and coherence
  • GOVERNABLERELATIONS
  • Functions (features) that must be subcategorized
    for in the PRED
  • SUBJ OBJ OBL-? ?COMP etc.
  • SEMANTICFUNCTIONS
  • Functions that must have a PRED
  • ADJUNCT NMOD

43
Feature Declaration
  • List of all the features
  • GGF and semantic functions need not be listed
  • all other features must be listed
  • List of their possible values
  • atomic
  • f-structure
  • Multiple feature declarations
  • multilingual setting
  • grammar specialization

44
Why a feature declaration?
  • Good engineering practice
  • Catch typos and old analyses
  • Grammar easier to read
  • NB Theory doesnt have typos

45
Declaration format
  • STANDARD LANGUAGE FEATURES (1.0)
  • feature1 -gt val1 val2 val3 .
  • feature2 -gt val4 val 5 .
  • feature3 -gt ltlt feature1 feature2 .
  • feature4.
  • ----

46
Sample feature declaration
  • TOY ENGLISH FEATURES (1.0)
  • NUM -gt sg pl .
  • PERS -gt 1 2 3 .
  • TNS-ASP -gt ltlt TENSE MOOD ASPECT .
  • TENSE.
  • MOOD -gt indicative subjunctive .
  • ASPECT -gt ltlt PERF PROG .
  • PERF -gt - .
  • PROG -gt - .

47
XLE and the feature declaration
  • XLE will not load a grammar with a violation of
    the feature declaration.
  • To catch violations in the lexicon, the generator
    must be loaded.
  • regenerate some-sentence-to-parse
  • parse, then choose generate in f-str window
  • create-generator grammar-name.lfg
  • print-unused-feature-declarations

48
Multiple feature declarations
  • List in priority order in the configuration
  • FEATURES (STANDARD COMMON)

  • (STANDARD ENGLISH).
  • New features are listed as usual
  • Changes to features use edit operators
  • add a new value
  • intersect the values
  • ! replace the feature entirely

49
Multiple feature declarations
  • STANDARD COMMON FEATURES (1.0)
  • NUM -gt sg pl dual .
  • CASE -gt nom acc .
  • TENSE -gt ltlt PAST FUTURE .
  • PAST -gt - .
  • FUTURE -gt - .
  • STANDARD ENGLISH FEATURES (1.0)
  • PERS -gt 1 2 3 .
    PERS -gt 1 2 3 .
  • NUM -gt sg pl . NUM
    -gt sg pl .
  • CASE -gt gen . CASE
    -gt nom acc gen .
  • !TENSE -gt pres past fut . TENSE -gt
    pres past fut .
  • !PAST -gt .
  • !FUTURE -gt .

50
Using Multiple Feature Decl.
  • Multilingual contexts
  • Language universal features
  • Customize to particular language
  • Grammar specialization
  • Add new features for odd constructions
  • Remove unused choices

51
Global Operations METARULEMACRO
  • System defined function
  • Operates on every category
  • Global statements
  • Linguistic subject condition
  • SUBJ lt OBJ
  • coordination
  • Engineering quotes
  • bracketing

52
METARULEMACRO
  • Right-hand side of each grammar rule is the
    result of applying the macro to the rule
  • METARULEMACRO(_CAT _BASECAT _RHS)
  • _RHS.

53
Punctuation and METARULEMACRO
  • Surround any constituent with quotes
  • METARULEMACRO( _CAT _BASECAT _RHS)
  • _RHS
  • L-QT
  • _CAT
  • R-QT
  • L-DQT
  • _CAT
  • R-DQT.

54
Punctuation cont.
  • Mary and John left them there.
  • We saw them in the garden.
  • They appeared and then disappeared.'

55
Punctuation Problem
  • Vacuous branching results in many analyses

NP
etc.
Nzero
N
bagels
56
Solution PUSHUP
  • If non-branching, push up to highest node.
  • METARULEMACRO(_CAT _BASECAT _RHS)
  • _RHS
  • L-QT
  • _CAT _at_PUSHUP
  • R-QT .
  • How to define PUSHUP?
  • Need to test existence of sister nodes MOTHER
    SISTER

PUSHUP ( MOTHER LEFT_SISTER)
( MOTHER RIGHT_SISTER)
( MOTHER LEFT_SISTER)
( MOTHER MOTHER) .
57
Summary
  • Lexical rules allow for generalizations over
    predicate alternations
  • Configurations and declarations allow management
    of large-scale grammars
  • readability and consistency
  • maintenance
  • specialization
  • Global operators allow for cross-grammar
    generalizations
  • coordination

58
(No Transcript)
59
The HPSG lexicon a type hierarchy
  • More specific types inherit information from less
    specific
  • Types and subtypes
  • A mathematical relation between structures
    AND/OR lattice
  • Different subtypes represent alternatives/disjunct
    ion
  • Multiple supertypes represent conjunction

head
(Malouf)
OR
noun
relational
AND
c-noun
gerund
verb
  • LFG does not use typed feature structures for
    lexical generalizations

but type inheritance is not the only (best?)
way to express generalizations
60
Coordination without METARULEMACRO
  • Want to coordinate any constituent
  • Coordination macro
  • SCCOORD(_CAT)
  • _CAT !
  • COMMA
  • _CAT !
  • CONJ
  • _CAT ! .
  • Put call in each rule
  • NP (DET) AP N PP
  • _at_(SCCOORD NP).
  • Engineering problem
  • forget to call
  • put in wrong category

61
Coordination with METARULEMACRO
  • Call SCCOORD as part of MRM
  • METARULEMACRO(_CAT _BASECAT _RHS)
  • _RHS
  • _at_(SCCOORD _CAT).
  • NP rule now
  • NP (DET) AP N PP.
  • Effectively
  • NP (DET) AP N PP
  • _at_(SCCOORD NP.
Write a Comment
User Comments (0)
About PowerShow.com