Computational Linguistics - PowerPoint PPT Presentation

1 / 74

About This Presentation

Title:

Computational Linguistics

Description:

Computational Linguistics What is it and what (if any) are its unifying themes? Computational linguistics I often agree with XKCD What defines the rigor of a field? – PowerPoint PPT presentation

Number of Views:97

Avg rating:3.0/5.0

Slides: 75

Provided by: csluOgiE

Category:

more less

Transcript and Presenter's Notes

Title: Computational Linguistics

1
Computational Linguistics

What is it and what (if any) are its
unifying themes?

2
Computational linguistics
3
I often agree with XKCD
4
linguistics?
computational linguistics
literary criticism
physics
biology
chemistry
psychology
neuropsychology
more rigorous
less rigorous
more flakey
5
What defines the rigor of a field?

Whether results are reproducible
Whether theories are testable/falsifiable
Whether there are a common set of methods for
similar problems
Whether approaches to problems can yield
interesting new questions/answers

6
Linguistics
7
literary criticism
engineering
sociology
linguistics
more rigorous
less rigorous
8
The true situation with linguistics
other areas of sociolinguistics (e.g. Deborah
Tannen)
theoretical linguistics (e.g.
lexical-functional grammar)
some areas of sociolinguistics (e.g. Bill Labov)
theoretical linguistics (e.g. minimalist syntax)
experimental phonetics
historical linguistics
psycholinguistics
more rigorous
less rigorous
9
Okay enough alreadyWhat is computational
linguistics

Text normalization/segmentation
Morphological analysis
Automatic word pronunciation prediction
Transliteration
Word-class prediction e.g. part of speech
tagging
Parsing
Semantic role labeling
Machine translation
Dialog systems
Topic detection
Summarization
Text retrieval
Bioinformatics
Language modeling for automatic speech
recognition
Computer-aided language learning (CALL)

10
Computational linguistics

Often thought of as natural language engineering
But there is also a serious scientific component
to it.

11
Why CL may seem ad hoc

Wide variety of areas (as in linguistics)
If its natural language engineering, the goal is
often just to build something that works
Techniques tend to change in somewhat faddish
ways
For example machine learning approaches fall in
and out of favor

12
(No Transcript)
13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
16
Machine learning in CL

In general its a plus since it has meant that
evaluation has become more rigorous
But its important that the field not turn into
applied machine learning
For this to be avoided, people need to continue
to focus on what linguistic features are
important
Fortunately, this seems to be happening

17
Some interesting themes

Finite-state methods
Many application areas
Raises interesting questions about how much of
language is regular (in the sense of finite
state)
Grammar induction
Linguists have done a poor job at their stated
goal of explaining how humans learn grammar
Computational models of language change
Historical evidence for language change is only
partial. There are many changes in language for
which we have no direct evidence.

18
Finite state methods

Used from the 1950s onwards
Went out of fashion a bit during the 1980s
Then a revival in the 1990s with the advent of
weighted finite-state methods

19
Some applications

Analysis of word structure morphology
Analysis of sentence structure
Part of speech tagging
Parsing
Speech recognition
Text normalization
Computational biology

20
Regular languages

A regular language is a language with a finite
alphabet that can be constructed out of one or
more of the following operations
Set union
Concatenation
Transitive closure (Kleene star)

21
Finite state automata formal definition
Every regular language can be recognized by a
finite-state automaton. Every finite-state
automaton recognizes a regular language.
(Kleenes theorem)
22
Representation of FSAs State Diagram
23
Regular relations formal definition
24
Finite-state transducers
25
An FST
26
Composition

In addition to union, concatenation and Kleene
closure, regular relations are closed under
composition
Composition is to be understood here the same way
as composition in algebra
R1oR1 means take the output of R1 and feed it to
the input of R2

27
Composition an illustration
28
R1 as a transducer
29
R2 as a transducer
30
R1?R2
31
Some things you can do with FSTs

Text analysis/normalization
Word segmentation
Abbreviation expansion
Digit-to-number-name mappings
i.e. mapping from writing to language
Morphological analysis
Syntactic analysis
E.g. part-of-speech tagging
(With weights) pronunciation modeling and
language modeling for speech recognition

32
Thats fine for engineering but

Does it really account for the facts?
Is morphology really regular?
Is the mapping between writing and speech really
regular?

33
What is morphology?

scripserunt is third person, plural, perfect,
active of scribo (I write)
Morphology relates word forms
the lemma of scripserunt is scribo
Morphology analyzes the structure of word forms
scripserunt has the structure scribserunt

34
Morphology is a relation

Imagine you have a Latin morphological analyzer
comprising
D a relation that maps between surface form and
decomposed form
L a relation that maps between decomposed form
and lemma
Then
scripserunt ? D scribserunt
scripserunt ? D ? L scribo

35
English regular plurals

cat s cats /s/
dog s dogs /z/
spouse s spouses /?z/
This can be implemented by a rule that composes
with the base word, inserting the relevant form
of the affix at the end

36
Templatic affixes in Yowlumne
Transducer for each affix transforms base into
required templatic form and appends the relevant
string.
37
Subtractive morphology
Transducer deletes final VC of the base
38
Bontoc infixation

Insert a marker gt after the first consonant (if
any)
Change gt into the infix um-

39
Side note infixation in English
Kalama
zoo
fg
40
Reduplication Gothic
Problem mapping w to ww is not a regular relation
41
Factoring Reduplication

Prosodic constraints
Copy verification transducer C

42
Non-Exact Copies

Dakota (Inkelas Zoll, 1999)

43
Non-Exact Copies

Basic and modified stems in Sye (Inkelas Zoll,
1999)

44
Morphological Doubling Theory(Inkelas Zoll,
1999)

Most linguistic accounts of reduplication assume
that the copying is done as part of morphology
In MDT
Reduplication involves doubling at the
morphosyntactic level i.e. one is actually
simply repeating words or morphemes
Phonological doubling is thus expected, but not
required

45
Gothic Reduplication under Morphological Doubling
Theory
46
Summary

If Inkelas Zoll are right then all morphology
can be computed using regular relations
This in turn suggests that computational
morphology has picked the right tool for the job

47
Another Example Linguistic analysis of text

Maps between the stuff you see on the page e.g.
text written in the standard orthography of a
language into linguistic units (words,
morphemes, phonemes)
For example
I ate a 25kg bass
aI eIt ? twenti faIv kIl?græm bæs
This can be done using transducers
But is the mapping between writing and language
really regular (finite-state)?

48
Linguistic analysis of text

Abbreviation expansion
Disambiguation
Number expansion
Morphological analysis of words
Word pronunciation

49
A transducer for number names
Consider a machine that maps between digit
strings and their reading as number names in
English. 30,294,005,179,018,903.56 ? thirty
quadrillion, two hundred and ninety four
trillion, five billion, one hundred seventy nine
million, eighteen thousand, nine hundred three,
point five six
50
Mapping between speech and writing

It seems obvious on the face of it that the
mapping between speech and its written form is
regular. After all, the words are ordered in the
same way as speech. Even the

tend to be ordered in the same
letters
way as the sounds they represent.
51
Some examples where it isnt
honorific inversion
r
m
j
n
t
nx
xpr
w
t
w
nb
52
Finite state methods

In morphology they seem almost exactly correct as
characterizations of the natural phenomenon
In the mapping from writing to language, again,
finite-state models seem almost exactly correct

53
Grammar induction
The common nativist view in linguistics From
Gilbert Harman's review of Chomsky's New Horizons
in the Study of Language and Mind (published in
Journal of Philosophy, 98(5), May 2001) Further
reflection along these lines and a great deal of
empirical study of particular languages has led
to the "principles and parameters" framework
which has dominated linguistics in the last few
decades. The idea is that languages are basically
the same in structure, up to certain parameters,
for example, whether the head of a phrase goes at
the beginning of a phrase or at the end. Children
do not have to learn the basic principles, they
only need to set the parameters. Linguistics aims
at stating the basic principles and parameters by
considering how languages differ in certain more
or less subtle respects. The result of this
approach has been a truly amazing outpouring of
discoveries about how languages are the same yet
different.
54
Similarly
Cedric Boeckx and Norbert Hornstein. 2003. The
Varying Aims of Linguistic Theory.
Children come equipped with a set of principles
of grammar construction (i.e. Universal Grammar
(UG)). The principles of UG have open parameters.
Specific grammars arise once values for these
open parameters are specified. Parameter values
are determined on the basis of the primary
linguistic data. A language specific grammar,
then, is simply a specification the values that
the principles of UG leave open.
55
My challenge with Shalom Lappin
56

57
Automatic induction of grammars from unannotated
text

Klein, Dan and Manning, Christopher. 2004.
Corpus-based induction of syntactic
structure models of dependency and
constituency. Proceedings of the 42nd Annual
Meeting on Association for Computational
Linguistics
Lots of subsequent work

58
Different syntactic representations
59
Dependency Model with Valence (DMV)

Each head generates a set of non-STOP arguments
to one side, then a STOP argument then similarly
on the other side
Trained using expectation maximization

60
Performance
61
Improvements

Constituent structure can be induced in a similar
way to inducing word classes (e.g. parts of
speech) by considering the environments in
which the putative constituent finds itself.
In Klein Mannings constituent-context model
(CCM) probability of a bracketing is computed as
follows

62
Combined DMVCCM
Subsequent work e.g. Rens Bods 2006
Unsupervised Data Oriented Parsing report
F-scores close to 83.0 For comparison, the best
supervised parsers get about 91.0
63
Some objections and a synopsis

Children do not learn grammars from unannotated
text corpora they get a lot of guidance from the
environmental situation
Sure
Performance of automatic induction algorithms is
still far from human performance so they do not
constitute evidence that we can do away with
(nativist) linguistic theories of language
acquisition
They do not show this. But the argument would
have more weight if nativist theories had already
been demonstrated to contribute to a working
model of grammar induction
But Computational Linguistics is starting to make
some serious contributions to this 50-year-old
debate

64
The evolution of complex structure in language
Examples from Stump, Gregory (2001) Inflectional
Morphology A Theory of Paradigm Structure.
Cambridge University Press.
65
Evolutionary Modeling (A tiny sample)

Hare, M. and Elman, J. L. (1995) Learning and
morphological change. Cognition, 56(1)61--98.
Kirby, S. (1999) Function, Selection, and
Innateness The Emergence of Language Universals.
Oxford
Nettle, D. "Using Social Impact Theory to
simulate language change". Lingua,
108(2-3)95--117, 1999.
de Boer, B. (2001) The Origins of Vowel Systems.
Oxford
Niyogi, P. (2006) The Computational Nature of
Language Learning and Evolution. Cambridge, MA
MIT Press.

66
A multi-agent simulation

System is seeded with a grammar and small number
of agents
Each agent randomly selects a set of phonetic
rules to apply to forms
Agents are assigned to one of a small number of
social groups
2 parents beget child agents.
Children are exposed to a predetermined number of
training forms combined from both parents
Forms are presented proportional to their
underlying frequency
Children must learn to generalize to unseen slots
for words
Learning algorithm similar to
David Yarowsky and Richard Wicentowski (2001)
"Minimally supervised morphological analysis by
multimodal alignment." Proceedings of ACL-2000,
Hong Kong, pages 207-216.
Features include last n-characters of input form,
plus semantic class
Learners select the optimal surface form to
derive other forms from (optimal requiring the
simplest resulting ruleset a Minimum
Description Length criterion)
Forms are periodically pooled among all agents
and the n best forms are kept for each word and
each slot
Population grows, but is kept in check by
natural disasters and a quasi-Malthusian model
of resource limitations
Agents age and die according to reasonably
realistic mortality statistics

67
Final states for a given initial state
68
Another example

Kirby, Simon. 2001. Spontaneous evolution of
linguistic structure an iterated learning model
of the emergence of regularity and irregularity.
IEEE Transactions on Evolutionary Computation,
5(2)102--110.
Assumes two meaning components each with 5
values, for 25 possible words
Initial speaker randomly selects examples from
the 25, producing random strings for each, and
teaches them to the hearer
Not all of the slots are filled, thus producing a
bottleneck the hearer must compute forms for
the missing slots

69
The basic algorithm produces results that are too
regular
Initial state
Final state
70
A more realistic result

Addition of other constraints, including
a random tendency for speakers to omit symbols,
a frequency distribution over the 25 possible
meaning combinations

71
Summary

Evolutionary modeling is evolving slowly
We are a long way from being able to model the
complexities of known language evolution
Nonetheless, computational approaches promise to
lend insights into how complex social systems
such as language change over time, and complement
discoveries in historical linguistics

72
Final thoughts

Language is central to what it means to be human.
Language is used to
Communicate information
Communicate requests
Persuade, cajole
(In written form) record history
Deceive
Other animals do some or most of these things
(cf. Anindya Sinhas work on bonnet macaques)
But humans are better at all of these

73
Final thoughts