Introduction to Computational Linguistics - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Introduction to Computational Linguistics

Description:

Getting computers to perform useful tasks involving human languages whether for: ... http://babelfish.altavista.com/ Text-To-Speech ... – PowerPoint PPT presentation

Number of Views:85
Avg rating:3.0/5.0
Slides: 40
Provided by: mariecathe8
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Computational Linguistics


1
Introduction to Computational Linguistics
  • Marie-Catherine de Marneffe

2
What is Computational Linguistics?
  • Getting computers to perform useful tasks
    involving human languages whether for
  • Enabling human-machine communication
  • Improving human-human communication
  • Doing stuff with language objects
  • Examples
  • Machine Translation
  • Automatic Question Answering
  • Speech Recognition
  • Text-to-Speech Synthesis
  • Text Understanding

3
Some brief demos
  • Machine Translation
  • http//translate.google.com/translate_t
  • http//babelfish.altavista.com/
  • Text-To-Speech
  • http//www-306.ibm.com/software/pervasive/tech/dem
    os/tts.shtml
  • Question Answering
  • http//www.powerset.com/

4
I. Syntax and Parsing
5
Syntax
  • Why should we care?
  • Grammar checkers
  • Question answering
  • Information extraction
  • Machine translation

6
Parsing
  • Parsing is the process of taking a string and a
    grammar and returning a parse tree for that
    string
  • a flight left

7
Phrase structure rules
  • S ? NP VP
  • NP ? Det N
  • VP ? Verb
  • Det ? a
  • N ? flight
  • Verb ? left

8
Context-Free Grammars (CFG)
  • Capture constituency and ordering
  • Constituency
  • How words group into units and how the various
    kinds of units behave
  • Ordering
  • What are the rules that govern the ordering of
  • words and bigger units in the language

9
Context?
  • The notion of context in CFGs has nothing to do
    with the ordinary meaning of the word context in
    language
  • All it really means is that the non-terminal on
    the left-hand side of a rule is out there all by
    itself (free of context)
  • A ? B C
  • Means that I can rewrite an A as a B followed by
    a C
  • regardless of the context in which A is found

10
Parsing
  • Parsing assigning correct trees to input strings
  • Correct tree
  • a tree that covers all and only the elements of
    the input and has an S at the top
  • For now enumerate all possible trees
  • A further task disambiguation
  • means choosing the correct tree from among all
    the possible trees

11
Parsing involves search
  • As with everything of interest, parsing involves
    a search which involves the making of choices
  • Well look at some basic methods to give you an
    idea of the problem

12
Top-Down Parsing
  • Since were trying to find trees rooted with an S
    (Sentences) start with the rules that give us an
    S.
  • Then work your way down from there to the words.

13
Top-Down Space
S
14
Bottom-Up Parsing
  • Of course, we also want trees that cover the
    input words. So start with trees that link up
    with the words in the right way.
  • Then work your way up from there.

15
Bottom-Up Space
16
Control
  • We need to keep track of the search space and
    have a strategy to make choices
  • Which node to try to expand next
  • Which grammar rule to use to expand a node

17
Top-Down, Depth-First, Left-to-Right Search
18
Example
19
Example
20
Example
21
Avoiding repeated work
  • Parsing is hard, and slow. Its wasteful to redo
    stuff over and over and over.
  • More efficient algorithm
  • Dynamic programming parsing CKY
  • (Cocke-Kasami-Younger)

22
Ambiguity
  • Bond shot the spy with a pistol.

23
One possible structure
24
Another possible structure
25
Lots of ambiguity
  • VP ? VP PP
  • NP ? NP PP
  • Show me the meals on flight 286 from SF to
    Denver.
  • 14 parses!

26
Lots of ambiguity
  • Church and Patil (1982)
  • Number of parses for such sentences grows at rate
    of number of parenthesizations of arithmetic
    expressions
  • Which grow with Catalan numbers
  • PPs Parses
  • 1 2
  • 2 5
  • 3 14
  • 4 132
  • 5 469
  • 6 1430

27
How to disambiguate parses?
  • Probabilistic methods
  • Augment the grammar rules with probabilities,
    computed on Treebanks
  • Modify the parser to keep only most probable
    parses
  • And at the end, return the most probable parse

28
A statistical scientific revolution
  • Computational Linguistics before 1990
  • Hand-built parsers, hand-built dialogue systems
  • High precision, low coverage methods
  • Computational Linguistics after 1995
  • Automatically trained parsers, unsupervised
    clustering, statistical machine translation
  • High coverage, low precision methods
  • LOGIC vs NGRAM (Gazdar, 1996)

29
Ambiguity
  • One morning I shot an elephant in my pajamas.
    How he got into my pajamas I dont know.
  • Groucho Marx

30
II. Text Understanding
31
The textual inference task
  • On the assumption that a piece of text T is true,
  • does this imply the truth of the hypothesis H?
  • T Sydney was the host city of the 2000 Olympics.
  • H The Olympics have been held in Sydney.
  • T Wal-Mart defended itself in court today
    against claims that its female employees were
    kept out of jobs in management because they are
    women.
  • H Wal-Mart was sued for sexual discrimination.
  • PASCAL RTE Challenge Dagan et al. 05
  • US government AQUAINT program

32
The contradiction detection task
  • Given two sentences, are they contradictory to
    one another?
  • T Sources in the intelligence community revealed
    that Abu Zubaydah was a low-level al-Qaeda
    operative handling minor logistics.
  • H Abu Zubaydah was a high-ranking member of
    al-Qaeda.
  • T UN Secretary General Kofi Annan has expressed
    deep concern over Saturday's Israeli commando
    raid deep inside Lebanon, calling it a truce
    violation.
  • H Israel insisted it had not breached the
    ceasefire.

33
Why is it useful?
  • Several applications need automatic entailment
    and contradiction detection
  • e.g., determining similarities and differences
    in peoples positions
  • I think that it is the right idea. We can make
    sure that drivers who are illegal come out of the
    shadows.
  • Barack Obama

I will not support driver's licenses for
undocumented people. -Hillary Clinton
34
Approaches to text understanding
Graph matchingapproaches
35
Approaches to text understanding
Very precise, But poor recall
Shallow, But robust
Graph matchingapproaches
36
How the system works
37
  • Antonym only predictive of contradiction because
  • modify the same entity
  • in context of same polarity

38
How to develop and evaluate such a system?
  • Development sets
  • (training sets, when we learn something)
  • Test sets that contain the good answers
  • 2 measures used precision and recall
  • Precision exactness
  • Recall completeness

items correctly retrieved
items in total
39
Why people get into this field
  • Passion about understanding how human language
    works
  • Passion about finding ways to use the power of
    computers to help processing of natural language

Kevin Knight
Write a Comment
User Comments (0)
About PowerShow.com