CSA2050 Introduction to Computational Linguistics - PowerPoint PPT Presentation

About This Presentation
Title:

CSA2050 Introduction to Computational Linguistics

Description:

Theory of Grammar is a theory of human linguistic abilities. ... English. John saw the dog. German. Johann hat den hund gesehen. Maltese. Gianni ra kelb ... – PowerPoint PPT presentation

Number of Views:341
Avg rating:3.0/5.0
Slides: 22
Provided by: michael307
Category:

less

Transcript and Presenter's Notes

Title: CSA2050 Introduction to Computational Linguistics


1
CSA2050 Introduction to Computational Linguistics
  • Lecture 1
  • Overview

2
Lecture 1
  • Course Information
  • What is CL?
  • What is L?
  • Course Contents

3
Course Information
  • Webhttp//www.cs.um.edu.mt/mros/csa2050
  • Lecturersmike.rosner_at_um.edu.mtray.fabri_at_um.edu.m
    tangelo.dalli_at_um.edu.mt
  • Book (nominally)Jurafsky Martin, Speech and
    Language Processing, Prentice Hall 2000, ISBN
    0-13-095069-6
  • NLTK

4
Human Language Technologies
  • Natural Language Processing (NLP)
  • Computational models of language analysis,
    interpretation, and generation.
  • syntax/semantics interface
  • Natural Language Engineering
  • emphasis on large-scale performance
  • example Google
  • Speech Technology
  • Computational Linguistics
  • Emphasis on mechanised linguistic theories.
  • Grew out of early Machine Translation efforts

5
CL Two Main Disciplines
COMP SCI
LINGUISTICS
6
Linguistics
  • Phonetics The study of speech sounds
  • Phonology The study of sound systems
  • Morphology The study of word structure
  • Syntax The study of sentence structure
  • Semantics The study of meaning
  • Pragmatics The study of language use

7
Noam Chomsky
  • Noam Chomskys work in the 1950s radically
    changed linguistics, making syntax central.
  • Chomsky has been the dominant figure in
    linguistics ever since.
  • Chomsky invented the generative approach to
    grammar.

8
Generative GrammarKey Points
  • A language is a possibly infinite set of strings.
  • Grammar is a finite description of that set.
  • Grammar is precisely defined.
  • Theory of Grammar is a theory of human linguistic
    abilities.
  • Grammar should generate all and only the strings
    of the language.
  • source Sag Wasow

9
A Simple Grammar Lexicon
  • grammar
  • S ? NP VP
  • NP ? N
  • VP ? V NP
  • lexicon
  • V ? kicks
  • N ? John
  • N ? Bill

S
10
Generative Power of a Grammar
G
L
G
L
overgeneration all but not only
undergeneration only but not all
G
L
all and only
11
Formal v. Natural Languages
  • Formal Languages
  • Numbers3290 1 1010101
  • Logic?x man(x) ? mortal(x)
  • Cif (i gt10) exit(0)
  • Natural Languages
  • EnglishJohn saw the dog
  • GermanJohann hat den hund gesehen
  • MalteseGianni ra kelb

12
Points of Similarity
  • A language is considered to be a (possibly
    infinite) set of sentences.
  • Sentences are sequences of tokens.
  • Formation rules determine which sequences are
    valid sentences.
  • Sentences have a definite structure.
  • Sentence structure related to meaning.

13
Structure Affects Meaning
I shot an elephant in my trousers
14
Points of Difference
  • Formal Languages
  • The grammar defines the language
  • Restricted application
  • Non ambiguous
  • Natural Languages
  • The language defines the grammar
  • Universal application
  • Highly ambiguous

15
Ambiguity
  • Lexical Ambiguitythe sheep is in the pen
  • Syntactic Ambiguitysmall animals and children
    laugh
  • Semantic Ambiguityevery girl loves a sailor
  • Pragmatic Ambiguitycan you pass the salt?
  • The management of ambiguity is central to the
    success of CL

16
Algorithms and Linguistics
  • Pure linguistics deals with
  • data
  • grammar rules
  • theories about grammar rules
  • Putting knowledge to some use involves
    processing.
  • Linguistic theory is silent about implementation
    issues
  • Implementation is central to Computational
    Linguistics

17
Computational Linguistics Issues
  • Representation of grammar and a lexicon
  • How is the structure of a given sentence actually
    discovered?
  • Generation of a sentence to express a particular
    meaning?
  • Learning a language with limited exposure to
    grammatical sentences?

18
Unimplemented theoriescan be dangerous
  • Representational details omitted.
  • Computer memory/complexity issues omitted.
  • Nature of individual steps may be unclear.
  • Difficult to test.
  • Potentially unimplementable

19
Computational LinguisticsTwin Goals
  • Scientific GoalContribute to Linguistics by
    adding a computational dimension.
  • Technological Goal Develop basis for machinery
    capable of handling human language that can
    support language engineering

20
Applications of Computational Linguistics
  • Machine Translation
  • Information Retrieval/Extraction
  • Document Classification
  • Question Answering
  • Style and Spell Checking
  • Dialogue Systems
  • Speech

21
LECTURES
1 Overview
2 IE
3 POS RF
4 Tagging
5 Chunking
6 SyntaxRF
7 NL Parsing
8 NL Generation
9 MorphologyRF
10 Lexicon
11 Spell Checking
12 Dialogue
13 Speech
14 Revision
Write a Comment
User Comments (0)
About PowerShow.com