74.406 Natural Language Processing - PowerPoint PPT Presentation

1 / 51
About This Presentation
Title:

74.406 Natural Language Processing

Description:

'Communication is the intentional exchange of information brought about by the ... a common or shared set of signs ... Manner:{the way it is done} Manner: fast ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 52
Provided by: christe
Category:

less

Transcript and Presenter's Notes

Title: 74.406 Natural Language Processing


1
74.406 Natural Language Processing
  • Christel Kemke
  • Department of Computer Science
  • University of Manitobe

74.406 Natural Language Processing, 1st term
2004/5
2
Evolution of Human Language
  • communication for "work"
  • social interaction
  • basis of cognition and thinking
  • (Whorff Saphir)

3
Communication
"Communication is the intentional exchange of
information brought about by the production and
perception of signs drawn from a shared system of
conventional signs." Russell Norvig, p.651
4
Natural Language - General
  • Natural Language is characterized by
  • a common or shared set of signs alphabeth
    lexicon
  • a systematic procedure to produce combinations of
    signs
  • syntax
  • a shared meaning of signs and combinations of
    signs
  • (constructive) semantics

5
Natural Language Processing Overview
  • Speech Recognition
  • Natural Language Processing
  • Syntax
  • Semantics
  • Pragmatics
  • Spoken Language

6
Natural Language and Speech
  • Speech Recognition
  • acoustic signal as input
  • conversion into phonemes and written words
  • Natural Language Processing
  • written text as input sentences (or
    'utterances')
  • syntactic analysis parsing grammar
  • semantic analysis "meaning", semantic
    representation
  • pragmatics dialogue discourse metaphors
  • Spoken Language Processing
  • transcribed utterances
  • Phenomena of spontaneous speech

7
Words
8
Morphology
  • A morphological analyzer determines (at least)
  • the stem ending of a word,
  • and usually delivers related information, like
  • the word class,
  • the case and
  • the person of the word.
  • The morphology can be part of the lexicon or
    implemented as a single component, for example as
    a rule-based system.
  • eats ? eat s verb, singular, 3rd pers
  • dog ? dog noun, singular


9
Lexicon
  • The Lexicon contains information on words, as
  • inflected forms (e.g. goes, eats) or
  • word-stems (e.g. go, eat).
  • The Lexicon usually assigns a syntactic category,
  • the word class or Part-of-Speech category
  • Sometimes also
  • further syntactic information (see Morphology)
  • semantic information (e.g. semantic
    classifications like agent)
  • syntactic-semantic information, e.g. on verb
    complements like give requires a direct object.

10
Lexicon
  • Example contents
  • eats ? verb singular, 3rd person
  • can have direct object
  • dog ? dog, noun, singular animal
  • semantic annotation

11
POS (Part-of-Speech) Tagging
  • POS Tagging determines word class or
    part-of-speech category (basic syntactic
    categories) of single words or word-stems.
  • The det (determiner)
  • dog noun
  • eat, eats verb (3rd singular)
  • the det
  • bone noun

12
NLP - Syntactic Analysis
Part-of-Speech (POS) Tagging
Morphological Analyzer
Parser
Grammar Rules
Lexicon
eat s eat verb Verb VP ? Verb
Noun VP recognized 3rd sing
VP Verb Noun
parse tree
13
Syntax
14
Language and Grammar
  • Natural Language described as Formal Language L
    using a Formal Grammar G
  • start-symbol S sentence
  • non-terminals NT syntactic constituents
  • terminals T lexical entries/ words
  • production rules P grammar rules
  • Generate sentences or recognize sentences
    (Parsing) of the language L through the
    application of grammar rules.

15
Grammar
  • Terminals can be words, part-of-speech
    categories, or more complex lexical items
    (including additional syntactic/semantic
    information related to the word)
  • Non-Terminals represent (higher level) syntactic
    categories

16
Grammar
  • Most often we deal with Context-free Grammars,
    with a distinguished Start-symbol S (sentence).
  • det ? the
  • noun ? dog bone
  • verb ? eat eats
  • NP ? det noun (NP ? noun phrase)
  • VP ? verb (VP ? verb phrase)
  • VP ? verb NP
  • S ? NP VP (S ? sentence)
  • Here, POS Tagging is included in the grammar.

17
Parsing (here LR, bottom-up)
  • Determine the syntactic structure of a sentence.
  • the ? det POS Tagging
  • dog ? noun
  • det noun ? NP Rule application
  • eats ? verb
  • the ? det
  • bone ? noun
  • det noun ? NP
  • verb NP ? VP
  • NP VP ? S

18
Syntax Analysis / Parsing
  • Syntactic Structure often represented as Parse
    Tree.
  • Connect symbols according to applied grammar
    rules.

19
Parse Trees
20
Lexical Ambiguity
  • Several word senses or word categories
  • e.g. chase noun or verb
  • e.g. plant - ????

21
Syntactic Ambiguity
  • Several parse trees
  • e.g. The dog eats the bone in the park.
  • e.g. The dog eats the bone in the package.
  • Who/what is in the park and who/what is in the
    package?
  • Syntactically speaking
  • How do I bind the Prepositional Phrase "in the
    ..." ?

22
Semantics
23
Semantic Representation
  • Representation meaning of the sentence.
  • Generate
  • a logic-based representation or
  • a frame-based representation e.g.
  • Fillmores case frames
  • based on the syntactic structure, lexical
    entries, and particularly the head-verb
    (determines how to arrange parts of the sentence
    in the semantic representation).

24
Semantic Representation
  • Verb-centered representation
  • Verb (action, head) is regarded as center of
    verbal expression and determines the case frame
    with possible case roles other parts of the
    sentence are described in relation to the action
    as fillers of case slots. (cf. also Schanks CD
    Theory)
  • Typing of case roles possible (e.g. 'agent'
    refers to a specific sort or concept)

25
General Frame for eat
  • Agent animate
  • Action eat
  • Patiens food
  • Manner e.g. fast
  • Location e.g. in the yard
  • Time e.g. at noon

26
Frame with fillers for sample sentence
  • Agent the dog
  • Action eat
  • Patiens the bone / the bone in the package
  • Location in the park

27
Frame with fillers for sample sentence
  • Agent the dog
  • Action eat
  • Patiens the bone / the bone in the package
  • Location in the park

28
  • General Frame for drive Frame with fillers
  • Agent animate Agent she
  • Action drive Action drives
  • Patiens vehicle Patiens the convertible
  • Mannerthe way it is done Manner fast
  • Location Location-spec Location in the Rocky
    Mountains
  • Source Location-spec Source from home
  • Destination Location-spec Destination to the
    ASIC conference
  • Time Time-spec Time in the summer holiday

29
Pragmatics
30
Pragmatics
  • Pragmatics includes context-related aspects of NL
    expressions (utterances).
  • These are in particular anaphoric references,
    elliptic expressions, deictic expressions,
  • anaphoric references refer to items mentioned
    before
  • deictic expressions simulate pointing
    gestures
  • elliptic expressions incomplete expression
  • relate to item mentioned
    before

31
Pragmatics
  • I put the box on the top shelve.
  • I know that. But I cant find it there.

deictic expression
anaphoric reference
32
Pragmatics
  • I put the box on the top shelve.
  • I know that. But I cant find it there.

anaphoric reference
33
Pragmatics
  • I put the box on the top shelve.
  • I know that. But I cant find it there.

deictic expression
34
Pragmatics
  • I put the box on the top shelve.
  • I know that. But I cant find it there.
  • The candy-box?

deictic expression
anaphoric reference
elliptic expression
35
Pragmatics
  • I put the box on the top shelve.
  • The candy-box?

elliptic expression
36
Intentions
  • Intentions
  • One philosophical assumption is that natural
    language is used to achieve something
  • Do things with words.
  • The meaning of an utterance is essentially
    determined by the intention of the speaker.

37
Intentionality - Examples
  • What was said What was meant
  • There is a terrible "Can you please
  • draft here. close the window."
  • How does it look "I am really mad
  • here? clean up your room."
  • "Will this ever end?" "I would prefer to be
  • with my friends than to sit in class
    now."

38
Metaphors
  • Metaphors
  • The meaning of a sentence or expression is not
    directly inferable from the sentence structure
    and the word meanings. Metaphors transfer
    concepts and relations from one area of discourse
    into another area, for example, seeing time as
    line (in space) or seing friendship or life as a
    journey.

39
Metaphors - Examples
  • This car eats a lot of gas.
  • She devoured the book.
  • He was tied up with his clients.
  • Marriage is like a journey.
  • Their marriage was a one-way road into hell.
  • (see also George Lakoff, e.g. Women, Fire and
    Dangerous Things)

40
Dialogue and Discourse
41
Discourse / Dialogue Structure
  • Grammar for various sentence types (speech acts)
    dialogue, discourse, story grammar
  • Distinguish questions, commands, and statements
  • Where is the remote-control?
  • Bring the remote-control!
  • The remote-control is on the brown table.
  • Dialogue Grammars describe possible sequences of
    Speech Acts in communication, e.g. that a
    question is followed by an answer/statement.
  • Similar for Discourse (like continuous texts).

42
Speech
43
Speech Processing SystemsTypes and
Characteristics
  • Speech Recognition vs. Speaker Recognition (Voice
    Recognition Speaker Identification )
  • speaker-dependent vs. speaker-independent
  • training?
  • unlimited vs. large vs. small vocabulary
  • single word vs. continuous speech

44
Speech Recognition Phases
  • acoustic signal as input
  • signal analysis - spectrogram
  • feature extraction
  • phoneme recognition
  • word recognition
  • conversion into written words

45
Spoken Language
46
Spoken Language
  • Output of Speech Recognition System as input
    "text".
  • Can be associated with probabilities for
    different word sequences.
  • Contains ungrammatical structures, so-called
    "disfluencies", e.g. repetitions and corrections.

47
Spoken Language - Examples
  • no s- straight southwest
  • right to my my left
  • that is that is correct

Robin J. Lickley. HCRC Disfluency Coding Manual.
http//www.ling.ed.ac.uk/robin/maptask/HCRCdsm-
01.html
48
Spoken Language - Disfluency
  • Reparandum and Repair

come to ... walk right to the ... the
right-hand side of the page
Reparandum
Repair
49
Spoken Language - Example
  • we're going to g-- ... turn straight back
    around for testing.
  • come to ... walk right to the ... right-hand
    side of the page.
  • right up ... past ... up on the left of the ...
    white mountain walk ... right up past.
  • i'm still ... i've still gone halfway back
    round the lake again.

50
Spoken Language - Example
  • Id d if I need to go
  • its basi-- see if you go over the old mill
  • you are going make a gradual slope to your
    right
  • Ive got one I dont realize why it is there

51
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com