Part II. Statistical NLP - PowerPoint PPT Presentation

1 / 27

About This Presentation

Title:

Part II. Statistical NLP

Description:

Parts of chapters 10, 11, 12 of Statistical NLP, Manning and Schuetze, and ... The next word is an adjective, an adverb or a quantifier, ... – PowerPoint PPT presentation

Number of Views:28

Avg rating:3.0/5.0

Slides: 28

Provided by: informati3

Category:

Tags: nlp | part | quantifier | statistical

Transcript and Presenter's Notes

Title: Part II. Statistical NLP

1
Advanced Artificial Intelligence

Part II. Statistical NLP

Applications of HMMs and PCFGs in NLP Wolfram
Burgard, Luc De Raedt, Bernhard Nebel, Lars
Schmidt-Thieme
Most slides taken (or adapted) from Adam
Przepiorkowski (Poland) Figures by Manning and
Schuetze
2
Contents

Part of Speech Tagging
Task
Why
Approaches
Naive
VMM
HMM
Transformation Based Learning
Probabilistic Parsing
PCFGs and Tree Banks
Parts of chapters 10, 11, 12 of Statistical NLP,
Manning and Schuetze, and Chapter 8 of Jurafsky
and Martin, Speech and Language Processing.

3
Motivations and Applications

Part-of-speech tagging
The representative put chairs on the table
AT NN VBD NNS IN AT NN
AT JJ NN VBZ IN AT NN
Some tags
AT article, NN singular or mass noun, VBD
verb, past tense, NNS plural noun, IN
preposition, JJ adjective

4
Table 10.1
5
Why pos-tagging ?

First step in parsing
More tractable than full parsing, intermediate
representation
Useful as a step for several other, more complex
NLP tasks, e.g.
Information extraction
Word sense disambiguation
Speech Synthesis
Oldest task in Statistical NLP
Easy to evaluate
Inherently sequential

6
Different approaches

Start from tagged training corpus
And learn
Simplest approach
For each word, predict the most frequent tag
0-th order Markov Model
Gets 90 accuracy at word level (English)
Best taggers
96-97 accuracy at word level (English)
At sentence level e.g. 20 words per sentence,
on average one tagging error per sentence
Unsure how much better one can do (human error)

7
Notation / Table 10.2
8
Visual Markov Model

Assume the VMM of last week
We are representing
Lexical (word) information implicit

9
Table 10.3
10
Hidden Markov Model

Make the lexical information explicit and use
HMMs
State values correspond to possible tags
Observations to possible words
So, we have

11
Estimating the parameters

From a tagged corpus, maximum likelihood
estimation
So, even though a hidden markov model is
learning, everything is visible during learning !
Possibly apply smoothing (cf. N-gramms)

12
Table 10.4
13
Tagging with HMM

For an unknown sentence, employ now the Viterbi
algorithm to tag
Similar techniques employed for protein secondary
structure prediction
Problems
The need for a large corpus
Unknown words (cf. Zipfs law)

14
Unknown words

Two classes of part of speech
open and closed (e.g. articles)
for closed classes all words are known
Z normalization constant

15
What if no corpus available ?

Use traditional HMM (Baum-Welch) but
Assume dictionary (lexicon) that lists the
possible tags for each word
One possibility initialize the word generation
(symbol emmision) probabilities

16
(No Transcript)
17
Transformation Based Learning (Eric Brill)

Observation
Predicting the most frequent tag already results
in excellent behaviour
Why not try to correct the mistakes that are made
?
Apply transformation rules
IF conditions THEN replace tag_j by tag_I
Which transformations / corrections admissible ?
How to learn these ?

18
Table 10.7/10.8
19
(No Transcript)
20
The learning algorithm
21
Remarks

Other machine learning methods could be applied
as well (e.g. decision trees, rule learning )

22
Rule-based tagging

Oldest method, hand-crafted rules
Start by assigning all potential tags to each
word
Disambiguate using manually created rules
E.g. for the word that
If
The next word is an adjective, an adverb or a
quantifier,
And the further symbol is a sentence boundary
And the previous word is not a consider-type verb
Then erase all tags apart from the adverbial tag
Else erase the adverbial tag

23
Learning PCFGs for parsing

Learning from complete data
Everything is observed visible, examples are
parse trees
Cf. POS-tagging from tagged corpora
PCFGs learning from tree banks,
Easy just counting
Learning from incomplete data
Harder The EM approach
The inside-outside algorithm
Learning from the sentences (no parse trees given)

24
(No Transcript)
25
How does it work ?

R r r is a rule that occurs in one of the
parse trees in the corpus
For all rules r in R do
Estimate probability label rule
P( N -gt S) Count(N -gt S) / Count(N)

26
Conclusions

Pos-tagging as an application of SNLP
VMM, HMMs, TBL
Statistical tagggers
Good results for positional languages (English)
Relatively cheap to build
Overfitting avoidance needed
Difficult to interpret (black box)
Linguistically naive

27
Conclusions

Rule-based taggers
Very good results
Expensive to build
Presumably better for free word order languages
Interpretable
Transformation based learning
A good compromise ?
Tree bank grammars
Pretty effective (and easy to learn)
But hard to get the corpus.

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

Introducing-PowerShowcom PowerPoint PPT Presentation

Introducing-PowerShowcom - Introducing-PowerShowcom (Without Music)

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

Part II. Statistical NLP PowerPoint PPT Presentation

Part II. Statistical NLP - Advanced Artificial Intelligence Part II. Statistical NLP Hidden Markov Models Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme | PowerPoint PPT presentation | free to view

Part II. Statistical NLP PowerPoint PPT Presentation

Part II. Statistical NLP - Advanced Artificial Intelligence Part II. Statistical NLP Introduction and Grammar Models Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Kristian Kersting | PowerPoint PPT presentation | free to view

Part II. Statistical NLP PowerPoint PPT Presentation

Part II. Statistical NLP - Advanced Artificial Intelligence Part II. Statistical NLP Probabilistic Context Free Grammars Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme | PowerPoint PPT presentation | free to view

Part II. Statistical NLP PowerPoint PPT Presentation

Part II. Statistical NLP - Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Kristian Kersting ... gun, gum, Gus, and gull are words, but gun has a higher probability in the context of a bank ... | PowerPoint PPT presentation | free to view

Scalable Statistical Relational Learning for NLP PowerPoint PPT Presentation

Scalable Statistical Relational Learning for NLP - Scalable Statistical Relational Learning for NLP William Wang CMU UCSB William Cohen CMU | PowerPoint PPT presentation | free to view

How NLP is redefining automation strategies? PowerPoint PPT Presentation

How NLP is redefining automation strategies? - Learn how natural language processing can redefine automation strategies and how you can leverage as well | PowerPoint PPT presentation | free to view

Statistical techniques in NLP PowerPoint PPT Presentation

Statistical techniques in NLP - Statistical techniques in NLP Vasileios Hatzivassiloglou University of Texas at Dallas Learning Central to statistical NLP In most cases, supervised methods are used ... | PowerPoint PPT presentation | free to view

Statistical Machine Translation Part I - Introduction PowerPoint PPT Presentation

Statistical Machine Translation Part I - Introduction - * Where we have been Human evaluation & BLEU Parallel corpora Sentence alignment ... of machine translation Parallel corpora Sentence alignment ... | PowerPoint PPT presentation | free to view

Declarative Specification of NLP Systems PowerPoint PPT Presentation

Declarative Specification of NLP Systems - Declarative Specification of NLP Systems Jason Eisner student co-authors on various parts of this work: Eric Goldlust, Noah A. Smith, John Blatz, Roy Tromble | PowerPoint PPT presentation | free to view

COMP791A: Statistical Language Processing PowerPoint PPT Presentation

COMP791A: Statistical Language Processing - Title: COMP791: Statistical NLP Last modified by: Leila Kosseim Created Date: 12/7/1999 2:57:41 AM Document presentation format: On-screen Show Other titles | PowerPoint PPT presentation | free to view

How Natural Language processing is redefines automation strategies PowerPoint PPT Presentation

How Natural Language processing is redefines automation strategies - Learn how NLP is redefining automation strategies and how you can leverage it as well | PowerPoint PPT presentation | free to view

Foundations of statistical natural language processing PowerPoint PPT Presentation

Foundations of statistical natural language processing - Introduction Chapter 1 Foundations of statistical natural language processing | PowerPoint PPT presentation | free to view

COMP 791A: Statistical Language Processing PowerPoint PPT Presentation

COMP 791A: Statistical Language Processing - COMP 791A: Statistical Language Processing Introduction Chap. 1 Course information Prof: Leila Kosseim Office: LB 903-7 Email: kosseim@cs.concordia.ca Office hours ... | PowerPoint PPT presentation | free to view

WSTA Lecture 14 Part-of-speech Tagging PowerPoint PPT Presentation

WSTA Lecture 14 Part-of-speech Tagging - WSTA Lecture 14 Part-of-speech Tagging Tags introduction tagged corpora, tagsets Tagging motivation Simple unigram tagger Markov model tagging Rule based tagging | PowerPoint PPT presentation | free to view

COMP 791A: Statistical Language Processing PowerPoint PPT Presentation

COMP 791A: Statistical Language Processing - Title: COMP 790: Statistical Language Processing Last modified by: Leila Kosseim Created Date: 12/7/1999 2:57:41 AM Document presentation format | PowerPoint PPT presentation | free to view

Advanced Techniques in NLP PowerPoint PPT Presentation

Advanced Techniques in NLP - Do we want a translation system for one language pair or for many language pairs? ... the best method in statistical machine translation. Discriminative training ... | PowerPoint PPT presentation | free to view

CS626-449: NLP, Speech and Web-Topics-in-AI PowerPoint PPT Presentation

CS626-449: NLP, Speech and Web-Topics-in-AI - CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 36: UNL | PowerPoint PPT presentation | free to view

Statistical NLP: Lecture 7 PowerPoint PPT Presentation

Statistical NLP: Lecture 7 - Statistical NLP: Lecture 7 Collocations (Ch 5) Introduction Collocations are characterized by limited compositionality. Large overlap between the concepts of ... | PowerPoint PPT presentation | free to view

Seven Lectures on Statistical Parsing PowerPoint PPT Presentation

Seven Lectures on Statistical Parsing - It's harder when you compose in errors from word segmentation as well... PP modifiers follow NP; arguments and PP modifiers follow V ... | PowerPoint PPT presentation | free to view

Part-of-Speech Tagging PowerPoint PPT Presentation

Part-of-Speech Tagging - Bill directed a cortege of autos through the dunes. PN Verb Det Noun Prep Noun Prep Det Noun ... N:autos. 600.465 - Intro to NLP - J. Eisner. 28. Observed ... | PowerPoint PPT presentation | free to view

CS188 Guest Lecture: Statistical Natural Language Processing PowerPoint PPT Presentation

CS188 Guest Lecture: Statistical Natural Language Processing - CS188 Guest Lecture: Statistical Natural Language Processing Prof. Marti Hearst School of Information Management & Systems www.sims.berkeley.edu/~hearst | PowerPoint PPT presentation | free to view

Machine Translation and NLP PowerPoint PPT Presentation

Machine Translation and NLP - Context Free Grammar of English Language. S PreNP NP VP PostVP ... Urdu translation is built by re-arrangement and inflection of words and phrases. ... | PowerPoint PPT presentation | free to view

Statistical NLP: Lecture 13 PowerPoint PPT Presentation

Statistical NLP: Lecture 13 - Statistical Alignment and Machine Translation. 2. Overview. MT is very hard: translation programs available today do not perform very well. ... | PowerPoint PPT presentation | free to view

An Overview of Natural Language Processing PowerPoint PPT Presentation

An Overview of Natural Language Processing - Natural Language Processing (NLP) is a subfield of artificial intelligence (AI) and linguistics that focuses on the interaction between computers and human language. Its primary goal is to enable machines to understand, interpret, generate, and respond to human language in a way that is both meaningful and contextually appropriate. | PowerPoint PPT presentation | free to view

NLP/CL: Review PowerPoint PPT Presentation

NLP/CL: Review - School of Computing FACULTY OF ENGINEERING NLP/CL: Review Eric Atwell, Language Research Group (with thanks to other contributors) | PowerPoint PPT presentation | free to view

Linguistically Rich Statistical Models of Language PowerPoint PPT Presentation

Linguistically Rich Statistical Models of Language - Talk to your computer like another human. HAL, Star Trek, etc. ... British Left Waffles on Falkland Islands. Red Tape Holds Up New Bridges ... | PowerPoint PPT presentation | free to view

Social Network Inspired Models of NLP and Language Evolution PowerPoint PPT Presentation

Social Network Inspired Models of NLP and Language Evolution - Social Network Inspired Models of NLP and Language Evolution Monojit Choudhury (Microsoft Research India) Animesh Mukherjee (IIT Kharagpur) Niloy Ganguly (IIT Kharagpur) | PowerPoint PPT presentation | free to view