Shallow Parsing Using CRFs - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Shallow Parsing Using CRFs

Description:

e.g., (NP)(Tom Hanks) ran around (NP)(America). Tom Hanks and America are the two chunks identified ... e.g. Tom Hanks ran around America. Output: BIOOB. Chunking CRF ... – PowerPoint PPT presentation

Number of Views:146
Avg rating:3.0/5.0
Slides: 13
Provided by: cseIi8
Category:
Tags: crfs | hanks | parsing | shallow | tom | using

less

Transcript and Presenter's Notes

Title: Shallow Parsing Using CRFs


1
Shallow Parsing Using CRFs
  • Ashutosh Agarwal
  • Paper by Fei Sha and Pereira

2
Shallow Parsing Chunking
  • Introduction
  • Identifies non-recursive parts of various phrase
    types in text
  • e.g., (NP)(Tom Hanks) ran around (NP)(America).
  • Tom Hanks and America are the two chunks
    identified
  • NP chunking finding out the non-recursive type
    of Nps
  • Also called base NPs

3
Previous Approaches
  • Machine Learning Approaches
  • K-order generative probabilistic models
  • e.g., Hidden markov models
  • Makes very naive independence assumptions
  • Otherwise intractable
  • As a sequence of classification tasks
  • Classification of lable depends on input data
    prev classified labels of words
  • Trained to make best local decision
  • Myopic about the effect of current decision on
    later decisions

4
CRF NP Chunker
  • Input to chunker POS tagged corpora
  • Output Sequence of 'B', 'I', 'O' where
  • 'B' represents beginning of chunk
  • 'I' represents continuation of chunk
  • 'O' represents outside of a chunk
  • Hence 'OI' can never occur
  • One label for each word in sentence
  • e.g. Tom Hanks ran around America.
  • Output BIOOB

5
Chunking CRF
  • Has second order markov-dependency between chunk
    tags
  • i.e., CRFs labels pairs of consecutive tags
  • Thus, label at position i is ci-1ci
  • And label at i-1 is ci-2ci-1
  • And label at 0 is c0
  • These constraints can be forced by giving
    appropriate features -infinite weight

6
Chunking CRFs
  • We can divide our feature set as
  • f(yi-1, yi,x,i)p(x,i)q(yi-1,yi)?
  • p(x,i) is the predicate on input sequence x and
    position i
  • e.g., word at position i is 'the'
  • q(yi-1,yi) is a predicate on pairs of labels
  • e.g., POS tags at position i and i-1 are DT,NN,
    etc

7
Features in chunking CRFs
  • Since, the number POS tags, labels to be
    generated, etc is finite
  • Hence finite number of features
  • Millions of features on large training sets
  • More features lead to better accuracy
  • Might lead to overfitting though

8
Example Feature Set
9
Evaluation Metric
  • Precision P
  • Fraction of output chunks that exactly match the
    reference chunks
  • Recall R
  • Fraction of reference chunks returned by chunker
  • F score (used for comparing with other systems)?
  • F1 score 2 P R/(PR)?

10
Empirical Results
11
Conclusion
  • Log-linear parsing models have potential to
    supplant currently dominant PCFG parsing models
  • Allows much richer feature set
  • Simpler smoothing
  • Avoids label-bias problem
  • Prevelant in classifier-based parsers

12
References
  • Shallow Parsing with Conditional Random Fields,
    Fei Sha and Fernando Pereira, University of
    Pennsylvania
Write a Comment
User Comments (0)
About PowerShow.com