Natural Language Applications of InstanceBased Learning

About This Presentation

Title:

Description:

Number of Views:24

Avg rating:3.0/5.0

Slides: 12

Provided by: car7159

Category:

Tags: instancebased | applications | language | learning | natural

Transcript and Presenter's Notes

Title: Natural Language Applications of InstanceBased Learning

1
Natural Language Applications of Instance-Based
Learning

2
Part-of-Speech Tagging (Daelemans Zavrel, 1996)

3
Case Representation

4
Case Representation

5
Results

6
Domain-Specific Lexical Tagging (Cardie, 1993,
1994)

7
Case Retrieval

Similarity metric
k-nearest neighbor
Hamming distance partial matches
Use training corpus to determine k
Feature selection Decision trees
Retrieval algorithm
Retrieve top k cases.
Return those cases whose focus word feature
matches the focus word, if any exist. Otherwise
return all k cases.
Let the retrieved cases vote on the four class
values.
Case base construction
p-o-s, IE concept
120 sentences from MUC business JV corpus
2056 cases for open class words
semantic classes
175 sentences from MUC business JV corpus
3060 cases for open class words

8
Results

Lexical tagging tasks
part-of-speech tagging 95.0
general semantic class 85.7
specific semantic class 86.3
IE concept 96.8
60-70 for non-nil concepts
Replacing the CBL system for semantic class
tagging, with handcrafted, conservative
heuristics caused severe drop in recall of IE
system (41).
No separate procedure or case representation
needed for unknown words. 89.1 accuracy for
p-o-s tagging of unknown words.
Can detect when known words are appearing in
entirely new contexts.

9
Word sense disambiguation (Ng Lee, 1996)

Sense definitions from WordNet
Builds one classifier per word
Case representation for word w
Li correct p-o-s for words i positions to the
left
Ri correct p-o-s for words i positions to the
right
M morphological form for w
Ki... Km binary features that indicate the
presence of m words that frequently co-occur with
w in the same sentence
determined by computing pw(sense i keyword k)
for all words that appear with w in a sentence
p gt M1
k must appear gt M2 times with sense i
at most M3 keywords are chosen keywords
C1... C9 local collocations containing w
e.g. interest, interest rate, national interest
in
V verb predictive of w with sense i as an object

L3, L2, L1, R3, R2, R1, M, K1, ..., Km, C1, ...,
C9, V
10
Case Retrieval

Uses the PEBLS system (Cost Salzberg, 1993)
1-nearest neighbor
distance between two values v1 and v2 of feature
f
where
C 1,i is the number of training examples with
value v1 for f that is classified as sense i
and
C1 is the number of training examples with value
v1 for f in any sense

11
Evaluation