SVM and Perceptron Based IE Systems - PowerPoint PPT Presentation

1 / 22

About This Presentation

Title:

SVM and Perceptron Based IE Systems

Description:

SVM and Perceptron Based IE Systems. Yaoyong Li, Kalina Bontcheva, ... they are furthest with each other. also furthest from a predefined subset if given. ... – PowerPoint PPT presentation

Number of Views:22

Avg rating:3.0/5.0

Slides: 23

Provided by: hamishcu

Category:

more less

Transcript and Presenter's Notes

Title: SVM and Perceptron Based IE Systems

1
SVM and Perceptron Based IE Systems

Yaoyong Li, Kalina Bontcheva, Hamish Cunningham
Department of Computer Science
University of Sheffield
yaoyong,kalina,hamish_at_dcs.shef.ac.uk
http//gate.ac.uk/ http//nlp.shef.ac.uk/

2
Outline

Description of our participating systems
Discussion of the results
Additional experiments for analysing the
algorithms

3
Three Participating Systems

Three systems participated in task1, 2a and 2b.
System1 combining the tags from System2 and
System3.
System2 uneven margins SVM.
System3 Perceptron with uneven margins (PAUM).
Classifier based framework for IE.

4
Pre-prossing

Context window (window_size 10).
Reciprocal weighting
Weight 1/j the features from the jth left or
right token of the current token.
NLP features Token, Capitalisation, Token types,
Entity information not using POS (see below).

5
Classification Problems

Two classification problems for one type of
entity.
One was for recognition of start token, and
another for end token.

6
Post-processing

Procedure has three stages.
First, keep the start and end tags consistency by
removing spurious tags.
Then, remove the tags of entity whose length is
not equal to the length of any entity of the same
type in training set.
Finally, choose the best entity type for a piece
of text according to the scores.

7
The SVM Classifier

Classification hyperplane has the same margins to
negative and positive training examples.

8
Uneven Margins SVM --for imbalanced data
9
Uneven Margins SVM

Introduce an uneven margins parameter t into the
SVM (see Li and Shawe-Taylor, 2003).
t is the ratio of negative margin to positive
margin.

10
Uneven Margins Perceptron

Perceptron a simple and fast linear learning
algorithm.
PAUM introduce two margin parameters into
Perceptron (see Li et al, 2002).
PAUMs performance was comparable to the SVM on
document classification.

11
Combination

System2 and 3 were common in some respects but
were also complementary in others
quadratic kernel vs linear kernel,
batch optimistation vs on-line optimisation.
Combine in the level of tags, not based on the
scores of classifiers.
Take the tags of system2 and 3 as the tags of
system1. Adopt the tags from system2 where there
was a conflict between system2 and 3.

12
Active learning task2b

We used Gram-Schmidt orthogonalisation algorithm
(GS) to select training documents.
GS choose a subset of examples such that in
feature space
they are furthest with each other.
also furthest from a predefined subset if given.

13
Results -- task1

Combination obtained higher recall by sacrificing
precision.
PAUM needed much less computation time and memory
than SVM.

14
Results task2a
15
Results task2b

No significantly improvement over randomly
selecting documents.
Compared documents using bag of words model, not
relevant to the task of recognising specific
entities in document.

16
Uneven Margins of SVM

F1 for 4-fold CV of task1 with different uneven
margins settings.
Uneven margins model were better than even margin.

17
Uneven Margins small data

Compare uneven and even margins of SVM on task2a
Uneven margins parameter is particularly useful
for small data.

18
NLP features for IE
19
Weighting Scheme
20
Post-processing
21
Conclusions

Uneven margins parameter was indeed helpful to
SVM for IE, especially for small data.
PAUM performed well for IE.
Combination system had better results than both
the SVM and PAUM systems.
Future works
Exploit the specific entity information for
active learning.
Apply uneven margins SVM and PAUM to other NLP
learning tasks.

22
Thanks!

Write a Comment

User Comments (0)