SVM and Perceptron Based IE Systems - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

SVM and Perceptron Based IE Systems

Description:

SVM and Perceptron Based IE Systems. Yaoyong Li, Kalina Bontcheva, ... they are furthest with each other. also furthest from a predefined subset if given. ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 23
Provided by: hamishcu
Category:

less

Transcript and Presenter's Notes

Title: SVM and Perceptron Based IE Systems


1
SVM and Perceptron Based IE Systems
  • Yaoyong Li, Kalina Bontcheva, Hamish Cunningham
  • Department of Computer Science
  • University of Sheffield
  • yaoyong,kalina,hamish_at_dcs.shef.ac.uk
  • http//gate.ac.uk/ http//nlp.shef.ac.uk/


2
Outline
  • Description of our participating systems
  • Discussion of the results
  • Additional experiments for analysing the
    algorithms

3
Three Participating Systems
  • Three systems participated in task1, 2a and 2b.
  • System1 combining the tags from System2 and
    System3.
  • System2 uneven margins SVM.
  • System3 Perceptron with uneven margins (PAUM).
  • Classifier based framework for IE.

4
Pre-prossing
  • Context window (window_size 10).
  • Reciprocal weighting
  • Weight 1/j the features from the jth left or
    right token of the current token.
  • NLP features Token, Capitalisation, Token types,
    Entity information not using POS (see below).

5
Classification Problems
  • Two classification problems for one type of
    entity.
  • One was for recognition of start token, and
    another for end token.

6
Post-processing
  • Procedure has three stages.
  • First, keep the start and end tags consistency by
    removing spurious tags.
  • Then, remove the tags of entity whose length is
    not equal to the length of any entity of the same
    type in training set.
  • Finally, choose the best entity type for a piece
    of text according to the scores.

7
The SVM Classifier
  • Classification hyperplane has the same margins to
    negative and positive training examples.

8
Uneven Margins SVM --for imbalanced data
9
Uneven Margins SVM
  • Introduce an uneven margins parameter t into the
    SVM (see Li and Shawe-Taylor, 2003).
  • t is the ratio of negative margin to positive
    margin.

10
Uneven Margins Perceptron
  • Perceptron a simple and fast linear learning
    algorithm.
  • PAUM introduce two margin parameters into
    Perceptron (see Li et al, 2002).
  • PAUMs performance was comparable to the SVM on
    document classification.

11
Combination
  • System2 and 3 were common in some respects but
    were also complementary in others
  • quadratic kernel vs linear kernel,
  • batch optimistation vs on-line optimisation.
  • Combine in the level of tags, not based on the
    scores of classifiers.
  • Take the tags of system2 and 3 as the tags of
    system1. Adopt the tags from system2 where there
    was a conflict between system2 and 3.

12
Active learning task2b
  • We used Gram-Schmidt orthogonalisation algorithm
    (GS) to select training documents.
  • GS choose a subset of examples such that in
    feature space
  • they are furthest with each other.
  • also furthest from a predefined subset if given.

13
Results -- task1
  • Combination obtained higher recall by sacrificing
    precision.
  • PAUM needed much less computation time and memory
    than SVM.

14
Results task2a
15
Results task2b
  • No significantly improvement over randomly
    selecting documents.
  • Compared documents using bag of words model, not
    relevant to the task of recognising specific
    entities in document.

16
Uneven Margins of SVM
  • F1 for 4-fold CV of task1 with different uneven
    margins settings.
  • Uneven margins model were better than even margin.

17
Uneven Margins small data
  • Compare uneven and even margins of SVM on task2a
  • Uneven margins parameter is particularly useful
    for small data.

18
NLP features for IE
19
Weighting Scheme
20
Post-processing
21
Conclusions
  • Uneven margins parameter was indeed helpful to
    SVM for IE, especially for small data.
  • PAUM performed well for IE.
  • Combination system had better results than both
    the SVM and PAUM systems.
  • Future works
  • Exploit the specific entity information for
    active learning.
  • Apply uneven margins SVM and PAUM to other NLP
    learning tasks.

22
Thanks!
Write a Comment
User Comments (0)
About PowerShow.com