Japanese Dependency Structure Analysis Based on Maximum Entropy Models - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Japanese Dependency Structure Analysis Based on Maximum Entropy Models

Description:

Conclusion. Background. Preparing a dependency matrix ... Conclusion. Japanese dependency structure analysis based on the M. E. model. ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 25
Provided by: csN4
Learn more at: https://cs.nyu.edu
Category:

less

Transcript and Presenter's Notes

Title: Japanese Dependency Structure Analysis Based on Maximum Entropy Models


1
Japanese Dependency Structure Analysis Based on
Maximum Entropy Models
  • Kiyotaka Uchimoto Satoshi Sekine
  • Hitoshi Isahara
  • Kansai Advanced Research Center, Communications
    Research Laboratory
  • New York University

2
Outline
  • Background
  • Probability model for estimating dependency
    likelihood
  • Experiments and discussion
  • Conclusion

3
Background
  • Japanese dependency structure analysis

?????????????? Taro bought a red rose.
  • Preparing a dependency matrix
  • Finding an optimal set of dependencies for the
    entire sentence

4
Background (2)
  • Approaches to preparing a dependency matrix
  • Rule-based approach
  • Several problems with handcrafted rules
  • Coverage and consistency
  • The rules have to be changed according to the
    target domain.
  • Corpus-based approach

5
Background (3)
  • Corpus-based approach
  • Learning the likelihoods of dependencies from a
    tagged corpus (Collins, 1996 Fujio and
    Matsumoto, 1998 Haruno et al., 1998)
  • Probability estimation based on the maximum
    entropy models (Ratnaparkhi, 1997)
  • Maximum Entropy model
  • learns the weights of given features from a
    training corpus

6
Probability model
bunsetsu
dependency
?
  • Assigning one of two tags
  • Whether or not there is a dependency between two
    bunsetsus
  • Probabilities of dependencies are estimated from
    the M. E. model.
  • Overall dependencies in a sentence
  • Product of probabilities of all dependencies
  • Assumption Dependencies are independent of each
    other.

or
7
M. E. model
8
Feature sets
  • Basic features (expanded from Harunos list
    (Haruno, 1998))
  • Attributes on a bunsetsu itself
  • Character strings, parts of speech, and
    inflection types of bunsetsu
  • Attributes between bunsetsus
  • Existence of punctuation, and the distance
    between bunsetsus
  • Combined features

9
Feature sets
dependency
? ?
?? ?
?? ?
?? ????
Taro_wa
bara_wo
kai_mashita
Aka_i
Taro
rose
bought
red
Anterior bunsetsu
Posterior bunsetsu
e
Head Type
Head Type
  • Basic features a, b, c, d, e
  • Combined features
  • Twin (b,c), Triplet (b,c,e), Quadruplet
    (a,b,c,d), Quintuplet (a,b,c,d,e)

10
Algorithm
  • Detect the dependencies in a sentence by
    analyzing it backwards (from right to left).
  • Characteristics of Japanese dependencies
  • Dependencies are directed from left to right
  • Dependencies do not cross
  • A bunsetsu, except for the rightmost one, depends
    on only one bunsetsu
  • In many cases, the left context is not necessary
    to determine a dependency
  • Beam search

11
Experiments
  • Using the Kyoto University text corpus (Kurohashi
    and Nagao, 1997)
  • a tagged corpus of the Mainichi newspaper
  • Training 7,958 sentences (Jan. 1st to 8th)
  • Testing 1,246 sentences (Jan. 9th)
  • The input sentences were morphologically analyzed
    and their bunsetsus were identified correctly.

12
Results of dependency analysis
  • When analyzing a sentence backwards, the previous
    context has almost no effect on the accuracy.

13
Relationship between the number of bunsetsus and
accuracy
  • The accuracy does not significantly degrade with
    increasing sentence length.

14
Features and accuracy
  • Experiments without the feature sets
  • Useful basic features
  • Type of the anterior bunsetsu (-17.41) and the
    part-of-speech tag of the head word on the
    posterior bunsetsu (-10.99)
  • Distance between bunsetsus (-2.50), the
    existence of punctuation in the bunsetsu
    (-2.52), and the existence of brackets (-1.06)
  • preferential rules with respect to the features

Anterior bunsetsu
Posterior bunsetsu
e
Head Type
Head Type
15
Features and accuracy
  • Experiments without the feature sets
  • Combined features are useful (-18.31).
  • Basic features are related to each other.

16
Lexical features and accuracy
  • Experiment with the lexical features of the head
    word
  • Better accuracy than that without them (-0.84)
  • Many idiomatic expressions
  • They had high dependency probabilities.
  • ???(oujite, according to)---???(kimeru, decide)
  • ??(katachi_de, in the form of)

  • ---????(okonawareru, be held)
  • More training data
  • Expect to collect more of such expressions

17
Number of training data and accuracy
  • Accuracy of 81.84 even with 250 sentences
  • M. E. framework has suitable characteristics for
    overcoming the data sparseness problem.

18
Comparison with related works
19
Comparison with related works (2)
  • Combining a parser based on a handmade CFG and a
    probabilistic dependency model (Shirai, 1998)
  • Using several corpora the EDR corpus, RWC
    corpus, and Kyoto University corpus.
  • Accuracy achieved by our model was about 3
    higher than that of Shirais model.
  • Using a much smaller set of training data.

20
Comparison with related works (3)
  • M. E. model (Ehara, 1998)
  • Set of similar kinds of features to ours
  • Only the combination of two features
  • Using TV news articles for training and testing
  • Average sentence length 17.8 bunsetsus
  • cf. 10 in the Kyoto University corpus
  • Difference in the combined features
  • We also use triplet, quadruplet, and quintuplet
    features (5.86).
  • Accuracy of our system was about 10 higher than
    Eharas system.

21
Comparison with related works (4)
  • Maximum Likelihood model (Fujio, 1998)
  • Decision tree models and a boosting method
    (Haruno, 1998)
  • Set of similar kinds of features to ours
  • Using the EDR corpus for training and testing
  • EDR corpus is ten times as large as our corpus.
  • Accuracy was around 85, which is slightly worse
    than ours.

22
Comparison with related works (5)
  • Experiments with Fujios and Harunos feature
    sets
  • The important factor in the statistical
    approaches is feature selection.

23
Future work
  • Feature selection
  • Automatic feature selection (Berger, 1996, 1998
    Shirai, 1998)
  • Considering new features
  • How to deal with coordinate structures
  • Taking into account a wide range of information

24
Conclusion
  • Japanese dependency structure analysis based on
    the M. E. model.
  • Dependency accuracy of our system
  • 87.2 using the Kyoto University corpus
  • Experiments without feature sets
  • Some basic and combined features strongly
    contribute to improve the accuracy.
  • Number of training data and accuracy
  • Good accuracy even with a small set of training
    data
  • M. E. framework has suitable characteristics for
    overcoming the data sparseness problem.
Write a Comment
User Comments (0)
About PowerShow.com