Anne Li-E Liu

About This Presentation

Title:

Anne Li-E Liu

Description:

Automated Suggestions for Miscollocations Anne Li-E Liu David Wible Nai-lung Tsao Overview Introduction Methodology Experimental Results Conclusion Introduction Our ... – PowerPoint PPT presentation

Number of Views:147

Avg rating:3.0/5.0

Slides: 30

Provided by: roch123

Learn more at: https://www.cs.rochester.edu

Category:

more less

Transcript and Presenter's Notes

Title: Anne Li-E Liu

1
Automated Suggestions for Miscollocations

Anne Li-E Liu
David Wible
Nai-lung Tsao

2
Overview

Introduction
Methodology
Experimental Results
Conclusion

3
Introduction

Our study focuses on how to find suggestions for
miscollocations automatically.
In this paper, only verb-noun collocations and
miscollocations are considered.

4
Introduction

Howarths (1998) investigation of collocations
found in L1 and L2 writers writing.
Grangers analysis on adverb-adjective
collocation (1998).
Lius (2002) lexical semantic analysis on the
verb-noun miscollocations in English Taiwanese
Learner Corpus.

5
Introduction

Projects using learner corpora in analyzing and
categorizing learner errors
NICT JLE (Japanese Learner English) Corpus
The Chinese Learner English Corpus (CLEC)
English Taiwan Learner Corpus (or TLC) (Wible et
al., 2003).

6
An example
1. solve
2. pose
3. tackle
4. grapple
5. alleviate
6. overcome
7. exacerbate
8. compound
9. beset
10. resolve

She tries to improve her students problems.

7
Method

Three features of collocate candidates are used
1. Word association strength,
2. Semantic similarity
3. Intercollocability (Cowie and Howarth,
1996).

8
Resource

84 VN miscollocations in TLC (Liu, 2002).
Training data 42 Testing data 42
Two knowledge resources BNC, WordNet
Two human evaluators.

9
Word Association Strength

Mutual Information (Church et al. 1991)
Two purposes
All suggested correct collocations have to be
identified as collocations.
The higher the word association strength the more
likely it is to be a correct substitute for the
wrong collocate.

10
Semantic Similarity

A semantic relation holds between a miscollocate
and its correct counterpart (Gitsaki et al.,
2000 Liu 2002)
The synsets of WordNet to be nodes in a graph.
?measure graph-theoretic distance

say a story
tell a story
think of a story
say a story
11
Semantic Similarity
12
Intercollocability

Cowie and Howarth (1996) propose that certain
collocations form clusters on the basis of the
shared meaning.

convey point
get across the message
communicate concern
convey feeling
express concern
13
Intercollocability

Collocations in a cluster show a certain degree
of intercollocability.

?
condolences
express ones concern
express communicate
concern feeling
14
Intercollocability

She tries to improve her students problems.

improve problem
Starting point.
problem
86 verb collocates
improve
52 noun collocates
problem
problem
resolve/ improve
resolve
reduce
situation matter way
15
Intercollocability
situation matter problem way quality efficiency ef
fectiveness
situation matter problem way
resolve
reduce

The cluster is partially created and the link
between improve, resolve and reduce is developed
by virtue of the overlapping noun collocates.

16
Intercollocability

Quantify intercollocability
The number of shared collocates

17
situation matter problem way quality efficiency ef
fectiveness
situation matter problem way
resolve
reduce

shared collocate (resolve, improve) 3
shared collocate (reduce, improve) 3
The more shared collocates a verb has with the
wrong verb, the more likely this verb is a good
candidate

18
Integrate the 3 features

The probabilistic model

19
Training

Probability distribution of word association
strength
MI value to 5 levels
(lt1.5, 1.53.0, 3.04.5, 4.56, gt6)
P( MI level )
P(MI level Sc)

20
Training

Probability distribution of semantic similarity
Similarity score to 5 levels
(0.00.2, 0.20.4, 0.40.6, 0.60.8 and 0.8 1.0
)
P(SS level )
P(SS level Sc)

21
Training

Probability distribution of intercollocability
Normalized shared collocates number to 5 levels
(0.00.2, 0.20.4, 0.40.6, 0.60.8 and 0.8 1.0
)
P(SC level )
P(SC level Sc)

22
Experiments

Different combinations of the three features.

Models Feature (s) considered
M1 MI (Mutual Information)
M2 SS (Semantic Similarity)
M3 SC (Shared Collocates)
M4 MI SS
M5 MI SC
M6 SS SC
M7 MI SS SC
23
Results
K-Best M1 M2 (SS) M3 M4 M5 M6 (SSSC) M7 (MISSSC)
1 16.67 40.48 22.62 48.81 29.76 55.95 53.75
2 36.90 53.45 38.10 60.71 44.05 63.1 67.86
3 47.62 64.29 50.00 71.43 59.52 77.38 78.57
4 52.38 67.86 63.10 77.38 72.62 80.95 82.14
5 64.29 75.00 72.62 83.33 78.57 83.33 85.71
6 65.48 77.38 75.00 85.71 83.33 84.52 88.10
7 67.86 77.38 77.38 86.90 86.90 86.90 89.29
8 70.24 80.95 82.14 86.90 89.29 88.10 91.67
9 72.62 83.33 85.71 88.10 92.86 90.48 92.86
10 76.19 86.90 88.10 88.10 94.05 90.48 94.05
24
Results (cont.)
The K-Best suggestions for get knowledge. The K-Best suggestions for get knowledge. The K-Best suggestions for get knowledge. The K-Best suggestions for get knowledge.
K-Best M2 M6 M7
1 aim obtain acquire
2 generate share share
3 draw develop obtain
4 obtain generate develop
5 develop acquire gain
25
The K-Best suggestions for reach purpose. The K-Best suggestions for reach purpose. The K-Best suggestions for reach purpose. The K-Best suggestions for reach purpose.
K-Best M2 M6 M7
1 achieve achieve achieve
2 teach account account
3 explain trade trade
4 account treat fulfill
5 trade allocate serve
26
The K-Best suggestions for pay time. The K-Best suggestions for pay time. The K-Best suggestions for pay time. The K-Best suggestions for pay time.
K-Best M2 M6 M7
1 devote spend spend
2 spend invest waste
3 expend devote devote
4 spare date invest
5 invest waste date
27
Conclusion

A probabilistic model to integrate features.
The early experimental result shows the potential
of this research.

28
Future works

Applying such mechanisms to other types of
miscollocations.
Miscollocation detection will be one of the main
points of this research.
A larger amount of miscollocations should be
included in order to verify our approach.

Thank you!
Q A
Anne Li-E Liu lel29_at_cam.ac.uk
David Wible wible45_at_yahoo.com
Nai-Lung Tsao beaktsao_at_gmail.com

Write a Comment

User Comments (0)

About PowerShow.com

Anne Li-E Liu - PowerPoint PPT Presentation

Anne Li-E Liu

Automated Suggestions for Miscollocations Anne Li-E Liu David Wible Nai-lung Tsao Overview Introduction Methodology Experimental Results Conclusion Introduction Our ... – PowerPoint PPT presentation