Unsupervised Constraint Driven Learning for Transliteration Discovery - PowerPoint PPT Presentation

About This Presentation

Title:

Unsupervised Constraint Driven Learning for Transliteration Discovery

Description:

Title: Unsupervised Constraint Driven Learning for Transliteration Discovery Author: Ming-Wei Chang Last modified by: Ming-Wei Chang Created Date – PowerPoint PPT presentation

Number of Views:136

Avg rating:3.0/5.0

Slides: 35

Provided by: Ming119

Learn more at: https://cogcomp.seas.upenn.edu

Category:

more less

Transcript and Presenter's Notes

Title: Unsupervised Constraint Driven Learning for Transliteration Discovery

1
Unsupervised Constraint Driven Learning for
Transliteration Discovery

M. Chang, D. Goldwasser, D. Roth, and Y. Tu

2
What I am going to do today

Goal 1 Present the transliteration work
Get feedback!
Goal 2 Think about this work with CCM
Tutorial . ?
I will try to present this work in a slightly
different way
Some of them are my personal comment
Different than our yesterday discussion
Please give us comment about this
Make this work more general (not only
transliteration)

3
Wait a sec! What is CCM?

I get this question 100 times already!
Informal answer
everything uses constraints is CCM! ?
Formal Answer
No constraints
CCM
We do not define the training method
Definition CCM makes prediction with constraints!

4
Constraints Driven Learning

Why Constraints?
The Goal Building a good system easily
We have prior knowledge at our hand
Why not inject knowledge directly ?
How useful are constraints?
Useful for supervised learning Yih and Roth 04
many others
Useful for semi-supervised learning Chang et.al.
ACL 2007
Some times more efficient than labeling data
directly

5
Unsupervised Constraint Driven Learning

In this work
We do not use any label instance
Achieve to good performance that competitive
several supervised model
Compared to Chang et.al. ACL 2007
In ACL 07, they use a small amount of dataset
(5-20)
Reason Bad Models can not benefit from
constraints!
For some applications, we have very good resource
We do not need labeled instances at all!

6
In a nutshell

Traditional semi-supervised learning.
Model can drift from the correct one.

Unsupervised Learning
Model
Resource ?
Prediction Label unlabeled data
Feedback Learn from labeled data
Unlabeled Data
7
In a nutshell
CODL Improves Simple Model Using Expressive
Constraints
CODL Use constraints to generate better training
samples in unsupervised learning.
Model
Better Model
Prediction Constraints
Prediction
Feedback
Unlabeled Data
More accurate labeling
8
Outline

Constraint Driven Learning (CoDL)
Transliteration Discovery
Algorithm
Experimental Results

9
Transliteration Generation (Not our focus)

Given a Source Transliteration What is the
target transliteration?
Bush
? ??
Sushi
? ??
Issues
Ambiguity
For the same source word, many different
transliteration
Think about Chinese
What we want find the most widely used
transliteration

10
Transliteration Discovery (Our focus)

Problem Settings
Give you two list of words, map them!
Advantages
A relatively easy problem
Can find the most widely used transliteration
Assumption
Source English
Each source entities has a transliteration in the
target candidates
Target candidates might not be named entities

11
Outline

Constraint Driven Learning (CoDL)
Transliteration Discovery
Algorithm
Experimental Results

12
Algorithm Outline

Prediction Model
How to use existing resource to construct the
Model?
Constraints?
Learning Algorithm

13
The Prediction Model

How do we make prediction?
Given a source word, how to predict the best
target ?
Model 1 Vs, Vt ? Yes or No
Issue Not many obvious constraints can be added
Not a structure prediction problem
Model 2 Vs, Vt ? Hidden variables ? Yes or No
Predicting F is a structure prediction algorithm
We can add constraints more easily

14
The Prediction Model

Score for a pair
A CCM formulation
A slightly different scoring function

More on this point in the next few slides
15
Prediction Model Another View

The scoring function looks like weight times
features!
If there is a bad feature, score ? - 8
Our Hidden variable (Feature Vectors)
Character Mapping

Everything
(a,a), (o,O), (w,_),

17
Algorithm Outline

Prediction Model
How to use existing resource to construct the
Model?
Constraints?
Learning Algorithm

18
Resource Romanization Table

Hebrew, Russian
How can you type Hebrew or Russian?
Use English Keyboard, C maps to
A similar character C or S in Hebrew or
Russian
Very easy to get
Ambiguous
Special Case Chinese (Pin Yin)
?? ? shòu si (Low ambiguity)
Map Pin-Yin to English (sushi)
Romanization Table? a ?a

19
Initialize the Table

Every character pair in the Romanization Table
Weight 0
Everything else, -1
Could have better way to do initialization
Note All (v_s,v_t) will get zero without
constraints

20
Algorithm Outline

Prediction Model
How to use existing resource to construct the
Model?
Constraints?
Learning Algorithm

21
Constraints

General Constraints
Coverage all character need to be mapped at
least once
No crossing character mappings can not cross
each other
Language Specific Constraints
General Restricted Mapping
Initial Restricted Mapping
Length Restriction

22
Constraints
Many other works use similar information as well!

Pin-Yin to English

23
Algorithm Outline

Prediction Model
How to use existing resource to construct the
Model?
Constraints?
Learning Algorithm

24
High-Level Overview

Model ? Resource
While Converge
Use Model Constraints to get Labels (for both
F, y)
Update Model with newly labeled F and y (without
Constraints) (details in the next slide)
Similar to ACL 07
Update the model without Constraints
Difference from ACL 07
We get feedback from the labels of both hidden
variables and output

25
Training
Predict hidden variables and the labels
Update Algorithm
26
Outline

Constraint Driven Learning (CoDL)
Transliteration Discovery
Algorithm
Experimental Results

27
Experimental Setting

Evaluation
ACC Top candidate is (one of) the right answer
Learning Algorithm
Linear SVM with C 0.5
Dataset
English-Hebrew 300 300
English-Chinese 581681
English-Russian 72750648 (Target includes all
words)

28
Results - Hebrew
29
Results - Russian
30
Analysis
4) Better Constraints Lead to Better Final Results

A small Russian subset was used here

3) Learning has great impact here! But
constraints are very important, too!
1) Without Constraints (on features),
Romanization Table is useless!
2) General Constraints are more important!
31
Related Works (Need more work here)

Learning the score for Edit Distance
Previous transliteration works
Machine translation?

32
Conclusion

ML unsupervised constraint driven algorithm
Use hidden variable to find more constraints
(e.g. co-ref)
Use constraints to find cleaner feature
representation
Transliteration
Usage of Normalization Table as the starting
point
We can get good results without training data
Right constraints (modeling) is the key
Future Work
Transliteration Model Better Model, Quicker
Inference
CoDL Other applications for unsupervised CoDL

33
Constraint - Driven Learning (CODL)
Any supervised learning algorithm parametrized
by ?
?learn(Tr) For N iterations do T? For
each x in unlabeled dataset y ?Inference(x,
?) TT ? (x, y) ? ? ?(1-? )learn(T)
Augmenting the training set (feedback). Any
inference algorithm (with constraints).
Inference(x,C, ?)
Learn from new training data. Weight supervised
and unsupervised model(Nigam2000).
34
Unsupervised Constraint - Driven Learning
Construct the model with Resources
?Construct(Resource) For N iterations do T?
For each x in unlabeled dataset y
?Inference(x, ?) TT ? (x, y) ? ?
?(1-? )learn(T)
Augmenting the training set (feedback). Any
inference algorithm (with constraints).
Inference(x,C, ?)
Learn from new training data. ? 0 in this work

Write a Comment

User Comments (0)