Title: Hyphenated compounds are tagged as NN.
1Experimental Results
Constrained Conditional Model
Domain Adaptation
After adding knowledge, POS tagging error reduces
42, SRL error reduces 25 on Be verbs and 9 on
all verbs.
- Incorporate prior knowledge as constraints c
Cj(.). - Learn the weight vector w ignoring c.
- Impose constraints c at inference time.
Problem Performance of statistical systems drops
significantly when tested on a domain different
than the training domain. Example CoNLL 2007
shared task annotation standard was different
across the source and target domain. Motivation
Prior Knowledge is cheap and readily available
for many domains. Solution Use prior knowledge
on the target domain for better adaptation.
System POS SRL SRL
All Verbs Be Verbs
Baseline 86.2 58.1 15.5
Self-training 86.2 58.3 13.7
PDA-KW 91.8 62.1 34.5
PDA-ST 92.0 62.4 36.4
- For POS tagging, we do not have any domain
independent knowledge. - For SRL, we use some domain independent
knowledge. - Example Two arguments can not overlap.
POS Tagging
I eat fruits .
When POS Tagger trained on WSJ domain is tested
on Bio domain, F1 drops 9.
PDA-KW
PRP VB NNS .
Comparison with JiangZh07
- Incorporate Target domain specific knowledge c
Ck(.) as constraints. - Impose constraints c and c at inference time.
- Adaptation without retraining.
Semantic Role Labeling (SRL)
- Without using any labeled data, prior knowledge
reduces error 38 over using 300 labeled
sentences. - Without using any labeled data, prior knowledge
recovers 72 accuracy gain of adding 2730 labeled
sentences.
I eat fruits .
When SRL trained on WSJ domain is tested on
Ontonotes, F1 drops 18.
A0 V A1
Prior Knowledge on Ontonotes
Be verbs are unseen in training domain.
.
System POS Amount of Target Label Data
PDA-ST 92.0 0
JiangZh07-1 87.2 300
JiangZh07-2 94.2 2730
- If be verb is followed by a verb immediately,
there can be no core argument. - Example John is eating.
- If be verb is followed by the word like, core
arguments of A0 and A1 are possible. - Example And hes like why s the door open ?
- Otherwise, A1 and A2 are possible.
- Example John is a good man.
PDA-ST
- Motivation Constraints are accurate but apply
rarely. So can we generalize to cases where
constraints did not apply? - Solution Embed constraints into self training.
Frame file of be verb
Ds Source domain labeled data Du Target domain
unlabeled data Dt Target domain test data
Conclusion
Prior Knowledge on BioMed
- Hyphenated compounds are tagged as NN.
- Example H-ras
- Digit letter combinations should be tagged as NN.
- Example CTNNB1
- Hyphen should be tagged as HYPH.
Annotation wiki
- Prior knowledge gives competitive results to
using labeled data. - Future Work
- Improve the results for self-training.
- Find theoretical justifications for self training
- Apply PDA to more tasks/ domains.
Suggestions?
Only names of persons, locations etc. are proper
nouns which are very few. Gene, disease, drug
names etc. are marked as common nouns.
References
- Any word unseen in source domain followed by the
word gene should be tagged as NN. Example ras
gene - If any word does not appear with tag NNP in
training data, predict NN instead of NNP.
Example polymerase chain reaction ( PCR )
Self-training
- J. Jiang and C. Zhai, Instance Weighting for
domain adaptation in nlp, acl07 - G. Kundu and D. Roth, Adapting text instead of
the Model An Open Domain Approach, conll 11 - J. Blitzer, R. Mcdonald, F. Pereira, Domain
Adaptation with Structural Correspondence
Learning, emnlp06
- Motivation How good is self training without
knowledge?
Same as PDA-ST except replace the red boxed line
with the following line.
This research is sponsored by ARL and DARPA,
under the machine reading program.
TexPoint fonts used in EMF. Read the TexPoint
manual before you delete this box. AAAAAAAA
2Only names of persons, locations etc. are proper
nouns which are very few. Gene, disease, drug
names etc. are marked as common nouns.