noun verb noun ? subj predicate object - PowerPoint PPT Presentation

1 / 66
About This Presentation
Title:

noun verb noun ? subj predicate object

Description:

Title: PowerPoint Author: nakagawa Last modified by: Hiroshi Nakagawa Created Date: 9/12/2001 3:16:16 AM Document presentation format – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 67
Provided by: naka8
Category:

less

Transcript and Presenter's Notes

Title: noun verb noun ? subj predicate object


1
????
  • ???? ????????
  • ?????????????? ??)
  • ????

2
??????
  • ?????????????
  • ???????????
  • noun verb noun ? subj predicate object
  • ?????
  • (action???, agent?, target???, timepast)
  • ????????(????????????????????????? ???????
  • (actioneat, agentI, targetan apple, timepast)
  • ???????????(?????)??????????
  • ???????
  • nounI, verb(past)ate, nounan apple
  • ??? I ate an apple.

3
??????
  • ???????????????????????????
  • ??????????????????????????????????
  • ??????????????????
  • ????????????????
  • ? ? hot water?
  • ???????? ?
  • check???????????!

4
????
  • ??????
  • ??? ? APPLE
  • ?????
  • ALLPE? if bear noun or singular apple
  • if plural apples
  • ??????? an apple,???? apples??????????????????

5
????????example based translation
  • ???????????????????????
  • ??????????? ?? I ate an orange.
  • ?????????????
  • ???????????????????????
  • ??????????
  • ???????????????????
  • ????????????an apple?????
  • ????I ate an apple.
  • ??????????????????????????????????????????????????
    ???????????

6
????????example based translation
  • ??????????????????????????????????????????????
  • ????????????????????????
  • ???????????????

7
???????Statistic Machine Translation (SMT)
  • ??????????????????????????
  • 2???????????
  • ???????????? aligned corpus
  • ????????????????????????????????
  • ??????????????????????
  • ???????
  • IBM? Peter Brown,S. Della Pietra, V. Della
    Pietra, Robert Mercer??1993??CL???The
    Mathematics of Statistical Machine
    TranslationParameter Estimation??????

8
Bayes???
  • Canadian Hansard French-English Bilingual
    corpus
  • ?????????f ????????????? e ????
  • Given French string f, find earg
    maxePr(ef)
  • ???f???????e???????!!
  • then

9
??Pr(ef)?????Pr(fe)Pr(e)??
  • ???f???????e???????!!
  • ????????????????
  • ????????????????? f ???????????????????????
  • Pr(ef)????????????e???????????????????????
  • ????????????????????Pr(e)????????????????

10
Alignment??
  • The1 poor2 dont3 have4 any5 money6
  • Les1 pauvres2 sont3 demunis4
  • (Les pauvres sont demunis
  • The(1) poor(2) dont(3,4) have(3,4) any(3,4)
    money(3,4))
  • A(e,f)a
  • ? e,f??????

11
??
  • Alignment?????Pr(f,ae)
  • ???Pr(f,a,e)???????

12
IBM Model 1
  • ?????????????????????????????????-(1)
  • ?????????????????-(2)

13
Model 1
  • ????????Alignment aj ?0?? m ??????????1/(l1)?????
    ???????????Pr(fe)???????

14
  • c() ????(fe)????????e ? ???????f
    ?????????2???????alignment a ????f,e????????????

15
  • (9)?? ????f?e???????????????(alignme
    nt a????????)?????
  • f f1, f2(f), f3, ..,f7(f), . fm
  • e1(e)
  • e2
  • e
  • e8(e)
  • el
  • ????????S???? (f(s)e(s)) s1,,S????????????????S
    ????????????(10)??????????????????????????????
    (10)??ePr(fe) ??e???????

16
t(fe)????????????
  • ????????
  • ?
  • ??????

17
  • ??????????????????

18
????EM?t(fe)???-1
  • t(fe)???????????
  • ?(f(s),e(s)), 1lts ltS?????
  • ??????
  • ??? f,e
    ?f(s),e(s)???
  • ?????0????

19
????EM?t(fe)???-2
  • ? ??????1??????
  • ???e??????t(fe)???????????
  • t(fe)???????2,3??????

20
Model 2
  • Alignment?????????????

21
????????????h?????????
22
Model 1?????????
  • Model 1 ??(l1)-1 ???a(ij,m,l)?Model 2
    ????????????
  • ?????EM???????t(fe)????
  • ?????Model 1????????

23
Model 3
  • 1???n????? not gt ne pas
  • n0(??????)??????? ???????????
  • ????????????????
  • ????????
  • ???????????????

24
  • ???? n(fe) ????e?f?????????????????
  • ????t(fe)????e????????f????????
  • ???d(ji,m,l)????l,???????m,???????i???????????j?
    ???????
  • ?????????f0

25
  • ????????????????????????????????f0???????p1???????
    ????

26
???????

fi ! ?ei???????fi?????????????????????????
27
  • (32)??????n,t,d,p0,1?????????1???????????????????
    ????? Pr(ef) ??????????
  • ????model1,2????????????????????????
  • ?????????????????????????

28
???????????????
  • ???MT??????????????????BLEU???????????????????????
    ?????????
  • ????SYSTRAN??????????MT????BLEU???????????????????
    ???????
  • ??????????SMT????????????????????????

29
?????MT?????
  • ????????????????MT????????????????????

????????(RMT) ???? (SMT)
BLEU ???? ????
????? ?????? ????????
???? ??????????????? ???????????????
30
????MT????(1)
  • BLEU
  • WER(word error rate)
  • ??????????????????????

31
????MT????(2)
  • PER(position independent WER)
  • GTM(General Text Matcher)
  • ?????????????????????MMS

32
???????SMT???
  • ?????JAPIO???????????/PAJ?????(1993???2004????12?
    ??G06??77??????1000??????????500????
  • ???????SMT
  • ????????(???????????????)???
  • ??????????
  • ?????????????

33
???
Tommrow I will go to the conference
in Japan
?? F ??? ??? ????
?????MT?????????
BLEU WER PER
0.2713 0.848 0.452
MT2006(NIST??)??BLEU?0.35? ?????????
34
?????????????? -- Aligned corpus ???--
  • Parallel Corpus(?????????)
  • Aligned Corpus ?????????????????2????????????????
    ???????? ?????(align) ???????
  • 90?????????2???????????????????????
  • 90??????Noisy Parallel Corpus ??????????
    (Fung94,Fung98)

35
???????????????????
  • Gale and Church 1993
  • 2????? S,T??????(Alignment) A?????
  • S?T?????????? bead ????
  • ? B(?? language),
  • B(les eaux mineral, mineral water)
  • AlignmentargmaxA P(AS,T) argmaxA P(A,S,T)
  • Bk? ??????k???bead

36
???????????????????
  • ? B(?? language),
  • B(les eaux mineral, mineral water)
  • ???????????????????????collocation?????2??????????
    ????bead Bk??????P(Bk)?????????????
  • ???????????DP???

37
???????????????????
  • ???????????????????????????????????????
  • ????????????????????????????????????????
  • ????????????????????????????????DP????
  • Bead??????????????????????????????????????????????
    ?????

38
???????????????
  • ?????????? GaleChruch93, Nissen et al 98
  • A?? Wang Waibel 97
  • ???????????????????????? Kay Roscheisen 93

39
Aligned corpus ??????collocation ???????
  • ???????
  • ??????????????????????w1,w2(????????????)????????
    ????????
  • w1,w2 ???????????????????????0??EM??????????Kupie
    c93
  • w1,w2?????? Haruno93, Smadja96
  • w1,w2?Dice?? Smadja96
  • ??-??????
  • Likelihood ratio Melamed97

40
??2???????????????????
  • ?????????(??????)??W1(??A),w2(??B)?contingency
    matrix (??) abcdN
  • ?????
  • Dice??

W2?? W2???
W1?? a b
W1??? c d
41
Champollion ( Smdja et al 96)
  • Translating collocations based on sentence
    aligned bilingual corpus
  • 1????????Xtract? collocation ???
  • ????????????????????????????????????? collocation
    ???????
  • ???????????????Dice??????????

42
Champollion
  • Dice????X0,Y0(????????collocation????????)??????
    ??????????????????
  • ????? collocation (?????)?Dice????????????????
  • ?????????????????????

43
Champollion
  • Canadian Hansards (50MB order)
  • 3000, 5000 collocations extracted by Xtract
  • ????300 collocations ???
  • Xtract?error rate 11
  • Incorrect translations 24
  • Correct translations 65
  • Champollions precsion 73

44
Likelihood ratio Melamed 97
  • Melamed 97 ????????one-to-one?????????
  • 2?????u,v???????????????????
  • ????????????????
  • ????u,v?????????
  • ? P(????????)? ? -P(?????????)?????????????
  • ? ?- ??u,vP(k(u,v)n(u,v),?)??????????????
  • ??????????L(u,v)B(k(u,v)n(u,v),?)/B(k(u,v)n(u,
    v),?-)
  • ? ????????????
  • recall90 -- precision87

45
?????????????? -- non aligned corpus ???--
  • Non-aligned Corpus ???
  • Align???????????? Fung95ACL
  • Alignment ??????????????
  • ?????? Fung95WVLC
  • ??????????? Rapp95,Tanaka96 ,Fung98,
  • ???????????????

46
?????????????? -- non aligned corpus ???--
  • ????????????????????? collocation
    ?????????????????????????

47
Noisy Parallel Corpus ???
  • Fung94 ACL ?? English-French parallel corpus
    (Hansards) ?????????????????
  • ???????????????K????
  • ???Wen????Wfr?K????????????(lt1,0,1,0,0gt?????)?????
    ????MI???
  • MI?????????????
  • ??????MI???????????(t-score?????)????
  • K????????????Fung95??????

48
???????????????Alignment ???????????????????Noisy
Parallel Corpus ?????
  • Fung95 ACL95
  • Alignment ? ???????????
  • Step1. ???????????????????????????????????????????
    ????????
  • ?????????????????
  • ???????DP?????????????????????????????????????????
    ??????????
  • ???????????????
  • ????????????????????? alignment ??????

49
Fung 95 ACL ???
  • Step2 ??????????
  • ????????????????s1,s2,???(K????????????)
  • ???????????????????????( i of si
    )?????????????????? alignment ????????????????????
    ?
  • ???????????????????????

50
Fung 95 ACL ???
  • ??
  • 6000?????????????
  • ?????? 128? 80????
  • ?????? 533? 70?????
  • ???? 73???
  • ????????????????

51
?????????2?????????????
  • Non-parallel comparable corpora
  • Similarity of context is cue.
  • language A
  • language B
  • Calculation of contexts similarity is heavy

a b X c d
a b Y c d
52
Context Heterogeneity (Fung95WVLC)
  • ????????? parallel ???????
  • ?? trigram ????????????
  • ??L ??0 ??R
  • ??????0 ? context heterogeneity ?
  • ??L?????a, ??R?????b
  • ??0 ?????c ???
  • Left-heterogeneitya/c, rigth-heterogeneityb/c
  • ?????????????(??)?????
  • ????????w1,w2? context heterogeneity x1,y1 (for
    w1), x2,y2 (for w2)????

53
Context heterogeneity ???
  • e??????????????????????????????????????
  • ??????????non-aligned corpus ?????????
  • Context heterogeneity ?????????????????????????
  • ?????????????
  • ????????????????????????????????????

54
??????????????? Rapp
  • Rapp95,99 ???????????????????????????????
  • ????????????????
  • ?????????????wd???2?????????????
  • ?????????????????? wa,wb ??????????MI?????????????
    (????????)?????????

55
??????????????? Rapp
  • ?????
  • ?????????(??????)??wa,wb(??B)?contingency matrix
    (??) abcdN,
  • wa,wb??? a, wa?? b ,wb?? c,??????d
  • ???????????????????wa?????w(ger)?????w(eng)???????
    ???????

56
??????????????? Rapp
  • ???w(ger)???w(eng)???????????
  • ??s(w(ger),w(eng))?wa????????????????????S????
  • S???????wa??????????
  • ????????????????100??????????????
  • ?1?????????72?
  • ??10???????????????????89???

57
??????????????? Fung
  • Fung98 Rapp????????
  • ????????????????????W?????????
  • W?????????????????????????? Wd??????????????????
  • W????????Wd?????tf
  • W?????document ????????idf (W?????????????????

58
??????????????? Fung
  • tfidf ???Wd????????Wd?????????????????k????
  • cosine, Dice???????????????Wd???????
  • ?1????????30
  • ?20????????76???????

59
??????????????? Tanaka K
  • Tanaka96 ????MI????EDICT
  • ??A T ??B
  • ??u ??k
  • A TAT
    vs B
  • ??v ??l
  • Tijp(??B???j ??A???i)
  • TAT?B ???????????????T?????????
  • 378????????
    ???????? 82??????????????85????

60
Context Heterogeneity (Fung95WVLC)
  • ????????? parallel ???????
  • ?? trigram ????????????
  • ??L ??0 ??R
  • ??????0 ? context heterogeneity ?
  • ??L?????a, ??R?????b
  • ??0 ?????c ???
  • Left-heterogeneitya/c, rigth-heterogeneityb/c
  • ?????????????(??)?????
  • ????????w1,w2? context heterogeneity x1,y1 (for
    w1), x2,y2 (for w2)????

61
Context heterogeneity ???
  • e??????????????????????????????????????
  • ??????????non-aligned corpus ?????????
  • Context heterogeneity ?????????????????????????
  • ?????????????
  • ????????????????????????????????????

62
??????????????????nakagawa2000 LREC WTRC, 2001
NLPRS
  • ?????????????????
  • ??????????????????????????
  • ??????wj????????we1,we2,..????????(EDICT)?????????
    ??
  • wj ??????? wj ???????????????????wei (i1,2,..)
    ???

63
?????????? ????(???)
??????????????(??)
1? N? memory system 100? ?????
1? N-2???????? N3??????? N50???? 100? ????
?
?

64
Distance
  • distance(Xe, Xj) rank(Xe)-rank(Xj)
  • If distance(Xe, Xj) is small, Xe is the
    translation of Xj.
  • distance(Xe1,Xj)ltdistance(Xe2,Xj)lt
  • then Xe1 is most likely translation of Xj

65
Example of distance
??????? 0.051493 ??????
0.956459 memory system ?????
1.234347 ???? 3.809609 ??? 63.4
98688
??,?????????????????60??80???
66
??????????????? ???
  • Rapp?Fung, Tanaka ????
  • ????????????????
  • ????????????(???????)????????
  • ????????????????????????????????????????
  • ????????????????????(local minimum)???????????????
Write a Comment
User Comments (0)
About PowerShow.com