The NESPOLE Interchange Format IF - PowerPoint PPT Presentation

1 / 82
About This Presentation
Title:

The NESPOLE Interchange Format IF

Description:

I wish to book two single rooms. A: Yes. ... They are rare anyway. Some simple relative clauses aren't broken. The hotel that is in Cavalese ... – PowerPoint PPT presentation

Number of Views:67
Avg rating:3.0/5.0
Slides: 83
Provided by: auladid
Category:

less

Transcript and Presenter's Notes

Title: The NESPOLE Interchange Format IF


1
The NESPOLE Interchange Format (IF)
  • Lori Levin, Emanuele Pianta, Donna Gates, Kay
    Peterson, Dorcas Wallace, Herve Blanchon, Roldano
    Cattoni, Jean-Philippe Gibaud, Chad Langley, Alon
    Lavie, Nadia Mana, Fabio Pianesi

2
Outline
  • Approaches to MT Interlingua, Transfer, Direct.
  • The NESPOLE! Interlingua.
  • Overview and motivation
  • Linguistic coverage
  • Tools and resources.
  • Evaluating an interlingua
  • Coverage how do we measure coverage of the
    domain
  • Reliability so that an analyzer written by one
    person in Italy can work with a generator written
    by someone he has never met in Korea.
  • Scalability move to broader semantic domains
    without a constant increase in the amount of work.

3
Outline
  • Approaches to MT Interlingua, Transfer, Direct.
  • The NESPOLE! Interlingua.
  • Overview and motivation
  • Linguistic coverage
  • Tools and resources.
  • Evaluating an interlingua
  • Coverage
  • Reliability
  • Scalability

4
What is an interlingua?
  • Representation of meaning or speaker intention.
  • Sentences that are equivalent for the translation
    task have the same interlingua representation.
  • The room costs 100 Euros per night.
  • The room is 100 Euros per night.
  • The price of the room is 100 Euros per night.

5
Interlingua
Give-informationpersonal-data (namealex_waibel)
Vaquois MT Triangle
s vp accusative_pronoun chiamare proper_name
s np possessive_pronoun name vp be
proper_name
Transfer
Mi chiamo Alex Waibel
My name is Alex Waibel.
Direct
6
Other Approaches to Machine Translation
  • Direct
  • Very little analysis of the source language.
  • Transfer
  • Analysis of the source language.
  • The structure of the source language input may
    not be the same as the structure of the target
    language sentence.
  • Transfer rules relate source language structures
    to target language structures.

7
Note
  • Some transfer systems may produce a more detailed
    meaning representation than some interlingua
    systems.
  • The difference is whether translation equivalents
    in the source and target languages are related by
    a single canonical representation.

8
Multilingual Translation with an Interlingua
Chinese (input sentence) San1 tian1 qian2, wo3
kai1 shi3 jue2 de2 tong4
French
Italian
Analyzers
Japanese
English
German
Korean
Catalan
give-informationonsetbody-state
(body-state-specpain, time(interval3d,
relativebefore))
Spanish
Arabic
Interlingua
Arabic
Spanish
Korean
Catalan
Chinese (paraphrase) wo3 yi3 jin1 tong4 le4 san1
tian1
French
Italian
Generators
Japanese
English (output sentence) The pain started three
days ago.
German
9
Multilingual translation with transfer
  • Transfer-rules-1 Arabic-Catalan
  • Transfer-rules-2 Catalan-Arabic
  • Transfer-rules-3 Arabic-Chinese
  • Transfer-rules-4 Chinese-Arabic
  • Transfer-rules-5 Arabic-English
  • Transfer-rules-6 English-Arabic
  • Etc.

10
Advantages of Interlingua
  • Add a new language easily
  • get all-ways translation to all previous
    languages by adding one grammar for analysis and
    one grammar for generation
  • Mono-lingual development teams.
  • Paraphrase
  • Generate a new source language sentence from the
    interlingua so that the user can confirm the
    meaning

11
Disadvantages of Interlingua
  • Meaning is arbitrarily deep.
  • What level of detail do you stop at?
  • If it is too simple, meaning will be lost in
    translation.
  • If it is too complex, analysis and generation
    will be too difficult.
  • Should be applicable to all languages.
  • Human development time.

12
Interlingual MT Systems
  • University of Maryland Lexical Conceptual
    Structure (Dorr)
  • Carnegie Mellon
  • Kantoo (Mitamura and Nyberg)
  • Nespole/C-STAR (Waibel, Levin, Lavie)
  • UNL (Universal Networking Language)
  • Microcosmos (Nirenburg)
  • Verbmobil Domain actions (Block)

13
Outline
  • Approaches to MT Interlingua, Transfer, Direct.
  • The NESPOLE! Interlingua.
  • Overview and motivation
  • Linguistic coverage
  • Tools and resources.
  • Evaluating an interlingua
  • Reliability
  • Coverage

14
A Travel DialogueTranslated from Italian
  • A Albergo Gabbia DOro. Good evening.
  • B My name is Anna Maria DeGasperi. Im calling
    from Rome. I wish to book two single rooms.
  • A Yes.
  • B From Monday to Friday the 18th, Im sorry, to
    Monday the 21st.
  • A Friday the 18th of June.
  • B The 18th of July. Im sorry.
  • A Friday the 18th of July to, you were saying,
    Sunday.
  • B No. Through Monday the 21st.

15
A Travel Dialogue(Continued)
  • B So with departure on Tuesday the 22nd.
  • A Then leaving on the 22nd. Yes. We have two
    singles certainly.
  • B Yes.
  • A Would you like breakfast?
  • B Is it possible to have all meals?
  • A No. We serve meals only in the evening.
  • B Ok. If you can do breakfast and dinner.
  • A Ok.
  • B Do you need a deposit?

16
A Travel Dialogue(Continued)
  • A You can give me your credit card number.
  • B Ok. Just a moment. Ok. My name is Anna
    Maria DeGaperi. The card is 005792005792.
  • A Good.
  • B Expiration 2002.
  • A 2002. Good. Thank you. We need a
    confirmation on the 18th of July before 6pm.
  • B Goodbye.
  • A Thanks. Goodbye.
  • B Thanks. Goodbye.

17
A Non-Task-Oriented Dialogue(We cant translate
this.)
  • A Are you cooking?
  • B My father is cooking. Im cleaning. I just
    finished cleaning the bathroom.
  • A Look. What do you know about Monica?
  • B I dont know anything. Look. I dont know
    anything.
  • A You dont know anything? I wrote her three
    weeks ago, but if she hasnt received the letter,
    they would have returned it. I hope she received
    it.
  • B Because Celia told me that the address that
    Monica had given us was wrong. She said that if
    I was going to write to her, well, .

From the Spanish CallHome corpus unlimited
conversation between family members.
18
The Ideal MT System
  • Fully automatic
  • High quality
  • Domain independent (any topic)
  • .isnt within the current state-of-the-art.

19
Design Principles of the Interchange Format
  • Instructions
  • Delete sample document icon and replace with
    working document icons as follows
  • Create document in Word.
  • Return to PowerPoint.
  • From Insert Menu, select Object
  • Click Create from File
  • Locate File name in File box
  • Make sure Display as Icon is checked.
  • Click OK
  • Select icon
  • From Slide Show Menu, Select Action Settings.
  • Click Object Action and select Edit
  • Click OK
  • Suitable for task oriented dialogue
  • Based on speakers intent, not literal meaning
  • Can you pass the salt is represented only as a
    request for the hearer to perform an action, not
    as a question about the hearers ability.
  • Abstract away from the peculiarities of any
    particular language
  • resolve translation mismatches.

20
Translation Mismatches
  • Sentences that are translation-equivalents in two
    languages do not have the same syntactic
    structure or predicate-argument structure.
    (Unitrans Eurotra)
  • I like to swim.
  • I swam across the river.
  • Sue met with Sam/Sue met Sam.

21
Design Principles (continued)
  • Domain independent framework with domain-specific
    parts
  • Simple and reliable enough to use
  • at multiple research sites with high intercoder
    agreement.
  • with widely varying type of parsers and
    generators.
  • Allow robust language engines
  • Underspeicification must be possible.
  • Fragments must be represented.

22
Speech ActsSpeaker intention vs literal meaning
  • Can you pass the salt?
  • Literal meaning The speaker asks for information
    about the hearers ability.
  • Speaker intention The speaker requests the
    hearer to perform an action.

23
Remember this term Domain Action
24
Domain Actions Extended, Domain-Specific Speech
Acts
  • give-informationexistencebody-state
  • It hurts.
  • give-informationonsetbody-object
  • The rash started three days ago.
  • request-informationavailabilityroom
  • Are there any rooms available?
  • request-informationpersonal-data
  • What is your name?

25
Domain ActionsExtended, Domain-Specific Speech
Acts
  • In domain.
  • I sprained my ankle yesterday.
  • When did the headache start?
  • Out of domain
  • Yesterday I slipped in the driveway on my way to
    the garage.
  • The headache started after my boss noticed that I
    deleted the file.

26
Formulaic Utterances
  • Good night.
  • tisbaH cala xEr
  • waking up on good
  • Romanization of Arabic from CallHome Egypt

27
Same intention, different syntax
  • rigly bitiwgacny
  • my leg hurts
  • candy wagac fE rigly
  • I have pain in my leg
  • rigly bitiClimny
  • my leg hurts
  • fE wagac fE rigly
  • there is pain in my leg
  • rigly bitinqaH calya
  • my leg bothers on me
  • Romanization of Arabic from CallHome Egypt.

28
Language Neutrality
  • Comes from representing speaker intention rather
    than literal meaning for formulaic and
    task-oriented sentences.
  • How about suggestion
  • Why dont you suggestion
  • Could you tell me request info.
  • I was wondering request
    info.

29
Domain Action Interlingua and Lexical Semantic
Interlingua
  • and how will you be paying for this
  • Domain Action representation
  • arequest-informationpayment (methodquestion)
  • Lexical Semantic representation
  • predicate pay
  • time future
  • agent hearer
  • product distance proximate, type
    demonstrative
  • manner question

30
Complementary Approaches
  • Domain actions limited to task oriented
    sentences
  • Lexical Semantics less appropriate for formulaic
    speech acts that should not be translated
    literally

31
Components of the Interchange Format
  • Instructions
  • Delete sample document icon and replace with
    working document icons as follows
  • Create document in Word.
  • Return to PowerPoint.
  • From Insert Menu, select Object
  • Click Create from File
  • Locate File name in File box
  • Make sure Display as Icon is checked.
  • Click OK
  • Select icon
  • From Slide Show Menu, Select Action Settings.
  • Click Object Action and select Edit
  • Click OK
  • speaker a (agent)
  • speech act give-information
  • concept availabilityroom
  • argument (room-type(single double),
  • timemd12)

32
Components of IFas of February 2002
  • 61 speech acts give-information
  • domain independent,
  • 20 are dialog managing
  • 108 concepts availability,
    accommodation
  • mostly domain dependent
  • 304 arguments room-type, time
  • domain dependent and independent
  • 7,652 values single,
    double, 12th

33
Examples
  • Instructions
  • Delete sample document icon and replace with
    working document icons as follows
  • Create document in Word.
  • Return to PowerPoint.
  • From Insert Menu, select Object
  • Click Create from File
  • Locate File name in File box
  • Make sure Display as Icon is checked.
  • Click OK
  • Select icon
  • From Slide Show Menu, Select Action Settings.
  • Click Object Action and select Edit
  • Click OK
  • no thats not necessary
  • cnegate
  • yes I am
  • caffirm
  • my name is alex waibel
  • cgive-informationpersonal-data
    (person-name(given-namealex, family-namewaibel)
    )
  • and how will you be paying for this
  • arequest-informationpayment (methodquestion)
  • I have a mastercard
  • cgive-informationpayment (methodmastercard)

34
Outline
  • Approaches to MT Interlingua, Transfer, Direct.
  • The NESPOLE! Interlingua.
  • Overview and motivation
  • Linguistic coverage
  • Tools and resources.
  • Evaluating an interlingua
  • Reliability
  • Coverage
  • Scalability

35
Conventional Speec Acts
  • thank you. cthank
  • can I help you ?aofferhelp (whoi,
    to-whomyou)
  • ltuhgt my name is Chadcgive-informationpersonal-d
    ata (person-name(given-namechad))

36
Fragments ellipsis
  • ltBgt and ltuhgt lthmgt in a restaurant.
    agive-informationconcept
    (conjunctiondiscourse, location(restaurant,
    identifiabilityno))
  • ltuhgt which town? crequest-informationconcept
    (concept-spec(town, identifiabilityquestion))

37
Fragments abandoned
  • You should
  • a suggestconcept (whoyou)
  • What should I
  • c request-suggestionconcept (whoI)

38
Coordination of Sentences
  • I want to go to France and I would prefer to
    leave today.
  • cgive-informationdispositiontrip
    (destination(object-namefrance),
    disposition(whoi, desire))
  • cgive-informationdispositiondeparture
    (conjunctiondiscourse, time(relative-timetoday)
    , disposition(whoi, preference))

39
Coordination of sentences, reduced
  • I want to leave Pittsburgh at 2 and return from
    Rome at 5.
  • cgive-informationdispositiondeparture
    (conjunctiondiscourse, origin(object-namepittsb
    urgh), disposition(whoi, desire),
    time(clock(hours2)))
  • cgive-informationtrip (conjunctiondiscourse,
    factualityunspecified, trip-specreturn,
    origin(object-namerome), time(clock(hours5)))

40
Conjunctive Set
  • I like festivals and plays.
  • cgive-informationdispositionevent (...
    event-spec(operatorconjunct, (festival,
    quantityplural), (play, quantityplural)))

41
Conjunction of modifiers
  • I prefer red and blue cars.
  • cgive-informationdispositionvehicle (...
    vehicle-spec(car, quantityplural,
    color(operatorconjunct, red, blue)))

42
Disjunctive Sets
  • I prefer hotels or cabins.
  • cgive-informationdispositionaccommodation (...
    accommodation-spec(operatordisjunct, (hotel,
    quantityplural), (cabin, quantityplural)))

43
Contrastive Set
  • I like hotels but not cabins.
  • cgive-informationdispositionaccommodation (...
    accommodation-spec(operatorcontrast, (hotel,
    quantityplural), (polaritynegative, cabin,
    quantityplural)))

44
Attitudes often a source of mismatches
  • Disposition
  • Eventuality
  • Evidentiality
  • Feasibility
  • Knowledge
  • Obligation
  • Main verbs in English that occur in other
    languages as affixes, adverbs, or other
    construtions that are not clearly bi-clausal.

45
Disposition
  • ltuhmgt ltPgt and I would like to arrive ltPgt around
    September ninth.cgive-informationdispositionar
    rival (disposition(whoi, desire),
    / attitude / conjunctiondiscour
    se, / rhetorical
    information / time(exactnessapproximate,
    month9, md9)) / time /

46
Disposition
  • I would like to stay in a hotel.
  • Dispositiondesire
  • I hate mushroom picking.
  • Dispositiondislike
  • I am waiting to see the circle.
  • Dispositionexpectation
  • But wouldnt matter.
  • Dispositionindifferent
  • When do you plan on arriving in Pittsburgh?
  • Dispositionintention

47
Eventuality
  • It is possible I may be arriving earlier.
  • give-informationeventualityarrival
  • (eventualitypossible)
  • Im sure that they will arrive tomorrow.
  • Maybe there is something beautiful to see.
  • It is not impossible.

48
Evidentiality Source of information
  • Apparently there are many castles.
  • Give-informationevidentialityattraction
  • I heard there are many castles.
  • I noticed there is a winter package available.
  • Ive been told I must leave before ten.

49
Feasibility
  • You can rent skis at the resort.
  • Give-informationfeasibilityrentequipment
  • (feasibilityfeasible.)

50
Knowledge
  • I didnt know that Trento has lakes.
  • Give-informationnegationknowledgecontainattrac
    tion
  • (knowledge(whoI, polaritynegative),
    contain(lake, quantityplural),
    attraction-specname-trento)
  • I know the location of the hotel.

51
Obligation
  • You must make a reservation.
  • Give-informationobligationreservation
    (obligationrequired.)
  • You may cancel at any time.
  • We require that you cancel 24 hours in advance.

52
Negation with limited facilities for
representing scope
  • Of conventional speech act
  • I didnt hear.
  • Negate-dialogue-hear
  • Of main predication
  • I did not make a reservation.
  • Give-informationnegationreserveaccommodation
    (polaritynegative)
  • Of attitude
  • I dont know if its all right.
  • Give-informationnegationknowledgefeatureobject
  • (knowledge(whoI, polaritynegative),
    object-specpronoun, feature(modifieracceptable)
    )

53
Negation
  • I didnt promise I would not come.
  • Negate-promisenegationaction
  • Of a concept
  • There is no downhill skiiing?
  • Request-informationexistenceactivity
  • (activity-spec(polaritynegative,
    downhill_skiing)

54
Relative Clauses
  • Broken into two IFs.
  • I want the hotel that you suggested.
  • Give-informationdispositionaccommodation
  • (disposition(desire, whoI),
  • accommodation-spec(hotel, identifiabilityyes))
  • Give-informationrecommendationobject
  • (object-specrelative, whoyou, e-timeprevious)
  • Sentence internal relative clauses (e.g.,
    modifying the subject) are not handled very well.
  • The only hotel that I can show you is a four star
    hotel.
  • No long-distance gaps.
  • They are rare anyway.

55
Some simple relative clauses arent broken
  • The hotel that is in Cavalese
  • give-informationconcept
  • (accommodation-spec(hotel,
    identifiabilityyes, locationname-cavalese))

56
Yes-No Questions
  • Conventional speech act
  • Do you hear me?
  • Dialog-request-hear
  • Does the flight leave at 200?
  • Tell me if the flight leaves at 200.
  • request-informationdeparture
  • (transportation-spec(flight,
    identifiabilityyes), time(clock(hours2)))

57
Wh-questions
  • Who is traveling?
  • request-informationtrip (whoquestion)
  • When are you traveling?
  • What date are you traveling?
  • How quiet is the hotel?
  • Where are you traveling to?
  • How are you traveling?
  • What are you doing?
  • No long distance gaps.

58
Rhetorical Relations
  • Therefore I arrived late.
  • give-informationarrival
  • (causediscourse, whoI)
  • I arrived late because of the snow
  • give-informationarrival
  • (whoI, causesnow, e-timeprevious, timelate)

59
Rhetorical Relations
  • because I was tired.
  • give-informationfeatureperson
  • (rhetoricalcause, )
  • Other relations after, before, besides,
    co-occurrence, concessive, condition,
    contrastive, dependency, purpose, related-to,
    restrictive-result, result, while

60
Outline
  • Approaches to MT Interlingua, Transfer, Direct.
  • The NESPOLE! Interlingua.
  • Overview and motivation
  • Linguistic coverage
  • Tools and resources.
  • Evaluating an interlingua
  • Reliability
  • Coverage
  • Scalability

61
The Interchange Format Database
d.u.sdu olang X lang Y Prv Z sdu in language
Y on one line d.u.sdu olang X lang Z Prv Z sdu
in language Z on one line d.u.sdu
IF Prv Z IF on-one-line d.u. sdu
comments your comments d.u. sdu comments go
here
61.2.3 olang I lang I Prv IRST telefono per
prenotare delle stanze per quattro
colleghi 61.2.3 olang I lang E Prv IRST Im
calling to book some rooms for four
colleagues 61.2.3 IF Prv
IRST crequest-actionreservation featuresroom
(for-whom(associate, quantity4)) 61.2.3
comments dial-oo5-spkB-roca0-02-3
62
NESPOLE! Database
  • Annotated turns (end 2001)
  • English 815 (235 distinct DAs)
  • German 2,873 (367)
  • Italian 1,286 (233)
  • French 234 (94)
  • Total distinct DAs 610
  • Annotated turns (end 2002) 30/40 more

63
Tools and Resources
  • IF specifications (available on the web)
  • http//www.is.cs.cmu.edu/nespole/db/index.html
  • IF discussion board
  • http//peace.is.cs.cmu.edu/ISL/get/if.html
  • C-STAR and NESPOLE! Data Bases
  • http//www.is.cs.cmu.edu/nespole/db/index.html
  • IF Checker (web interface)
  • http//tcc.itc.it/projects/xig/xig-on-line.html
  • IF test suite
  • http//tcc.itc.it/projects/xig/xig-ts.html
  • IF emacs mode

64
The C-STAR Interchange Format Database
English Dialogues English Sentences Korean
Dialogues Korean Sentences Italian
Dialogues Italian Sentences Japanese
Dialogues Japanese Utterances Distinct Dialogue
Acts
36 2466 70 1142 5 233 124 5887 554 (310 agent,
244 client)
65
Outline
  • Approaches to MT Interlingua, Transfer, Direct.
  • The NESPOLE! Interlingua.
  • Overview and motivation
  • Linguistic coverage
  • Tools and resources.
  • Evaluating an interlingua
  • Reliability
  • Coverage
  • Scalability

66
Comparison of two interlinguas
  • I would like to make a reservation for the fourth
    through the seventh of July.
  • IF-1 (C-STAR II, 1997-1999)
  • crequest-actionreservationtemporalhotel
  • (time(start-timemd4, end-time(md7,july)))
  • IF-2 (NESPOLE, 2000-2002)
  • cgive-informationdispositionreservation
  • accommodation
  • (disposition(whoI, desire),
  • reservation-spec(reservation,
  • identifiabilityno),
  • accommodation-spechotel,
  • object-time(start-time(md4),
  • end-time(md7, month7,
  • incl-exclinclusive)))

67
Comparison of four databases(travel domain, role
playing, spontaneous speech)
Same data, different interlingua
  • DB-1 C-STAR II English database tagged with IF-1
  • 2278 sentences
  • DB-2 C-STAR II English database tagged with IF-2
  • 2564 sentences
  • DB-3 NESPOLE English database tagged with IF-2
  • 1446 sentences
  • Only about 50 of the vocabulary overlaps with
    the C-STAR database.
  • DB-4 Combined database tagged with IF-2
  • 4010 sentences

Significantly larger domain
68
Outline
  • Approaches to MT Interlingua, Transfer, Direct.
  • The NESPOLE! Interlingua.
  • Overview and motivation
  • Linguistic coverage
  • Tools and resources.
  • Evaluating an interlingua
  • Reliability
  • Coverage
  • Scalability

69
Measuring Coverage
  • No-tag rate
  • Can a human expert assign an interlingua
    representation to each sentence?
  • C-STAR II no-tag rate 7.3
  • NESPOLE no-tag rate 2.4
  • 300 more sentences were covered in the C-STAR
    English database
  • End-to-end translation performance Measures
    recognizer, analyzer, and generator performance
    in combination with interlingua coverage.

70
Outline
  • Approaches to MT Interlingua, Transfer, Direct.
  • The NESPOLE! Interlingua.
  • Overview and motivation
  • Linguistic coverage
  • Tools and resources.
  • Evaluating an interlingua
  • Reliability
  • Coverage
  • Scalability

71
Example of failure of reliability
  • Input 300, right?
  • Interlingua verify (time300)
  • Poor choice of speech act name does it mean
    that the speaker is confirming the time or
    requesting verification from the user?
  • Output 300 is right.

72
Measuring Reliability Cross-site evaluations
  • Compare performance of
  • Analyzer ? interlingua ? generator
  • Where the analyzer and generator are built at the
    same site (or by the same person)
  • Where the analyzer and generator are built at
    different sites (or by different people who may
    not know each other)
  • C-STAR II interlingua comparable end-to-end
    performance within sites and across sites.
  • around 60 acceptable translations from speech
    recognizer output.
  • NESPOLE interlingua cross-site end-to-end
    performance is lower (but not clearly because of
    the IF).

73
Intercoder agreement average of percent
agreeent pairwise

74
Workshop on InterlinguaReliabilitySIG-IL
  • Association for Machine Translation in the
    Americas
  • October 8, 2002
  • Tiburon, California
  • Intent to participate in coding experiment
    (dependency representation)
  • (email to lsl_at_cs.cmu.edu)

75
Outline
  • Approaches to MT Interlingua, Transfer, Direct.
  • The NESPOLE! Interlingua.
  • Overview and motivation
  • Linguistic coverage
  • Tools and resources.
  • Evaluating an interlingua
  • Reliability
  • Coverage
  • Scalability

76
Comparison of four databases(travel domain, role
playing, spontaneous speech)
Same data, different interlingua
  • DB-1 C-STAR II English database tagged with IF-1
  • 2278 sentences
  • DB-2 C-STAR II English database tagged with IF-2
  • 2564 sentences
  • DB-3 NESPOLE English database tagged with IF-2
  • 1446 sentences
  • Only about 50 of the vocabulary overlaps with
    the C-STAR database.
  • DB-4 Combined database tagged with IF-2
  • 4010 sentences

Significantly larger domain
77
Measuring Scalability Coverage Rate
  • What percent of the database is covered by the
    top n most frequent domain actions?

78
Measuring Scalability Number of domain actions
as a function of database size
  • Sample size from 100 to 3000 sentences in
    increments of 25.
  • Average number of unique domain actions over ten
    random samples for each sample size.
  • Each sample includes a random selection of
    frequent and infrequent domain actions.

79
(No Transcript)
80
Comparison of four databases(travel domain, role
playing, spontaneous speech)
Same data, different interlingua
  • English database 1 tagged with interlingua 1
    2278 sentences
  • English database 1 tagged with interlingua 2
    2564 sentences
  • English database 2 tagged with interlingua 2
    1446 sentences
  • Only about 50 of the vocabulary overlaps with
    the English database 1.
  • Combined databases tagged with interlingua 2
    4010 sentences

Significantly larger domain
81
Conclusions
  • An interlingua based on domain actions is
    suitable for task-oriented dialogue
  • Reliable
  • Good coverage
  • Scalable without explosion of domain actions
  • It is possible to evaluate an interlingua for
  • Realiability
  • Expressivity
  • Scalability

82
How to have success with an interlingua in a
multi-site project
  • Keep it simple.
  • Periodically check for intercoder agreement.
  • Good documentation
  • Discussion board for developers
  • Know your language typology.
Write a Comment
User Comments (0)
About PowerShow.com