Microplanning Sentence planning Part 1 - PowerPoint PPT Presentation

1 / 75
About This Presentation
Title:

Microplanning Sentence planning Part 1

Description:

NLG Tasks (as explained by Anja) ... Violation of a maxim leads to implicatures. For example, [Quantity] the pitbull' (when there is ... – PowerPoint PPT presentation

Number of Views:109
Avg rating:3.0/5.0
Slides: 76
Provided by: keesvan6
Category:

less

Transcript and Presenter's Notes

Title: Microplanning Sentence planning Part 1


1
Microplanning(Sentence planning)Part 1
  • Kees van Deemter

2
Natural Language Generation
  • Taking some computer-readable gibberish
  • Translating it into proper English
  • Applications include
  • dialogue/chat systems
  • on-line help
  • summarisation,
  • document authoring

3
NLG Tasks (as explained by Anja)
  • Content determination decide what to say
    construct set of messages
  • Discourse planning ordering, structuring
    concepts rhetorical relationships
  • Sentence aggregation divide content into
    sentences construct sentence plans
  • Lexicalisation map concepts and relations to
    lexemes ( words)
  • Referring expression generation decide how to
    refer to objects
  • Linguistic realisation put it all together in
    acceptable words and sentences

4
Modular structure of NLG systems (in theory!)
Content determination
TEXT PLANNER
Discourse planning
Sentence aggregation
SENTENCE PLANNER/ MICROPLANNER
Lexicalisation
Referring expressions
REALISER
Realisation
5
Last week Input to realisation
  • message-id msg02
  • relation C_DEPARTURE
  • departing-entity C_CALEDON-EXPRESS
  • args departure-location C_ABERDEEN
  • departure-time C_1000
  • departure-platform C_7

6
Microplanning 1Aggregation
  • Distributing information over different
    sentences. Example
  • a. The Caledonian express departs Aberdeen at
    1000, from platform 7
  • b. The Caledonian express departs Aberdeen at
    1000. The Caledonia express departs from
    platform 7

7
Microplanning 2 GRE
  • GRE Generation of Referring Expressions
  • Explaining which objects youre talking about
  • a. The Caledonian express departs Aberdeen at
    1000, from platform 7
  • b. The Caledonian express departs -- at
    1000. The train departs from this platform

8
Microplanning 3 lexical choice
  • Using different words for the same concept
  • a. The Caledonian express departs Aberdeen at
    ten oclock, from platform 7
  • b. The Caledonian express departs Aberdeen
    at ten. The Caledonia express leaves from
    platform 7

9
In practice tasks can be performed in different
order
  • Example aggregation can be performedon
    messages

10
  • message-id msg02
  • relation C_DEPARTURE_1
  • departing-entity C_CALEDON-EXPRESS
  • args departure-location C_ABERDEEN
  • departure-time C_1000

message-id msg03 relation C_DEPARTURE_2 args
departure-entity C_CALEDON-EXPRESS
departure-platform C_7
11
  • Aggregation can also be performed after
    realisation
  • The Caledonian express departs Aberdeen
    at 1000 from platform 7
  • gt
  • The Caledonian express departs Aberdeen
    at 1000. The Caledonia express departs
    from platform 7

12
Lets focus on GRE, but ...
  • A little detour NLG systems do not always work
    as youve been told
  • Some practically deployed systems combine canned
    text with NLG
  • One possibility system has a library of language
    templates, with gaps that need to be filled.
    E.g.,

13
  • TRAIN departs TOWN at TIME
  • TRAIN departs TOWN from PLATFORM
  • We apologise for the fact that TRAIN is
    delayed by AMOUNT
  • Question which of the other tasks are still
    relevant?

14
Lets move on to GRE
  • Why/when is GRE useful?

15
  • The referent has a familiar name, but its not
    unique, e.g., John Smith
  • The referent has no familiar name trains,
    furniture, trees, atomic particles,
  • ( Databases use keys, e.g.,
  • Smith73527, TRAIN-3821 )
  • 3. Similar sets of objects
  • 4. NL is too economical to have namesfor
    everything

16
Last week Input to realisation
  • message-id msg02
  • relation C_DEPARTURE
  • departing-entity C_CALEDON-EXPRESS
  • args departure-location C_ABERDEEN
  • departure-time C_1000

17
Last week Input to realisation
  • message-id msg02
  • relation C_DEPARTURE
  • departing-entity C_CALEDON-EXPRESS
  • args departure-location C_ABERDEEN
  • departure-time C_1000

18
This week more realistic input
  • message-id msg02
  • relation C_DEPARTURE
  • departing-entity C_34435
  • args departure-location .....
  • departure-time .....

the caledonian (express), the Aberdeen-Glasgow
express the blue train on your left , the
train
19
  • Communication is about saying the truth ...
  • but thats not all there is to it
  • Paul Grice (around 1970) principles of rational,
    cooperative communication
  • GRE is a good case study. (R.Dale and E.Reiter,
    Cognitive Science, 1995)

20
Grice maxims of conversation
  • Quality only say what you know to be true
  • Quantity give enough but not too much
    information
  • Relevance be relevant
  • Manner be clear and brief
  • (There is overlap between these four)

21
Maxims are two-edged sword
  • They say how one should normally speak/write.
    Example
  • Yes, theres a gasoline station around the
    corner (when its no longer operational)
  • quality yes, its true
  • quantity probably yes
  • relevance no, not relevant to hearers
    intentions
  • manner its brief, clear, etc.

22
Maxims are two-edged sword
  • 2. They can also be exploited. Example
  • Asked to write academic reference Kees
    always came to my lectures and hes a nice guy
  • quality yes, its true (lets assume)
  • quantity No -- How about academic achievements?
  • relevance yes
  • manner yes

23
  • Application to GRE
  • Dale Reiter best description of an object
  • fulfils the Gricean maxims. E.g.,
  • (Quality) list properties truthfully
  • (Quantity) use properties that allow
    identification without containing more info
  • (Relevance) use properties that are of
    interest in the situation
  • (Manner) be brief

24
DRs expectation
  • Violation of a maxim leads to implicatures.
  • For example,
  • Quantity the pitbull (when there is only
    one dog).
  • Manner Get the cordless drill thats in the
    toolbox (Appelt).
  • Theres just one problem

25
people dont always speak this way
  • For example,
  • Manner the red chair (when there is only one
    red object in the domain).
  • Manner/Quantity I broke my arm (when I have
    two).
  • General empirical work shows much redundancy
  • Similar for other maxims, e.g.,
  • Quality the man with the martini (Donellan)

26
Example Situation
c, 100
d, 150
e, ?
Swedish
Italian
b, 150
a, 100
27
Formalized in a KB
  • Type furniture (abcde), desk (ab), chair (cde)
  • Origin Sweden (ac), Italy (bde)
  • Colours dark (ade), light (bc), grey (a)
  • Price 100 (ac), 150 (bd) , 250 ()
  • Contains wood (), metal (abcde), cotton(d)
  • Assumption all this is shared knowledge.

28
Game
  • 1. Describe object a.
  • 2. Describe object e.
  • 3. Describe object d.

29
Game
  • 1. Describe object a desk,sweden, grey
  • 2. Describe object e no solution
  • 3. Describe object d Italy, 150

30
Violations of
  • Manner
  • The 100 grey Swedish desk which is made of
    metal
  • (Description of a)
  • Relevance
  • The cotton chair is a fire hazard?
  • ?Then why not buy the Swedish chair?
  • (Descriptions of d and c respectively)

31
  • In fact, there is a second problem with
    Quantity/Manner. Consider the following
    formalization
  • Full Brevity Never use more than the minimal
    number of properties required for identification
    (Dale 1989)
  • An algorithm

32
  • Dale 1989
  • Check whether 1 property is enough
  • Check whether 2 properties is enough
  • .
  • Etc., until
  • success minimal description is generated or
  • failure no description is possible

33
Problem exponential complexity
  • Worst-case, this algorithm would have to inspect
    all combinations of properties. n properties
    combinations.
  • Recall one grain of rice on square one twice
    as many on any subsequent square.
  • Some algorithms may be faster, but
  • Theoretical result algorithm must be
    exponential in the number of properties.

34
  • DR conclude that Full Brevity cannot be
    achieved in practice.
  • They designed an algorithm that only approximates
    Full Brevity the Incremental Algorithm (next
    time)

35
Microplanning(Sentence planning)Part 2
  • Kees van Deemter

36
Exercise
  • Two issues to think about
  • 1. Which NLG tasks does a template-based system
    need to perform?
  • 2. How would you set up a GRE program, to resolve
    the issues that have come up so far?

37
  • Ex. 1. Earlier examples of templates
  • TRAIN departs TOWN at TIME
  • TRAIN departs TOWN from PLATFORM
  • We apologise for the fact that TRAIN is
    delayed by AMOUNT

38
  • Simple extreme use canned text
  • C_3445 the Caledonian Express
  • C_3446 the Gatwick Express
  • C_327 the train from Bton to Hastings
  • Sophisticated extreme use full NLG

39
Ex. 2. Some problems in GRE
  • Producing a distinguishing description whenever
    one exists
  • Honouring Grices maxims ...
  • ... except where people dont
  • Doing everything in a computationally feasible
    way
  • Well show you Dale Reiters response.

40
  • Incremental Algorithm (informal)
  • Properties are considered in a fixed order
  • P
  • A property is included if it is useful
  • true of target false of some distractors
  • Stop when done so earlier properties have a
    greater chance of being included. (E.g., a
    perceptually salient property)
  • Therefore called preference order.

41
  • r the individual to be described
    (target referent)
  • P the list of properties, in preference order
  • P a property
  • L the list of properties in the description
  • (Recall were not worried about realization
    today)

42

43
Properties of the IA
  • Hill climbing better and better approximations
    of the target
  • Linear complexity
  • If m nr of properties
  • then O(m) actions needed in the worst
    case
  • Can do smart things with preference order
  • (think of properties that are hard to test)

44
Back to the KB
  • Type furniture (abcde), desk (ab), chair (cde)
  • Origin Sweden (ac), Italy (bde)
  • Colours dark (ade), light (bc), grey (a)
  • Price 100 (ac), 150 (bd) , 250 ()
  • Contains wood (), metal (abcde), cotton(d)
  • Assumption all this is shared knowledge.

45
Back to our game
  • 1. Describe object a.
  • 3. Describe object d.
  • Can you see room for improvement?

46
Improving the Incremental Algorithm
  • 1. Absent nouns. IA may fail to include a
    property that can be expressed through a noun.
    E.g., L ltgreygt
  • Solution
  • -- Make noun-ish properties most preferred
  • -- Last line of algorithm If no noun-ish
    properties have been included, add one

47
Improving the Incremental Algorithm
  • 2. Relations between properties
  • Add f to the domain. Now,
  • to describe a, IA includes these properties
  • furniture gt abcde
  • desk gt ab
  • Sweden gt a
  • After lexicalisation and realisation, e.g.
  • the piece of furniture thats a Swedish desk

48
Relations between properties
  • Dale and Reiter proposed to take the
  • attribute value structure into account
  • Type furniture(abcde),desk(ab),chair(cde
    )
  • -- For a given attribute, use the value that
    removes most distractors
  • -- If there is a tie, use the most general value

49
Relations between properties
  • Type furniture(abcde), desk(ab),
    chair(cde)
  • -- To refer to a when C abcdef,
  • desk is better than furniture
  • (desk removes c,d,e,f furniture removes f)
  • -- To refer to a when C abf,
  • furniture is better than desk
  • (furniture removes f desk removes f)

50
Improving the Incremental Algorithm
  • 3. Negations. Try to describe object e. IA
    fails
  • furniture gt abcde
  • chair gt cde
  • italy gt de
  • (dark) gt de
  • (metal gt de
  • It might have succeeded
  • Italy gt bde
  • Not 150 gt e

51
Adding negations where useful
  • Type furnit (abcde), desk (ab), chair
    (cde)
  • Origin Sweden (ac), Italy (bde)
  • Colours dark (ade), light (bc), grey (a)
  • not-grey (bcde)
  • Price 100 (ac), 150 (bd) , 250 ()
  • not-150 (a,c,e)
  • Contains wood (), metal (abcde), cotton(d)
  • not-cotton (abce)

52
Adding negations where useful
  • Type furnit (abcde), desk (ab), chair
    (cde)
  • Origin Sweden (ac), Italy (bde)
  • Colours dark (ade), light (bc), grey (a)
  • not-grey (bcde)
  • Price 100 (ac), 150 (bd) , 250 ()
  • not-150 (a,c,e)
  • Contains wood (), metal (abcde), cotton(d)
  • not-cotton (abce)

53

54
Improving the Incremental Algorithm
  • 4. Referring to sets. Try to describe the set
    d,e. No problem!
  • The algorithm was designed for reference to
    individuals, but it can be reused for sets
  • chair gt c,d,e
  • Italy gt d,e

55
  • Type furnit (abcde), desk (ab), chair
    (cde)
  • Origin Sweden (ac), Italy (bde)
  • Colours dark (ade), light (bc), grey (a)
  • not-grey (bcde)
  • Price 100 (ac), 150 (bd) , 250 ()
  • not-150 (a,c,e)
  • Contains wood (), metal (abcde), cotton(d)
  • not-cotton (abce)

56
Improving the Incremental Algorithm
  • 5. Other logical operations. Try to describe
    the set b,c,d
  • Its not possible (even with negation)
  • furniture gt a,b,c,d,e
  • not grey gt b,c,d,e
  • Yet, a description might have been possible
  • light or 150 gt b,c U b,d b,c,d

57
Some other limitations
  • How might one generate the small mouse, the
    large mouse (where large and small are context
    dependent)?
  • How could you generate non-literal descriptions,
    e.g., shes married to the yellow spectacles
  • Recursive descriptions the mother of
  • the father of the brother of ... of my
    neighbour
  • And so on ... Well focus on one more issue
    context

58
Sometimes no description is needed
  • The ..... leaves at 10. It departs from ...
    (earlier example)
  • Let every xi be a referring expression
  • ....x1....x2.....x3.......
    ..........x2....x2....x1..
    ....x1....x4.....x5.......
  • Full definite descriptions are one option among
    many

59
Some different ways of referring
  • the train from Aberdeen to Glasgow
  • (description)
  • the Caledonian Express (proper name)
  • this train (demonstrative)
  • that train over there (demonstrative)
  • it (pronoun)

60
Category choice
  • Choosing between pronouns, demonstratives,
    descriptions, proper names, etc.
  • Theories about category choice are often studied
    using corpora, via hypothesis testing or
    learning.
  • Salience is a key concept, which takes a
    different form in different theories

61
Salience is difficult to define
  • Talking about John may make him salient
  • John ... Henry ... Henry ... John ...
    John ... John ... John
  • Johns physical proximity may make him salient
  • salient/important/striking/attention catching/...
    Measured in different ways

62
Henschel, Cheng Poesio (2000)(just to give you
the flavour)
  • Choose pronoun if
  • antecedent is realized as subject or
    discourse-old
  • no competing referent is realized as subject or
    discourse-old
  • no competing referent is amplified by appositive
    or restrictive relative clause
  • Otherwise choose definite description

63
  • In this lecture, we ignore category choice,
    focussing on generation of definite
    descriptions.
  • So far, we have also ignored salience,even
    though it is also relevant for definite
    descriptions

64
Salience in GRE
  • Book Reiter and Dale (2000) Building Natural
    Language Generation Systems
  • Domain elements that are salient enough
  • Krahmer and Theune (2002)
  • This disregards different degrees of salience
    within the Domain
  • This fails to reflect that even the least salient
    object can be referable

65
Degrees of salience
  • If our chihuahua is the most salient object in
    the Domain then the dog refers to it.
  • If our chihuahua is the least salient object in
    the Domain then we might refer to him saying the
    small ratty creature thats trying to hide behind
    the chair

66
Krahmer and Theune (2002)
  • the dog the most salient dog
  • This assumption is exploited by re-interpreting
    Domain (in the Incremental Algorithm) by

67

68
SalMaxac, SalMidb, SalMinde
  • Type furniture (abcde), desk (ab), chair (cde)
  • Origin Sweden (ac), Italy (bde)
  • Colours dark (ade), light (bc), brown (a)
  • Price 100 (ac), 150 (bd) , 250 ()
  • Contains wood (), metal (abcde), cotton (d)
  • Exercise Describe a Describe b Describe d

69
SalMaxac, SalMidb, SalMinde
  • Type furniture (abcde), desk (ab), chair (cde)
  • Origin Sweden (ac), Italy (bde)
  • Colours dark (ade), light (bc), brown (a)
  • Price 100 (ac), 150 (bd) , 250 ()
  • Contains wood (), metal (abcde), cotton (d)
  • a Domain a,c description desk
  • b Domain a,b,c description desk,
    Italy
  • d Domain a,b,c,d,e description chair,
    Italy, 150

70
  • Krahmer Theune are agnostic about how salience
    is determined
  • They focus on textual salience
  • ....x1....x2.....x3.......
    ..........x2....x2....x1..
    ....x1....x4.....x5.......
  • Also influenced by physical proximity. (E.g.,
    the door the nearest door)

71
Before abandoning GRE ...
  • Lets return to the pipeline architecture

72
Modular structure of NLG systems
Content determination
TEXT PLANNER
Discourse planning
Sentence aggregation
SENTENCE PLANNER/ MICROPLANNER
Lexicalisation
Referring expressions
REALISER
Realisation
73
  • We have talked about Microplanning,
  • focussing on GRE
  • Weve focussed even further, on determining the
    properties in a description
  • Who decides
  • What words to use?
  • How to put them together? (The chair thats from
    Italy and that contains wood)

74
  • One can see GRE as a microcosm of GRE, containing
  • content determination (which properties?)
  • aggregation (which grouping?)
  • which words?
  • which linguistic construction?

75
Next week
  • Multimodality
Write a Comment
User Comments (0)
About PowerShow.com