CS5545: Realisation - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

CS5545: Realisation

Description:

Background Reading: Building Natural Language Generation Systems, chap 6 ... Often want to insert line breaks to make text fit into a page of given width ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 26
Provided by: computin7
Category:

less

Transcript and Presenter's Notes

Title: CS5545: Realisation


1
CS5545 Realisation
  • Background Reading Building Natural Language
    Generation Systems, chap 6

2
Realisation
  • Third (last) NLG stage
  • Generate actual text
  • Take care of details of language
  • Syntactic details
  • Eg Agreement (the dog runs vs the dogs run)
  • Morphological details
  • Eg, plurals (dog/dogs vs box/boxes)
  • Presentation details
  • Eg, fit to 80 column width

3
Realisation
  • Problem There are lots of finicky details of
    language which most people developing NLG systems
    dont want to worry about
  • Solution Automate this using a realiser

4
Syntax
  • Sentences must obey the rules of English grammar
  • Specifies which order words should appear in,
    extra function words, word forms
  • Many aspects of grammar are somewhat bizarre

5
Syntactic Details Verb Group
  • Verb group is the main verb plus helping words
    (auxiliaries).
  • Encodes information in fairly bizarre ways, eg
    tense
  • John will watch TV (future add will)
  • John watches TV (present - s form of verb for
    third-person singular subjects)
  • John is watching TV (progressive form of BE
    verb, plus ing form of verb)
  • John watched TV (past use ed form of verb)

6
Verb group
  • Negation
  • Usually add not after first word of verb group
  • John will not watch TV
  • Exception add do not before 1-word VG
  • Inflections on do, use infinitive form of main
    verb
  • John did not watch TV
  • Exception to exception use first strategy if
    verb is form of BE
  • John is not happy

7
Realiser
  • Just tell realiser verb, tense, whether negated,
    and it will figure out the VG
  • (watch, future) -gt will watch
  • (watch, past, negated) -gt did not watch
  • Etc
  • Similarly automate other obscure encodings of
    information

8
Morphology
  • Words have different forms
  • Nouns have plural
  • Dog, dogs
  • Verbs have 3ps form, participles
  • Watch, watches, watching, watched
  • Adjectives have comparative, superlative
  • Big, bigger, biggest

9
Formation of variants
  • Example plural
  • Usually add s (dogs)
  • But add es if base noun ends in certain letters
    (boxes, guesses)
  • Also change final y to i (tries)
  • Many special cases
  • children (vs childs), people (vs persons), etc

10
Realiser
  • Calculates variants automatically
  • (dog, plural) -gt dogs
  • (box, plural) -gt boxes
  • (child, plural) -gt children
  • etc

11
Punctuation
  • Rules for structures
  • Sentences have first word capitalised, end in a
    full stop
  • My dog ate the bone, the meat, etc..
  • Lists have conjunction (eg, and) between last two
    elements, comma between others
  • I saw Tom, Sue, Zoe, and Ciaran at the meeting.
  • Etc
  • Realiser can automatically insert appropriate
    punctuation for a structure

12
Punctuation
  • Rules on combinations of punc
  • Dont end full stop if sentence already ends in a
    full stop
  • He lives in Washington D.C.
  • He lives in Washington D.C..
  • Brackets absorb some full stops
  • John lives in Aberdeen (he used to live in
    Edinburgh).
  • John lives in Aberdeen (he used to live in
    Edinburgh.).
  • Again realiser can automate

13
Pouring
  • Usually we insert spaces between tokens, but not
    always
  • My dog
  • Mydog
  • I saw John, and said hello.
  • I saw John , and said hello
  • Automated by realiser

14
Pouring
  • Often want to insert line breaks to make text fit
    into a page of given width
  • Breaks should go between words if possible
  • Breaks should go between words if poss
  • ible
  • If not possible, break between syllables and add
    a hyphen
  • Realiser automates

15
Output formatting
  • Many possible output formats
  • Simple text
  • HTML
  • RTF (MS Word)
  • Realiser can automatically add appropriate
    markups for this

16
Summary
  • Realiser automates the finicky details of
    language
  • So NLG developer doesnt have to worry about
    these
  • One of the advantages of NLG

17
simplenlg classes
  • Realiser realise objects
  • SyntaxPhraseSpec
  • SPhraseSpec represents a sentence as verb,
    subjects, complements
  • NPPhraseSpec represents a noun phrase as
    determiner, modifiers, noun
  • can be subject or complement in SPhraseSpec
  • PPPhraseSpec represents prepositional phrase
  • Can be modifier in SPhraseSpec, NPPhraseSpec

18
Realiser API
  • class Realiser
  • public void Realiser () // default realiser
    (text)
  • public void Realiser (boolean HTML) //
    HTML output ?
  • public String realiser(Object spec)
  • Usage
  • // create text spec ts
  • Realiser r new Realiser()
  • String output r.realise(ts)

19
SPhraseSpec example
  • SPhraseSpec p new SPhraseSpec()
  • p.addSubject("my dog")
  • p.addSubject("your cat")
  • p.addVerb("like")
  • p.addComplement("balls")
  • p.setTense(SPhraseSpec.Tense.FUTURE)
  • p.setNegated(true)
  • Results in
  • My dog and your cat will not like balls.

20
NPPhraseSpec example
  • NPPhraseSpec subject new NPPhraseSpec("dog")
  • subject.addModifier("big")
  • subject.addDeterminer("the")
  • Results in
  • the big dog

21
PPhraseSpec Example
  • PPPhraseSpec pp new PPPhraseSpec("in", "the
    park"))
  • p new SPhraseSpec("I", "be")
  • p.addModifier(pp)
  • Results in
  • I am in the park

22
Lexicon
  • Lexicon class allows you to directly do
    morphology
  • Usually dont need to call explicitly

23
Lexicon example
  • Lexicon lex new Lexicon()
  • System.out.println(lex.getPlural("child"))
  • System.out.println(lex.getPast("eat"))
  • System.out.println(lex.getPastParticiple("eat"))
  • System.out.println(lex.getSuperlative("happy"))
  • This will print out
  • children
  • ate
  • eaten
  • happiest

24
Other feature Lexicaliser
  • Lexicaliser (different from lexicon) allows you
    to specify templates which produce SPhraseSpec
    from Protégé instances.
  • Wont discuss here, see doc

25
Simplenlg
  • Being developed, currently working in
  • Referring expressions
  • Better documentation
  • Let me know if you have any comments or
    suggestions!
Write a Comment
User Comments (0)
About PowerShow.com