CSA3050: Natural Language Algorithms - PowerPoint PPT Presentation

About This Presentation
Title:

CSA3050: Natural Language Algorithms

Description:

mouse/mice. automaton/automata. Handled on a case-by-case basis ... PL, witch es into witch PL, mice, into mouse PL and automata into automaton PL. ... – PowerPoint PPT presentation

Number of Views:26
Avg rating:3.0/5.0
Slides: 28
Provided by: MikeR2
Category:

less

Transcript and Presenter's Notes

Title: CSA3050: Natural Language Algorithms


1
CSA3050 Natural Language Algorithms
  • Finite State Devices

2
Sources
  • Blackburn Striegnitz Ch. 2

3
Parsers vs. Recognisers
  • Recognizers tell us whether a given input is
    accepted by some finite state automaton.
  • Often we would like to have an explanation of why
    it was accepted.
  • Parsers give us that kind of explanation.
  • What form does it take?

4
Finite State Parser
  • The output of a finite state parser is a sequence
    of nodes and arcs. If we, gave the input
    h,a,h,a,! to a parser for our first laughing
    automaton, it should give us 1,h,2,a,3,h,2,a,3,!,
    4.
  • The technique in Prolog for turning a recognizer
    into a parser is to add one or more extra
    arguments to keep track of the structure that was
    found.

5
Base Case
  • Recogniser
  • recognize1(Node, ) -    final(Node).
  • Parser
  • parse1(Node, ,Node) -    final(Node).

6
Recursive Case
  • Recogniser
  • recognize1(Node1,
  • String) -
  • arc(Node1,Node2,Label),
  • traverse1(Label,
  • String,
  • NewString),
  • recognize1(Node2,
  • NewString).
  • Parser
  • parse1(Node1, String,
  • Node1,LabelPath) -arc(Node1,Node2
    ,Label),traverse1( Label,
  • String,
  • NewString), parse1(Node2,
  • NewString,
  • Path).

7
Complex Labels
  • So far we have only considered transitions with
    single-character labels.
  • More complex labels are possible e.g. symbols
    comprising several characters.
  • We can construct an FSA recognizing English noun
    phrases that can be built from the wordsthe,
    a, wizard, witch, broomstick, hermione, harry,
    ron, with, fast.

8
FSA for Noun Phrases
9
FSA for NPs in Prolog
  • initial(1).final(3).arc(1,2,a).arc(1,2,the).a
    rc(2,2,brave).arc(2,2,fast).arc(2,3,witch).
  • arc(2,3,wizard).arc(2,3,broomstick).arc(2,3,rat
    ).arc(1,3,harry).arc(1,3,ron).arc(1,3,hermione)
    .arc(3,1,with).

10
Parsing a Noun Phrase
  • testparse1(Symbols,Parse) -
  • initial(Node),parse1(Node,Symbols,Parse).
  • ?- testparse1(the,fast,wizard,Z).
  • Z1, the, 2, fast, 2, wizard, 3

11
Rewriting Categories
  • It is also possible to obtain a more abstract
    parse, e.g.
  • ?- testparse2(the,fast,wizard,Z).
  • Z1, det, 2, adj, 2, noun, 3
  • What changes are required to obtain this
    behaviour?

12
1. Changes to the FSA
  • FSA Lexicon
  • initial(1).           lex(a,det).final(3).      
           lex(the,det).arc(1,2,det).         lex(fas
    t,adj).arc(2,2,adj).         lex(brave,adj).arc(
    2,3,cn).          lex(witch,cn).arc(1,3,pn).     
         lex(wizard,cn).arc(3,1,prep).        lex(bro
    omstick,cn).                      lex(rat,cn).  
                        lex(harry,pn).               
           lex(hermione,pn).                      lex
    (ron,pn).                      lex(with,prep).

13
Changes to the Parser
Parse2 parse2(Node1, String,
Node1,LabelPath) -arc(Node1,Node2,Label),trav
erse2( Label,
String, NewString), parse2(Node2,
NewString, Path).
traverse2(Label,SymbolSymbols,Symbols) -   l
ex(Symbol,Label).
  • Parse1
  • parse1(Node1, String,
  • Node1,LabelPath) -arc(Node1,Node2
    ,Label),traverse1( Label,
  • String,
  • NewString), parse1(Node2,
  • NewString,
  • Path).

14
Handling Jumps
  • traverse3('',String,String).
  • traverse3(Cat,WordWords,Words) -   lex(Word,C
    at).

15
Finite State Transducers
  • A finite state transducer essentially is a finite
    state automaton that works on two (or more)
    tapes.
  • The most common way to think about transducers is
    as a kind of translating machine' which works
    by reading from one tape and writing onto the
    other.

16
A Translator from a to b
  • initial state arrowhead
  • final statedouble circle
  • ab read from first tape and write to second tape

17
Prolog Representation
  • - op(250,xfx,). initial(1).final(1).arc(1,1
    ,ab).

18
Modes of Operation
  • generation mode It writes a string of as on one
    tape and a string bs on the other tape. Both
    strings have the same length.
  • recognition mode It accepts when the word on the
    first tape consists of exactly as many as as the
    word on the second tape consists of bs.
  • translation mode (left to right) It reads as
    from the first tape and writes an b for every a
    that it reads onto the second tape.
  • translation mode (right to left) It reads bs
    from the second tape and writes an a for every f
    that it reads onto the first tape.

19
Transducers and Jumps
  • Transducers can make jumps going from one state
    to another without doing anything on either one
    or on both of the tapes.
  • So, transitions of the form a or a or are
    possible.

20
Simple Transducer in Prolog
  • transduce1(Node, , ) -    final(Node).
  • transduce1(Node1,Tape1,Tape2) -arc(Node1,Node2,La
    bel),traverse1(Label, Tape1,
    NewTape1, Tape2,
    NewTape2),transduce1(Node2,NewTape1,NewTape2).

21
Traverse for FST
  • traverse1(L1L2, L1RestTape1,
    RestTape1, L2RestTape2,
    RestTape2).
  • testtrans1(Tape1,Tape2) -    initial(Node),    
    transduce1(Node,Tape1,Tape2).

22
Handling Jumps4 cases
  • Jump on both tapes.
  • Jump on the first but not on the second tape.
  • Jump on the second but not on the first tape.
  • Jump on neither tape (this is what traverse1
    does).

23
4 Corresponding Clauses
  • traverse2('''',Tape1,Tape1,Tape2,Tape2).
  • traverse2(''L2,Tape1,Tape1,L2RestTape2,RestTa
    pe2).
  • traverse2(L1'',L1RestTape1,RestTape1,Tape2,Ta
    pe2).
  • traverse2(L1L2, L1RestTape1,
    RestTape1, L2RestTape2,
    RestTape2).

24
Morphological Analysis with FSTs
  • Morphology is concerned with the internal
    structure of words.
  • How can a word be decomposed into morphemes?
  • How do the morphemes combine?
  • What are legitimate combinations?
  • Basic idea is to write FSTs that map the surface
    form of a word to a description of the morphemes
    that constitute that word or vice versa.
  • Example wizards to wizardPL or kissed to
    kissPAST.

25
Plural Nouns in English
  • Regular Forms
  • add an s as in wizards.
  • add es as in witch s
  • Handled with morpho-phonological rules that
    insert an e whenever the morpheme preceding the s
    ends in s, x, ch or another fricative.
  • Irregular forms
  • mouse/mice
  • automaton/automata
  • Handled on a case-by-case basis
  • Require transducer that translates wizards into
    wizardPL, witches into witchPL, mice, into
    mousePL and automata into automatonPL.

26
FST for English Plurals
27
FST in Prolog
  • lex(wizardwizard,STEM-REG1').lex(witchwitch,
    STEM-REG2').lex(automatonautomaton,IRREG-SG').
    lex(automataautomaton-PL',IRREG-PL').lex(mouse
    mouse,IRREG-SG').lex(micemouse-PL',IRREG-PL'
    ).
Write a Comment
User Comments (0)
About PowerShow.com