Title: Finite State Automata
1Finite State Automata
2Finite State Automata
- A very simple and intuitive formalism suitable
for certain tasks - A bit like a flow chart, but can be used for both
recognition and generation - Transition network
- Unique start point
- Series of states linked by transitions
- Transitions represent input to be accounted for,
or output to be generated - Legal exit-point(s) explicitly identified
3ExampleJurafsky Martin, Figure 2.10
- Loop on q3 means that it can account for infinite
length strings - Deterministic because in any state, its
behaviour is fully predictable
4Non-deterministic FSAJurafsky Martin, Figure
2.18
- At state q2 with input a there is a choice of
transitions - We can also have jump arcs (or empty
transitions), which also introduce non-determinism
5Augmented Transition Networks
- ATNs were used for parsing in the 60s and 70s
- For parsing, you need to pass constraints (e.g.
for agreement) as well as account for input the
Transition Networks were augmented by having a
register into/from which such information could
be put/taken. - Its easy to write recognizers, but computing
structure is difficult - ATNs quickly become very complex one solution
isto have a cascade of ATNs, where transitions
can call other networks
6Augmented Transition Networks
push NP put num
push VP get num
adj
det put num
n put num
pop NP
prep
7Exercises
0,b,1 1,a,2 2,a,3 3,a,3
3,!,end
fsa(0,b,1,1,a,2,2,a,3,3,a,3,3,!,end).
8NDSFA
0,b,1 1,a,2 2,a,3 3,!,end
3,empty,2
fsa(0,b,1,1,a,2,2,a,3,3,empty,2,3,!,end
).
9FSA and NDFSA programs
First load (consult) the file, eg 219.pl ?-
help. Options are as follows run - a simple
recognizer on prompt type in string with
space between each element, ending in . or ! or
? run(v) - verbose recognizer gives trace of
transitions gen(X) - generate text will interact
at choice points rec(X,quiet) - to generate text
deterministically. Type to get other
grammatical sequences
?- run.
b a a a a !
Enter your string
yes
10FSA and NDFSA programs
?- run(v).
Enter your string
0-b-1 1-a-2 2-a-3 3-skip-2 2-
a-3 3-skip-2 2-a-3 3-skip-2 3-!-end yes
b a a a a !
11FSA and NDFSA programs
?- gen(X).
- Choice at state 3. Choose state from
- !,end
- (2) empty,2
- Select choice number
2. Choice at state 3.
Choose state from (1) !,end (2)
empty,2 Select choice number
2. Choice at state 3.
Choose state from (1) !,end (2)
empty,2 Select choice number
1. X b,a,a,a,a,! ?
yes
12FSA and NDFSA programs
?- rec(X,quiet). X b,a,a ?
X b,a,a,a ?
X b,a,a,a,a ?
X b,a,a,a,a,a ?
yes
13FSAs and regular expressions
- FSAs have a close relationship with regular
expressions, a formalism for expressing strings,
mainly used for searching texts, or stipulating
patterns of strings - Regular expressions are defined by combinations
of literal characters and special operators
14Regular expressions
Character Meaning Examples
alternatives /aeiou/, /maen/ range
/a-z/ not /pbm/,
/oxs/ ? optionality /Kath?mandu/ zero or
more /baa!/ one or more /ba!/ . any
character /cat.aeiou/ , start, end of
line \ not special character
\.\?\ alternate strings /catdog/ (
) substring /cit(yies)/ etc.
15Regular expressions
- A regular expression can be mapped onto an FSA
- Can be a good way of handling morphology
- Especially in connection with Finite State
Transducers
16Finite State Transducers
- A transducer defines a relationship (a mapping)
between two things - Typically used for two-level morphology, but
can be used for other things - Like an FSA, but each state transition stipulates
a pair of symbols, and thus a mapping
17Finite State Transducers
- Three functions
- Recognizer (verification) takes a pair of
strings and verifies if the FST is able to map
them onto each other - Generator (synthesis) can generate a legal pair
of strings - Translator (transduction) given one string, can
generate the corresponding string
18Some conventions
- Transitions are marked by
- A non-changing transition xx can be shown
simply as x - Wild-cards are shown as _at_
- Empty string shown as e
19An exampleJM Fig. 3.9, p.74
f o x c a t d o g
P s
Ne
q4
q1
g o o s e s h e e p m o u s e
S
Ne
q0
q5
q2
q7
S
g oe oe s e s h e e p m oi uesc e
Ne
P
q6
q3
lexicalintermediate
20- 0 ff oo xx 1 Ne 4 P ss 7
- 0 ff oo xx 1 Ne 4 S 7
- 0 cc aa tt 1 Ne 4 P ss 7
- 0 ss hh ee pp 2 Ne 5 S 7
- 0 gg oo oo ss ee 2 Ne 5 P 7
f o x N P s f o x s f o x N S f o x c
a t N P s c a t s s h e e p N S s h e e
p g o o s e N P g e e s e
f o x c a t d o g
P s
Ne
q4
q1
g o o s e s h e e p m o u s e
S
Ne
q0
q5
q2
q7
S
g oe oe s e s h e e p m oi uesc e
Ne
P
q6
q3
21Lexicalsurface mappingJM Fig. 3.14, p.78
f o x N P s f o x s c a t N P s c a t
s
e ? e / x s z __ s
220 ff 0 oo 0 xx 1 e 2 ee 3 ss
4 0 0 cc 0 aa 0 tt 0 e 0
ss 0 0
f o x s f o x e s c a t s c a t s
other
q5
e other
z, s, x
s
e
z, s, x
e
ee
s
q0
q1
q4
q2
q3
, other
z, x
23FST
- Can be generated automatically
- Therefore, slightly different formalism
24FST compiler http//www.xrce.xerox.com/competencie
s/content-analysis/fsCompiler/fsinput.html d o g
N P .x. d o g s c a t N P .x. c a t s
f o x N P .x. f o x e s g o o s e N P .x.
g e e s e Â
s0 c -gt s1, d -gt s2, f -gt s3, g -gt s4. s1 a
-gt s5. s2 o -gt s6. s3 o -gt s7. s4 ltoegt
-gt s8. s5 t -gt s9. s6 g -gt s9. s7 x -gt
s10. s8 ltoegt -gt s11. s9 ltNsgt -gt s12. s10
ltNegt -gt s13. s11 s -gt s14. s12 ltP0gt -gt
fs15. s13 ltPsgt -gt fs15. s14 e -gt s16. fs15
(no arcs) s16 ltN0gt -gt s12.
25s0 c -gt s1, d -gt s2, f -gt s3, g -gt s4. s1 a
-gt s5. s2 o -gt s6. s3 o -gt s7. s4 ltoegt
-gt s8. s5 t -gt s9. s6 g -gt s9. s7 x -gt
s10. s8 ltoegt -gt s11. s9 ltNsgt -gt s12. s10
ltNegt -gt s13. s11 s -gt s14. s12 ltP0gt -gt
fs15. s13 ltPsgt -gt fs15. s14 e -gt s16. fs15
(no arcs) s16 ltN0gt -gt s12.
fst( s0,c,s1, d,s2, f,s3,
g,s4, s1,a,s5, s2,o,s6, s3,o,s7, s
4,o,e,s8, s5,t,s9, s6,g,s9, s7,x,s1
0, s8,o,e,s11, s9,'N',s,s12, s10,
'N',e,s13, s11,s,s14, s12,'P',0,fs15,
s13,'P',s,fs15, s14,e,s16, fs15,
noarcs, s16,'N',0,s12 ).
26FST 3.9
f o x c a t d o g
PL s
Ne
q4
q1
g o o s e s h e e p m o u s e
SG
Ne
s0
q5
q2
q7
SG
g oe oe s e s h e e p m oi uesc e
Ne
PL
q6
q3
27FST 3.9 (portion)
f o x c a t d o g
s0,f,s1, c,s3, d,s5, s1,o,s2, s2,x,q
1, s3,a,s4, s4,t,q1, s5,o,s6, s6,g
,q1,
q1
s0
o
s1
s2
f
x
a
c
t
s0
q1
s3
s4
d
g
o
s5
s6