Title: Finite State Automata
1Finite State Automata
- DFSAs (Deterministic)
- NFSAs (Non-deterministic)
- FSTs (Transducers)
- Other Issues
- Reading
- Jurafsy Martin Ch.2
2Finite State Automata
- Sheep talk
- baa!
- baaa!
- baaaa!
- ...
- States
- Start state
- Final/Accepting state
- Transitions
- FSTN
- Recogniser (Acceptor)
- Generator
3Components
- S s1, , sN a finite set of N states
- Note notational difference from JM
- K k1, , kM a finite set of M input symbols
(alphabet) - s0 start state
- F set of final states
- d(S, K) transition function
4Transition Table
5Input Tape
6Algorithm
- function DRECOGNIZE(tape,machine) returns accept
or reject - index ? Beginning of tape
- currentstate ? Initial state of machine
- loop
- if End of input has been reached then
- if currentstate is an accept state then
- return accept
- else
- return reject
- end
- elsif transitiontablecurrentstate,
tapeindex is empty then - return reject
- else
- currentstate ? transitiontablecurrentstate,
tapeindex - index ? index 1
- end
- end
7Fail State
8Example Numbers (1-99)
9NFSAs
- Choice of path
- or
- e-transition
10Strategies for NFSAs
- Backup
- Place marker at choice point
- If wrong choice made, try another (backtrack)
- Look-ahead
- Look ahead in input to help decide on path
- Parallelism
- When choice-point reached explore alternative
paths in parallel
11Backup
- We will focus on backup
- We will refer to an automaton (machine) state as
a node - A search-state will contain
- Node that we can go to from current position
- Corresponding position on tape
- Dont confuse search state with machine state
12NFSA Transition Table
13Algorithm...
function NDRECOGNIZE(tape,machine) returns
accept or reject agenda ? (Initial state of
machine, beginning of tape) currentsearchstate
? NEXT(agenda) loop if ACCEPTSTATE?(currentse
archstate) returns true then return accept
else agenda ? agenda ? GENERATENEWSTATES(curre
ntsearchstate) end if agenda is empty then
return reject else currentsearchstate ?
NEXT(agenda) end end
14...Algorithm
function GENERATENEWSTATES(currentstate)
returns a set of searchstates currentnode ?
the node the current searchstate is in index ?
the point on the tape the current searchstate is
looking at return a list of search states from
transition table as follows (transitiontablecu
rrentnode, e, index) ? (transitiontablecurre
ntnode, tapeindex, index 1) function
ACCEPTSTATE?(searchstate) returns true or false
currentnode ? the node searchstate is in
index ? the point on the tape searchstate is
looking at if index is at the end of the tape
and currentnode is an accept state of machine
then return true else return false end
15Algorithm details
- ND-RECOGNIZE
- agenda keeps track of all currently explored
choices - currentsearchstate is branch choice currently
being explored - Create initial search state and place on agenda
- NEXT
- Retrieves item from agenda
- Order?
- GENERATE-NEW-STATES
- Creates search states for any ?transitions and
- any normal input symbol transition
16Search Strategies
- agenda as a stack
- LIFO (Last In First Out) strategy
- Depth-first search
- agenda as a queue
- FIFO ( First In First Out) strategy
- Breadth-first search
- Work through some examples yourself
- Use a pencil and paper
17Regular Languages and FSAs
- Regular Languages are characterisable by FSAs
- The class of regular languages over K is defined
as - ? (empty set) is a regular language
- ?a ? K ? ?, a is a regular language
- Pay attention to regular expressions in Perl
18Regular Languages and FSAs
- If L1 and L2 are regular languages, then so are
- L2 L2 xy x ? L1, y ? L2, the concatenation
of L1 and L2 - L1 ? L2, the union or disjunction of L1 and L2
- L1, the Kleene closure of L1
19Concatenation
20Union
21Closure
22Regular Languages
- Regular Languages are closed under
- Intersection L1,L2 regular ? L1 ? L2 regular
- L1 ? L2 set of strings that are in both L1 and
L2 - Difference L1,L2 regular ? L1 L2 regular
- L1 L2 set of strings that are in L1 but not L2
- Complementation L regular ? KU L regular
- KU infinite set of all possible strings formed
from alphabet K - Reversal L regular ? LR regular
- LR set of reversals of all strings in L
23Formal Languages and Natural Languages
- Any set of strings is a formal language
- L1 a, aa, aaa, aaaa, aaaaa,
- L2 zzmy, niwhiuhew, sjehuiwheu
- L3 dog, cat, elephant
- The systems that we write will accept or map
words in a formal language. - In practical natural-language processing, we try
to make these formal languages as close as
possible to a natural language, e.g. Swahili.
I.e. we try to model a natural language, as
perfectly as possible.
24Concatenation can form Real Words
work talk walk
ing ed s
Root Language
Suffix Language
The concatenation of the Suffix language after
the Root language.
working worked works talking talked talks walking
walked walks
25Concatenation can also form Bad Words
ing ed s
try plot wiggle
Suffix Language
Root Language
Raw Concatenation Result/Level/Language
trys tryed trying plots ploted ploting
wiggles wiggleed wiggleing tries
tried trying plots plotted plotting
wiggles wiggled wiggling Desired Final
Result/Level/Language