Title: Finite Automata
1Finite Automata
2Outline (this and next lecture)
- Introduction
- Deterministic finite automata (DFAs)
- Non-deterministic finite automata (NFAs)
- NFAs to DFAs
- Simplifying DFAs
- Regular expressions ? finite automata
- Only attempt up to question 5 of tutorial 4, the
rest will be covered next lecture
3An automatic one way door
Consider the control system for a one-way
swinging door There are two states Open and
Closed It has two inputs, person detected at
position A and person detected at position B If
the door is closed, it should open only if a
person is detected at A but not B Door should
close only if no one is detected
A
B
4Control schematic
Open
Closed
A, no B
No A or B
A and B B, no A No A or B
A and B A, no B B, no A
5This is a finite automaton!
- A finite automaton is usually represented like
this as a directed graph - Two parts of a directed graph
- The states (also called nodes or vertices)
- The edges with arrows which represent the
allowed transitions - One state is usually picked out as the starting
point - For so-called accepting automata, some states
are chosen to be final states
6Strings and automata
The input data are represented by a string over
some alphabet and it determines how the machine
progresses from state to state. Beginning in the
start state, the characters of the input string
cause the machine to change from one state to
another. Accepting automata give only yes or no
answers, depending on whether they end up in a
final state. Strings which end in a final
state are accepted by the automaton.
7The labeled graph in the figure above represents
a FA over the alphabet A a, b with start
state 0 and final state 3. Final states are
denoted by a double circle.
Example
8Deterministic finite automata (DFAs)
- The previous graph was an example of a
deterministic finite automaton every node had
two edges (a and b) coming out - A DFA over a finite alphabet A is a finite
directed graph with the property that each node
emits one labeled edge for each distinct element
of A.
9More formally
- A DFA accepts a string w in A if there is a path
from the start state to some final state such
that w is the concatenation of the labels on the
edges of the path. - Otherwise, the DFA rejects w .
- The set of all strings accepted by a DFA M is
called the language of M and is denoted by - L(M)
10EXAMPLE (ab)
- Construct a DFA to recognize the regular
languages represented by the regular expression
(a b) over alphabet A a, b. -
- This is the set a, b of all strings over a,
b. This can be recognised by
11EXAMPLE a(ab)
- Find a DFA to recognize the language represented
by the regular expression a(a b) over the
alphabet A a, b. - This is the set of all strings in A which begin
with a. One possible DFA is
12EXAMPLE Pattern recognition
- Build a DFA to recognize the regular language
represented by the regular expression (a
b)abb over the alphabet A a, b. - The language is the set of strings that begin
with anything, but must end with the string abb. - Effectively, were looking for strings which have
a particular pattern to them
13Solution - (a b)abb
The diagram below shows a DFA to recognize this
language.
If in state 1 the last character was a If in
state 2 the last two symbols were ab If in
state 3 the last three were abb
14State transition function
We can also represent a DFA by a state
transition function, which we'll denote by T,
where any state transition of the form is
represented by T(i,a) j To describe a full
DFA we need to know what states there are,
which are the start and final ones, the set
of transitions between them.
15Regular languages
- The class of regular languages is exactly the
same as the class of languages accepted by DFAs! - Kleene (1956)
- For any regular language, we can find a DFA which
recognizes it!
16Applications of DFAs
- DFAs are very often used for pattern matching,
e.g. searching for words/structures in strings - This is used often in UNIX, particularly by the
grep command, which searches for combinations of
strings and wildcards (, ?) - grep stands for Global (search for) Regular
Expressions Parser - DFAs are also used to design and check simple
circuits, verifying protocols, etc. - They are of use whenever significant memory is
not required
17Non-deterministic Finite Automata
- DFAs are called deterministic because following
any input string, we know exactly which state its
in and the path it took to get there - For NFAs, sometimes there is more than one
direction we can go with the same input character
- Non-determinism can occur, because following a
particular string, one could be in many possible
states, or taken different paths to end at the
same state!
18NFAs
- A non-deterministic finite automaton (NFA) over
an alphabet A is a finite directed graph with
each node having zero or more edges, - Each edge is labelled either with a letter from
A or with ?. - Multiple edges may be emitted from the same node
with the same label. - Some letters may not have an edge associated
with them. Strings following such paths are not
recognised.
19Non-determinism
- If an edge is labelled with the empty string ?,
then we can travel the edge without consuming an
input letter. Effectively we could be in either
state, and so the possible paths could branch. - If there are two edges with the same label, we
can take either path. -
- NFAs recognise a string if any one of its many
possible states following it is a final state - Otherwise, it rejects it.
20NFAs versus DFAs
NFA for aa
DFA for aa
Why is the top an NFA while the bottom is a DFA?
21EXAMPLE
- Draw two NFAs to recognize the language of the
regular expression ab aa. - This NFA has a ? edge, which allows us to travel
to state 2 without consuming an input letter. - The upper path corresponds to ab and the lower
one to aa
22An equivalent NFA
This NFA also recognizes the same language.
Perhaps it's easier to see this by considering
the equality ab aa ab aa.
23NFA transition functions
- Since there may be nondeterminism, we'll let the
values of this function be sets of states. - For example, if there are no edges from state k
labelled with a, we'll write - T(k, a) ?.
- If there are three edges from state k, all
labelled with a, going to states i, j and k,
we'll write - T(k, a) i, j, k.
24Comments on non-determinism
- All digital computers are deterministic quantum
computers may be another story! - The usual mechanism for deterministic computers
is to try one particular path and to backtrack to
the last decision point if that path proves poor
- Parallel computers make non-determinism almost
realizable. We can let each process make a random
choice at each branch point, thereby exploring
many possible trees.
25Perhaps surprisingly
- The class of regular languages is exactly the
same as the class of languages accepted by NFAs! - Rabin and Scott (1959)
- Just like for DFAs!
- Every NFA has an equivalent DFA which recognises
the same language.
26NFAs to DFAs
- We prove the equivalence of NFAs and DFAs by
showing how, for any NFA, to construct a DFA
which recognises the same language - Generally the DFA will have more possible states
than the NFA. If the NFA has n states, then the
DFA could have as many as 2n states! - Example NFA has three states A, B, C
- the DFA could have eight ?, A, B, C,
- A, B, A, C, B, C, A, B, C
- These correspond to the possible states the NFA
could be in after any string
27DFA construction
- Begin in the NFA start state, which could be a
multiple state if its connected to any by ? - Determine the set of possible NFA states you
could be in after receiving each character. Each
set is a new DFA state, and is connected to the
start by that character. - Repeat for each new DFA state, exploring the
possible results for each character until the
system is closed - DFA final states are any that contain a NFA
final state
28Example (a b) ab
C
A
B
b
a
NFA
a,b
- The start state is A, but following an a you
could be in A or B following a b you could only
be in state A
A
A,B
A,C
b
a
DFA
b
a
a
b
29Summary
- Regular expressions represent the regular
languages. - DFAs recognize the regular languages.
- NFAs also recognize the regular languages.
30Finite automata
- So far, weve introduced two kinds of automata
deterministic and non-deterministic. - Weve shown that we can find a DFA to recognise
anything language that a given NFA recognises. - Weve asserted that both DFAs and NFAs
recognise the regular languages, which themselves
are represented by regular expressions - We prove this by construction, by showing how any
regular expression can be made into a NFA and
vice versa
31Regular Expressions ? Finite Automata
- Given a regular expression, we can find an
automata which recognises its language - Start the algorithm with a machine that has a
start state, a single final state, and an edge
labelled with the given regular expression as
follows
32Four step algorithm
- 1. If an edge is labelled with ?, then erase the
edge. - 2. Transform any diagram like
into the diagram
33- 3. Transform any diagram like
into the diagram
34- 4. Transform any diagram like
into the diagram
35Example a ab
- Construct a NFA for the regular expression,
a ab - Start with
- Apply rule 2
a ab
a
ab
36Example a ab
a
Apply rule 4 to a
?
?
ab
a
Apply rule 3 to ab
?
?
a
b
37Finite Automaton ? Regular Expression
- 1.Create a new start state s, and draw a new edge
labelled with ? from s to the original start
state. - 2. Create a new final state f, and draw new edges
labelled with ? from all the original final
states to f
38- 3. For each pair of states i and j that have more
than one edge from i to j, replace all the edges
from i to j by a single edge labelled with the
regular expression formed by the sum of the
labels on each of the edges from i to j. - 4. Construct a sequence of new machines by
eliminating one state at a time until the only
states remaining are s and the f.
39Eliminating states
- As each state is eliminated, a new machine is
constructed from the previous machine as follows - Let old(i,j) denote the label on edge ?i,j? of
the current machine. If no edge exists, label it
?. - Assume that we wish to eliminate state k. For
each pair of edges ?i,k? (incoming edge) and
?k,j? (outgoing edge) we create a new edge label
new(i, j)
40Eliminate state k
- The label of this new edge is given by
- new(i,j) old(i,j) old(i, k) old(k, k)
old(k,j) - All other edges, not involving state k, remain
the same - new(i, j) old(i, j).
- After eliminating all states except s and f, we
wind up with a two-state machine with the single
edge ?s, f? labelled with the desired regular
expression new(s, f)
41Example
Initial DFA
Steps 1 and 2 Add start and final states
42Eliminate state 2 (No path to f)
Eliminate state 0
Eliminate state 1
Final regular expression
43Finding simpler automata
- Sometimes our constructions lead to more
complicated automata than we need, having more
states than are really necessary - Next, we look for ways of making DFAs with a
minimum number of states - Myhill-Nerode theorem
- Every regular expression has a unique minimum
state DFA - up to a simple renaming of the states
44Finding minimum state DFA
- Two steps to minimizing DFA
- 1) Discover which, if any, pairs of states are
indistinguishable. Two states, s and t, are
equivalent if for all possible strings w , T(s,w)
and T(t,w) are both either final or non-final. - 2) Combine all equivalent states into a single
state, modifying the transition functions
appropriately.
45Consider the DFA
b
a,b
a
1
a
b
a
2
b
- States 1 and 2 are indistinguishable! Starting in
either, b is rejected and anything with a in it
is accepted.
b
a,b
a
a,b
46Part 1 finding indistinguishable pairs
- Remove all inaccessible states, where no path
exists to them from start. - Construct a grid of pairs of states.
- Begin by marking those pairs which are clearly
distinguishable, where one is final and the other
non-final. - Next eliminate all pairs, which on the same
input, lead to a distinguishable pair of states.
Repeat until you have considered all pairs. - The remaining pairs are indistinguishable.
47Part 2 construct minimum DFA
- Construct a new DFA where any pairs of
indistinguishable states form a single state in
the new DFA - The start state will be the state containing the
original start state. - The final states will be those which contain
original final states - The transitions will be the full set of
transitions from the original states (these
should all be consistent.)
48Example
1
a
b
b
b
a,b
a
0
2
4
b
a
a
3
What are the distinguishable pairs of
states? Clearly, 0, 4 1, 4 2, 4 3, 4 are
all distinguishable because 4 is final but none
of the others are.
49Grid of pairs of states
- We eliminate these as possible indistinguishable
pairs - Next consider 0, 1. With input a, this becomes
3, 4 which is distinguishable, so 0, 1 is as
well. - Similarly, we can show 0, 2 and 0, 3 are also
distinguishable, leading to the modified grid
50Remaining pairs
- We are left with
- 1, 2 given a 4, 4
- given b 2, 1
- 2, 3 given a 4, 4
- given b 1, 2
- 1, 3 given a 4, 4
- given b 2, 2
- These do not lead to pairs we know to be
distinguishable, and are therefore
indistinguishable!
51Construct minimal DFA
- States 1, 2, and 3 are all indistinguishable,
thus the minimal DFA will have three states - 0 1, 2, 3 4
- Since originally T(0, a) 3 and T(0, b) 1, the
new transitions are T(0, a) T(0, b) 1,2,3 - Similarly,
- T(1,2,3, a) 4 and T(1,2,3, b) 1,2,3
- Finally, as before,
- T(4, a) 4 and T(4, b) 4
52Resulting minimal DFA
- The resulting DFA is much simpler
b
a,b
a
a,b
0
1, 2, 3
4
This recognises regular expressions of the form,
(a b) b a (a b) This is the simplest DFA
which will recognise this language!
53Conclusions
- We now have many equivalent ways of representing
regular languages DFAs, NFAs, regular
expressions and regular grammars - We can also now simply(?!) move between these
various representations. - Well see next lecture that the automata
representation leads to a simple way of
recognising some languages which are not regular.
- Well also begin to consider more powerful
language types and correspondingly more powerful
computing models!