Finite-State automata
Sheep language
baa!
baaa!
baaaa!
baaaaa!
baaaaaa!
and so on....
The machine for this language
Diagram and terminological conventions
- Circles in diagrams are called nodes or
vertices or states.
- Arrows between nodes are called (labeled) transitions
or (labeled) arcs
- Labels on arcs represent input that must be consumed when that
arc is taken
- Arrow at left (which doesn't come
from a state) represents entry point or initial
state of automaton.
- Double circle represents final state.
- baa!: machine succeeds
- baa...a!: machine succeeds
- /baa+!/: regular expression for the
language accepted by this machine
Failure examples:
- aba! (Fig. 2.11): machine fails (q0)
- bba! (Fig. 2.11): machine fails (q1)
- bab! (Fig. 2.11): machine fails (q2)
- ba! (Fig. 2.11): machine fails (q2)
- baab! (Fig. 2.11): machine fails (q3)
- baaaaaa!!,baaaaaa!a,: machine fails (q4)
A property of
deterministic Finite State automata:
a unique failure state for any string.
Machine for accepting spoken numbers
1-99
Represent FSA as a state-transition table
State-transition table for sheep language
Conventions for state-transition table
- Initial state listed first
- Final states have ":" (State 4 here)
- "Null" symbol for failure.
More formally (p. 36):
- Q: a finite set of states (rows in table)
- Sigma: a finite input alphabet of symbols (columns in table)
- q0: the start state (first state)
- F (subset of Q): Final States, the states with colons
- delta(q,i): the transition function from states
and inputs to states (what's in the cell for row q
and column i).
Properties of Regular Languages
Observation
|
Many infinite sets of strings are regular languages.
|
Example:
Sigma*
|
Observation
|
All finite sets of strings are regular languages.
|
Observation
|
An FSA accepting an infinite language must have a loop.
|
Reason:
A finite set of states must serve to accept an
infinite set of strings.
|
Formalization of this intuition:
Theorem: The Pumping Lemma
|
If L is an infinite finite automaton language
over alphabet Sigma, then there are strings
x,y,z in Sigma*, y non empty,
such that xynz is in L for
all n > 0 or n = 0.
|
Example
|
a(ba)*c is a regular language.
Pumping String:
- x=a
- y=ba
- z=c
|
Sketch of proof: We have an infinite language
but a finite number of states. Let
n be the
number of states and consider a string
s of length
n. In admitting
s, some state S
i must have been visited twice.
Let
s=
xyz,
where
y is the substring admitted
by the state sequence connecting
S
i to S
i. Then
for any
n > 0,
xynz
is admitted by the machine.
The theorem is often used in its contrapositive
form:
Contrapositive Pumping Lemma
|
If there are no strings
x,y,z in Sigma*, y non empty,
such that xynz is in L for
all n > 0 or n = 0,
then L is not an infinite finite automaton language.
|
Example
|
It can be shown, using The Pumping Lemma,
that
anbn
is not an infinite finite automaton language.
|
Proof by enumeration of cases. Each of
the following exhaustive listing of possibilities
is impossible:
- x is the empty string, z is the empty string.
- x is the empty string, z is non empty.
- z consists entirely of a's.
- z consists entirely of b's.
- z consists of a string of a's followed by a string of b's.
- x is non empty, z is the empty string.
- x consists entirely of a's.
- x consists entirely of b's.
- x consists of a string of a's followed by a string of b's.
- x consists of a nonempty string of a's. y ends with
a non empty string of b's
- y is all a's
- y is all b's.
- y is a sequence of a's followed by a sequence of b's.
Implementing FSAs
Deterministic Finite-State Automaton (DFSA) as a recognizer or string
accepter:
- Input tape with one symbol on each cell of the tape.
Machine set to read beginning of tape, set in starting state.
Standard input tape set up
Another property of input tapes, at least the way we're going to think
about them, is that we cxan think of them as being
indexed. Each cell on the tape has a number.
Usually the starting position is the tape is numbered 0 and
we move from one symbol to the right on the tape by adding 1.
- Advance to the next cell
(next index) on the tape. If the symbol there matches
the label on an arc leaving
the current state of the machine, follow that arc to the next state. If not,
fail.
- If there are no more symbols on the tape, we must be
in a final state. If not, fail.
Implementing a DFSA
- Data Types and data structures
- Index: an integer (0, 1, 2, 3 ...)
- Tape: a one-column table (one-argument function).
Each index is associated with exactly one
symbol on the input tape.
0 b
1 a
2 a
3 a
4 !
5 eoi
tape(index) returns the symbol for that index.
This stands for "end of input".
- Transition-table is a table with as many columns as
there are symbols in the input alphabet ({a,b,!} Fig. 2.12)
transition-table(current-state, symbol)
means look in the row for the
current state, the column for symbol, and return
the new state indicated by the table.
transition-table(2,a) = 3
for the table in 2.12. In general we have to
look on the tape for a symbol. So:
transition-table(2,tape(3)) = 3
Notice how everything is getting turned into
operations on numbers.
- Pseudo code for DFSA
- Trace of a DFSA
Non-Deterministic Finite-State Automata (NFSA)
NFSAs as recognizers:
- Machine set to read beginning of tape, set in starting state.
- Advance to the next cell
(next index) on the tape. If the symbol there matches
the label on an arc leaving
the current state of the machine, follow that arc to the next state. If not,
try another path through the machine
that matches the input thus far. If there are no more
paths through the machine that match the input, fail.
- If there are no more symbols on the tape, we must be
in a final state. If not, try another path through the machine
that matches the input. If there are no more symbols on the tape and no other
paths through the machine that match the input, fail.
This is a
search problem, because of the steps that
include "try another ..."
Implementing a NFSA
- Pseudo code for NFSA
- Trace of an NFSA