Finite Automata and Regular Expressions

About 1006 wordsAbout 13 min

2025-08-07

Types of Finite Automata

Deterministic Finite Automata (DFA)

A DFA is defined as a 5-tuple $M = (Q, \Sigma, \delta, q_0, F)$ where:

$Q$ : Finite set of states
$\Sigma$ : Input alphabet
$\delta$ : Transition function $\delta: Q \times \Sigma \rightarrow Q$
$q_0$ : Start state ( $q_0 \in Q$ )
$F$ : Set of accept states ( $F \subseteq Q$ )

Important

In a DFA, for every state and input symbol, there is exactly one next state.

Nondeterministic Finite Automata (NFA)

An NFA is defined as a 5-tuple $M = (Q, \Sigma, \delta, q_0, F)$ where:

$Q$ : Finite set of states
$\Sigma$ : Input alphabet
$\delta$ : Transition function $\delta: Q \times (\Sigma \cup \{\epsilon\}) \rightarrow \mathcal{P}(Q)$
$q_0$ : Start state ( $q_0 \in Q$ )
$\boldsymbol{F}$ : Set of accept states ( $F \subseteq Q$ )

Tips

NFAs can have:

Multiple transitions for the same symbol from a state
$\epsilon$ -transitions (moves without reading input)

Equivalence of DFA and NFA

Theorem: For every NFA, there exists an equivalent DFA that recognizes the same language.

Subset Construction Method

The standard algorithm to convert an NFA to a DFA:

Start with the $\epsilon$ -closure of the NFA's start state
For each state in the DFA, compute transitions for each input symbol
A DFA state is accepting if it contains any NFA accept state
Continue until no new states are generated

Regular Expressions to Automata

Thompson's Construction

A systematic method to convert regular expressions to NFAs:

Base case: Handle $\emptyset$ , $\epsilon$ , and single symbols

Inductive case: Handle union, concatenation, and Kleene star

Base Cases

Regular Expression	NFA Construction
$\emptyset$	Single state, no transitions
$\epsilon$	Single state with $\epsilon$ -transition to accept state
$a \in \Sigma$	Two states with transition labeled 'a'

Inductive Cases

Union $(r_1 + r_2)$ :
- Create new start and accept states
- Add $\epsilon$ -transitions from new start to $r_1$ and $r_2$ starts
- Add $\epsilon$ -transitions from $r_1$ and $r_2$ accepts to new accept
Concatenation $(r_1 \cdot r_2)$ :
- Connect accept state of $r_1$ to start state of $r_2$ with $\epsilon$ -transition
Kleene Star $(r_1^*)$ :
- Create new start/accept state
- Add $\epsilon$ $ϵ$ -transitions:
  - From new start to $r_1$ start
  - From $r_1$ accept to new accept
  - From $r_1$ accept back to $r_1$ start
  - From new start to new accept (empty string case)

Minimization of DFAs

Equivalence Relations

Two states $p$ and $q$ are equivalent if for all strings $w$ :

$\delta^*(p, w) \in F$ if and only if $\delta^*(q, w) \in F$

Myhill-Nerode Theorem

Theorem: A language $L$ is regular if and only if the relation $\equiv_L$ has finite index.

The minimal DFA for $L$ has exactly as many states as there are equivalence classes of $\equiv_L$ .

Minimization Algorithm

Initial Partition: Separate states into accept and non-accept states
Refinement: For each group, split based on transition behavior
Repeat: Continue refinement until no more splits are possible
Merge: States in the same final group are equivalent and can be merged

Practical Examples

Example 1: DFA for Even Number of 'a's

States: {q0, q1}
Alphabet: {a, b}
Start: q0
Accept: {q0}

Transitions:
δ(q0, a) = q1
δ(q0, b) = q0  
δ(q1, a) = q0
δ(q1, b) = q1

Example 2: NFA for $(a + b)^aa(a + b)^$

This NFA recognizes strings containing "aa" as a substring.

Key Theorems

Kleene's Theorem

Theorem: A language is regular if and only if it is recognized by some finite automaton (DFA or NFA).

Equivalence of Models

All the following are equivalent for a language $L$ :

$L$ is regular (described by a regular expression)
$L$ is recognized by a DFA
$L$ is recognized by an NFA
$L$ is recognized by an $\epsilon$ -NFA
The complement of $L$ is regular
$L$ has a pumping length satisfying the pumping lemma

Applications

Regular expressions and finite automata are used in:

Lexical analysis: Compilers use DFAs to tokenize source code
Text search: Pattern matching in editors and search engines
Network protocols: State machines in protocol implementation
Digital circuits: Finite state machines in hardware design

Warning

While NFAs are often easier to construct and understand, DFAs are generally more efficient for actual implementation since they have no nondeterminism.

Generalized Nondeterministic Finite Automata (GNFA)

Definition

A GNFA is a variant of NFA where:

Transitions are labeled with regular expressions (not just symbols)
Exactly one start state (no incoming transitions)
Exactly one accept state (no outgoing transitions)
All other states have transitions to every other state (except start and accept)

GNFA to Regular Expression Conversion

Algorithm:

Convert NFA to GNFA by adding new start and accept states
Eliminate states one by one (except start and accept)
When eliminating state $q_{rip}$ $q_{r i p}$ , update transitions:
- For every pair of states $q_i$ and $q_j$ :
- Replace $R_{ij}$ with $R_{ij} + R_{i,rip} \cdot (R_{rip,rip})^* \cdot R_{rip,j}$
Final expression is the transition from start to accept state

This provides a systematic way to convert any finite automaton to a regular expression.

Testing Properties of Regular Languages

Emptiness Testing

Problem: Given a DFA $M$ , is $L(M) = \emptyset$ ?

Algorithm: Check if any accept state is reachable from the start state

Time Complexity: $O(|Q| + |\delta|)$ using DFS or BFS

Equivalence Testing

Problem: Given DFAs $M_1$ and $M_2$ , is $L(M_1) = L(M_2)$ ?

Algorithm:

Construct DFA for $(L(M_1) \cap \overline{L(M_2)}) \cup (L(M_2) \cap \overline{L(M_1)})$
Test if this language is empty
If empty, then $L(M_1) = L(M_2)$

The equivalence between regular expressions and finite automata provides a powerful theoretical foundation for pattern recognition. Regular expressions offer a concise descriptive language, while finite automata provide an efficient computational model. This equivalence enables both theoretical analysis and practical implementation of pattern matching systems.

The systematic conversion algorithms (Thompson's construction for regex→NFA, subset construction for NFA→DFA, and GNFA method for automata→regex) complete the circle of equivalence and provide practical tools for implementation.

Changelog

8/8/25, 4:17 AM

View All Changelog

2aa48-web-deploy(Auto): Update base URL for web-pages branchon 8/8/25

Copyright

License under:Attribution-NonCommercial-NoDerivatives 4.0 International (CC-BY-NC-ND-4.0)