Finite Automata and Regular Expressions
About 1006 wordsAbout 13 min
2025-08-07
Types of Finite Automata
Deterministic Finite Automata (DFA)
A DFA is defined as a 5-tuple M=(Q,Σ,δ,q0,F) where:
- Q: Finite set of states
- Σ: Input alphabet
- δ: Transition function δ:Q×Σ→Q
- q0: Start state (q0∈Q)
- F: Set of accept states (F⊆Q)
Important
In a DFA, for every state and input symbol, there is exactly one next state.
Nondeterministic Finite Automata (NFA)
An NFA is defined as a 5-tuple M=(Q,Σ,δ,q0,F) where:
- Q: Finite set of states
- Σ: Input alphabet
- δ: Transition function δ:Q×(Σ∪{ϵ})→P(Q)
- q0: Start state (q0∈Q)
- F: Set of accept states (F⊆Q)
Tips
NFAs can have:
- Multiple transitions for the same symbol from a state
- ϵ-transitions (moves without reading input)
Equivalence of DFA and NFA
Theorem: For every NFA, there exists an equivalent DFA that recognizes the same language.
Subset Construction Method
The standard algorithm to convert an NFA to a DFA:
- Start with the ϵ-closure of the NFA's start state
- For each state in the DFA, compute transitions for each input symbol
- A DFA state is accepting if it contains any NFA accept state
- Continue until no new states are generated
Regular Expressions to Automata
Thompson's Construction
A systematic method to convert regular expressions to NFAs:
Base case: Handle ∅, ϵ, and single symbols
Inductive case: Handle union, concatenation, and Kleene star
Base Cases
Regular Expression | NFA Construction |
---|---|
∅ | Single state, no transitions |
ϵ | Single state with ϵ-transition to accept state |
a∈Σ | Two states with transition labeled 'a' |
Inductive Cases
Union (r1+r2):
- Create new start and accept states
- Add ϵ-transitions from new start to r1 and r2 starts
- Add ϵ-transitions from r1 and r2 accepts to new accept
Concatenation (r1⋅r2):
- Connect accept state of r1 to start state of r2 with ϵ-transition
Kleene Star (r1∗):
- Create new start/accept state
- Add ϵ-transitions:
- From new start to r1 start
- From r1 accept to new accept
- From r1 accept back to r1 start
- From new start to new accept (empty string case)
Minimization of DFAs
Equivalence Relations
Two states p and q are equivalent if for all strings w:
- δ∗(p,w)∈F if and only if δ∗(q,w)∈F
Myhill-Nerode Theorem
Theorem: A language L is regular if and only if the relation ≡L has finite index.
The minimal DFA for L has exactly as many states as there are equivalence classes of ≡L.
Minimization Algorithm
- Initial Partition: Separate states into accept and non-accept states
- Refinement: For each group, split based on transition behavior
- Repeat: Continue refinement until no more splits are possible
- Merge: States in the same final group are equivalent and can be merged
Practical Examples
Example 1: DFA for Even Number of 'a's
States: {q0, q1}
Alphabet: {a, b}
Start: q0
Accept: {q0}
Transitions:
δ(q0, a) = q1
δ(q0, b) = q0
δ(q1, a) = q0
δ(q1, b) = q1
Example 2: NFA for (a+b)∗aa(a+b)∗
This NFA recognizes strings containing "aa" as a substring.
Key Theorems
Kleene's Theorem
Theorem: A language is regular if and only if it is recognized by some finite automaton (DFA or NFA).
Equivalence of Models
All the following are equivalent for a language L:
- L is regular (described by a regular expression)
- L is recognized by a DFA
- L is recognized by an NFA
- L is recognized by an ϵ-NFA
- The complement of L is regular
- L has a pumping length satisfying the pumping lemma
Applications
Regular expressions and finite automata are used in:
- Lexical analysis: Compilers use DFAs to tokenize source code
- Text search: Pattern matching in editors and search engines
- Network protocols: State machines in protocol implementation
- Digital circuits: Finite state machines in hardware design
Warning
While NFAs are often easier to construct and understand, DFAs are generally more efficient for actual implementation since they have no nondeterminism.
Generalized Nondeterministic Finite Automata (GNFA)
Definition
A GNFA is a variant of NFA where:
- Transitions are labeled with regular expressions (not just symbols)
- Exactly one start state (no incoming transitions)
- Exactly one accept state (no outgoing transitions)
- All other states have transitions to every other state (except start and accept)
GNFA to Regular Expression Conversion
Algorithm:
- Convert NFA to GNFA by adding new start and accept states
- Eliminate states one by one (except start and accept)
- When eliminating state qrip, update transitions:
- For every pair of states qi and qj:
- Replace Rij with Rij+Ri,rip⋅(Rrip,rip)∗⋅Rrip,j
- Final expression is the transition from start to accept state
This provides a systematic way to convert any finite automaton to a regular expression.
Testing Properties of Regular Languages
Emptiness Testing
Problem: Given a DFA M, is L(M)=∅?
Algorithm: Check if any accept state is reachable from the start state
- Time Complexity: O(∣Q∣+∣δ∣) using DFS or BFS
Equivalence Testing
Problem: Given DFAs M1 and M2, is L(M1)=L(M2)?
Algorithm:
- Construct DFA for (L(M1)∩L(M2))∪(L(M2)∩L(M1))
- Test if this language is empty
- If empty, then L(M1)=L(M2)
Summary
The equivalence between regular expressions and finite automata provides a powerful theoretical foundation for pattern recognition. Regular expressions offer a concise descriptive language, while finite automata provide an efficient computational model. This equivalence enables both theoretical analysis and practical implementation of pattern matching systems.
The systematic conversion algorithms (Thompson's construction for regex→NFA, subset construction for NFA→DFA, and GNFA method for automata→regex) complete the circle of equivalence and provide practical tools for implementation.
Changelog
2aa48
-web-deploy(Auto): Update base URL for web-pages branchon
Copyright
Copyright Ownership:WARREN Y.F. LONG
License under:Attribution-NonCommercial-NoDerivatives 4.0 International (CC-BY-NC-ND-4.0)