Book contents
- Frontmatter
- Contents
- Preface
- Part I Formal Background
- Part II From Theory to Practice
- 7 The C(M) language
- 8 C(M) Implementation of Finite-State Devices
- 9 The Aho–Corasick Algorithm
- 10 The Minimal Deterministic Finite-State Automaton for a Finite Language
- 11 Constructing Finite-State Devices for Text Rewriting
- References
- Index
9 - The Aho–Corasick Algorithm
from Part II - From Theory to Practice
Published online by Cambridge University Press: 29 July 2019
- Frontmatter
- Contents
- Preface
- Part I Formal Background
- Part II From Theory to Practice
- 7 The C(M) language
- 8 C(M) Implementation of Finite-State Devices
- 9 The Aho–Corasick Algorithm
- 10 The Minimal Deterministic Finite-State Automaton for a Finite Language
- 11 Constructing Finite-State Devices for Text Rewriting
- References
- Index
Summary
This chapter describes a special construction based on finite-state automata with important applications: the Aho–Corasick algorithm is used to efficiently find all occurrences of a finite set of strings (also called pattern set, or dictionary) in a given input string, called the ‘text’. Search is ‘online’, which means that the input text is neither fixed nor preprocessed in any way. This problem is a special instance of pattern matching in strings, and other automata constructions are used to solve other pattern matching tasks. From an automaton point of view, the Aho–Corasick algorithm comes in two variants. We first present the more efficient version where a classical deterministic finite-state automaton is built for text search. The disadvantage of this first construction is that the resulting automaton can become very large, in particular for large pattern alphabets. Afterwards we present the second version, where an automaton with additional transitions of a particular kind is built, yielding a much smaller device for text search.
Keywords
- Type
- Chapter
- Information
- Finite-State TechniquesAutomata, Transducers and Bimachines, pp. 236 - 252Publisher: Cambridge University PressPrint publication year: 2019