48 lines
7.3 KiB
HTML
48 lines
7.3 KiB
HTML
<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"><meta name="viewport" content="width=device-width, initial-scale=1.0"><meta name="generator" content="rustdoc"><meta name="description" content="Provides non-deterministic finite automata (NFA) and regex engines that use them."><title>regex_automata::nfa - Rust</title><script>if(window.location.protocol!=="file:")document.head.insertAdjacentHTML("beforeend","SourceSerif4-Regular-6b053e98.ttf.woff2,FiraSans-Italic-81dc35de.woff2,FiraSans-Regular-0fe48ade.woff2,FiraSans-MediumItalic-ccf7e434.woff2,FiraSans-Medium-e1aa3f0a.woff2,SourceCodePro-Regular-8badfe75.ttf.woff2,SourceCodePro-Semibold-aa29a496.ttf.woff2".split(",").map(f=>`<link rel="preload" as="font" type="font/woff2"href="../../static.files/${f}">`).join(""))</script><link rel="stylesheet" href="../../static.files/normalize-9960930a.css"><link rel="stylesheet" href="../../static.files/rustdoc-ca0dd0c4.css"><meta name="rustdoc-vars" data-root-path="../../" data-static-root-path="../../static.files/" data-current-crate="regex_automata" data-themes="" data-resource-suffix="" data-rustdoc-version="1.93.1 (01f6ddf75 2026-02-11) (Arch Linux rust 1:1.93.1-1)" data-channel="1.93.1" data-search-js="search-9e2438ea.js" data-stringdex-js="stringdex-a3946164.js" data-settings-js="settings-c38705f0.js" ><script src="../../static.files/storage-e2aeef58.js"></script><script defer src="../sidebar-items.js"></script><script defer src="../../static.files/main-a410ff4d.js"></script><noscript><link rel="stylesheet" href="../../static.files/noscript-263c88ec.css"></noscript><link rel="alternate icon" type="image/png" href="../../static.files/favicon-32x32-eab170b8.png"><link rel="icon" type="image/svg+xml" href="../../static.files/favicon-044be391.svg"></head><body class="rustdoc mod"><!--[if lte IE 11]><div class="warning">This old browser is unsupported and will most likely display funky things.</div><![endif]--><rustdoc-topbar><h2><a href="#">Module nfa</a></h2></rustdoc-topbar><nav class="sidebar"><div class="sidebar-crate"><h2><a href="../../regex_automata/index.html">regex_<wbr>automata</a><span class="version">0.4.14</span></h2></div><div class="sidebar-elems"><section id="rustdoc-toc"><h2 class="location"><a href="#">Module nfa</a></h2><h3><a href="#">Sections</a></h3><ul class="block top-toc"><li><a href="#why-only-a-thompson-nfa" title="Why only a Thompson NFA?">Why only a Thompson NFA?</a></li></ul><h3><a href="#modules">Module Items</a></h3><ul class="block"><li><a href="#modules" title="Modules">Modules</a></li></ul></section><div id="rustdoc-modnav"><h2 class="in-crate"><a href="../index.html">In crate regex_<wbr>automata</a></h2></div></div></nav><div class="sidebar-resizer" title="Drag to resize sidebar"></div><main><div class="width-limiter"><section id="main-content" class="content"><div class="main-heading"><div class="rustdoc-breadcrumbs"><a href="../index.html">regex_automata</a></div><h1>Module <span>nfa</span> <button id="copy-path" title="Copy item path to clipboard">Copy item path</button></h1><rustdoc-toolbar></rustdoc-toolbar><span class="sub-heading"><a class="src" href="../../src/regex_automata/nfa/mod.rs.html#1-55">Source</a> </span></div><details class="toggle top-doc" open><summary class="hideme"><span>Expand description</span></summary><div class="docblock"><p>Provides non-deterministic finite automata (NFA) and regex engines that use
|
||
them.</p>
|
||
<p>While NFAs and DFAs (deterministic finite automata) have equivalent <em>theoretical</em>
|
||
power, their usage in practice tends to result in different engineering trade
|
||
offs. While this isn’t meant to be a comprehensive treatment of the topic, here
|
||
are a few key trade offs that are, at minimum, true for this crate:</p>
|
||
<ul>
|
||
<li>NFAs tend to be represented sparsely where as DFAs are represented densely.
|
||
Sparse representations use less memory, but are slower to traverse. Conversely,
|
||
dense representations use more memory, but are faster to traverse. (Sometimes
|
||
these lines are blurred. For example, an <code>NFA</code> might choose to represent a
|
||
particular state in a dense fashion, and a DFA can be built using a sparse
|
||
representation via <a href="../dfa/sparse/struct.DFA.html" title="struct regex_automata::dfa::sparse::DFA"><code>sparse::DFA</code></a>.</li>
|
||
<li>NFAs have epsilon transitions and DFAs don’t. In practice, this means that
|
||
handling a single byte in a haystack with an NFA at search time may require
|
||
visiting multiple NFA states. In a DFA, each byte only requires visiting
|
||
a single state. Stated differently, NFAs require a variable number of CPU
|
||
instructions to process one byte in a haystack where as a DFA uses a constant
|
||
number of CPU instructions to process one byte.</li>
|
||
<li>NFAs are generally easier to amend with secondary storage. For example, the
|
||
<a href="thompson/pikevm/struct.PikeVM.html" title="struct regex_automata::nfa::thompson::pikevm::PikeVM"><code>thompson::pikevm::PikeVM</code></a> uses an NFA to match, but also uses additional
|
||
memory beyond the model of a finite state machine to track offsets for matching
|
||
capturing groups. Conversely, the most a DFA can do is report the offset (and
|
||
pattern ID) at which a match occurred. This is generally why we also compile
|
||
DFAs in reverse, so that we can run them after finding the end of a match to
|
||
also find the start of a match.</li>
|
||
<li>NFAs take worst case linear time to build, but DFAs take worst case
|
||
exponential time to build. The <a href="../hybrid/index.html" title="mod regex_automata::hybrid">hybrid NFA/DFA</a> mitigates this
|
||
challenge for DFAs in many practical cases.</li>
|
||
</ul>
|
||
<p>There are likely other differences, but the bottom line is that NFAs tend to be
|
||
more memory efficient and give easier opportunities for increasing expressive
|
||
power, where as DFAs are faster to search with.</p>
|
||
<h2 id="why-only-a-thompson-nfa"><a class="doc-anchor" href="#why-only-a-thompson-nfa">§</a>Why only a Thompson NFA?</h2>
|
||
<p>Currently, the only kind of NFA we support in this crate is a <a href="https://en.wikipedia.org/wiki/Thompson%27s_construction">Thompson
|
||
NFA</a>. This refers
|
||
to a specific construction algorithm that takes the syntax of a regex
|
||
pattern and converts it to an NFA. Specifically, it makes gratuitous use of
|
||
epsilon transitions in order to keep its structure simple. In exchange, its
|
||
construction time is linear in the size of the regex. A Thompson NFA also makes
|
||
the guarantee that given any state and a character in a haystack, there is at
|
||
most one transition defined for it. (Although there may be many epsilon
|
||
transitions.)</p>
|
||
<p>It’s possible that other types of NFAs will be added in the future, such as a
|
||
<a href="https://en.wikipedia.org/wiki/Glushkov%27s_construction_algorithm">Glushkov NFA</a>.
|
||
But currently, this crate only provides a Thompson NFA.</p>
|
||
</div></details><h2 id="modules" class="section-header">Modules<a href="#modules" class="anchor">§</a></h2><dl class="item-table"><dt><a class="mod" href="thompson/index.html" title="mod regex_automata::nfa::thompson">thompson</a></dt><dd>Defines a Thompson NFA and provides the <a href="thompson/pikevm/struct.PikeVM.html" title="struct regex_automata::nfa::thompson::pikevm::PikeVM"><code>PikeVM</code></a> and
|
||
<a href="thompson/backtrack/struct.BoundedBacktracker.html" title="struct regex_automata::nfa::thompson::backtrack::BoundedBacktracker"><code>BoundedBacktracker</code></a> regex engines.</dd></dl></section></div></main></body></html> |