Files
GopherGate/target/doc/regex_automata/nfa/index.html
2026-02-26 12:00:21 -05:00

48 lines
7.3 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"><meta name="viewport" content="width=device-width, initial-scale=1.0"><meta name="generator" content="rustdoc"><meta name="description" content="Provides non-deterministic finite automata (NFA) and regex engines that use them."><title>regex_automata::nfa - Rust</title><script>if(window.location.protocol!=="file:")document.head.insertAdjacentHTML("beforeend","SourceSerif4-Regular-6b053e98.ttf.woff2,FiraSans-Italic-81dc35de.woff2,FiraSans-Regular-0fe48ade.woff2,FiraSans-MediumItalic-ccf7e434.woff2,FiraSans-Medium-e1aa3f0a.woff2,SourceCodePro-Regular-8badfe75.ttf.woff2,SourceCodePro-Semibold-aa29a496.ttf.woff2".split(",").map(f=>`<link rel="preload" as="font" type="font/woff2"href="../../static.files/${f}">`).join(""))</script><link rel="stylesheet" href="../../static.files/normalize-9960930a.css"><link rel="stylesheet" href="../../static.files/rustdoc-ca0dd0c4.css"><meta name="rustdoc-vars" data-root-path="../../" data-static-root-path="../../static.files/" data-current-crate="regex_automata" data-themes="" data-resource-suffix="" data-rustdoc-version="1.93.1 (01f6ddf75 2026-02-11) (Arch Linux rust 1:1.93.1-1)" data-channel="1.93.1" data-search-js="search-9e2438ea.js" data-stringdex-js="stringdex-a3946164.js" data-settings-js="settings-c38705f0.js" ><script src="../../static.files/storage-e2aeef58.js"></script><script defer src="../sidebar-items.js"></script><script defer src="../../static.files/main-a410ff4d.js"></script><noscript><link rel="stylesheet" href="../../static.files/noscript-263c88ec.css"></noscript><link rel="alternate icon" type="image/png" href="../../static.files/favicon-32x32-eab170b8.png"><link rel="icon" type="image/svg+xml" href="../../static.files/favicon-044be391.svg"></head><body class="rustdoc mod"><!--[if lte IE 11]><div class="warning">This old browser is unsupported and will most likely display funky things.</div><![endif]--><rustdoc-topbar><h2><a href="#">Module nfa</a></h2></rustdoc-topbar><nav class="sidebar"><div class="sidebar-crate"><h2><a href="../../regex_automata/index.html">regex_<wbr>automata</a><span class="version">0.4.14</span></h2></div><div class="sidebar-elems"><section id="rustdoc-toc"><h2 class="location"><a href="#">Module nfa</a></h2><h3><a href="#">Sections</a></h3><ul class="block top-toc"><li><a href="#why-only-a-thompson-nfa" title="Why only a Thompson NFA?">Why only a Thompson NFA?</a></li></ul><h3><a href="#modules">Module Items</a></h3><ul class="block"><li><a href="#modules" title="Modules">Modules</a></li></ul></section><div id="rustdoc-modnav"><h2 class="in-crate"><a href="../index.html">In crate regex_<wbr>automata</a></h2></div></div></nav><div class="sidebar-resizer" title="Drag to resize sidebar"></div><main><div class="width-limiter"><section id="main-content" class="content"><div class="main-heading"><div class="rustdoc-breadcrumbs"><a href="../index.html">regex_automata</a></div><h1>Module <span>nfa</span>&nbsp;<button id="copy-path" title="Copy item path to clipboard">Copy item path</button></h1><rustdoc-toolbar></rustdoc-toolbar><span class="sub-heading"><a class="src" href="../../src/regex_automata/nfa/mod.rs.html#1-55">Source</a> </span></div><details class="toggle top-doc" open><summary class="hideme"><span>Expand description</span></summary><div class="docblock"><p>Provides non-deterministic finite automata (NFA) and regex engines that use
them.</p>
<p>While NFAs and DFAs (deterministic finite automata) have equivalent <em>theoretical</em>
power, their usage in practice tends to result in different engineering trade
offs. While this isnt meant to be a comprehensive treatment of the topic, here
are a few key trade offs that are, at minimum, true for this crate:</p>
<ul>
<li>NFAs tend to be represented sparsely where as DFAs are represented densely.
Sparse representations use less memory, but are slower to traverse. Conversely,
dense representations use more memory, but are faster to traverse. (Sometimes
these lines are blurred. For example, an <code>NFA</code> might choose to represent a
particular state in a dense fashion, and a DFA can be built using a sparse
representation via <a href="../dfa/sparse/struct.DFA.html" title="struct regex_automata::dfa::sparse::DFA"><code>sparse::DFA</code></a>.</li>
<li>NFAs have epsilon transitions and DFAs dont. In practice, this means that
handling a single byte in a haystack with an NFA at search time may require
visiting multiple NFA states. In a DFA, each byte only requires visiting
a single state. Stated differently, NFAs require a variable number of CPU
instructions to process one byte in a haystack where as a DFA uses a constant
number of CPU instructions to process one byte.</li>
<li>NFAs are generally easier to amend with secondary storage. For example, the
<a href="thompson/pikevm/struct.PikeVM.html" title="struct regex_automata::nfa::thompson::pikevm::PikeVM"><code>thompson::pikevm::PikeVM</code></a> uses an NFA to match, but also uses additional
memory beyond the model of a finite state machine to track offsets for matching
capturing groups. Conversely, the most a DFA can do is report the offset (and
pattern ID) at which a match occurred. This is generally why we also compile
DFAs in reverse, so that we can run them after finding the end of a match to
also find the start of a match.</li>
<li>NFAs take worst case linear time to build, but DFAs take worst case
exponential time to build. The <a href="../hybrid/index.html" title="mod regex_automata::hybrid">hybrid NFA/DFA</a> mitigates this
challenge for DFAs in many practical cases.</li>
</ul>
<p>There are likely other differences, but the bottom line is that NFAs tend to be
more memory efficient and give easier opportunities for increasing expressive
power, where as DFAs are faster to search with.</p>
<h2 id="why-only-a-thompson-nfa"><a class="doc-anchor" href="#why-only-a-thompson-nfa">§</a>Why only a Thompson NFA?</h2>
<p>Currently, the only kind of NFA we support in this crate is a <a href="https://en.wikipedia.org/wiki/Thompson%27s_construction">Thompson
NFA</a>. This refers
to a specific construction algorithm that takes the syntax of a regex
pattern and converts it to an NFA. Specifically, it makes gratuitous use of
epsilon transitions in order to keep its structure simple. In exchange, its
construction time is linear in the size of the regex. A Thompson NFA also makes
the guarantee that given any state and a character in a haystack, there is at
most one transition defined for it. (Although there may be many epsilon
transitions.)</p>
<p>Its possible that other types of NFAs will be added in the future, such as a
<a href="https://en.wikipedia.org/wiki/Glushkov%27s_construction_algorithm">Glushkov NFA</a>.
But currently, this crate only provides a Thompson NFA.</p>
</div></details><h2 id="modules" class="section-header">Modules<a href="#modules" class="anchor">§</a></h2><dl class="item-table"><dt><a class="mod" href="thompson/index.html" title="mod regex_automata::nfa::thompson">thompson</a></dt><dd>Defines a Thompson NFA and provides the <a href="thompson/pikevm/struct.PikeVM.html" title="struct regex_automata::nfa::thompson::pikevm::PikeVM"><code>PikeVM</code></a> and
<a href="thompson/backtrack/struct.BoundedBacktracker.html" title="struct regex_automata::nfa::thompson::backtrack::BoundedBacktracker"><code>BoundedBacktracker</code></a> regex engines.</dd></dl></section></div></main></body></html>