Tools

zearch — searches regex directly on Re-Pair grammar-compressed text without decompression

grep — GNU grep on plain (uncompressed) text

ripgrep — fast ripgrep on plain (uncompressed) text

hyperscan — Intel Hyperscan engine on plain (uncompressed) text

lz4|hyperscan — decompress with LZ4, then search with Hyperscan

zstd|hyperscan — decompress with Zstandard, then search with Hyperscan

lz4|grep — decompress with LZ4, then search with grep

zstd|grep — decompress with Zstandard, then search with grep

lz4|ripgrep — decompress with LZ4, then search with ripgrep

zstd|ripgrep — decompress with Zstandard, then search with ripgrep

repair — Re-Pair decompression only (no search; decompression baseline)

lz4 — LZ4 decompression only (no search; decompression baseline)

zstd — Zstandard decompression only (no search; decompression baseline)

gzip — Gzip decompression only (no search; decompression baseline)


Overview

The running time shown for each regular expression is the confidence interval computed over 30 runs, measured after a "warming up" run. When the confidence intervals of two experiments do not overlap then we have enough statistical evidence to claim that one tool outperforms the other on the given experiment. If an execution takes more than 10 times the time required by zearch it is considered a timeout.


Subtitles

Regular Expressions

Graphs

Gutenberg

Regular Expressions

Graphs

CSV

Regular Expressions

Graphs

Logs

Regular Expressions

Graphs

Qwerty

Regular Expressions

Graphs