scripts

command

v4.4.2+incompatible Latest Latest Go to latest Published: Dec 4, 2025 License: Apache-2.0 Imports: 8 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/ashvardanian/stringzilla

Links

README ¶

StringZilla Scripts

This directory contains benchmarks, tests, and exploratory scripts for the StringZilla library, focused on internal functionality, rather than third-party alternatives.

For comparative performance analysis, please refer to StringWars.
To understand the distributional properties of hash functions, see HashEvals.

Benchmark Programs

Benchmarks validate SIMD-accelerated backends against serial baselines and measure throughput on real-world workloads.

bench_find.cpp - bidirectional substring search, byte search, and byteset search
bench_token.cpp - token-level operations: hashing, checksums, equality, and ordering
bench_sequence.cpp - sorting, partitioning, and set intersections of string arrays
bench_memory.cpp - memory operations: copies, moves, fills, and lookup table transformations
bench_container.cpp - STL associative containers (std::map, std::unordered_map) with string keys
bench_similarities.cpp - Levenshtein, Needleman-Wunsch, Smith-Waterman scoring on CPU
bench_fingerprints.cpp - MinHash rolling fingerprints and multi-pattern search on CPU
bench_similarities.cu - similarity scoring algorithms on CUDA GPUs
bench_fingerprints.cu - fingerprinting algorithms on CUDA GPUs

All benchmarks support environment variables for configuration. Check file headers for details.

Test Programs

Unit tests validate correctness across all backends and programming languages.

test_stringzilla.cpp - C++ API tests against STL baselines
test_stringzilla.py - Python API tests against native strings
test_stringzillas.cpp - parallel CPU backend tests
test_stringzillas.cu - CUDA backend tests
test.js - JavaScript API tests

Exploratory Notebooks

Jupyter notebooks for algorithm visualization and analysis.

explore_levenshtein.ipynb - edit distance algorithms and diagonal traversal
explore_fingerprint.ipynb - MinHash and rolling fingerprints
explore_unicode.ipynb - UTF-8 handling and Unicode normalization

Documentation ¶

There is no documentation for this package.

Source Files ¶

View all Source files

bench.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL