bench_batch

command

v1.26.2 Latest Latest Go to latest Published: Mar 27, 2026 License: Apache-2.0 Imports: 11 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/zerfoo/zerfoo

Links

Open Source Insights

Documentation ¶

Overview ¶

Command bench_batch benchmarks continuous batching vs session pool throughput.

Continuous batching dynamically batches decode steps from multiple concurrent sessions into a single forward pass, amortizing GPU kernel launch and memory transfer overhead. The session pool baseline runs each session independently, serialized on the shared graph mutex.

Usage:

bench_batch --model /path/to/model.gguf [--sessions 8] [--tokens 128] [--backend cuda] [--warmup 2]

Source Files ¶

View all Source files

main.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL