bench_disagg

command
v1.32.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 28, 2026 License: Apache-2.0 Imports: 16 Imported by: 0

Documentation

Overview

Command bench_disagg benchmarks disaggregated vs collocated serving throughput.

In disaggregated mode, prefill and decode run on separate workers behind a gateway that routes requests via least-loaded scheduling. In collocated mode, a single worker handles both prefill and decode sequentially.

The benchmark measures requests/sec, mean TTFT, and P99 latency for both modes at configurable concurrency levels.

Usage:

bench_disagg [--concurrent 16] [--requests 100] [--tokens 50]

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL