benchrunner

command
v0.4.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 8, 2026 License: MIT Imports: 17 Imported by: 0

Documentation

Overview

benchrunner is a black-box benchmark harness that compares coding agents by running them against a set of tasks and collecting structured traces.

Usage:

go run ./bench/cmd/benchrunner/ [flags]
go build ./bench/cmd/benchrunner/ && ./benchrunner [flags]

Flags:

--agent string    Filter to a single agent ID (e.g., "deepseekcode-current")
--task string     Filter to a single task ID (e.g., "ctx-long-readonly")
--dry-run         Show what would run without executing
--bench-dir string Root bench directory (default "bench")

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL