benchmark

package

v1.38.1 Latest Latest Go to latest Published: Mar 30, 2026 License: Apache-2.0 Imports: 7 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/zerfoo/zerfoo

Links

Open Source Insights

Documentation ¶

Overview ¶

Package benchmark provides a standardized benchmark suite for measuring ML model inference performance: tok/s decode, tok/s prefill, memory usage, and time to first token.

Index ¶

func ResultsJSON(results []BenchmarkResult) ([]byte, error)
type BenchmarkResult
- func RunB(b *testing.B, cfg Config, infer InferenceFunc) []BenchmarkResult
type Config
- func (c Config) Validate() error
type InferenceFunc
type ModelSpec
type RunMetrics
type Suite
- func NewSuite(cfg Config, infer InferenceFunc) (*Suite, error)
- func (s *Suite) Run(ctx context.Context) ([]BenchmarkResult, error)

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func ResultsJSON ¶

func ResultsJSON(results []BenchmarkResult) ([]byte, error)

ResultsJSON returns the benchmark results as a JSON byte slice.

Types ¶

type BenchmarkResult ¶

type BenchmarkResult struct {
	ModelName           string  `json:"model_name"`
	Quantization        string  `json:"quantization"`
	BatchSize           int     `json:"batch_size"`
	DecodeTokensPerSec  float64 `json:"decode_tokens_per_sec"`
	PrefillTokensPerSec float64 `json:"prefill_tokens_per_sec"`
	MemoryUsageMB       float64 `json:"memory_usage_mb"`
	TimeToFirstTokenMS  float64 `json:"time_to_first_token_ms"`
	Timestamp           string  `json:"timestamp"`
}

BenchmarkResult holds the metrics from a single benchmark configuration.

func RunB ¶

func RunB(b *testing.B, cfg Config, infer InferenceFunc) []BenchmarkResult

RunB is a helper for integrating with Go's testing.B. It creates a suite and runs it within the benchmark function, reporting decode tok/s as the benchmark metric.

type Config ¶

type Config struct {
	Models        []ModelSpec `json:"models"`
	Quantizations []string    `json:"quantizations"`
	BatchSizes    []int       `json:"batch_sizes"`
	WarmupRuns    int         `json:"warmup_runs"`
	BenchmarkRuns int         `json:"benchmark_runs"`
}

Config controls what the benchmark suite measures.

func (Config) Validate ¶

func (c Config) Validate() error

Validate checks that the configuration is well-formed.

type InferenceFunc ¶

type InferenceFunc func(ctx context.Context, model ModelSpec, quantization string, batchSize int) (RunMetrics, error)

InferenceFunc is the function signature that the suite calls to run a single inference benchmark. Implementations should return metrics for one run of the given model, quantization, and batch size.

type ModelSpec ¶

type ModelSpec struct {
	Path         string `json:"path"`
	Name         string `json:"name"`
	Architecture string `json:"architecture"`
}

ModelSpec identifies a model to benchmark.

type RunMetrics ¶

type RunMetrics struct {
	DecodeTokensPerSec  float64
	PrefillTokensPerSec float64
	MemoryUsageMB       float64
	TimeToFirstTokenMS  float64
}

RunMetrics holds the raw measurements from a single inference run.

type Suite ¶

type Suite struct {
	// contains filtered or unexported fields
}

Suite orchestrates running standardized benchmarks across all combinations of models, quantizations, and batch sizes.

func NewSuite ¶

func NewSuite(cfg Config, infer InferenceFunc) (*Suite, error)

NewSuite creates a benchmark suite with the given configuration and inference function.

func (*Suite) Run ¶

func (s *Suite) Run(ctx context.Context) ([]BenchmarkResult, error)

Run executes all model x quantization x batch_size combinations, performing warmup runs followed by benchmark runs. It returns one BenchmarkResult per combination with mean metrics across the benchmark runs.

Source Files ¶

View all Source files

suite.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL