Documentation
¶
Overview ¶
Recipe 08: Batch Inference
Run inference over many prompts concurrently using goroutines. This pattern is useful for processing datasets, evaluations, or any batch workload.
The program loads a single model and fans out generation across a configurable number of worker goroutines, collecting results in order.
Usage:
go run ./docs/cookbook/08-batch-inference/ --model path/to/model.gguf
Click to show internal directories.
Click to hide internal directories.