examples/

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

Links

Open Source Insights

README ¶

Zerfoo Examples

These examples demonstrate Zerfoo's core value: embeddable ML inference in pure Go. Each example is a standalone program you can build and run with go build.

Prerequisites

Go 1.25 or later -- Download Go
A GGUF model file -- download one from HuggingFace. For a quick start, pull Gemma 3 1B Q4:

zerfoo pull google/gemma-3-1b-it-qat-q4_0-gguf

Or download directly:

# The model file will be cached in ~/.cache/zerfoo/
zerfoo pull gemma-3-1b-q4

CUDA toolkit (optional) -- only needed for GPU acceleration. All examples work on CPU out of the box.

Available Examples

Example	Description
`inference/`	Load a GGUF model and generate text from a prompt. Demonstrates the core `inference.LoadFile` and `model.Generate` API with sampling options (temperature, top-K, top-P) and token streaming.
`api-server/`	Start an OpenAI-compatible HTTP server backed by a GGUF model. Demonstrates `serve.NewServer` with graceful shutdown. Drop-in replacement for any OpenAI client.
`embedding/`	Embed inference inside a custom Go HTTP handler. Demonstrates the pattern of loading a model once at startup and serving many concurrent requests through your own routing and request/response types.

Running an Example

# Build and run the inference example
go build -o inference ./examples/inference/
./inference path/to/model.gguf "What is the capital of France?"

# With GPU acceleration (automatic if CUDA is available)
./inference --device cuda path/to/model.gguf "What is the capital of France?"

Directories ¶

Path	Synopsis
api-server Command api-server demonstrates starting an OpenAI-compatible inference server.	Command api-server demonstrates starting an OpenAI-compatible inference server.
embedding Command embedding demonstrates embedding Zerfoo inference inside a Go HTTP handler.	Command embedding demonstrates embedding Zerfoo inference inside a Go HTTP handler.
inference Command inference demonstrates loading a GGUF model and generating text.	Command inference demonstrates loading a GGUF model and generating text.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

README ¶

Zerfoo Examples

Prerequisites

Available Examples

Running an Example

Further Reading

Directories ¶