examples/

directory
v1.7.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 18, 2026 License: Apache-2.0

README

Zerfoo Examples

These examples demonstrate Zerfoo's core value: embeddable ML inference in pure Go. Each example is a standalone program you can build and run with go build.

Prerequisites

  • Go 1.25 or later -- Download Go
  • A GGUF model file -- download one from HuggingFace. For a quick start, pull Gemma 3 1B Q4:
zerfoo pull google/gemma-3-1b-it-qat-q4_0-gguf

Or download directly:

# The model file will be cached in ~/.cache/zerfoo/
zerfoo pull gemma-3-1b-q4
  • CUDA toolkit (optional) -- only needed for GPU acceleration. All examples work on CPU out of the box.

Available Examples

Example Description Prerequisites
chat/ Interactive chatbot CLI. Demonstrates the zerfoo.Load and model.Chat one-line API with a readline loop. GGUF model file
embedding-search/ Semantic search over a document corpus using model embeddings and cosine similarity. GGUF model file
rag/ Retrieval-augmented generation: embed documents, retrieve relevant ones, and generate grounded answers. GGUF model file
code-completion/ Generate code completions from partial code snippets using inference.LoadFile and model.Generate. GGUF model file
summarization/ Summarize text from a string or file using a language model. GGUF model file
translation/ Translate text between languages using a multilingual model. GGUF model file
classification/ Text classification with grammar-constrained JSON output using inference.WithGrammar. GGUF model file
vision-analysis/ Analyze images using a vision-capable model with inference.Message.Images. Vision GGUF model + image
audio-transcription/ Speech-to-text using the OpenAI-compatible /v1/audio/transcriptions endpoint. Whisper GGUF model + audio file
agentic-tool-use/ Function calling (tool use) with zerfoo.WithTools for agentic AI patterns. GGUF model file
Additional Examples
Example Description Prerequisites
inference/ Load a GGUF model and generate text with sampling options and token streaming. GGUF model file
streaming/ Streaming chat generation using model.ChatStream with per-token output. GGUF model file
embedding/ Embed inference inside a custom Go HTTP handler for concurrent request serving. GGUF model file
api-server/ Start an OpenAI-compatible HTTP server with serve.NewServer and graceful shutdown. GGUF model file
json-output/ Grammar-guided decoding that constrains output to valid JSON matching a schema. GGUF model file

Running an Example

# Build and run the chat example
go build -o chat ./examples/chat/
./chat --model path/to/model.gguf

# Build and run the code completion example
go build -o code-completion ./examples/code-completion/
./code-completion --model path/to/model.gguf --code "func fibonacci(n int) int {"

# With GPU acceleration
./code-completion --model path/to/model.gguf --device cuda --code "func add(a, b int) int {"

Further Reading

See docs/getting-started.md for a full tutorial covering CLI usage, library API, and the OpenAI-compatible server.

Directories

Path Synopsis
Command agentic-tool-use demonstrates function calling (tool use) with a language model using the zerfoo one-line API.
Command agentic-tool-use demonstrates function calling (tool use) with a language model using the zerfoo one-line API.
Command api-server demonstrates starting an OpenAI-compatible inference server.
Command api-server demonstrates starting an OpenAI-compatible inference server.
Command audio-transcription demonstrates speech-to-text using the Zerfoo OpenAI-compatible API server.
Command audio-transcription demonstrates speech-to-text using the Zerfoo OpenAI-compatible API server.
Command chat demonstrates a simple interactive chatbot using the zerfoo one-line API.
Command chat demonstrates a simple interactive chatbot using the zerfoo one-line API.
Command classification demonstrates text classification using grammar-constrained decoding to guarantee a valid JSON response with a category label.
Command classification demonstrates text classification using grammar-constrained decoding to guarantee a valid JSON response with a category label.
Command code-completion demonstrates using a language model for code completion.
Command code-completion demonstrates using a language model for code completion.
Command embedding demonstrates embedding Zerfoo inference inside a Go HTTP handler.
Command embedding demonstrates embedding Zerfoo inference inside a Go HTTP handler.
Command embedding-search demonstrates semantic search using model embeddings.
Command embedding-search demonstrates semantic search using model embeddings.
Command inference demonstrates loading a GGUF model and generating text.
Command inference demonstrates loading a GGUF model and generating text.
Command json-output demonstrates grammar-guided decoding with a JSON schema.
Command json-output demonstrates grammar-guided decoding with a JSON schema.
Command rag demonstrates retrieval-augmented generation using Zerfoo.
Command rag demonstrates retrieval-augmented generation using Zerfoo.
Command streaming demonstrates streaming chat generation using the zerfoo API.
Command streaming demonstrates streaming chat generation using the zerfoo API.
Command summarization demonstrates text summarization using a GGUF language model.
Command summarization demonstrates text summarization using a GGUF language model.
Command translation demonstrates text translation using a GGUF language model.
Command translation demonstrates text translation using a GGUF language model.
Command vision-analysis demonstrates multimodal inference with image input.
Command vision-analysis demonstrates multimodal inference with image input.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL