examples/

directory
v1.26.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 27, 2026 License: Apache-2.0

README

Zerfoo Examples

These examples demonstrate Zerfoo's core value: embeddable ML inference in pure Go. Each example is a standalone program you can build and run with go build.

Prerequisites

  • Go 1.25 or later -- Download Go
  • A GGUF model file -- download one from HuggingFace. For a quick start, pull Gemma 3 1B Q4:
zerfoo pull google/gemma-3-1b-it-qat-q4_0-gguf

Or download directly:

# The model file will be cached in ~/.cache/zerfoo/
zerfoo pull gemma-3-1b-q4
  • CUDA toolkit (optional) -- only needed for GPU acceleration. All examples work on CPU out of the box.

Available Examples

Example Description Prerequisites
chat/ Interactive chatbot CLI. Demonstrates the zerfoo.Load and model.Chat one-line API with a readline loop. GGUF model file
embedding-search/ Semantic search over a document corpus using model embeddings and cosine similarity. GGUF model file
rag/ Retrieval-augmented generation: embed documents, retrieve relevant ones, and generate grounded answers. GGUF model file
code-completion/ Generate code completions from partial code snippets using inference.LoadFile and model.Generate. GGUF model file
summarization/ Summarize text from a string or file using a language model. GGUF model file
translation/ Translate text between languages using a multilingual model. GGUF model file
classification/ Text classification with grammar-constrained JSON output using inference.WithGrammar. GGUF model file
vision-analysis/ Analyze images using a vision-capable model with inference.Message.Images. Vision GGUF model + image
audio-transcription/ Speech-to-text using the OpenAI-compatible /v1/audio/transcriptions endpoint. Whisper GGUF model + audio file
agentic-tool-use/ Function calling (tool use) with zerfoo.WithTools for agentic AI patterns. GGUF model file
Additional Examples
Example Description Prerequisites
inference/ Load a GGUF model and generate text with sampling options and token streaming. GGUF model file
streaming/ Streaming chat generation using model.ChatStream with per-token output. GGUF model file
embedding/ Embed inference inside a custom Go HTTP handler for concurrent request serving. GGUF model file
api-server/ Start an OpenAI-compatible HTTP server with serve.NewServer and graceful shutdown. GGUF model file
json-output/ Grammar-guided decoding that constrains output to valid JSON matching a schema. GGUF model file
fine-tuning/ LoRA fine-tuning of a tabular model: pre-train, adapt, merge, save/load. None (synthetic data)

Running an Example

# Build and run the chat example
go build -o chat ./examples/chat/
./chat --model path/to/model.gguf

# Build and run the code completion example
go build -o code-completion ./examples/code-completion/
./code-completion --model path/to/model.gguf --code "func fibonacci(n int) int {"

# With GPU acceleration
./code-completion --model path/to/model.gguf --device cuda --code "func add(a, b int) int {"

Further Reading

See docs/getting-started.md for a full tutorial covering CLI usage, library API, and the OpenAI-compatible server.

Directories

Path Synopsis
Command agentic-tool-use demonstrates function calling (tool use) with a language model using the zerfoo one-line API.
Command agentic-tool-use demonstrates function calling (tool use) with a language model using the zerfoo one-line API.
Command api-server demonstrates starting an OpenAI-compatible inference server.
Command api-server demonstrates starting an OpenAI-compatible inference server.
Command audio-transcription demonstrates speech-to-text using the Zerfoo OpenAI-compatible API server.
Command audio-transcription demonstrates speech-to-text using the Zerfoo OpenAI-compatible API server.
Command automl demonstrates using the AutoML coordinator to search over hyperparameter configurations with Bayesian optimization and early stopping.
Command automl demonstrates using the AutoML coordinator to search over hyperparameter configurations with Bayesian optimization and early stopping.
Command chat demonstrates a simple interactive chatbot using the zerfoo one-line API.
Command chat demonstrates a simple interactive chatbot using the zerfoo one-line API.
Command classification demonstrates text classification using grammar-constrained decoding to guarantee a valid JSON response with a category label.
Command classification demonstrates text classification using grammar-constrained decoding to guarantee a valid JSON response with a category label.
Command code-completion demonstrates using a language model for code completion.
Command code-completion demonstrates using a language model for code completion.
Command distributed-training demonstrates setting up FSDP distributed training with gradient accumulation using the zerfoo distributed and training packages.
Command distributed-training demonstrates setting up FSDP distributed training with gradient accumulation using the zerfoo distributed and training packages.
Command embedding demonstrates embedding Zerfoo inference inside a Go HTTP handler.
Command embedding demonstrates embedding Zerfoo inference inside a Go HTTP handler.
Command embedding-search demonstrates semantic search using model embeddings.
Command embedding-search demonstrates semantic search using model embeddings.
Command fine-tuning demonstrates parameter-efficient fine-tuning using LoRA (Low-Rank Adaptation) on a tabular model.
Command fine-tuning demonstrates parameter-efficient fine-tuning using LoRA (Low-Rank Adaptation) on a tabular model.
Command inference demonstrates loading a GGUF model and generating text.
Command inference demonstrates loading a GGUF model and generating text.
Command json-output demonstrates grammar-guided decoding with a JSON schema.
Command json-output demonstrates grammar-guided decoding with a JSON schema.
Command langchain-chatbot demonstrates using the Zerfoo LangChain adapter as a drop-in LLM for a simple interactive chatbot loop.
Command langchain-chatbot demonstrates using the Zerfoo LangChain adapter as a drop-in LLM for a simple interactive chatbot loop.
Command rag demonstrates retrieval-augmented generation using Zerfoo.
Command rag demonstrates retrieval-augmented generation using Zerfoo.
Command streaming demonstrates streaming chat generation using the zerfoo API.
Command streaming demonstrates streaming chat generation using the zerfoo API.
Command summarization demonstrates text summarization using a GGUF language model.
Command summarization demonstrates text summarization using a GGUF language model.
Command text-embedding demonstrates extracting text embedding vectors from a loaded GGUF model using the inference package.
Command text-embedding demonstrates extracting text embedding vectors from a loaded GGUF model using the inference package.
Command timeseries demonstrates time-series forecasting with the N-BEATS model using the zerfoo timeseries package.
Command timeseries demonstrates time-series forecasting with the N-BEATS model using the zerfoo timeseries package.
Command translation demonstrates text translation using a GGUF language model.
Command translation demonstrates text translation using a GGUF language model.
Command vision-analysis demonstrates multimodal inference with image input.
Command vision-analysis demonstrates multimodal inference with image input.
Command weaviate-search demonstrates using the Zerfoo Weaviate adapter to embed a corpus of documents and perform cosine-similarity semantic search without requiring a live Weaviate instance.
Command weaviate-search demonstrates using the Zerfoo Weaviate adapter to embed a corpus of documents and perform cosine-similarity semantic search without requiring a live Weaviate instance.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL