vision-analysis

command

v1.38.1 Latest Latest Go to latest Published: Mar 30, 2026 License: Apache-2.0 Imports: 5 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/zerfoo/zerfoo

Links

Open Source Insights

README ¶

Vision Analysis

Analyze images using a vision-capable GGUF language model.

How it works

Reads an image file (JPEG or PNG) from disk
Loads a vision-capable GGUF model via inference.LoadFile
Sends the image as part of an inference.Message with the Images field
Generates a text description or analysis using model.Chat

This uses the same multimodal API that powers the OpenAI-compatible /v1/chat/completions endpoint for vision requests.

Prerequisites

Requires a vision-capable model (e.g., LLaVA, Gemma 3 with vision encoder). Text-only models will ignore the image data.

Usage

go build -o vision-analysis ./examples/vision-analysis/

# Describe an image
./vision-analysis --model path/to/vision-model.gguf --image photo.jpg

# Ask a specific question about an image
./vision-analysis --model path/to/vision-model.gguf --image chart.png \
    --prompt "What trend does this chart show?"

# With GPU
./vision-analysis --model path/to/vision-model.gguf --device cuda --image photo.jpg

Flags

Flag	Default	Description
`--model`	(required)	Path to a vision-capable GGUF model file
`--image`	(required)	Path to an image file (JPEG or PNG)
`--device`	cpu	Compute device: "cpu", "cuda"
`--prompt`	"Describe this image in detail."	Question or instruction about the image
`--max-tokens`	512	Maximum tokens to generate

Documentation ¶

Overview ¶

Command vision-analysis demonstrates multimodal inference with image input.

It loads a vision-capable GGUF model, reads an image file, and asks the model to describe or analyze the image. This uses the same inference.Message API that the OpenAI-compatible server uses for vision requests.

Usage:

go build -o vision-analysis ./examples/vision-analysis/
./vision-analysis --model path/to/vision-model.gguf --image photo.jpg
./vision-analysis --model path/to/vision-model.gguf --image photo.jpg --prompt "What objects are in this image?"

Source Files ¶

View all Source files

main.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL