llmkit

package module
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 7, 2026 License: MIT Imports: 20 Imported by: 0

README

LLMKit

Go library for unified LLM API access. Write OpenAI-shaped requests, hit any provider. The per-provider config in providers/ is generated; runtime behavior (HTTP, transforms, agent loop) is hand-coded with the help of AI.

Install

go get github.com/aktagon/llmkit-go

Quick Start

resp, err := llmkit.Prompt(ctx,
    llmkit.Provider{Name: "anthropic", APIKey: os.Getenv("ANTHROPIC_API_KEY")},
    llmkit.Request{System: "You are helpful", User: "Hello"},
)
fmt.Println(resp.Text)

Providers

Provider Default Model Env Var
anthropic claude-sonnet-4-6 ANTHROPIC_API_KEY
openai gpt-4o-2024-08-06 OPENAI_API_KEY
google gemini-2.5-flash GOOGLE_API_KEY
grok grok-3-fast GROK_API_KEY
mistral mistral-large-latest MISTRAL_API_KEY
deepseek deepseek-chat DEEPSEEK_API_KEY
groq llama-3.3-70b-versatile GROQ_API_KEY
together meta-llama/Llama-3.3-70B-Instruct-Turbo TOGETHER_API_KEY
fireworks accounts/fireworks/models/llama-v3p3-70b-instruct FIREWORKS_API_KEY
perplexity sonar-pro PERPLEXITY_API_KEY
openrouter openai/gpt-4o OPENROUTER_API_KEY
qwen qwen-plus DASHSCOPE_API_KEY
zhipu glm-4-plus ZHIPU_API_KEY
moonshot moonshot-v1-8k MOONSHOT_API_KEY
doubao doubao-1.5-pro-32k-250115 ARK_API_KEY
ernie ernie-4.0-8k QIANFAN_API_KEY
ollama llama3.2 OLLAMA_API_KEY
cohere command-r-plus COHERE_API_KEY
ai21 jamba-1.5-large AI21_API_KEY
cerebras llama-3.3-70b CEREBRAS_API_KEY
sambanova Meta-Llama-3.3-70B-Instruct SAMBANOVA_API_KEY
yi yi-large YI_API_KEY
minimax MiniMax-Text-01 MINIMAX_API_KEY
lmstudio default LM_STUDIO_API_KEY
vllm default VLLM_API_KEY

25 providers, 3 API shapes. Adding an OpenAI-compatible provider requires only a Turtle file. Zero Go code.

API

Prompt

One-shot request:

resp, err := llmkit.Prompt(ctx, provider, llmkit.Request{
    System: "You are helpful",
    User:   "What is 2+2?",
}, llmkit.WithTemperature(0.7))

fmt.Println(resp.Text)               // "4"
fmt.Println(resp.Tokens.Input)       // prompt tokens
fmt.Println(resp.Tokens.Output)      // completion tokens
fmt.Println(resp.Tokens.CacheRead)   // tokens served from cache (all caching modes)
fmt.Println(resp.Tokens.CacheWrite)  // tokens written to cache (Anthropic explicit caching)
fmt.Println(resp.Tokens.Reasoning)   // internal reasoning tokens (OpenAI o1/o3/o4, Gemini 2.5+ thinking)

Capability-scoped fields (CacheRead, CacheWrite, Reasoning) are zero when the provider doesn't report them separately.

PromptStream

Streaming with callback:

resp, err := llmkit.PromptStream(ctx, provider, req, func(chunk string) {
    fmt.Print(chunk) // prints as tokens arrive
})
Structured Output

Pass a JSON schema to get typed responses:

resp, err := llmkit.Prompt(ctx, provider, llmkit.Request{
    User:   "The sky is blue",
    Schema: `{"type":"object","properties":{"color":{"type":"string"}}}`,
})
// resp.Text == `{"color":"blue"}`
Agent with Tools

Multi-turn conversations with function calling:

agent := llmkit.NewAgent(provider)
agent.SetSystem("You are a calculator")
agent.AddTool(llmkit.Tool{
    Name:        "add",
    Description: "Add two numbers",
    Schema:      map[string]any{"type": "object", "properties": map[string]any{
        "a": map[string]any{"type": "number"},
        "b": map[string]any{"type": "number"},
    }},
    Run: func(args map[string]any) (string, error) {
        return fmt.Sprintf("%g", args["a"].(float64)+args["b"].(float64)), nil
    },
})

resp, err := agent.Chat(ctx, "What is 2+3?")
UploadFile

Upload files to a provider:

file, err := llmkit.UploadFile(ctx, provider, "document.pdf")
resp, err := llmkit.Prompt(ctx, provider, llmkit.Request{
    User:  "Summarize this document",
    Files: []llmkit.File{file},
})
GenerateImage

Generate images from text, optionally conditioned on reference images for editing or composition. Currently supports Google's Nano Banana 2 (gemini-3.1-flash-image-preview) and Pro (gemini-3-pro-image-preview).

Text-to-image — pass Prompt for the terse hot path:

resp, err := llmkit.GenerateImage(ctx,
    llmkit.Provider{Name: providers.Google, APIKey: key},
    llmkit.ImageRequest{
        Model:  "gemini-3.1-flash-image-preview",
        Prompt: "A nano banana dish in a fancy restaurant",
    },
    llmkit.WithAspectRatio("16:9"),
    llmkit.WithImageSize("2K"),
)
os.WriteFile("out.png", resp.Images[0].Bytes, 0o644)

For editing or compositional generation, pass Parts — an ordered sequence of text and image parts. The Text(...) and Image(...) constructors build each part; the on-wire ordering matches the slice order, so the model attends to descriptions and references in the pairing you intend:

resp, err := llmkit.GenerateImage(ctx, provider,
    llmkit.ImageRequest{
        Model: "gemini-3.1-flash-image-preview",
        Parts: []llmkit.Part{
            llmkit.Text("Person:"),
            llmkit.Image("image/png", personBytes),
            llmkit.Text("Outfit:"),
            llmkit.Image("image/png", outfitBytes),
            llmkit.Text("Generate the person wearing the outfit."),
        },
    },
)

Set exactly one of Prompt or Parts — both empty or both set returns a *ValidationError.

Aspect ratios and sizes are validated against a per-model whitelist before the HTTP request — WithImageSize("512") on Pro returns *ValidationError without paying for a 4xx round-trip.

Model Aspect ratios Sizes
Nano Banana 2 (Flash) 1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9, 1:4, 4:1, 1:8, 8:1 512, 1K, 2K, 4K
Nano Banana Pro 1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9 1K, 2K, 4K

Up to 14 reference images per request. See examples/image-gen for a text-to-image + edit pass.

Options

llmkit.WithTemperature(0.7)
llmkit.WithTopP(0.9)
llmkit.WithTopK(40)
llmkit.WithMaxTokens(1000)
llmkit.WithStopSequences("END")
llmkit.WithSeed(42)
llmkit.WithFrequencyPenalty(0.5)
llmkit.WithPresencePenalty(0.5)
llmkit.WithThinkingBudget(2000)
llmkit.WithReasoningEffort("high")
Option anthropic openai google grok
temperature x x x x
top_p x x x x
top_k x x x
max_tokens x x x x
stop_sequences x x x x
seed x x x
frequency_penalty x x
presence_penalty x x
thinking_budget x x
reasoning_effort x x

Middleware

Register pre/post hooks around LLM requests, tool calls, cache creation, uploads, and batch submits. Pre-phase middleware can veto an operation by returning a non-nil error; post-phase runs for observation only.

import (
    "context"
    "fmt"

    "github.com/aktagon/llmkit-go"
    "github.com/aktagon/llmkit-go/providers"
)

// Observation: log token usage after every LLM request.
func logUsage(ctx context.Context, e providers.Event) error {
    if e.Op == providers.OpLLMRequest && e.Phase == providers.PhasePost {
        fmt.Printf("%s/%s: %d in, %d out, took %s\n",
            e.Provider, e.Model,
            e.Usage.Input, e.Usage.Output, e.Duration)
    }
    return nil
}

// Veto: abort if a daily budget is exceeded (pre-phase).
func budgetGate(limit float64, spent *float64) providers.MiddlewareFn {
    return func(ctx context.Context, e providers.Event) error {
        if e.Op == providers.OpLLMRequest && e.Phase == providers.PhasePre && *spent >= limit {
            return fmt.Errorf("daily budget $%.2f exceeded", limit)
        }
        return nil
    }
}

llmkit.Prompt(ctx, p, req,
    llmkit.WithMiddleware(budgetGate(5.00, &spent), logUsage),
)

See examples/middleware/ for a spend-cap implementation with a price table and mutex-guarded accumulation. Middlewares fire in registration order; the first pre-phase non-nil error aborts.

Streaming uses the same middleware shape: one pre-phase before the request, one post-phase after the stream closes. Event.Usage reflects the accumulated usage at stream close. Per-chunk observation stays on your StreamCallback.

CLI

# Install
go install github.com/aktagon/llmkit-go/cmd/llmkit@latest

# Usage
llmkit -provider anthropic -system "You are helpful" -user "Hello"
llmkit -provider openai -stream -system "Count to 5" -user "Go"
llmkit -provider google -system "Extract color" -user "Sky is blue" \
  -schema '{"type":"object","properties":{"color":{"type":"string"}}}'

Architecture

  • Generated (providers/*.go) — per-provider config: URLs, auth, options, JSON paths. Typed structs with no logic.
  • Hand-coded (llmkit.go, transforms.go, agent.go, http.go, batch.go, caching.go, errors.go, sigv4.go) — HTTP, request/response transforms, streaming, agent loop, batch lifecycle, caching, auth signing.

Transforms are derived from config fields, not provider names.

This repo is a read-only mirror of a private source. File issues and feature requests here; patches should be submitted against the private source via christian@aktagon.com.

License

MIT

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type APIError

type APIError struct {
	Provider   string
	StatusCode int
	Type       string
	Message    string
	Retryable  bool
	RetryAfter time.Duration
}

APIError represents a provider API error.

func (*APIError) Error

func (e *APIError) Error() string

type Agent

type Agent struct {
	// contains filtered or unexported fields
}

Agent manages multi-turn conversations with optional tool calling.

func NewAgent

func NewAgent(p Provider, opts ...Option) *Agent

NewAgent creates a new agent for multi-turn conversations.

func (*Agent) AddTool

func (a *Agent) AddTool(tool Tool)

AddTool registers a tool the LLM can call.

func (*Agent) Chat

func (a *Agent) Chat(ctx context.Context, msg string) (Response, error)

Chat sends a message and returns the response, executing tool calls if needed.

func (*Agent) Reset

func (a *Agent) Reset()

Reset clears conversation history and tools.

func (*Agent) SetSystem

func (a *Agent) SetSystem(system string)

SetSystem sets the system prompt.

type BatchHandle

type BatchHandle struct {
	ID       string
	Provider Provider
}

BatchHandle represents an in-progress batch job.

func SubmitBatch

func SubmitBatch(ctx context.Context, p Provider, reqs []Request, opts ...Option) (BatchHandle, error)

SubmitBatch submits a batch of requests and returns a handle for polling.

type File

type File struct {
	ID       string
	URI      string
	MimeType string
	Name     string
}

File references an uploaded file.

func UploadFile

func UploadFile(ctx context.Context, p Provider, path string, opts ...Option) (File, error)

UploadFile uploads a file to a provider and returns a File reference.

type ImageData added in v0.2.0

type ImageData struct {
	MimeType string
	Bytes    []byte
}

ImageData is one decoded image in an ImageResponse.

type ImageOption added in v0.2.0

type ImageOption func(*imageOptions)

ImageOption configures GenerateImage.

func WithAspectRatio added in v0.2.0

func WithAspectRatio(ratio string) ImageOption

WithAspectRatio constrains the output aspect ratio (e.g., "16:9"). The value must appear in ImageGenConfig(provider).Models[].AspectRatios for the requested model, otherwise GenerateImage returns ValidationError.

func WithImageHTTPClient added in v0.2.0

func WithImageHTTPClient(c *http.Client) ImageOption

WithImageHTTPClient overrides the http.Client used for the GenerateImage call.

func WithImageMiddleware added in v0.2.0

func WithImageMiddleware(fns ...providers.MiddlewareFn) ImageOption

WithImageMiddleware registers pre/post hooks that fire around the image generation request. Op is providers.OpImageGeneration. Pre-phase can veto.

func WithImageSize added in v0.2.0

func WithImageSize(size string) ImageOption

WithImageSize sets the output resolution (e.g., "1K", "2K", "4K", "512"). Same per-model whitelist enforcement as WithAspectRatio.

func WithIncludeText added in v0.2.0

func WithIncludeText() ImageOption

WithIncludeText asks the model to also emit text parts (captions, refusals) alongside images. Defaults to off — most callers want pure image output.

type ImageRequest added in v0.2.0

type ImageRequest struct {
	Model  string
	Prompt string
	Parts  []Part
}

ImageRequest is the canonical image-generation request.

Model is required: image-generation models are explicit choices and the text-generation default (e.g., gemini-2.5-flash) does not generate images.

Input is provided in one of two mutually-exclusive forms:

  • Prompt: terse sugar for the text-only hot path. Internally desugars to Parts: []Part{Text(Prompt)} before serialisation.
  • Parts: canonical multimodal input. A positionally-ordered sequence of text and image parts; required for editing and compositional generation where caller-controlled ordering matters.

Pre-flight validation requires exactly one of Prompt or Parts to be non-empty (XOR). Image-typed parts respect ImageGenConfig.MaxInputCount.

type ImageResponse added in v0.2.0

type ImageResponse struct {
	Images []ImageData
	Text   string
	Tokens Usage
}

ImageResponse is the canonical image-generation response.

func GenerateImage added in v0.2.0

func GenerateImage(ctx context.Context, p Provider, req ImageRequest, opts ...ImageOption) (ImageResponse, error)

GenerateImage produces one or more images from a text prompt, optionally conditioned on reference images for editing or composition. Input is either Prompt (sugar for the text-only case) or Parts (canonical multimodal sequence) — exactly one must be set. Pre-flight validation rejects unsupported aspect ratios, sizes, and image-part counts before any HTTP call.

type InputImage added in v0.2.0

type InputImage struct {
	URL      string // URL or base64 data URI
	MimeType string
	Detail   string // "auto", "low", "high" (provider-specific)
}

InputImage references an image attached to a text-generation request (vision input). Distinct from Part's Image() constructor used for image-generation calls; the two concepts target different capabilities and the migration to a unified Part-based vocabulary for text generation is tracked separately (see ADR-008 OQ-2).

type MediaRef added in v0.2.0

type MediaRef struct {
	MimeType string
	Bytes    []byte
}

MediaRef is an inline media payload (mime type + raw bytes). Reused by every Part variant that carries non-text content.

type Message

type Message struct {
	Role    string // "user" or "assistant"
	Content string
}

Message represents a single conversation turn.

type MiddlewareVetoError

type MiddlewareVetoError struct {
	Cause error
}

MiddlewareVetoError wraps a pre-phase veto. Callers can errors.As against this type to discriminate a veto from a transport or provider error.

func (*MiddlewareVetoError) Error

func (e *MiddlewareVetoError) Error() string

func (*MiddlewareVetoError) Unwrap

func (e *MiddlewareVetoError) Unwrap() error

type Option

type Option func(*options)

Option configures a Prompt or Agent call.

func CacheTTL

func CacheTTL(d time.Duration) Option

CacheTTL sets the cache time-to-live. Used by resource caching (Google). Ignored by providers with automatic or explicit caching.

func WithCaching

func WithCaching() Option

WithCaching enables prompt caching for providers that support it. Behavior depends on the provider's caching mode (automatic, explicit, or resource).

func WithFrequencyPenalty

func WithFrequencyPenalty(v float64) Option

WithFrequencyPenalty sets the repetition penalty (-2.0 to 2.0).

func WithHTTPClient

func WithHTTPClient(c *http.Client) Option

WithHTTPClient sets a custom HTTP client.

func WithMaxTokens

func WithMaxTokens(n int) Option

WithMaxTokens sets the maximum output length.

func WithMaxToolIterations

func WithMaxToolIterations(n int) Option

WithMaxToolIterations sets the maximum tool call loop iterations for Agent.

func WithMiddleware

func WithMiddleware(fns ...providers.MiddlewareFn) Option

WithMiddleware registers pre/post hooks that fire around LLM requests, tool calls, cache creation, uploads, and batch submits. Pre-phase middleware can veto an operation by returning a non-nil error. Post-phase return values are ignored (observation only). Middlewares fire in registration order.

func WithPresencePenalty

func WithPresencePenalty(v float64) Option

WithPresencePenalty sets the diversity encouragement (-2.0 to 2.0).

func WithReasoningEffort

func WithReasoningEffort(v string) Option

WithReasoningEffort sets reasoning intensity ("low", "medium", "high").

func WithSeed

func WithSeed(n int64) Option

WithSeed sets the seed for deterministic generation.

func WithStopSequences

func WithStopSequences(seqs ...string) Option

WithStopSequences sets generation halt strings.

func WithTemperature

func WithTemperature(v float64) Option

WithTemperature sets the sampling temperature (0.0-2.0).

func WithThinkingBudget

func WithThinkingBudget(n int) Option

WithThinkingBudget sets the extended thinking token budget.

func WithTopK

func WithTopK(n int) Option

WithTopK sets top-K token limiting.

func WithTopP

func WithTopP(v float64) Option

WithTopP sets nucleus sampling probability (0.0-1.0).

type Part added in v0.2.0

type Part struct {
	Text  string
	Image *MediaRef
}

Part is the universal multimodal input atom. Exactly one of Text or Image is set; both empty or both set is invalid (rejected by pre-flight validation). Construct via the package-level Text() and Image() helpers.

func Image

func Image(mime string, b []byte) Part

Image constructs an image-bearing Part. mime is the IANA media type (e.g., "image/png"); b is the raw bytes (not base64-encoded).

func Text added in v0.2.0

func Text(s string) Part

Text constructs a text-bearing Part.

type Provider

type Provider struct {
	Name    string // "anthropic", "openai", "google", "grok"
	APIKey  string
	Model   string // optional, uses default if empty
	BaseURL string // optional, overrides default API endpoint
}

Provider identifies an LLM provider with its API key and optional overrides.

type Request

type Request struct {
	System   string       // system prompt
	User     string       // user message (for single-turn)
	Messages []Message    // conversation history (for multi-turn)
	Schema   string       // JSON schema for structured output (optional)
	Files    []File       // file attachments (optional)
	Images   []InputImage // image inputs (optional)
}

Request is the canonical request format (OpenAI-compatible shape).

type Response

type Response struct {
	Text   string
	Tokens Usage
}

Response is the canonical response format.

func Prompt

func Prompt(ctx context.Context, p Provider, req Request, opts ...Option) (Response, error)

Prompt sends a one-shot request to an LLM provider.

func PromptBatch

func PromptBatch(ctx context.Context, p Provider, reqs []Request, opts ...Option) ([]Response, error)

PromptBatch sends multiple requests as a batch and blocks until all results are ready. Uses the provider's batch config from the ontology to determine input mode and lifecycle.

func PromptStream

func PromptStream(ctx context.Context, p Provider, req Request, callback StreamCallback, opts ...Option) (Response, error)

PromptStream sends a streaming request, calling back with each text chunk. Returns the final response with accumulated text and usage. Middleware fires exactly once pre-phase and once post-phase, bracketing the whole stream. Usage in post-phase is the accumulated total at stream close.

func WaitBatch

func WaitBatch(ctx context.Context, handle BatchHandle, opts ...Option) ([]Response, error)

WaitBatch polls until the batch is complete and returns results.

type StreamCallback

type StreamCallback func(chunk string)

StreamCallback is called with each text chunk during streaming.

type Tool

type Tool struct {
	Name        string
	Description string
	Schema      map[string]any
	Run         func(map[string]any) (string, error)
}

Tool defines a callable function for the agent.

type Usage

type Usage = providers.Usage

Usage holds token consumption metrics. Aliased to providers.Usage so middleware events and the public API share one type without conversion.

type ValidationError

type ValidationError struct {
	Field   string
	Message string
}

ValidationError represents a request validation error.

func (*ValidationError) Error

func (e *ValidationError) Error() string

Directories

Path Synopsis
cmd
llmkit command
examples
image-gen command
Example: text-to-image generation against Google's Nano Banana 2 (Gemini 3.1 Flash Image), with a follow-up edit pass that uses the first output as a reference image.
Example: text-to-image generation against Google's Nano Banana 2 (Gemini 3.1 Flash Image), with a follow-up edit pass that uses the first output as a reference image.
middleware command
Example: spend-cap middleware.
Example: spend-cap middleware.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL