goagent

package module
v0.5.6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 14, 2026 License: Apache-2.0 Imports: 13 Imported by: 3

README

goagent

⚠️ Work in Progress — API may change without notice. Not production-ready yet.

A minimal, Go-idiomatic framework for building AI agents with a ReAct loop and pluggable model providers.

Go version License

Install

go get github.com/Germanblandin1/goagent

Sub-modules

Each sub-module is versioned and installed independently.

Module Description
providers/anthropic Anthropic Messages API provider (Claude)
providers/ollama Local Ollama provider + embedder
providers/voyage Voyage AI embedder
mcp MCP client + server integration
rag RAG pipeline — chunking, embedding, retrieval
otel OpenTelemetry spans and metrics
ratelimit Token-bucket rate limiters for tool dispatch
memory/vector/pgvector Persistent VectorStore — PostgreSQL + pgvector
memory/vector/qdrant Persistent VectorStore — Qdrant
memory/vector/sqlitevec Persistent VectorStore — SQLite + sqlite-vec (CGO)
memory/vector/tiktoken Exact token-count SizeEstimator via tiktoken

Quickstart

agent, err := goagent.New(
    goagent.WithProvider(ollama.New()),
    goagent.WithModel("qwen3"),
)
if err != nil {
    log.Fatal(err)
}

answer, err := agent.Run(context.Background(), "What is the capital of France?")

Package layout

goagent/              Core — Agent, ReAct loop, interfaces
├── mcp/              MCP client + server (stdio and SSE transports)
├── memory/           Short-term and long-term memory
│   ├── storage/      InMemory storage
│   ├── policy/       FixedWindow, TokenWindow, NoOp
│   └── vector/       VectorStore, chunkers, similarity, size estimators
│       ├── pgvector/ Persistent VectorStore — PostgreSQL + pgvector (HNSW)
│       └── sqlitevec/ Persistent VectorStore — SQLite + sqlite-vec (CGO)
├── providers/
│   ├── anthropic/    Anthropic Messages API (Claude)
│   ├── ollama/       Local Ollama via OpenAI-compatible API (+ embedder)
│   └── voyage/       Voyage AI embedder
├── rag/              RAG pipeline — Pipeline, NewTool, observers, formatters
├── examples/
│   ├── calculator/           Tool use with arithmetic
│   ├── chatbot/              Multi-turn conversation
│   ├── chatbot-persistent/   Persistent memory across sessions
│   ├── chatbot-mcp-fs/       Filesystem access via MCP stdio
│   └── rag_batch_index/      Interactive RAG chatbot — BatchEmbedder + Qdrant
└── internal/testutil/        Shared mocks

Core concepts

ReAct loop

Agent.Run alternates between calling the model and executing tool calls until the model produces a final answer, the context is cancelled, or the iteration budget is exhausted. All tool calls within a single turn are dispatched concurrently.

                    ┌──────────────────────┐
                    │        prompt        │
                    └──────────┬───────────┘
                               │
                    ┌──────────▼───────────┐
             ┌─────▶│        model         │◀── OnIterationStart
             │      └──────────┬───────────┘
             │                 │
             │      ┌──────────▼───────────────────────┐
             │      │    response has tool calls?       │
             │      └──────────┬────────────┬───────────┘
             │                yes           no
             │      ┌──────────▼──────┐  ┌──▼─────────────┐
             │      │ dispatch tools  │  │     answer      │
             │      │  (concurrent)   │  │    (return)     │
             │      │  OnToolCall     │  └────────────────-┘
             │      │  OnToolResult   │       OnResponse
             │      └──────────┬──────┘
             │                 │
             └─────────────────┘  (next iteration)
Tools

Implement the Tool interface or use the ToolFunc helper. Use SchemaFrom to derive the JSON Schema from a struct instead of building the map by hand:

echo := goagent.ToolFunc("echo", "Returns the input text.",
    goagent.SchemaFrom(struct {
        Text string `json:"text" jsonschema_description:"Text to echo back."`
    }{}),
    func(_ context.Context, args map[string]any) (string, error) {
        return args["text"].(string), nil
    },
)

agent, _ := goagent.New(
    goagent.WithProvider(ollama.New()),
    goagent.WithModel("qwen3"),
    goagent.WithTool(echo),
)

SchemaFrom supports four struct tags:

Tag Effect
json:"name" Property name; "-" skips the field
json:"name,omitempty" Optional field (omitted from "required")
jsonschema_description:"text" Adds "description" to the property
jsonschema_enum:"a,b,c" Adds "enum" with the comma-separated values
MCP (Model Context Protocol)

Connect external tool servers over stdio or SSE. The agent discovers all tools automatically at startup.

// Spawn a local MCP server as a subprocess
agent, err := goagent.New(
    goagent.WithProvider(ollama.New()),
    goagent.WithModel("qwen3"),
    mcp.WithStdio("./my-mcp-server", "--flag"),
)
if err != nil {
    log.Fatal(err) // connection or discovery error
}
defer agent.Close()

// Connect to a running HTTP+SSE server
agent, err := goagent.New(
    goagent.WithProvider(anthropic.New()),
    mcp.WithSSE("http://localhost:8080/sse"),
)

Multiple servers can be attached in a single New call. Build your own MCP server with mcp.NewServer:

s := mcp.NewServer("my-tools", "1.0.0")
s.MustAddTool("echo", "Returns the input unchanged",
    goagent.SchemaFrom(struct {
        Text string `json:"text" jsonschema_description:"Text to echo back."`
    }{}),
    func(_ context.Context, args map[string]any) (string, error) {
        return args["text"].(string), nil
    },
)
log.Fatal(s.ServeStdio())
Memory
mem := memory.NewShortTerm(
    memory.WithStorage(storage.NewInMemory()),
    memory.WithPolicy(policy.NewFixedWindow(20)),
)

agent, _ := goagent.New(
    goagent.WithProvider(ollama.New()),
    goagent.WithShortTermMemory(mem),
)

Available policies: NewNoOp(), NewFixedWindow(n) (last n groups), NewTokenWindow(maxTokens) (most recent groups within a token budget). Both window policies treat an assistant+tool_use message and all its tool_result replies as a single atomic group, so the tool-call invariant is preserved at every window boundary regardless of where the cut falls.

Long-term memory enables semantic retrieval across sessions. It requires a VectorStore and an Embedder; both are provided by the framework:

store := vector.NewInMemoryStore()

embedder := ollama.NewEmbedder(
    ollama.WithEmbedModel("nomic-embed-text"),
)
// or: voyage.NewEmbedder(voyage.WithEmbedModel("voyage-3"))

ltm, err := memory.NewLongTerm(
    memory.WithVectorStore(store),
    memory.WithEmbedder(embedder),
)

agent, _ := goagent.New(
    goagent.WithName("my-agent"),         // session namespace
    goagent.WithProvider(ollama.New()),
    goagent.WithShortTermMemory(mem),
    goagent.WithLongTermMemory(ltm),
)

For long documents, plug in a chunker before embedding:

ltm, err := memory.NewLongTerm(
    memory.WithVectorStore(store),
    memory.WithEmbedder(embedder),
    memory.WithChunker(vector.NewTextChunker(
        vector.WithMaxSize(500),
        vector.WithOverlap(50),
    )),
)

Available chunkers:

Chunker When to use
NewNoOpChunker() Conversational messages (default)
NewTextChunker(...) Long text blocks; word-boundary splits with configurable max size and overlap
NewSentenceChunker(...) Narrative or prose text; overlap counted in complete sentences
NewRecursiveChunker(...) Markdown and structured docs; respects \n\n\n → sentence → word hierarchy
NewBlockChunker(...) Mixed multimodal content; text chunked, images pass through, PDFs split by page
NewPageChunker(...) PDF-only per-page chunking
Persistent vector stores

vector.NewInMemoryStore() is the default and is suitable for development and single-process use. For production deployments that require durability or shared access across processes, two persistent backends are available as separate sub-modules.

PostgreSQL + pgvector
go get github.com/Germanblandin1/goagent/memory/vector/pgvector
import (
    "database/sql"
    _ "github.com/jackc/pgx/v5/stdlib"
    "github.com/Germanblandin1/goagent/memory/vector/pgvector"
)

db, _ := sql.Open("pgx", "postgres://user:pass@localhost/mydb")

// Optional: create the extension, table, and HNSW index automatically.
pgvector.Migrate(ctx, db, pgvector.MigrateConfig{
    TableName: "goagent_embeddings",
    Dims:      1024, // must match your embedding model
})

store, err := pgvector.New(db, pgvector.TableConfig{
    Table:          "goagent_embeddings",
    IDColumn:       "id",
    VectorColumn:   "embedding",
    TextColumn:     "content",
    MetadataColumn: "metadata", // optional JSONB column
})

TableConfig describes the caller's existing table — the package imposes no schema. Metadata filtering is supported natively via goagent.WithFilter, which translates to a JSONB containment query (metadata @> filter::jsonb) applied server-side before similarity ranking. For large tables, add a GIN index: CREATE INDEX ON goagent_embeddings USING gin(metadata jsonb_path_ops).

Three distance functions are available via WithDistanceFunc:

Constant Operator Score Recommended when
pgvector.Cosine (default) <=> 1 − distance ∈ [0, 1] Most text embedding models
pgvector.L2 <-> 1 / (1 + distance) ∈ (0, 1] Non-normalised vectors
pgvector.InnerProduct <#> −distance Normalised vectors, speed-sensitive

The DistanceFunc passed to New and the HNSW operator class used by Migrate must match — a mismatch causes pgvector to fall back to a sequential scan.

pgvector.New accepts a Querier — the minimal interface satisfied by both *sql.DB and *sql.Tx — so queries can run inside a caller-managed transaction.

Qdrant
go get github.com/Germanblandin1/goagent/memory/vector/qdrant
import (
    "github.com/qdrant/go-client/qdrant"
    goagent_qdrant "github.com/Germanblandin1/goagent/memory/vector/qdrant"
)

client, _ := qdrant.NewClient(&qdrant.Config{Host: "localhost", Port: 6334})

// Optional: create the collection automatically.
goagent_qdrant.CreateCollection(ctx, client, goagent_qdrant.CollectionConfig{
    CollectionName: "goagent_embeddings",
    Dims:           1536,
})

store, err := goagent_qdrant.New(client, goagent_qdrant.Config{
    CollectionName: "goagent_embeddings",
})

Metadata filtering via goagent.WithFilter is translated to Qdrant Must conditions and evaluated server-side — Qdrant never scores points that do not pass the filter. goagent.WithScoreThreshold maps to Qdrant's native score_threshold field, also server-side.

SQLite + sqlite-vec
go get github.com/Germanblandin1/goagent/memory/vector/sqlitevec

Build requirement: this package requires CGO. Run once from the repository root:

go env -w CGO_CFLAGS="-I$(pwd)/memory/vector/sqlitevec/csrc -I$(go env GOMODCACHE)/github.com/mattn/go-sqlite3@v1.14.40"
import "github.com/Germanblandin1/goagent/memory/vector/sqlitevec"

// Open registers the sqlite-vec extension and opens the database.
db, _ := sqlitevec.Open("/path/to/mydb.sqlite")

// Optional: create the data table and the vec0 virtual table.
sqlitevec.Migrate(ctx, db, sqlitevec.MigrateConfig{
    TableName: "goagent_embeddings",
    Dims:      768, // must match your embedding model
})

store, err := sqlitevec.New(db, sqlitevec.TableConfig{
    Table:          "goagent_embeddings",
    IDColumn:       "id",
    VectorColumn:   "embedding",
    TextColumn:     "content",
    MetadataColumn: "metadata", // optional TEXT/JSON column
})

The schema uses two tables: a regular data table (Table) and a vec0 virtual table (Table+"_vec") that provides indexed KNN search. Migrate creates both.

Two distance metrics are available via WithDistanceMetric:

Constant Algorithm Score Notes
sqlitevec.L2 (default) Euclidean KNN via MATCH 1 / (1 + distance) ∈ (0, 1] Index-accelerated
sqlitevec.Cosine vec_distance_cosine full scan 1 − distance ∈ [0, 1] Full scan — use for small datasets or normalise vectors and use L2

When not using Open, call sqlitevec.Register() before sql.Open to load the extension.

Optional store capabilities

All three persistent backends (and InMemoryStore) implement optional interfaces beyond VectorStore. Check for support with a type assertion — stores that do not implement the interface simply fail the assertion cleanly.

CountableStore — count entries without a vector query:

if cs, ok := store.(goagent.CountableStore); ok {
    // Total entries in the store.
    n, err := cs.Count(ctx)

    // Entries whose metadata matches the filter.
    n, err = cs.Count(ctx, goagent.WithFilter(map[string]any{"env": "prod"}))
}

WithFilter applies the same metadata matching as Search. WithScoreThreshold is silently ignored — there is no query vector to score against.

Useful for health checks, monitoring store growth, or debugging index state without querying the underlying database directly.

VectorStoreObserver — observe every store operation with elapsed duration and error:

observed := goagent.NewObservableStore(store,
    goagent.VectorStoreObserver{
        AfterUpsert: func(ctx context.Context, d time.Duration, err error) { /* ... */ },
        AfterSearch: func(ctx context.Context, results int, d time.Duration, err error) { /* ... */ },
        AfterDelete: func(ctx context.Context, d time.Duration, err error) { /* ... */ },
        AfterCount:  func(ctx context.Context, n int, d time.Duration, err error) { /* ... */ },
    },
)

NewObservableStore wraps any VectorStore (including BulkVectorStore) and forwards all calls transparently. Compose multiple observer sets with MergeVectorStoreObservers. For OpenTelemetry instrumentation, use otel.NewVectorStoreObserver (see OpenTelemetry section).

Token budget on retrieval

Cap the token cost of long-term memory retrieval with WithTokenBudget:

results, err := ltm.Retrieve(ctx, query,
    goagent.WithTokenBudget(2000, func(ctx context.Context, text string) int {
        return len(strings.Fields(text)) // rough word-count estimate
    }),
)

WithTokenBudget walks the results (score-descending) and stops before the first result that would push the running total over the budget. The estimator is a plain func(ctx, text) int — plug in a tiktoken counter, a word counter, or a character counter.

RAG (Retrieval-Augmented Generation)

The rag sub-module provides a standalone pipeline for indexing documents and exposing them as an agent tool. It is decoupled from long-term memory — use it when you want to index a corpus offline and let the agent search it on demand.

go get github.com/Germanblandin1/goagent/rag
import "github.com/Germanblandin1/goagent/rag"

// 1. Build the pipeline
pipeline, err := rag.NewPipeline(
    vector.NewRecursiveChunker(vector.WithRCMaxSize(400), vector.WithRCOverlap(40)),
    ollama.NewEmbedder(ollama.WithEmbedModel("nomic-embed-text")),
    vector.NewInMemoryStore(),
)

// 2. Index documents
docs := []rag.Document{
    {Source: "readme.md", Content: []goagent.ContentBlock{goagent.TextBlock(text)}},
}
if err := pipeline.Index(ctx, docs...); err != nil { log.Fatal(err) }

// 3. Wrap as a tool and give it to the agent
searchTool := rag.NewTool(pipeline,
    rag.WithToolName("search_docs"),
    rag.WithToolDescription("Search the project documentation."),
    rag.WithTopK(3),
)

agent, _ := goagent.New(
    goagent.WithProvider(ollama.New()),
    goagent.WithModel("llama3.2"),
    goagent.WithTool(searchTool),
)

Observe indexing and retrieval with the built-in callbacks:

pipeline, _ := rag.NewPipeline(chunker, embedder, store,
    rag.WithIndexObserver(func(ctx context.Context, source string,
        chunked, embedded, skipped int, dur time.Duration, err error) {
        slog.Info("indexed", "source", source, "chunks", embedded, "skipped", skipped)
    }),
    rag.WithSearchObserver(func(ctx context.Context, query string,
        results []rag.SearchResult, dur time.Duration, err error) {
        if len(results) > 0 && results[0].Score < 0.5 {
            slog.Warn("low quality retrieval", "query", query, "score", results[0].Score)
        }
    }),
)

Both observers receive the caller's ctx, so active OTel spans are accessible via trace.SpanFromContext(ctx) without the rag package importing the OTel SDK.

BatchEmbedder fast path

When the configured embedder implements goagent.BatchEmbedder, Pipeline.Index embeds all chunks of a document in a single BatchEmbed call instead of K serial round-trips. OllamaEmbedder implements this interface out of the box:

embedder := ollama.NewEmbedder(ollama.WithEmbedModel("nomic-embed-text"))
// OllamaEmbedder implements goagent.BatchEmbedder — Pipeline picks the fast path automatically.

pipeline, _ := rag.NewPipeline(
    vector.NewRecursiveChunker(vector.WithRCMaxSize(400)),
    embedder,
    store,
)

The same optimization applies to LongTermMemory.Store() — when BatchEmbedder is available, all chunks from a turn are embedded in one call.

Extended thinking & effort
// Extended thinking — fixed budget
goagent.WithThinking(10_000)

// Extended thinking — adaptive (model decides)
goagent.WithAdaptiveThinking()

// Effort level
goagent.WithEffort("medium")  // "high" | "medium" | "low" | "" (model default)

Ollama captures reasoning from the reasoning field or <think>…</think> tags automatically.

Dispatch resilience

Per-tool timeouts and circuit breaking are configured at agent construction time and apply to every tool call in every Run.

agent, _ := goagent.New(
    goagent.WithProvider(provider),
    goagent.WithTool(myTool),

    // Cancel a tool's context if it takes longer than 5 s.
    goagent.WithToolTimeout(5*time.Second),

    // Open the circuit after 3 consecutive failures; reset after 30 s.
    goagent.WithCircuitBreaker(3, 30*time.Second),
)

Circuit-breaker state persists across Run calls on the same agent. Use OnCircuitOpen to observe rejections:

goagent.WithHooks(goagent.Hooks{
    OnCircuitOpen: func(toolName string, openUntil time.Time) {
        log.Printf("tool %s disabled until %s", toolName, openUntil.Format(time.RFC3339))
    },
})

For custom cross-cutting logic (metrics, retry, rate-limiting), implement DispatchMiddleware:

func metricsMiddleware(next goagent.DispatchFunc) goagent.DispatchFunc {
    return func(ctx context.Context, name string, args map[string]any) ([]goagent.ContentBlock, error) {
        start := time.Now()
        result, err := next(ctx, name, args)
        recordMetric(name, time.Since(start), err)
        return result, err
    }
}

agent, _ := goagent.New(
    goagent.WithProvider(provider),
    goagent.WithTool(myTool),
    goagent.WithDispatchMiddleware(metricsMiddleware),
)

The full chain order (outermost first): logging → timeout → circuit breaker → custom → Execute.

Observability hooks

All callbacks receive ctx context.Context as the first argument. OnRunStart returns a context.Context — use it to embed values (e.g. a trace span) that will be forwarded to every subsequent hook in the same run.

goagent.WithHooks(goagent.Hooks{
    OnRunStart:          func(ctx context.Context) context.Context                                              { return ctx },
    OnRunEnd:            func(ctx context.Context, result goagent.RunResult)                                    { /* ... */ },
    OnProviderRequest:   func(ctx context.Context, iteration int, model string, messageCount int)               { /* ... */ },
    OnProviderResponse:  func(ctx context.Context, iteration int, event goagent.ProviderEvent)                  { /* ... */ },
    OnIterationStart:    func(ctx context.Context, i int)                                                       { /* ... */ },
    OnThinking:          func(ctx context.Context, text string)                                                 { /* ... */ },
    OnToolCall:          func(ctx context.Context, name string, args map[string]any)                            { /* ... */ },
    OnToolResult:        func(ctx context.Context, name string, _ []goagent.ContentBlock, d time.Duration, err error) { /* ... */ },
    OnCircuitOpen:       func(ctx context.Context, toolName string, openUntil time.Time)                        { /* ... */ },
    OnResponse:          func(ctx context.Context, text string, iterations int)                                 { /* ... */ },
    OnShortTermLoad:     func(ctx context.Context, results int, d time.Duration, err error)                     { /* ... */ },
    OnShortTermAppend:   func(ctx context.Context, msgs int, d time.Duration, err error)                        { /* ... */ },
    OnLongTermRetrieve:  func(ctx context.Context, results []goagent.ScoredMessage, d time.Duration, err error) { /* ... */ },
    OnLongTermStore:     func(ctx context.Context, msgs int, d time.Duration, err error)                        { /* ... */ },
})

Multiple independent hook sets can be composed with MergeHooks. Each set's OnRunStart return value is chained — the enriched context from one hook is passed to the next, so span hierarchies nest correctly:

goagent.WithHooks(goagent.MergeHooks(metricsHooks, loggingHooks))
OpenTelemetry

The otel sub-module translates hook events into OpenTelemetry spans and RED metrics (Rate, Errors, Duration) with no manual instrumentation required.

go get github.com/Germanblandin1/goagent/otel
import (
    "go.opentelemetry.io/otel/metric"
    "go.opentelemetry.io/otel/trace"
    agentotel "github.com/Germanblandin1/goagent/otel"
)

hooks, err := agentotel.NewHooks(tracer, meter)
if err != nil {
    log.Fatal(err)
}

agent, err := goagent.New(
    goagent.WithProvider(provider),
    goagent.WithModel("claude-sonnet-4-6"),
    goagent.WithHooks(hooks),
)

If the caller context already carries an active span (e.g. from an HTTP handler), the agent's spans are automatically nested under it:

ctx, span := tracer.Start(r.Context(), "handle_request")
defer span.End()
result, err := agent.Run(ctx, prompt)

Span hierarchy emitted per Run call:

goagent.run
  ├── goagent.provider.complete   (one per LLM call)
  ├── goagent.tool.<name>         (one per tool execution)
  ├── goagent.memory.short_term.load
  ├── goagent.memory.short_term.append
  ├── goagent.memory.long_term.retrieve
  └── goagent.memory.long_term.store

RED metrics recorded:

Metric Instrument Unit Useful for
goagent.run.duration Histogram s p50/p99 latency per run
goagent.run.errors Counter {error} Run error rate
goagent.provider.duration Histogram s LLM call latency per iteration
goagent.provider.tokens.input Counter {token} Input token spend
goagent.provider.tokens.output Counter {token} Output token spend
goagent.tool.duration Histogram s Tool latency by tool.name
goagent.tool.errors Counter {error} Tool error rate by tool.name
goagent.memory.load.duration Histogram s Memory read latency
goagent.memory.append.duration Histogram s Memory write latency

tool.duration and tool.errors carry the tool.name attribute, so you can break down latency and error rates per tool in Grafana or any OTel-compatible backend.

To instrument a VectorStore independently of the agent, use otel.NewVectorStoreObserver:

observer, err := agentotel.NewVectorStoreObserver(tracer, meter)
if err != nil {
    log.Fatal(err)
}

store = goagent.NewObservableStore(store, observer)

This records spans and RED metrics for every Upsert, Search, Delete, Count, BulkUpsert, and BulkDelete call. Additional metrics recorded:

Metric Instrument Unit Useful for
goagent.vector.upsert.duration Histogram s Write latency per backend
goagent.vector.search.duration Histogram s Query latency per backend
goagent.vector.search.results Histogram {result} Result set size distribution
goagent.vector.delete.duration Histogram s Delete latency
goagent.vector.bulk_upsert.duration Histogram s Bulk write latency
goagent.vector.bulk_upsert.batch_size Histogram {entry} Entries per bulk call
goagent.vector.errors Counter {error} Error rate by operation
Multimodal input
answer, err := agent.RunBlocks(ctx,
    goagent.TextBlock("Describe this image"),
    goagent.ImageBlock(imgData, "image/png"),
)

Configuration reference

Option Default Description
WithProvider(p) Required. Model backend
WithModel(m) Required. Model identifier
WithTool(t) Register a tool (repeatable)
WithSystemPrompt(s) System instruction for every run
WithMaxIterations(n) 10 Max ReAct iterations
WithThinking(budget) Extended thinking, fixed token budget
WithAdaptiveThinking() Extended thinking, model-chosen budget
WithEffort(level) "" "high", "medium", "low", or ""
WithToolTimeout(d) 0 (off) Per-tool deadline; cancels the tool's context after d
WithCircuitBreaker(n, d) Open circuit after n consecutive failures; reset after d
WithDispatchMiddleware(mw) Append a custom DispatchMiddleware to the chain (repeatable)
WithHooks(h) Observability callbacks
WithRunResult(dst) Write RunResult metrics to *dst after each Run (synchronous alternative to OnRunEnd)
WithName(name) Agent identity / session namespace for long-term memory
WithShortTermMemory(m) Conversation history within a session
WithLongTermMemory(m) Semantic retrieval across sessions
WithWritePolicy(p) StoreAlways What to persist to long-term memory
WithLongTermTopK(k) 3 Messages to retrieve from long-term memory
WithShortTermTraceTools(b) true Include tool traces in short-term history
WithLogger(l) slog.Default() Structured logger

Error handling

var maxErr *goagent.MaxIterationsError
var provErr *goagent.ProviderError

switch {
case errors.As(err, &maxErr):
    fmt.Printf("gave up after %d iterations\n", maxErr.Iterations)
case errors.As(err, &provErr):
    fmt.Printf("provider error: %v\n", provErr.Cause)
}
Error When
*ProviderError Provider returned an error
*MaxIterationsError Iteration budget exhausted
*ToolExecutionError A tool failed (wraps *CircuitOpenError when circuit is open)
*CircuitOpenError Tool call rejected because the circuit breaker is open
*mcp.MCPConnectionError MCP server unreachable at startup
*mcp.MCPDiscoveryError MCP tool listing failed
ErrToolNotFound Requested tool does not exist

Providers

Provider Package Notes
Anthropic providers/anthropic Reads ANTHROPIC_API_KEY; supports text, images (5 MB), PDFs (32 MB)
Ollama providers/ollama Default http://localhost:11434/v1; supports text and images; includes NewEmbedder
Voyage AI providers/voyage Reads VOYAGE_API_KEY; embedder only (e.g. "voyage-3")

License

Apache 2.0 — see LICENSE.

Documentation

Overview

Package goagent provides a Go-idiomatic framework for building AI agents with a ReAct loop and pluggable providers.

Overview

goagent orchestrates the interaction between a language model and a set of tools. On each iteration of the ReAct loop the model either produces a final text answer or requests one or more tool calls. The agent dispatches those calls in parallel, feeds the results back, and repeats until the model stops or the iteration budget (WithMaxIterations) is exhausted.

The framework is built around three small interfaces:

  • Provider — wraps an LLM backend (Ollama, Anthropic, OpenAI-compatible, …).
  • Tool — a capability the model can invoke (calculator, web search, …).
  • ShortTermMemory / LongTermMemory — optional conversation persistence.

All configuration uses functional options passed to New.

Basic usage

provider := ollama.New(ollama.WithBaseURL("http://localhost:11434"))

add := goagent.ToolFunc("add", "Sum two numbers",
    goagent.SchemaFrom(struct {
        A float64 `json:"a" jsonschema_description:"First operand."`
        B float64 `json:"b" jsonschema_description:"Second operand."`
    }{}),
    func(ctx context.Context, args map[string]any) (string, error) {
        a, _ := args["a"].(float64)
        b, _ := args["b"].(float64)
        return fmt.Sprintf("%g", a+b), nil
    },
)

agent := goagent.New(
    goagent.WithProvider(provider),
    goagent.WithTool(add),
    goagent.WithMaxIterations(5),
)

answer, err := agent.Run(ctx, "What is 2 + 3?")

Multimodal content

[RunBlocks] accepts images and documents alongside text:

answer, err := agent.RunBlocks(ctx,
    goagent.ImageBlock(pngBytes, "image/png"),
    goagent.TextBlock("Describe this image"),
)

Memory

By default each Agent.Run call is stateless. To maintain conversation context across calls, configure a ShortTermMemory via WithShortTermMemory. For semantic retrieval across sessions, add a LongTermMemory via WithLongTermMemory. Implementations live in the memory sub-package.

Sub-packages

  • memory — ShortTermMemory and LongTermMemory with pluggable storage and policies.
  • memory/storage — persistence backends (in-memory; bring your own).
  • memory/policy — read-time filters: FixedWindow, TokenWindow, NoOp.
  • providers/anthropic — Provider for the Anthropic Messages API (Claude).
  • providers/ollama — Provider for Ollama (OpenAI-compatible API).
  • internal/testutil — mock implementations for testing agents.
Example

Example demonstrates creating an Agent with a mock provider and running it.

package main

import (
	"context"
	"fmt"
	"log"

	"github.com/Germanblandin1/goagent"
	"github.com/Germanblandin1/goagent/internal/testutil"
)

func main() {
	agent, err := goagent.New(
		goagent.WithProvider(testutil.NewMockProvider(
			goagent.CompletionResponse{
				Message:    goagent.AssistantMessage("4"),
				StopReason: goagent.StopReasonEndTurn,
			},
		)),
	)
	if err != nil {
		log.Fatal(err)
	}

	result, err := agent.Run(context.Background(), "what is 2+2?")
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println(result)
}
Output:
4

Index

Examples

Constants

This section is empty.

Variables

View Source
var ErrInvalidMediaType = errors.New("invalid media type")

ErrInvalidMediaType is returned when a content block has a MIME type that is not in the set of supported types for its content kind.

View Source
var ErrToolNotFound = errors.New("tool not found")

ErrToolNotFound is returned when the model requests a tool that was not registered with the agent.

View Source
var ErrUnsupportedContent = errors.New("unsupported content type")

ErrUnsupportedContent is returned when the provider does not support a content type present in the request (e.g. documents on an OpenAI-compatible provider, or images on a text-only model).

Functions

func TextFrom

func TextFrom(blocks []ContentBlock) string

TextFrom extracts and concatenates the text from a slice of ContentBlocks. Non-text blocks are ignored. Adjacent text values are separated by a space.

Intended as a convenience for Embedder implementations that only handle text:

func (e *myEmbedder) Embed(ctx context.Context, content []goagent.ContentBlock) ([]float32, error) {
    return e.client.Embed(ctx, goagent.TextFrom(content))
}
Example

ExampleTextFrom shows how TextFrom extracts text from a mixed slice of ContentBlocks, ignoring non-text blocks like images.

package main

import (
	"fmt"

	"github.com/Germanblandin1/goagent"
)

func main() {
	blocks := []goagent.ContentBlock{
		goagent.TextBlock("hello"),
		goagent.ImageBlock([]byte{0xFF}, "image/png"),
		goagent.TextBlock("world"),
	}
	fmt.Println(goagent.TextFrom(blocks))
}
Output:
hello world

func ValidDocumentMediaType

func ValidDocumentMediaType(mediaType string) bool

ValidDocumentMediaType reports whether mediaType is a supported document MIME type.

func ValidImageMediaType

func ValidImageMediaType(mediaType string) bool

ValidImageMediaType reports whether mediaType is a supported image MIME type.

Types

type Agent

type Agent struct {
	// contains filtered or unexported fields
}

Agent runs a ReAct loop: it alternates between calling the LLM provider and dispatching tool calls until the model produces a final answer or the iteration budget is exhausted.

By default an Agent is stateless — each Run call is independent and carries no memory of previous calls. To persist conversation history across calls, configure a ShortTermMemory via WithShortTermMemory. For semantic retrieval across sessions, configure a LongTermMemory via WithLongTermMemory.

Use WithName to assign a stable identity to the agent. When a LongTermMemory is configured, the name is used as the session namespace so that multiple agents sharing the same memory backend can only see their own entries.

Concurrency

Agent itself holds no mutable state after construction; all fields are set once by New and never written again. Whether concurrent calls to Run are safe depends entirely on the implementations injected by the caller:

  • Provider: safe if the implementation is. All built-in providers are.
  • ShortTermMemory: concurrent Run calls produce undefined message ordering. See WithShortTermMemory.
  • LongTermMemory: concurrent Retrieve and Store calls produce undefined ordering. See WithLongTermMemory.
  • Logger: slog.Logger is documented as safe for concurrent use.
  • Circuit breaker state is protected by mutexes and is safe for concurrent use across simultaneous Run calls.

If no memory backend is configured (the default), Run is safe to call from multiple goroutines simultaneously.

func New

func New(opts ...Option) (*Agent, error)

New creates an Agent with the provided options applied over sensible defaults. A Provider must be supplied via WithProvider before calling Run.

If any WithMCP* options are present, New establishes the MCP connections, discovers their tools, and returns an error if any connection fails. On error, all already-opened connections are closed before returning.

Call Close to release MCP connections when the agent is no longer needed:

agent, err := goagent.New(...)
if err != nil {
    log.Fatal(err)
}
defer agent.Close()

func (*Agent) Close

func (a *Agent) Close() error

Close releases all MCP connections opened during New. For stdio transports: terminates the subprocess. For SSE transports: closes the HTTP connection. Idempotent — multiple calls are safe. Errors are logged at Warn level; the method always returns nil.

func (*Agent) Run

func (a *Agent) Run(ctx context.Context, prompt string) (string, error)

Run executes the ReAct loop for the given prompt and returns the model's final text response.

This is the main entry point for text interactions. For sending images, documents, or other multimodal content, use RunBlocks.

If a ShortTermMemory is configured via WithShortTermMemory, Run loads the conversation history before calling the provider and persists the new turn after producing an answer (or exhausting iterations).

If a LongTermMemory is configured via WithLongTermMemory, Run retrieves semantically relevant context before building the request, and stores the completed turn (subject to WritePolicy) after each Run.

Memory write failures are logged at Warn level and do not cause Run to return an error.

Possible errors:

  • *ProviderError — the provider returned an error
  • *MaxIterationsError — the iteration budget was exhausted
  • context.Canceled / context.DeadlineExceeded — context was cancelled
Example (WithMemory)

ExampleAgent_Run_withMemory shows that completed turns are persisted to ShortTermMemory and available for the next Run call.

package main

import (
	"context"
	"fmt"
	"log"

	"github.com/Germanblandin1/goagent"
	"github.com/Germanblandin1/goagent/internal/testutil"
	"github.com/Germanblandin1/goagent/memory"
)

func main() {
	mem := memory.NewShortTerm()

	agent, err := goagent.New(
		goagent.WithProvider(testutil.NewMockProvider(
			goagent.CompletionResponse{
				Message:    goagent.AssistantMessage("pong"),
				StopReason: goagent.StopReasonEndTurn,
			},
		)),
		goagent.WithShortTermMemory(mem),
	)
	if err != nil {
		log.Fatal(err)
	}

	if _, err := agent.Run(context.Background(), "ping"); err != nil {
		log.Fatal(err)
	}

	msgs, err := mem.Messages(context.Background())
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println(len(msgs))
	fmt.Println(msgs[0].TextContent())
	fmt.Println(msgs[1].TextContent())
}
Output:
2
ping
pong
Example (WithShortTermMemory)

ExampleAgent_Run_withShortTermMemory shows how to configure short-term memory so that conversation history persists across Run calls.

package main

import (
	"context"
	"fmt"
	"log"

	"github.com/Germanblandin1/goagent"
	"github.com/Germanblandin1/goagent/internal/testutil"
	"github.com/Germanblandin1/goagent/memory"
)

func main() {
	mock := testutil.NewMockProvider(
		goagent.CompletionResponse{
			Message:    goagent.AssistantMessage("Paris."),
			StopReason: goagent.StopReasonEndTurn,
		},
		goagent.CompletionResponse{
			Message:    goagent.AssistantMessage("It has about 2 million inhabitants."),
			StopReason: goagent.StopReasonEndTurn,
		},
	)

	mem := memory.NewShortTerm()

	agent, err := goagent.New(
		goagent.WithProvider(mock),
		goagent.WithShortTermMemory(mem),
	)
	if err != nil {
		log.Fatal(err)
	}

	ctx := context.Background()
	first, _ := agent.Run(ctx, "What is the capital of France?")
	fmt.Println(first)

	second, _ := agent.Run(ctx, "What is its population?")
	fmt.Println(second)
}
Output:
Paris.
It has about 2 million inhabitants.
Example (WithTools)

ExampleAgent_Run_withTools shows how a tool is registered and invoked during a ReAct loop. The mock provider first requests a tool call, then produces the final answer after receiving the tool result.

package main

import (
	"context"
	"fmt"
	"log"

	"github.com/Germanblandin1/goagent"
	"github.com/Germanblandin1/goagent/internal/testutil"
)

func main() {
	echo := goagent.ToolFunc("echo", "returns the input text unchanged", nil,
		func(_ context.Context, args map[string]any) (string, error) {
			return fmt.Sprintf("%v", args["text"]), nil
		},
	)

	mock := testutil.NewMockProvider(
		// Iteration 1: model requests the echo tool.
		goagent.CompletionResponse{
			Message: goagent.Message{
				Role: goagent.RoleAssistant,
				ToolCalls: []goagent.ToolCall{
					{ID: "call-1", Name: "echo", Arguments: map[string]any{"text": "hello"}},
				},
			},
			StopReason: goagent.StopReasonToolUse,
		},
		// Iteration 2: model produces the final answer.
		goagent.CompletionResponse{
			Message:    goagent.AssistantMessage("hello"),
			StopReason: goagent.StopReasonEndTurn,
		},
	)

	agent, err := goagent.New(
		goagent.WithProvider(mock),
		goagent.WithTool(echo),
	)
	if err != nil {
		log.Fatal(err)
	}

	result, err := agent.Run(context.Background(), "echo hello")
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println(result)
}
Output:
hello

func (*Agent) RunBlocks

func (a *Agent) RunBlocks(ctx context.Context, blocks ...ContentBlock) (string, error)

RunBlocks executes the ReAct loop with multimodal content and returns the model's final text response.

It accepts one or more ContentBlock in any combination of types. For text-only prompts, prefer Run which is more ergonomic.

Example:

result, err := agent.RunBlocks(ctx,
    goagent.ImageBlock(imgData, "image/png"),
    goagent.TextBlock("What animal is this?"),
)

Possible errors:

  • error if no content blocks are provided
  • ErrInvalidMediaType if a content block has an unsupported MIME type
  • *ProviderError — the provider returned an error
  • *UnsupportedContentError — the provider does not support a content type
  • *MaxIterationsError — the iteration budget was exhausted
  • context.Canceled / context.DeadlineExceeded — context was cancelled
Example

ExampleAgent_RunBlocks shows how to use RunBlocks with multimodal content. Here we send a text block; in practice you would combine text with image or document blocks.

package main

import (
	"context"
	"fmt"
	"log"

	"github.com/Germanblandin1/goagent"
	"github.com/Germanblandin1/goagent/internal/testutil"
)

func main() {
	agent, err := goagent.New(
		goagent.WithProvider(testutil.NewMockProvider(
			goagent.CompletionResponse{
				Message:    goagent.AssistantMessage("It's a cat"),
				StopReason: goagent.StopReasonEndTurn,
			},
		)),
	)
	if err != nil {
		log.Fatal(err)
	}

	result, err := agent.RunBlocks(context.Background(),
		goagent.TextBlock("What animal is this?"),
	)
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println(result)
}
Output:
It's a cat

type BatchEmbedder added in v0.5.4

type BatchEmbedder interface {
	Embedder
	BatchEmbed(ctx context.Context, inputs [][]ContentBlock) ([][]float32, error)
}

BatchEmbedder is an optional extension of Embedder that converts multiple content slices to vectors in a single API call. Store uses it when available to collapse N×K individual Embed calls into one round trip — a significant win for remote embedding APIs (OpenAI, Voyage, Cohere) where HTTP overhead dominates latency.

Implementations must return a slice of exactly len(inputs) vectors. A nil vector at index i signals that the input had no embeddable content (equivalent to a single Embed returning ErrNoEmbeddeableContent for that slot). A non-nil error aborts the entire batch.

This mirrors the BulkVectorStore pattern: declare the interface, and longTermMemory.Store will detect it at runtime via a type assertion.

type BulkVectorStore added in v0.5.4

type BulkVectorStore interface {
	VectorStore

	// BulkUpsert stores or updates all entries in a single batch.
	// The operation is idempotent; when entries contains duplicate IDs the
	// last occurrence wins (same behaviour as repeated Upsert calls).
	BulkUpsert(ctx context.Context, entries []UpsertEntry) error

	// BulkDelete removes all entries with the given ids in a single batch.
	// IDs that do not exist are silently ignored.
	BulkDelete(ctx context.Context, ids []string) error
}

BulkVectorStore extends VectorStore with batch operations for stores that support efficient multi-row writes. Callers type-assert to BulkVectorStore and fall back to individual [VectorStore.Upsert] / [VectorStore.Delete] calls when the store does not implement it.

Implementations that do implement BulkVectorStore should ensure that a single BulkUpsert or BulkDelete call is cheaper than an equivalent number of individual Upsert / Delete calls (e.g. by using a database transaction or a native batch RPC).

type CircuitOpenError added in v0.5.1

type CircuitOpenError struct {
	Tool      string
	OpenUntil time.Time
}

CircuitOpenError is returned by the dispatcher when a tool's circuit breaker is in the open state and the call is rejected immediately.

func (*CircuitOpenError) Error added in v0.5.1

func (e *CircuitOpenError) Error() string

type CompletionRequest

type CompletionRequest struct {
	// Model is the exact model identifier forwarded to the provider
	// (e.g. "llama3", "qwen3"). Interpretation is provider-specific;
	// the framework passes it through without validation.
	Model string

	// SystemPrompt is the system-level instruction for the model.
	// Providers must forward it using their native mechanism — for
	// OpenAI-compatible APIs this means prepending a system message;
	// for Anthropic it maps to the top-level "system" field.
	// Empty string means no system prompt.
	SystemPrompt string

	// Messages is the conversation history to send, in chronological order.
	// The slice is never nil when sent by Agent, but Provider implementations
	// must handle a nil or empty slice without panicking.
	Messages []Message

	// Tools is the list of tools the model may call during this completion.
	// Nil or empty means no tool use is available for this request.
	// Providers must not error when this field is nil.
	Tools []ToolDefinition

	// Thinking configures extended thinking for this request.
	// nil means thinking is disabled (default behaviour).
	// Providers that do not support thinking must ignore this field.
	Thinking *ThinkingConfig

	// Effort controls the overall effort the model puts into its response,
	// affecting text, tool calls, and thinking (when enabled).
	// Valid values: "high", "medium", "low". Empty string means the model's
	// default (equivalent to "high"). Thinking and Effort are orthogonal —
	// each can be set independently.
	// Providers that do not support effort must ignore this field.
	Effort string
}

CompletionRequest is the input to a provider's Complete call.

type CompletionResponse

type CompletionResponse struct {
	// Message is the model's reply. Role is always RoleAssistant.
	// Content may be empty if the model produced only tool calls.
	// ToolCalls is non-empty when StopReason is StopReasonToolUse.
	Message Message

	// StopReason indicates why the model stopped generating.
	StopReason StopReason

	// Usage reports token consumption for this completion.
	// Providers should populate this when the API returns it;
	// zero values are valid if the backend does not expose token counts.
	Usage Usage
}

CompletionResponse is the output from a provider's Complete call.

type ContentBlock

type ContentBlock struct {
	Type     ContentType
	Text     string
	Image    *ImageData
	Document *DocumentData
	Thinking *ThinkingData
}

ContentBlock represents a unit of content within a message. Exactly one of Text, Image, Document, or Thinking is valid depending on the value of Type. The others are zero value.

Use the helpers TextBlock, ImageBlock, DocumentBlock, and ThinkingBlock to construct content blocks instead of building the struct directly.

func DocumentBlock

func DocumentBlock(data []byte, mediaType, title string) ContentBlock

DocumentBlock creates a document ContentBlock from raw bytes. mediaType must be one of: "application/pdf", "text/plain". title is optional — if non-empty, it gives the model context about the document.

func ImageBlock

func ImageBlock(data []byte, mediaType string) ContentBlock

ImageBlock creates an image ContentBlock from raw bytes. mediaType must be one of: "image/jpeg", "image/png", "image/gif", "image/webp".

func TextBlock

func TextBlock(s string) ContentBlock

TextBlock creates a text ContentBlock.

func ThinkingBlock

func ThinkingBlock(thinking, signature string) ContentBlock

ThinkingBlock creates a ContentBlock that carries the model's internal reasoning. signature is the opaque cryptographic token from the Anthropic API; pass an empty string for local models that do not use this mechanism.

type ContentType

type ContentType string

ContentType identifies the kind of content in a ContentBlock.

const (
	// ContentText indicates the block contains plain text.
	ContentText ContentType = "text"

	// ContentImage indicates the block contains an image.
	// Supported formats: JPEG, PNG, GIF, WebP.
	ContentImage ContentType = "image"

	// ContentDocument indicates the block contains a document.
	// Supported formats: PDF, plain text.
	ContentDocument ContentType = "document"

	// ContentThinking indicates the block contains the model's internal
	// reasoning produced before the final response or a tool call.
	// The text may be a summary (Claude 4+) or the full chain-of-thought
	// (Claude Sonnet 3.7, local models).
	//
	// Thinking blocks are produced by the model and must not be constructed
	// by callers except when echoing them back to the provider (which the
	// Agent does automatically). Use ThinkingBlock to build one if needed.
	ContentThinking ContentType = "thinking"
)

type CountableStore added in v0.5.5

type CountableStore interface {
	Count(ctx context.Context, opts ...SearchOption) (int64, error)
}

CountableStore is implemented by VectorStore backends that support counting stored entries without a vector query. Use a type assertion to check support:

if cs, ok := store.(goagent.CountableStore); ok {
    n, err := cs.Count(ctx)
}

Accepts the same SearchOption variadic as Search for consistency; only WithFilter is meaningful — WithScoreThreshold is silently ignored because there is no query vector to score against. Session scope follows the context (set via [WithSessionID]).

type DispatchFunc added in v0.5.1

type DispatchFunc func(ctx context.Context, name string, args map[string]any) ([]ContentBlock, error)

DispatchFunc is the function signature for tool dispatch. It is used as the base of the middleware chain.

type DispatchMiddleware added in v0.5.1

type DispatchMiddleware func(next DispatchFunc) DispatchFunc

DispatchMiddleware wraps a DispatchFunc to add cross-cutting behavior (logging, timeouts, circuit breaking, metrics, etc.). Middlewares are applied outermost-first: the first middleware in the slice is the first to execute and the last to return.

type DocumentData

type DocumentData struct {
	MediaType string // MIME type: "application/pdf", "text/plain"
	Data      []byte // raw document content
	Title     string // optional — gives the model context about the document
}

DocumentData holds a document to send to the model. For PDFs, Claude processes both text and visual content (tables, charts, embedded images) page by page.

Supported formats: application/pdf, text/plain. Anthropic limit: 32 MB per document.

type Embedder

type Embedder interface {
	Embed(ctx context.Context, content []ContentBlock) ([]float32, error)
}

Embedder converts message content into a dense vector representation suitable for semantic similarity search. Implementations receive the full []ContentBlock so they can handle text, image, and document blocks natively — for example, by routing ContentImage blocks to a vision embedding model.

The vector for a given content slice must be consistent across calls (deterministic given the same model and input).

Use TextFrom to extract and concatenate all text blocks in implementations that only support text.

type Hooks

type Hooks struct {
	// OnRunStart is called at the beginning of each Run/RunBlocks call,
	// before loading memory or building messages. It may return an enriched
	// ctx (e.g. containing an OTel span) that the loop will use for all
	// subsequent hook calls and operations. If it returns nil, the loop
	// uses the original ctx unchanged.
	//
	// For callers that only need to observe the event without enriching ctx:
	//   OnRunStart: func(ctx context.Context) context.Context { return ctx }
	OnRunStart func(ctx context.Context) context.Context

	// OnRunEnd is called at the end of each Run/RunBlocks call, just
	// before returning to the caller. It is always called — on success,
	// provider failure, MaxIterationsError, and context cancellation.
	//
	// ctx is the same context returned by OnRunStart (or the original ctx
	// if OnRunStart was not set), allowing span finalisation and cleanup.
	//
	// result contains accumulated metrics: total duration, iterations,
	// total token usage, total tool calls, and total tool execution time.
	// result.Err is nil when the Run succeeds.
	OnRunEnd func(ctx context.Context, result RunResult)

	// OnProviderRequest is called before each Provider.Complete call,
	// once per iteration of the ReAct loop.
	// iteration is 0-indexed. model is the identifier sent to the provider.
	// messageCount is the number of messages in the request.
	OnProviderRequest func(ctx context.Context, iteration int, model string, messageCount int)

	// OnProviderResponse is called after each Provider.Complete call,
	// on both success and provider error.
	// iteration is 0-indexed. event contains the call duration, token
	// usage, stop reason, and error (if any).
	//
	// On provider error, event.Err carries the underlying error (before
	// wrapping as *ProviderError) and Usage/StopReason are zero values.
	OnProviderResponse func(ctx context.Context, iteration int, event ProviderEvent)

	// OnIterationStart is called at the start of each ReAct loop
	// iteration, before calling the provider.
	// iteration is 0-indexed: the first iteration is 0.
	OnIterationStart func(ctx context.Context, iteration int)

	// OnThinking is called when the model produces a thinking block.
	// text is the reasoning content — it may be a summary on Claude 4+
	// or the full reasoning on local models and Claude Sonnet 3.7.
	//
	// Called once per thinking block in the model's response. If the
	// response has multiple thinking blocks (interleaved thinking),
	// it is called once for each, in order.
	//
	// Only called when thinking is enabled (WithThinking,
	// WithAdaptiveThinking) or when a local model produces thinking.
	OnThinking func(ctx context.Context, text string)

	// OnToolCall is called when the model requests a tool invocation,
	// before the dispatcher executes it.
	// Called once per tool call in the model's response. If the model
	// requests N tools in parallel, it is called N times before dispatch.
	OnToolCall func(ctx context.Context, name string, args map[string]any)

	// OnToolResult is called after a tool finishes executing.
	// content is the result that will be sent back to the model.
	// duration is how long the execution took.
	// err is nil on success, or the error if the tool failed.
	//
	// Called even when the tool fails — err contains the error.
	// Called once per tool call, after all parallel calls complete.
	OnToolResult func(ctx context.Context, name string, content []ContentBlock, duration time.Duration, err error)

	// OnCircuitOpen is called when a tool's circuit breaker transitions to the
	// open state and rejects a call. toolName is the name of the disabled tool
	// and openUntil is the earliest time the circuit may close again.
	OnCircuitOpen func(ctx context.Context, toolName string, openUntil time.Time)

	// OnResponse is called when the model produces the final response,
	// just before Run/RunBlocks returns to the caller.
	// text is the extracted text response (without thinking blocks).
	// iterations is the total number of iterations the loop used (1-indexed).
	//
	// Also called when the loop is exhausted (MaxIterationsError) —
	// text may be "" if the last iteration ended with a tool use.
	OnResponse func(ctx context.Context, text string, iterations int)

	// OnShortTermLoad is called after the agent loads conversation history
	// from short-term memory at the start of each Run, on both success
	// and error.
	// results is the number of messages loaded (0 if err != nil).
	// duration is how long the operation took.
	// err is nil on success.
	//
	// Only called when a ShortTermMemory is configured.
	OnShortTermLoad func(ctx context.Context, results int, duration time.Duration, err error)

	// OnShortTermAppend is called after the agent persists the turn to
	// short-term memory at the end of each Run, on both success and error.
	// msgs is the number of messages that were stored.
	// duration is how long the operation took.
	// err is nil on success.
	//
	// Only called when a ShortTermMemory is configured.
	OnShortTermAppend func(ctx context.Context, msgs int, duration time.Duration, err error)

	// OnLongTermRetrieve is called after the agent queries long-term memory
	// at the start of each Run, on both success and error.
	// results is the slice of messages retrieved, each paired with its
	// similarity score. The slice is nil when err is non-nil.
	// duration is how long the retrieval operation took.
	// err is nil on success.
	//
	// Score in each ScoredMessage is the cosine similarity in [0.0, 1.0]
	// when the underlying VectorStore implements ScoredStore.
	// Score is 0.0 when the store does not expose scores.
	//
	// Only called when a LongTermMemory is configured.
	OnLongTermRetrieve func(ctx context.Context, results []ScoredMessage, duration time.Duration, err error)

	// OnLongTermStore is called after the agent persists a turn to
	// long-term memory at the end of each Run, on both success and error.
	// msgs is the number of messages that were stored.
	// duration is how long the storage operation took.
	// err is nil on success.
	//
	// Only called when a LongTermMemory is configured and the
	// WritePolicy decided to persist the turn. Not called when the
	// policy discards the turn.
	OnLongTermStore func(ctx context.Context, msgs int, duration time.Duration, err error)
}

Hooks allows observing events in the ReAct loop without modifying its behaviour. All fields are optional — a nil hook is silently skipped.

Hooks are invoked synchronously within the loop. If a hook needs to perform heavy work (e.g. sending to an external service), it should spawn a goroutine internally to avoid blocking the loop.

The zero value of Hooks is functional and invokes no callbacks.

Example:

agent := goagent.New(
    goagent.WithProvider(provider),
    goagent.WithHooks(goagent.Hooks{
        OnToolCall: func(_ context.Context, name string, args map[string]any) {
            fmt.Printf("tool: %s\n", name)
        },
    }),
)

func MergeHooks added in v0.5.1

func MergeHooks(hooks ...Hooks) Hooks

MergeHooks combines multiple Hooks structs into one. For each hook field, the merged hook calls every non-nil callback in order. Fields where no input hook has a callback remain nil, preserving the zero-value semantics of the Hooks struct.

OnRunStart is special: each hook's return value is passed as the input ctx to the next hook in the chain, so multiple hooks can each enrich the context (e.g. adding an OTel span, a request ID, and a logger simultaneously).

This enables composing independent hook sets (e.g. OTel tracing + custom logging) without manual wiring.

agent, _ := goagent.New(
    goagent.WithHooks(goagent.MergeHooks(otelHooks, loggingHooks, metricsHooks)),
)

type ImageData

type ImageData struct {
	MediaType string // MIME type: "image/jpeg", "image/png", "image/gif", "image/webp"
	Data      []byte // raw image content
}

ImageData holds an image to send to the model. Data is the raw image content — the provider layer encodes it to base64.

Supported formats: image/jpeg, image/png, image/gif, image/webp. Anthropic limit: 5 MB per image, ~1600x1600 px recommended.

type LongTermMemory

type LongTermMemory interface {
	// Store persists msgs for future retrieval.
	Store(ctx context.Context, msgs ...Message) error

	// Retrieve returns the topK messages most semantically similar to the
	// given content, each paired with its similarity score.
	//
	// The full []ContentBlock is passed so that embedder implementations
	// that support vision or documents can build a meaningful query vector
	// even when the prompt contains no text.
	//
	// Score in each ScoredMessage is the cosine similarity in [0.0, 1.0]
	// as computed by the underlying VectorStore.
	//
	// opts are forwarded to the underlying VectorStore.Search call.
	Retrieve(ctx context.Context, query []ContentBlock, topK int, opts ...SearchOption) ([]ScoredMessage, error)
}

LongTermMemory stores and retrieves messages across sessions by semantic relevance. Unlike ShortTermMemory, retrieval is similarity-based, not positional — the store may contain thousands of messages but only the most relevant ones are surfaced on each Run. Implementations must be safe for concurrent use.

type MCPConnectorFn

type MCPConnectorFn func(ctx context.Context, logger *slog.Logger) ([]Tool, io.Closer, error)

MCPConnectorFn is a function that establishes an MCP connection, discovers tools, and returns them along with a Closer for lifecycle management. It is called by New for each WithMCP* option applied to the agent.

type MaxIterationsError

type MaxIterationsError struct {
	// Iterations is the budget that was exhausted (set by WithMaxIterations).
	Iterations int

	// LastThought is the text content of the model's last assistant message
	// before the budget ran out. It is empty if the model's last turn
	// produced only tool calls with no accompanying text. Useful for
	// debugging runaway loops or surfacing a partial answer to the user.
	LastThought string
}

MaxIterationsError is returned by Run when the agent exhausts its iteration budget without producing a final answer.

func (*MaxIterationsError) Error

func (e *MaxIterationsError) Error() string

type Message

type Message struct {
	Role    Role
	Content []ContentBlock

	// ToolCalls is non-empty when the model requests one or more tool
	// invocations. Only set on assistant messages (Role == RoleAssistant).
	ToolCalls []ToolCall

	// ToolCallID is the ID of the ToolCall this message is a result for.
	// Must be set — and must exactly match ToolCall.ID — when Role == RoleTool.
	// The Agent sets this automatically when building tool result messages;
	// Provider implementations must populate ToolCall.ID for the correlation
	// to work correctly.
	ToolCallID string

	// Metadata is a free-form map for infrastructure components (RAG pipelines,
	// vector stores) to attach auxiliary information such as source document,
	// chunk index, or retrieval score.
	//
	// The Agent and all Provider implementations ignore this field — it is never
	// serialised or sent to the model. Callers must not rely on Metadata being
	// propagated through the provider round-trip.
	Metadata map[string]any
}

Message is a single turn in a conversation.

Content is a slice of ContentBlock that can hold text, images, documents, or any combination. For simple text messages, use the helpers TextMessage() or UserMessage(). To extract concatenated text from all blocks, use TextContent().

func AssistantMessage

func AssistantMessage(text string) Message

AssistantMessage creates an assistant-role Message with text content. Shorthand for TextMessage(RoleAssistant, text).

func TextMessage

func TextMessage(role Role, text string) Message

TextMessage creates a Message with a single text content block.

func UserMessage

func UserMessage(text string) Message

UserMessage creates a user-role Message with text content. Shorthand for TextMessage(RoleUser, text).

func (Message) HasContentType

func (m Message) HasContentType(ct ContentType) bool

HasContentType reports whether the message contains at least one ContentBlock of the given type.

func (Message) TextContent

func (m Message) TextContent() string

TextContent returns the concatenation of all ContentText blocks in the message, separated by newlines. Returns an empty string if the message contains no text blocks.

type Option

type Option func(*options)

Option is a functional option for configuring an Agent.

func WithAdaptiveThinking

func WithAdaptiveThinking() Option

WithAdaptiveThinking enables thinking in adaptive mode: the model decides how much to reason based on the complexity of each prompt. Recommended for claude-opus-4-6 and claude-sonnet-4-6.

On models that do not support adaptive mode, the provider may fall back to a manual budget.

func WithCircuitBreaker added in v0.5.1

func WithCircuitBreaker(maxFailures int, resetTimeout time.Duration) Option

WithCircuitBreaker enables per-tool circuit breaking. After maxFailures consecutive failures, the tool is disabled for resetTimeout. Disabled tools return CircuitOpenError immediately without calling Execute. The OnCircuitOpen hook (if set) is called each time a call is rejected. maxFailures must be > 0; resetTimeout must be > 0.

func WithDispatchMiddleware added in v0.5.1

func WithDispatchMiddleware(mw DispatchMiddleware) Option

WithDispatchMiddleware appends a custom DispatchMiddleware to the chain. Custom middlewares run after the built-in ones (logging → timeout → circuit breaker → custom → Execute). Multiple calls append in order: first call = outermost custom middleware.

func WithEffort

func WithEffort(level string) Option

WithEffort controls the overall effort the model puts into its response, affecting text quality, tool call accuracy, and reasoning depth.

Valid values:

  • "high": maximum effort — equivalent to the model's default behaviour.
  • "medium": balanced quality and cost — suitable for most tasks.
  • "low": faster and cheaper responses — best for simple classification or extraction tasks.

Effort and thinking are orthogonal and can be combined freely. Supported models: claude-opus-4-6, claude-sonnet-4-6, claude-opus-4-5. Models that do not support effort silently ignore this setting.

func WithHooks

func WithHooks(h Hooks) Option

WithHooks registers observability callbacks for the ReAct loop. All fields of Hooks are optional — only non-nil hooks are invoked.

Example:

agent := goagent.New(
    goagent.WithProvider(provider),
    goagent.WithHooks(goagent.Hooks{
        OnToolCall: func(name string, args map[string]any) {
            fmt.Printf("calling tool: %s\n", name)
        },
    }),
)

func WithLogger

func WithLogger(l *slog.Logger) Option

WithLogger sets the structured logger for the agent's operational output.

The agent logs at three levels:

  • Info: run lifecycle (start, end, cancellation)
  • Warn: recoverable failures (provider errors, memory write errors, circuit breaker open)
  • Debug: per-iteration detail (provider calls, tool dispatch)

Default: slog.Default(). The logger is never nil internally — the default is always applied. To suppress all log output, pass a logger with a discard handler:

goagent.WithLogger(slog.New(slog.NewTextHandler(io.Discard, nil)))

func WithLongTermMemory

func WithLongTermMemory(m LongTermMemory) Option

WithLongTermMemory configures the agent to retrieve semantically relevant context from past sessions before each Run, and to store the completed turn after each Run (subject to the configured WritePolicy). The default is no long-term memory.

Note: concurrent Run calls on the same Agent will issue concurrent Retrieve and Store calls to this backend. Whether that is safe depends on the LongTermMemory implementation supplied by the caller.

func WithLongTermTopK

func WithLongTermTopK(k int) Option

WithLongTermTopK sets how many messages the long-term memory retrieves per Run. Default: 3.

func WithMCPConnector

func WithMCPConnector(fn MCPConnectorFn) Option

WithMCPConnector registers an MCP connector that is called during New to establish a connection, discover tools, and obtain a Closer for lifecycle management. The connection is established once during New; if it fails, New returns an error and closes any already-opened connections.

Callers typically use the higher-level helpers in the mcp sub-package (mcp.WithStdio, mcp.WithSSE) rather than calling this directly.

func WithMaxIterations

func WithMaxIterations(n int) Option

WithMaxIterations limits how many reasoning iterations the agent may perform. Defaults to 10.

func WithModel

func WithModel(model string) Option

WithModel sets the model identifier forwarded to the provider. Required. The value is provider-specific (e.g. "qwen3" for Ollama, "claude-sonnet-4-6" for Anthropic). If not set, the provider may return an error.

func WithName

func WithName(name string) Option

WithName assigns a name to the agent. When a LongTermMemory is configured, the name is used as the session namespace: all vectors stored and retrieved during Run are scoped to this name, so two agents sharing the same LongTermMemory backend but different names cannot see each other's entries. If not set, no session filtering is applied.

func WithProvider

func WithProvider(p Provider) Option

WithProvider sets the LLM backend used by the agent.

func WithRunResult added in v0.5.1

func WithRunResult(dst *RunResult) Option

WithRunResult configures a destination pointer that the agent writes after each Run/RunBlocks call completes. The pointed-to RunResult is overwritten on every call, so the caller should read it before starting the next Run.

This is a synchronous, non-hook alternative to Hooks.OnRunEnd for callers that prefer inspecting metrics after Run returns rather than inside a callback.

Note: sharing the same pointer across concurrent Run calls is a data race. Use one pointer per goroutine or use Hooks.OnRunEnd with a mutex instead.

func WithShortTermMemory

func WithShortTermMemory(m ShortTermMemory) Option

WithShortTermMemory configures the agent to persist and replay conversation history across Run calls using the provided ShortTermMemory. The default is stateless — each Run starts with no history.

Note: sharing a ShortTermMemory across concurrent Run calls produces undefined message ordering. Sequential use (one Run at a time) is safe.

func WithShortTermTraceTools

func WithShortTermTraceTools(include bool) Option

WithShortTermTraceTools controls whether the full ReAct trace (tool calls and their results) is included when persisting to short-term memory.

  • true (default): the complete trace is stored — user message, all intermediate assistant turns, tool results, and final assistant answer. The next Run will see the full reasoning history.
  • false: only the user message and the final assistant answer are stored, discarding intermediate tool call steps.

func WithSystemPrompt

func WithSystemPrompt(prompt string) Option

WithSystemPrompt sets a system-level instruction sent to the provider on every Run call.

func WithThinking

func WithThinking(budgetTokens int) Option

WithThinking enables extended thinking with a fixed token budget. The model uses up to budgetTokens tokens for internal reasoning before responding or invoking a tool.

Minimum: 1024 tokens. Recommended ranges:

  • Simple tasks: 4 000–8 000
  • Complex tasks (math, code): 10 000–16 000
  • Deep reasoning: 16 000–32 000

For budgets above 32 000 tokens, consider WithAdaptiveThinking instead. Supported models: claude-sonnet-4-6, claude-opus-4-6, claude-sonnet-3-7, claude-opus-4, claude-opus-4-5.

func WithTool

func WithTool(t Tool) Option

WithTool registers a tool the agent may invoke during the ReAct loop.

func WithToolTimeout added in v0.5.1

func WithToolTimeout(d time.Duration) Option

WithToolTimeout sets an independent deadline for each individual tool call, separate from the parent context. If a tool does not complete within d, its context is cancelled and the failure is recorded (including by the circuit breaker if configured). Zero disables per-tool timeouts.

func WithWritePolicy

func WithWritePolicy(p WritePolicy) Option

WithWritePolicy sets the function that decides whether a completed turn (prompt + final response) is stored in long-term memory. Default: StoreAlways. Only effective when WithLongTermMemory is configured.

type Provider

type Provider interface {
	Complete(ctx context.Context, req CompletionRequest) (CompletionResponse, error)
}

Provider is the interface that wraps a language model backend. Callers supply a Provider to Agent via WithProvider.

func RetryProvider added in v0.5.1

func RetryProvider(inner Provider, policy RetryPolicy) Provider

RetryProvider wraps inner with exponential-backoff retry logic. On each failed Complete call, the policy decides whether to retry, how long to wait, and how many total attempts to make.

When the error carries a server-suggested delay (e.g. Retry-After header) and RetryPolicy.RetryAfter is set, that delay takes precedence over the computed backoff for that attempt.

Context cancellation is respected between retries: if ctx is cancelled while waiting, the wait returns immediately with ctx.Err().

type ProviderError

type ProviderError struct {
	Provider string
	Cause    error
}

ProviderError wraps an error returned by a Provider's Complete method.

func (*ProviderError) Error

func (e *ProviderError) Error() string

func (*ProviderError) Unwrap

func (e *ProviderError) Unwrap() error

Unwrap enables errors.Is and errors.As to inspect the underlying cause.

type ProviderEvent added in v0.5.1

type ProviderEvent struct {
	// Duration is the wall-clock time of the Provider.Complete call.
	Duration time.Duration

	// Usage reports token consumption for this completion.
	// Zero value if the call failed or the provider does not report usage.
	Usage Usage

	// StopReason indicates why the model stopped generating.
	// Only meaningful when Err is nil.
	StopReason StopReason

	// ToolCalls is the number of tool calls the model requested in this
	// completion. Zero when the model produced a final answer or an error.
	ToolCalls int

	// Err is nil on success. On provider failure it carries the underlying
	// error (before wrapping as *ProviderError).
	Err error
}

ProviderEvent captures metrics from a single provider call within the ReAct loop. It is passed to Hooks.OnProviderResponse after each Provider.Complete invocation, whether it succeeds or fails.

type RetryPolicy added in v0.5.1

type RetryPolicy struct {
	// MaxAttempts is the total number of attempts including the initial call.
	// Values ≤ 1 disable retries (the call is made once). Default: 3.
	MaxAttempts int

	// InitialDelay is the base wait time before the first retry.
	// Actual delay includes jitter (±25%). Default: 200ms.
	InitialDelay time.Duration

	// MaxDelay caps the computed delay for any single retry.
	// Default: 10s.
	MaxDelay time.Duration

	// Multiplier scales the delay on each successive retry.
	// Default: 2.0.
	Multiplier float64

	// Retryable decides whether an error should be retried.
	// When nil every non-nil error is retried.
	// Returning false stops the retry loop immediately.
	Retryable func(error) bool

	// RetryAfter extracts a server-suggested wait time from the error.
	// When non-nil and the returned duration is > 0, that duration is used
	// instead of the computed exponential delay for that attempt.
	// Typical use: parse a Retry-After header from a 429/503 response.
	// When nil or when it returns ≤ 0, the normal backoff applies.
	RetryAfter func(error) time.Duration
}

RetryPolicy configures retry behaviour for RetryProvider and RetryMiddleware.

Zero-value fields use sensible defaults: 3 attempts, 200ms initial delay, 10s max delay, 2x multiplier, retry all errors.

type Role

type Role string

Role identifies who authored a message in a conversation.

const (
	RoleUser      Role = "user"
	RoleAssistant Role = "assistant"
	RoleTool      Role = "tool"

	// RoleSystem is provided for completeness and for callers that build
	// Provider implementations or raw message slices. When using Agent,
	// the system prompt is set via WithSystemPrompt and is forwarded to the
	// provider through CompletionRequest.SystemPrompt — never as a Message
	// with this role. Callers do not need to construct RoleSystem messages
	// directly.
	RoleSystem Role = "system"

	// RoleDocument marks messages that represent indexed document chunks in a
	// RAG pipeline. These messages live only in the VectorStore — they are
	// never sent to a provider directly. The rag sub-package provides
	// formatters that extract their Content and embed it into tool-result
	// blocks before the model sees the retrieved context.
	RoleDocument Role = "document"
)

type RunResult added in v0.5.1

type RunResult struct {
	// Duration is the wall-clock time of the entire Run, from entry to return.
	Duration time.Duration

	// Iterations is the number of ReAct iterations executed (1-indexed).
	// On success this is the iteration that produced the final answer.
	// On MaxIterationsError this equals the configured maximum.
	Iterations int

	// TotalUsage is the sum of token counts across all provider calls.
	TotalUsage Usage

	// ToolCalls is the total number of tool invocations across all iterations.
	ToolCalls int

	// ToolTime is the total wall-clock time spent executing tools.
	// Parallel tool calls within the same iteration overlap, so this may
	// exceed the actual elapsed time.
	ToolTime time.Duration

	// Err is nil when Run succeeds. Otherwise it carries the same error
	// that Run returns (*ProviderError, *MaxIterationsError, context error, etc.).
	Err error
}

RunResult aggregates metrics for an entire Run/RunBlocks call. It is passed to Hooks.OnRunEnd and optionally written to a caller-supplied pointer via WithRunResult.

type Schema added in v0.5.1

type Schema = map[string]any

Schema is a JSON Schema object represented as a plain map. It is interchangeable with map[string]any and is the type expected by [ToolDefinition.Parameters].

func SchemaFrom added in v0.5.1

func SchemaFrom(v any) Schema

SchemaFrom derives a JSON Schema from a struct value using reflection. Pass an empty struct literal so the caller does not need to import reflect:

schema := goagent.SchemaFrom(struct {
    Query string `json:"query" jsonschema_description:"The search query"`
    TopK  int    `json:"top_k,omitempty"`
}{})

Supported tags

  • json:"name" → property name in the schema; "-" skips the field
  • json:"name,omitempty" → field is optional (not added to "required")
  • jsonschema_description:"text" → adds a "description" key to the property
  • jsonschema_enum:"a,b,c" → adds an "enum" key with the comma-separated values

Type mapping

string              → "string"
int / int8…int64 /
uint / uint8…uint64 → "integer"
float32 / float64   → "number"
bool                → "boolean"
[]T / [n]T          → "array"
anything else       → "string"  (conservative fallback)

Note: slice and array fields map to {"type": "array"} without an "items" property — the element type is not reflected. If the model needs to know the element type, build the schema manually or augment the returned map after calling SchemaFrom.

Pointer arguments are dereferenced before inspection. For non-struct types (after pointer dereferencing), SchemaFrom returns {"type":"object"} without panicking.

type ScoredMessage added in v0.5.1

type ScoredMessage struct {
	Message Message
	Score   float64
}

ScoredMessage combines a Message with the similarity score calculated by the VectorStore at retrieval time.

Score is in [0.0, 1.0] for stores that use cosine similarity with normalised vectors (most modern text embedders produce unit vectors). Score is 0.0 when the VectorStore does not expose similarity scores (i.e. does not implement scored search internally).

type SearchOption added in v0.5.3

type SearchOption func(*SearchOptions)

SearchOption configures an individual Search call.

func WithFilter added in v0.5.3

func WithFilter(f map[string]any) SearchOption

WithFilter returns a SearchOption that restricts search results to entries whose metadata contains all key–value pairs in f. Support is implementation-defined; implementations that do not support metadata filtering silently ignore this option.

func WithScoreThreshold added in v0.5.3

func WithScoreThreshold(min float64) SearchOption

WithScoreThreshold returns a SearchOption that discards results whose similarity score is strictly below min.

func WithTokenBudget added in v0.5.5

func WithTokenBudget(budget int, estimator func(ctx context.Context, text string) int) SearchOption

WithTokenBudget returns a SearchOption that limits the total token cost of results returned by [LongTermMemory.Retrieve]. Results are already ordered by descending similarity score; the loop stops as soon as the next result would exceed the remaining budget.

estimator is called once per result with the concatenated text content of that message. Pass e.g. est.Measure for any memory/vector.SizeEstimator. If estimator is nil or budget ≤ 0 the option has no effect.

This option is a no-op when passed directly to [VectorStore.Search]; it is only applied by [LongTermMemory.Retrieve].

type SearchOptions added in v0.5.3

type SearchOptions struct {
	// ScoreThreshold, when non-nil, causes Search to discard results whose
	// similarity score is strictly below this value. The score convention
	// follows ScoredMessage.Score for the underlying store.
	ScoreThreshold *float64

	// Filter restricts results to entries whose metadata contains all the
	// key–value pairs in this map. Support is implementation-defined:
	// implementations that do not support metadata filtering silently ignore it.
	Filter map[string]any

	// TokenBudget, when positive, caps the total estimated token cost of the
	// results returned by [LongTermMemory.Retrieve]. Results are evaluated in
	// score-descending order; once a result would exceed the remaining budget,
	// that result and all subsequent ones are dropped.
	// TokenEstimator must be non-nil when TokenBudget > 0; otherwise the
	// option has no effect.
	// This field is consumed by [LongTermMemory] and is invisible to
	// [VectorStore] implementations.
	TokenBudget int

	// TokenEstimator is called once per result with the concatenated text of
	// that message's content blocks to determine its token cost.
	// Provide e.g. est.Measure for any memory/vector.SizeEstimator.
	TokenEstimator func(ctx context.Context, text string) int
}

SearchOptions holds the resolved configuration for a single Search call. Fields at their zero value impose no constraint. Implementations that do not support a given field must silently ignore it.

type ShortTermMemory

type ShortTermMemory interface {
	// Messages returns the messages to include in the next provider request,
	// with the configured read policy applied (e.g. FixedWindow, TokenWindow).
	// The returned slice is a defensive copy; callers may modify it freely.
	Messages(ctx context.Context) ([]Message, error)

	// Append adds msgs to the store in the order provided.
	// Filtering occurs at read time (Messages), never at write time.
	Append(ctx context.Context, msgs ...Message) error
}

ShortTermMemory manages the active conversation history for an Agent. It is used to maintain context across multiple Run calls within a session. Implementations must be safe for concurrent use.

type StopReason

type StopReason int

StopReason indicates why the model stopped generating.

const (
	StopReasonEndTurn   StopReason = iota // model produced a final answer
	StopReasonMaxTokens                   // token limit reached
	StopReasonToolUse                     // model wants to call tools
	StopReasonError                       // provider-level error
)

func (StopReason) String

func (s StopReason) String() string

String returns a human-readable representation of the stop reason.

type ThinkingConfig

type ThinkingConfig struct {
	Enabled      bool
	BudgetTokens int
}

ThinkingConfig configures the model's extended thinking mode.

Manual mode: Enabled true, BudgetTokens > 0. The model uses up to BudgetTokens tokens for internal reasoning before responding or invoking a tool. Minimum: 1024 tokens.

Adaptive mode: Enabled true, BudgetTokens 0. The model decides how much to reason based on prompt complexity. Recommended for Opus 4.6 and Sonnet 4.6.

Disabled: Enabled false (or nil pointer in CompletionRequest).

type ThinkingData

type ThinkingData struct {
	Thinking  string
	Signature string
}

ThinkingData holds the model's internal reasoning produced during extended thinking. The Agent preserves this data within a turn so the provider can echo it back to the API (required by Anthropic for thinking continuity).

Thinking is the reasoning text — may be a summary in Claude 4+ or the full chain-of-thought in Claude Sonnet 3.7 and local models.

Signature is the opaque cryptographic token issued by the Anthropic API to verify the block's authenticity. It must be echoed back unchanged and must never be logged, inspected, or modified. For local models (Ollama) that do not use this mechanism, Signature is an empty string.

type Tool

type Tool interface {
	Definition() ToolDefinition
	Execute(ctx context.Context, args map[string]any) ([]ContentBlock, error)
}

Tool is the interface that callers implement to give capabilities to an agent.

Execute receives the arguments parsed by the model and returns content that is injected into the conversation as the tool result. The content can be text, images, documents, or any combination.

For tools that only return text (the most common case), use ToolFunc which accepts func(...) (string, error) and wraps it automatically.

func RetryTool added in v0.5.1

func RetryTool(inner Tool, policy RetryPolicy) Tool

RetryTool wraps inner with exponential-backoff retry logic. On each failed Execute call, the policy decides whether to retry, how long to wait, and how many total attempts to make.

The caller decides which tools get retry behaviour:

apiTool := goagent.RetryTool(myAPITool, goagent.RetryPolicy{
    MaxAttempts:  3,
    InitialDelay: 100 * time.Millisecond,
})

agent, _ := goagent.New(
    goagent.WithTool(apiTool),          // retried
    goagent.WithTool(localTool),         // not retried
)

func ToolBlocksFunc

func ToolBlocksFunc(
	name, description string,
	parameters map[string]any,
	fn func(ctx context.Context, args map[string]any) ([]ContentBlock, error),
) Tool

ToolBlocksFunc creates a Tool from a plain function that returns multimodal content. Use this when the tool needs to return images, documents, or a combination of content types.

For tools that only return text, prefer ToolFunc which is more ergonomic.

Example

ExampleToolBlocksFunc shows how to create a Tool that returns multimodal content blocks instead of plain text.

package main

import (
	"context"
	"fmt"

	"github.com/Germanblandin1/goagent"
)

func main() {
	t := goagent.ToolBlocksFunc("screenshot", "captures the current screen", nil,
		func(_ context.Context, _ map[string]any) ([]goagent.ContentBlock, error) {
			return []goagent.ContentBlock{
				goagent.TextBlock("captured"),
			}, nil
		},
	)
	fmt.Println(t.Definition().Name)
	fmt.Println(t.Definition().Description)
}
Output:
screenshot
captures the current screen

func ToolFunc

func ToolFunc(
	name, description string,
	parameters map[string]any,
	fn func(ctx context.Context, args map[string]any) (string, error),
) Tool

ToolFunc creates a Tool from a plain function that returns text. The returned string is wrapped in a single text ContentBlock automatically, avoiding the need to define a new struct for simple tools.

For tools that need to return images, documents, or mixed content, use ToolBlocksFunc instead.

Example

ExampleToolFunc shows how to create a Tool from a plain function.

package main

import (
	"context"
	"fmt"

	"github.com/Germanblandin1/goagent"
)

func main() {
	t := goagent.ToolFunc("add", "adds two numbers", nil,
		func(_ context.Context, _ map[string]any) (string, error) {
			return "42", nil
		},
	)
	fmt.Println(t.Definition().Name)
	fmt.Println(t.Definition().Description)
}
Output:
add
adds two numbers

type ToolCall

type ToolCall struct {
	// ID is the opaque identifier assigned by the model to this tool call.
	// It must be echoed back in Message.ToolCallID of the corresponding
	// tool result message so the model can correlate the result with the
	// request. Provider implementations must populate this field; leaving
	// it empty will cause the next completion to fail on APIs that enforce
	// the tool call / tool result pairing (e.g. Anthropic, OpenAI).
	ID string

	// Name is the tool name the model wants to invoke, matching the
	// Name field of a registered ToolDefinition.
	Name string

	// Arguments contains the arguments the model supplied for this call,
	// decoded from the provider's JSON payload. Keys match the parameter
	// names defined in ToolDefinition.Parameters.
	Arguments map[string]any
}

ToolCall represents a request from the model to invoke a tool.

type ToolDefinition

type ToolDefinition struct {
	Name        string
	Description string
	Parameters  map[string]any
}

ToolDefinition describes a tool's name, purpose, and parameter schema. Parameters must be a valid JSON Schema object as map[string]any.

type ToolExecutionError

type ToolExecutionError struct {
	ToolName string
	Args     map[string]any
	Cause    error
}

ToolExecutionError wraps an error returned by a tool's Execute method, adding the tool name and arguments for diagnosis.

func (*ToolExecutionError) Error

func (e *ToolExecutionError) Error() string

func (*ToolExecutionError) Unwrap

func (e *ToolExecutionError) Unwrap() error

Unwrap enables errors.Is and errors.As to inspect the underlying cause.

type ToolPanicError added in v0.5.1

type ToolPanicError struct {
	// ToolName is the name of the tool that panicked.
	ToolName string

	// Value is the value passed to panic().
	Value any

	// Stack is the goroutine stack trace captured at the point of recovery.
	Stack []byte
}

ToolPanicError is returned when a tool's Execute method panics. The panic is recovered and wrapped into this error so it propagates through the middleware chain as a normal error instead of crashing the host process.

func (*ToolPanicError) Error added in v0.5.1

func (e *ToolPanicError) Error() string

func (*ToolPanicError) StackTrace added in v0.5.1

func (e *ToolPanicError) StackTrace() string

StackTrace returns the goroutine stack trace captured at the point of recovery. Useful for logging and debugging.

type ToolResult

type ToolResult struct {
	ToolCallID string
	Name       string
	Content    []ContentBlock
	// Err is nil on success. On tool failure it carries the underlying error,
	// typically *ToolExecutionError or *CircuitOpenError.
	Err error
	// Duration is how long Execute took. It is zero when the tool was not
	// found (ErrToolNotFound) — no execution occurred in that case.
	Duration time.Duration
}

ToolResult holds the outcome of a single tool execution dispatched by the agent.

type UnsupportedContentError

type UnsupportedContentError struct {
	ContentType ContentType
	Provider    string
	Reason      string
}

UnsupportedContentError provides detail about which content type is not supported and by which provider. It wraps ErrUnsupportedContent so callers can match with errors.Is(err, ErrUnsupportedContent).

func (*UnsupportedContentError) Error

func (e *UnsupportedContentError) Error() string

func (*UnsupportedContentError) Unwrap

func (e *UnsupportedContentError) Unwrap() error

Unwrap returns ErrUnsupportedContent so errors.Is works.

type UpsertEntry added in v0.5.4

type UpsertEntry struct {
	ID      string
	Vector  []float32
	Message Message
}

UpsertEntry holds the data for a single [BulkVectorStore.BulkUpsert] element.

type Usage

type Usage struct {
	// InputTokens is the number of tokens in the prompt (including history).
	InputTokens int

	// OutputTokens is the number of tokens in the model's response.
	OutputTokens int

	// CacheCreationInputTokens is the number of tokens written to the prompt
	// cache during this request. Only populated by providers that support
	// prompt caching (e.g. Anthropic); zero otherwise.
	CacheCreationInputTokens int

	// CacheReadInputTokens is the number of tokens read from the prompt
	// cache during this request. Only populated by providers that support
	// prompt caching (e.g. Anthropic); zero otherwise.
	CacheReadInputTokens int
}

Usage reports token consumption for a completion.

type VectorStore

type VectorStore interface {
	// Upsert stores or updates a message with its embedding vector.
	// id must be a stable identifier for the message (e.g. content hash).
	Upsert(ctx context.Context, id string, vector []float32, msg Message) error

	// Search returns the topK messages most similar to the given vector,
	// each paired with its similarity score. Score is in [0.0, 1.0] for
	// stores that use cosine similarity with normalised vectors.
	// opts may constrain results (e.g. WithScoreThreshold, WithFilter).
	Search(ctx context.Context, vector []float32, topK int, opts ...SearchOption) ([]ScoredMessage, error)

	// Delete removes the entry with the given id from the store.
	// It is a no-op if id does not exist.
	Delete(ctx context.Context, id string) error
}

VectorStore stores (message, embedding) pairs and supports similarity search. This module does not ship a VectorStore implementation; the caller must supply one (e.g. a pgvector client, a Chroma adapter, or an in-process approximate nearest-neighbour store).

func NewObservableStore added in v0.5.5

func NewObservableStore(inner VectorStore, obs VectorStoreObserver) VectorStore

NewObservableStore wraps inner with the provided observer callbacks. Every VectorStore method fires the corresponding callback after the inner call returns, passing the elapsed duration and any error.

If inner also implements BulkVectorStore, the returned value implements it too: [BulkVectorStore.BulkUpsert] and [BulkVectorStore.BulkDelete] each fire their respective callbacks. The RAG pipeline detects BulkVectorStore via a type assertion, so the wrapper is transparent to it.

type VectorStoreObserver added in v0.5.5

type VectorStoreObserver struct {
	// OnUpsert is called after every [VectorStore.Upsert] with the entry id,
	// elapsed duration, and any error returned by the inner store.
	OnUpsert func(ctx context.Context, id string, dur time.Duration, err error)

	// OnSearch is called after every [VectorStore.Search] with the requested
	// topK, the number of results actually returned, elapsed duration, and
	// any error.
	OnSearch func(ctx context.Context, topK int, results int, dur time.Duration, err error)

	// OnDelete is called after every [VectorStore.Delete] with the entry id,
	// elapsed duration, and any error.
	OnDelete func(ctx context.Context, id string, dur time.Duration, err error)

	// OnBulkUpsert is called after every [BulkVectorStore.BulkUpsert] with the
	// number of entries in the batch, elapsed duration, and any error.
	// It is only invoked when the inner store implements [BulkVectorStore].
	OnBulkUpsert func(ctx context.Context, count int, dur time.Duration, err error)

	// OnBulkDelete is called after every [BulkVectorStore.BulkDelete] with the
	// number of ids in the batch, elapsed duration, and any error.
	// It is only invoked when the inner store implements [BulkVectorStore].
	OnBulkDelete func(ctx context.Context, count int, dur time.Duration, err error)
}

VectorStoreObserver holds optional callbacks fired after each VectorStore operation completes. All fields are optional; nil callbacks are silently skipped. Callbacks are invoked synchronously after the operation returns, so heavy work (e.g. writing to an external metrics system) should spawn a goroutine.

Use NewObservableStore to wrap a VectorStore with these callbacks. Use MergeVectorStoreObservers to compose multiple observers.

func MergeVectorStoreObservers added in v0.5.5

func MergeVectorStoreObservers(observers ...VectorStoreObserver) VectorStoreObserver

MergeVectorStoreObservers returns a VectorStoreObserver that calls each provided observer in order. Nil-field observers in the input are silently ignored; when no input has a callback for a field, that field remains nil, preserving zero-value semantics.

obs := goagent.MergeVectorStoreObservers(logObserver, otelObserver)
store := goagent.NewObservableStore(rawStore, obs)

type WritePolicy

type WritePolicy func(prompt, response Message) []Message

WritePolicy decides what to persist after a completed turn. It is called once per Run after the final answer is produced.

Returning nil discards the turn — nothing is written to long-term memory. Returning a non-nil slice (even an empty one) stores exactly those messages.

This design lets policies both filter and transform: a policy may return the original user+assistant pair unchanged (like StoreAlways), a condensed single message (like a summarising judge agent), or any custom set of messages.

prompt is the user Message that opened the turn (may contain images, documents, or other multimodal blocks). response is the final assistant Message. Both are passed in full so policies can inspect or forward binary content without losing it.

var StoreAlways WritePolicy = func(p, r Message) []Message {
	return []Message{p, r}
}

StoreAlways is a WritePolicy that persists every turn as the original user+assistant message pair. It is the default when WithLongTermMemory is configured without an explicit WritePolicy.

func MinLength

func MinLength(n int) WritePolicy

MinLength returns a WritePolicy that stores the original user+assistant pair only when the combined character count of their text content exceeds n. Returns nil (discard) when the combined length is n or fewer characters. Useful for filtering out trivial exchanges ("ok", "gracias", "seguí") that add noise to the long-term store without carrying durable information.

Example

ExampleMinLength shows how MinLength filters out short exchanges from long-term memory storage. A nil return means the turn is discarded; a non-nil slice is stored as-is.

package main

import (
	"fmt"

	"github.com/Germanblandin1/goagent"
)

func main() {
	policy := goagent.MinLength(10)

	// 4 chars total ("hi" + "ok") — below threshold, turn is discarded.
	fmt.Println(policy(goagent.UserMessage("hi"), goagent.AssistantMessage("ok")) == nil)

	// 15 chars total ("hello world" + "fine") — above threshold, returns the
	// default user+assistant pair ready to be passed to LongTermMemory.Store.
	msgs := policy(goagent.UserMessage("hello world"), goagent.AssistantMessage("fine"))
	fmt.Printf("%d messages: %s / %s\n", len(msgs), msgs[0].Role, msgs[1].Role)
}
Output:
true
2 messages: user / assistant

Directories

Path Synopsis
internal
testutil
Package testutil provides shared test helpers for the goagent module.
Package testutil provides shared test helpers for the goagent module.
Package memory provides ShortTermMemory and LongTermMemory implementations for the goagent framework.
Package memory provides ShortTermMemory and LongTermMemory implementations for the goagent framework.
policy
Package policy provides Policy implementations for the goagent memory system.
Package policy provides Policy implementations for the goagent memory system.
storage
Package storage provides Storage implementations for the goagent memory system.
Package storage provides Storage implementations for the goagent memory system.
vector
Package vector provides embeddings, chunking, similarity search, and an in-process vector store for goagent's long-term memory subsystem.
Package vector provides embeddings, chunking, similarity search, and an in-process vector store for goagent's long-term memory subsystem.
rag module

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL