sleipnir

package module
v0.1.1-alpha Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 24, 2026 License: Apache-2.0 Imports: 12 Imported by: 0

README

Sleipnir

A minimal Go multi-agent LLM orchestration harness.

In Norse mythology, Sleipnir is Odin's eight-legged horse, on which he rides across sky and sea.

Use of AI

This software is developed with the assistance of AI.

Status

Alpha. Initial implementation complete. API may change before v1.

What it is

  • A thin canonical agent loop in readable Go.
  • First-class parallel sub-agent dispatch: sub-agents are just tools.
  • Typed event stream suitable for SSE, CLI, or test assertions.
  • Middleware hooks for context compression, token accounting, observability, and retries.
  • Provider-agnostic via any-llm-go with runtime model/provider selection per agent.

What it is not

A framework. No DI container, no singletons, no declarative YAML (in v1), no persistence, no retrieval or vector-store opinions. The harness owns no data; conversation history is an argument in and an argument out.

Requirements

  • Go 1.25+

Install

go get sleipnir.dev/sleipnir

Examples

Both examples use a scripted demoProvider and run without an API key.

examples/hello/ — minimal working harness: one agent, one typed tool, one custom event sink. Shows NewHarness, RegisterAgent, NewTypedTool, Run, and the Sink interface.

go run ./examples/hello/

examples/orchestrator/ — parent agent dispatching two registered sub-agents. Demonstrates AgentAsTool, per-agent model routing via MapRouter.Overrides, ExtraTools with OmitExtraToolsInheritance, and an event log that collects then replays all agent lifecycle and tool events.

go run ./examples/orchestrator/

Concepts

A Harness is the central object: you register one or more AgentSpec values into it, then call Run to drive a single agent through the LLM loop. Each loop iteration calls the provider, dispatches any tool calls in parallel (up to MaxParallelTools), and feeds results back until the model returns a final text response. Tools that wrap other agents give you sub-agent orchestration for free — a sub-agent is just a Tool. Events stream out through a Sink as the run progresses, so the caller can observe tokens, tool calls, and agent lifecycle without polling. The LLM provider is never hardcoded: a ModelRouter resolves a ModelConfig (provider, model, sampling parameters) for each agent at call time, so you can route different agents to different models or providers without changing agent code.

API overview

Harness

NewHarness(cfg Config) (*Harness, error) creates a harness from a Config. RegisterAgent(spec AgentSpec) error adds an agent; once Run is called the harness is frozen against further registrations (unless AllowLateRegistration is set). Run(ctx, RunInput) (RunOutput, error) executes the named agent and returns the final text, full message history, aggregate token usage, and the stop reason. Config can be built directly or loaded from environment variables with LoadConfigFromEnv().

Agents

An AgentSpec declares an agent's name, an optional description and input schema (used when the agent is exposed as a tool), a SystemPrompt function that receives the AgentInput and returns a string, a list of Tool values, a list of Middleware values, and per-agent overrides for MaxIterations and MaxParallelTools. Zero values for the iteration and parallelism limits fall back to the harness-wide defaults in Config.

Tools

The Tool interface has two methods: Definition() ToolDefinition (name, description, JSON schema) and Invoke(ctx, json.RawMessage) (ToolResult, error). NewFuncTool constructs a tool from a raw-JSON handler when you want full control over argument parsing. NewTypedTool[T] constructs a tool from a typed handler — the JSON schema is derived automatically from the type parameter via invopop/jsonschema. ToolResult carries a Content string and an IsError bool; a non-nil error return from Invoke signals an infrastructure failure that is isolated from the run, while ToolResult{IsError: true} signals a structured failure that the LLM can read and react to.

Router

ModelRouter is an interface with a single method: Resolve(ctx, agentName) (ModelConfig, error). MapRouter is a ready-made implementation: set Default for the baseline model config and add per-agent entries in Overrides. ModelConfig holds the anyllm.Provider, model string, reasoning effort, temperature, and max output tokens.

Events

All events implement the Event interface. The full set is: AgentStartEvent, AgentEndEvent, TokenEvent, ThinkingEvent, ToolCallEvent, ToolResultEvent, QuestionEvent, TodoEvent, and ErrorEvent. Every event that carries an agent name also implements AgentNamer (EventAgent() string), which lets sinks filter by agent. The Sink interface has a single method, Send(Event). BufferedSink is a channel-backed implementation: create one with NewBufferedSink(ctx, size), pass it in RunInput.Events, and read events from Events(). Dropped events (when the channel is full) are counted by DroppedCount().

Errors

Harness lifecycle: ErrHarnessFrozen, ErrAgentNotRegistered, ErrToolNameCollision.

Run budget: ErrIterationBudget (loop exhausted MaxIterations), ErrTokenBudget (cumulative usage exceeded RunInput.MaxTotalTokens). Both are returned alongside a partial RunOutput.

Human-in-the-loop: ErrHITLTimeout, ErrHITLCancelled.

Middleware: ErrCompactionFailed (compactor failed; run proceeds uncompacted).

Sub-packages

Package Purpose
sleipnir.dev/sleipnir/middleware/retry DefaultRetryPolicy — exponential backoff on transient LLM errors
sleipnir.dev/sleipnir/middleware/slogobs SlogObserver — structured log observer for all LLM and tool events
sleipnir.dev/sleipnir/middleware/accounting TokenAccountant — per-agent and aggregate token tracking
sleipnir.dev/sleipnir/middleware/compact Compactor — context-window compaction when history approaches the model limit
sleipnir.dev/sleipnir/mcpadapter Load tools from an MCP server into any agent
sleipnir.dev/sleipnir/sleipnirtest StubProvider, EventCollector, StubTool — test helpers for agent behaviour

License

Apache-2.0.

Documentation

Overview

Package sleipnir provides a multi-agent LLM orchestration harness. It is a thin canonical agent loop with first-class parallel sub-agents, a typed event stream, and a provider-agnostic LLM boundary.

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrHarnessFrozen      = errors.New("sleipnir: harness is frozen; RegisterAgent called after first Run with AllowLateRegistration disabled")
	ErrAgentNotRegistered = errors.New("sleipnir: agent not registered")
	ErrToolNameCollision  = errors.New("sleipnir: duplicate tool name detected")
)

Harness lifecycle errors.

View Source
var (
	ErrIterationBudget = errors.New("sleipnir: agent loop exhausted MaxIterations without a final text response")
	ErrTokenBudget     = errors.New("sleipnir: cumulative token usage exceeded RunInput.MaxTotalTokens")
)

Run budget errors.

View Source
var (
	ErrHITLTimeout   = errors.New("sleipnir: HITL handler did not return within SLEIPNIR_HITL_TIMEOUT")
	ErrHITLCancelled = errors.New("sleipnir: HITL handler returned because context was cancelled")
)

Human-in-the-loop errors.

View Source
var (
	ErrCompactionFailed = errors.New("sleipnir: compactor middleware failed; run proceeds uncompacted")
)

Middleware errors.

Functions

func WithCompactStore

func WithCompactStore(ctx context.Context, cs CompactStore) context.Context

WithCompactStore stores cs in ctx. Called by the harness at the start of every runLoop invocation; also usable in tests for unit-testing middleware that reads the store.

Types

type AgentEndEvent

type AgentEndEvent struct {
	AgentName string
	Usage     Usage
	Stopped   StopReason
}

func (AgentEndEvent) EventAgent

func (e AgentEndEvent) EventAgent() string

type AgentInfo

type AgentInfo struct {
	Name       string
	ParentName string // "" for top-level agents
	Depth      int    // 0 for top-level
	IsSubAgent bool
}

type AgentInput

type AgentInput struct {
	Prompt  string
	Input   json.RawMessage
	History []anyllm.Message
}

type AgentNamer

type AgentNamer interface {
	EventAgent() string
}

Exported so sleipnirtest can use it:

type AgentSpec

type AgentSpec struct {
	Name             string
	Description      string
	InputSchema      map[string]any
	SystemPrompt     func(AgentInput) string
	Tools            []Tool
	Middlewares      []Middleware
	MaxIterations    int
	MaxParallelTools int
}

type AgentStartEvent

type AgentStartEvent struct{ AgentName, ParentName string }

func (AgentStartEvent) EventAgent

func (e AgentStartEvent) EventAgent() string

type BaseMiddleware

type BaseMiddleware struct{}

BaseMiddleware can be embedded in any struct to satisfy the Middleware marker interface. External packages (e.g. sleipnirtest, application code) must embed this to produce valid Middleware values.

type BufferedSink

type BufferedSink struct {
	// contains filtered or unexported fields
}

func NewBufferedSink

func NewBufferedSink(ctx context.Context, size int) *BufferedSink

func (*BufferedSink) DroppedCount

func (s *BufferedSink) DroppedCount() int64

func (*BufferedSink) Events

func (s *BufferedSink) Events() <-chan Event

func (*BufferedSink) Send

func (s *BufferedSink) Send(e Event)

type CachedRouter

type CachedRouter struct {
	// contains filtered or unexported fields
}

CachedRouter wraps a ModelRouter and caches resolved ModelConfig values keyed by agent name. The cache persists for the lifetime of the CachedRouter instance. Construct a new CachedRouter per Run to get per-Run scoping.

func NewCachedRouter

func NewCachedRouter(inner ModelRouter) *CachedRouter

func (*CachedRouter) Resolve

func (r *CachedRouter) Resolve(ctx context.Context, agentName string) (ModelConfig, error)

type CompactStore

type CompactStore interface {
	GetWatermark(agentName string) int // returns 0 if not set
	SetWatermark(agentName string, n int)
}

CompactStore tracks stable compaction watermarks per agent, scoped to one run. Stored in ctx by the harness; consumed by the Compactor middleware.

func CompactStoreFrom

func CompactStoreFrom(ctx context.Context) (CompactStore, bool)

CompactStoreFrom returns the CompactStore added by the harness, or (nil, false). Used by middleware/compact — not needed by most callers.

type Config

type Config struct {
	DefaultMaxIterations    int
	DefaultMaxParallelTools int
	CompactThreshold        float64 // 0.75 = 75% of model context; validated (0.0, 1.0]
	CompactModel            string
	MaxLLMRetries           int
	HITLTimeout             time.Duration
	// LogLevel controls the minimum log level for the harness logger.
	//
	// NOTE: this field is only honoured when set via [LoadConfigFromEnv] or
	// the SLEIPNIR_LOG_LEVEL environment variable. Direct struct construction
	// cannot set the internal logLevelSet flag, so [resolveDefaults] will
	// silently override any value set here to [slog.LevelInfo].
	LogLevel              slog.Level
	LogFormat             string // "json" | "text"
	Middlewares           []Middleware
	AllowLateRegistration bool
	// contains filtered or unexported fields
}

Config holds harness-wide settings. All fields are optional; zero values are replaced with documented defaults by [resolveDefaults].

Note on [Config.Middlewares]: nil and empty slice are both valid; the harness treats them identically.

func LoadConfigFromEnv

func LoadConfigFromEnv() (Config, error)

LoadConfigFromEnv reads SLEIPNIR_* environment variables and returns a fully resolved Config. Missing variables fall back to documented defaults. No files are read; no globals are mutated.

type ContextRewriter

type ContextRewriter interface {
	Middleware
	RewriteBeforeLLMCall(ctx context.Context, req *LLMRequest) error
}

ContextRewriter can mutate an LLMRequest immediately before it is sent to the provider. Mutations are visible to the provider but not persisted to the conversation history. Returning a non-nil error emits an ErrorEvent; the request is sent as-is.

type ErrorEvent

type ErrorEvent struct {
	AgentName string
	Err       error
}

func (ErrorEvent) EventAgent

func (e ErrorEvent) EventAgent() string

type Event

type Event interface {
	// contains filtered or unexported methods
}

type HITLHandler

type HITLHandler interface {
	AskUser(ctx context.Context, agent, question, contextBlurb string) (string, error)
}

type Harness

type Harness struct {
	// contains filtered or unexported fields
}

func NewHarness

func NewHarness(cfg Config) (*Harness, error)

func (*Harness) AgentAsTool

func (h *Harness) AgentAsTool(name string) (Tool, error)

AgentAsTool returns a Tool that, when dispatched by the harness, recursively runs the named sub-agent. The tool's Definition is derived from the registered AgentSpec (Name, Description, InputSchema).

Returns ErrAgentNotRegistered if no agent with that name has been registered.

func (*Harness) RegisterAgent

func (h *Harness) RegisterAgent(spec AgentSpec) error

Register agent with harness. Returns ErrHarnessFrozen if `h.frozen.Load() && !h.cfg.AllowLateRegistration`

func (*Harness) Run

func (h *Harness) Run(ctx context.Context, in RunInput) (RunOutput, error)

Run the agent loop

type LLMObserver

type LLMObserver interface {
	Middleware
	OnLLMCall(ctx context.Context, req *LLMRequest, resp *LLMResponse, err error)
}

LLMObserver is called after every provider call, whether it succeeds or fails. It must not mutate req or resp.

type LLMRequest

type LLMRequest struct {
	Agent    AgentInfo
	Messages []anyllm.Message
	Tools    []anyllm.Tool // converted from []ToolDefinition via toolsToAnyllm()
	Model    ModelConfig
}

LLMRequest is the request passed to the LLM. Middleware may mutate fields. The Agent field must not be mutated by middleware.

type LLMResponse

type LLMResponse struct {
	Agent   AgentInfo
	Message anyllm.Message // the assistant turn (Role == RoleAssistant)
	Usage   Usage          // converted from anyllm.Usage, TotalTokens computed
}

LLMResponse wraps the provider response and carries AgentInfo for observers.

type MapRouter

type MapRouter struct {
	Default   ModelConfig
	Overrides map[string]ModelConfig
}

func (MapRouter) Resolve

func (r MapRouter) Resolve(_ context.Context, agentName string) (ModelConfig, error)

type Middleware

type Middleware interface {
	// contains filtered or unexported methods
}

Middleware is a marker interface for middleware values. A middleware may implement any subset of ContextRewriter, LLMObserver, ToolObserver, and RetryPolicy. The harness discovers capabilities via type assertion.

type ModelConfig

type ModelConfig struct {
	Provider        anyllm.Provider
	Model           string
	ReasoningEffort anyllm.ReasoningEffort
	Temperature     *float64
	MaxOutputTokens *int
}

type ModelRouter

type ModelRouter interface {
	Resolve(ctx context.Context, agentName string) (ModelConfig, error)
}

type QuestionEvent

type QuestionEvent struct{ AgentName, QuestionID, Question string }

func (QuestionEvent) EventAgent

func (e QuestionEvent) EventAgent() string

type RetryPolicy

type RetryPolicy interface {
	Middleware
	ShouldRetry(ctx context.Context, attempt int, err error) (retry bool, backoff time.Duration)
}

RetryPolicy decides whether a failed provider call should be retried and how long to wait before the next attempt. The first RetryPolicy in the chain wins; subsequent policies are not consulted. attempt is 0-based: 0 on the first failure, 1 on the second, and so on.

type RunInput

type RunInput struct {
	AgentName                 string
	Prompt                    string
	Input                     any
	History                   []anyllm.Message
	Router                    ModelRouter
	Events                    Sink
	HITL                      HITLHandler
	ExtraTools                []Tool
	OmitExtraToolsInheritance bool // false (default) = sub-agents inherit ExtraTools
	MaxTotalTokens            int
}

type RunOutput

type RunOutput struct {
	Text     string
	Messages []anyllm.Message
	Usage    Usage
	Stopped  StopReason
}

type Sink

type Sink interface{ Send(Event) }

type StopReason

type StopReason string
const (
	StopDone             StopReason = "done"
	StopIterationBudget  StopReason = "iteration_budget"
	StopTokenBudget      StopReason = "token_budget"
	StopHITLTimeout      StopReason = "hitl_timeout"
	StopHITLCancelled    StopReason = "hitl_cancelled"
	StopContextCancelled StopReason = "context_cancelled"
	StopContextTimeout   StopReason = "context_timeout"
)

type ThinkingEvent

type ThinkingEvent struct{ AgentName, Text string }

func (ThinkingEvent) EventAgent

func (e ThinkingEvent) EventAgent() string

type TodoEvent

type TodoEvent struct {
	AgentName, ToolCallID string
	Todos                 []TodoItem
}

func (TodoEvent) EventAgent

func (e TodoEvent) EventAgent() string

type TodoItem

type TodoItem struct {
	ID     string     `json:"id"`
	Text   string     `json:"text"`
	Status TodoStatus `json:"status"`
}

type TodoStatus

type TodoStatus string
const (
	TodoPending    TodoStatus = "pending"
	TodoInProgress TodoStatus = "in_progress"
	TodoDone       TodoStatus = "done"
)

type TokenEvent

type TokenEvent struct{ AgentName, Text string }

func (TokenEvent) EventAgent

func (e TokenEvent) EventAgent() string

type Tool

type Tool interface {
	Definition() ToolDefinition
	Invoke(ctx context.Context, args json.RawMessage) (ToolResult, error)
}

Tool defines the behaviour of an agent tool

There are two distinct failure modes:

  • Infrastructure failure the LLM cannot meaningfully interpret -> Wrap error as ToolResult{IsError: true, Content: "tool execution failed: "+err.Error()} -> Emit ErrorEvent, continue run
  • Structured failure the LLM can read and react to -> passed to LLM as tool_result: (ToolResult{IsError: true}, nil)

func AskUserTool

func AskUserTool(handler HITLHandler) Tool

AskUserTool returns a Tool that, when invoked, emits a QuestionEvent and then delegates to the provided HITLHandler. Pass via RunInput.ExtraTools. If handler is nil, Invoke returns ToolResult{IsError: true} immediately.

func NewFuncTool

func NewFuncTool(name, desc string, schema map[string]any, fn func(context.Context, json.RawMessage) (ToolResult, error)) Tool

func NewTypedTool

func NewTypedTool[T any](name, desc string, fn func(context.Context, T) (ToolResult, error)) (Tool, error)

func TodoReadTool

func TodoReadTool() Tool

TodoReadTool returns a Tool that reads the current todo list for the invoking agent. Returns "[]" if no list has been written yet. Pass via RunInput.ExtraTools.

func TodoWriteTool

func TodoWriteTool() Tool

TodoWriteTool returns a Tool that replaces the current todo list for the invoking agent and emits a TodoEvent. The list is keyed by agent name: state persists across turns, but two concurrent calls from the same agent are not isolated from each other. Pass via RunInput.ExtraTools.

type ToolCall

type ToolCall struct {
	Agent      AgentInfo
	ToolCallID string
	ToolName   string
	Args       json.RawMessage
}

ToolCall carries the agent identity and call parameters for ToolObserver. Middleware must not mutate Agent fields. ToolCall is distinct from anyllm.ToolCall; it carries harness-level agent identity.

type ToolCallEvent

type ToolCallEvent struct {
	AgentName, ToolCallID, ToolName string
	Args                            json.RawMessage
}

func (ToolCallEvent) EventAgent

func (e ToolCallEvent) EventAgent() string

type ToolDefinition

type ToolDefinition struct {
	Name        string
	Description string
	InputSchema map[string]any
}

type ToolObserver

type ToolObserver interface {
	Middleware
	OnToolCall(ctx context.Context, call *ToolCall, result *ToolResult, err error)
}

ToolObserver is called after every tool invocation, including sub-agent calls. It must not mutate tc or result.

type ToolResult

type ToolResult struct {
	Content string
	IsError bool
}

type ToolResultEvent

type ToolResultEvent struct {
	AgentName, ToolCallID string
	Result                string
	IsError               bool
}

func (ToolResultEvent) EventAgent

func (e ToolResultEvent) EventAgent() string

type Usage

type Usage struct {
	InputTokens  int64
	OutputTokens int64
	TotalTokens  int64
}

Directories

Path Synopsis
examples
hello command
hello demonstrates the minimal Sleipnir setup: one agent, one typed tool, and a custom event sink.
hello demonstrates the minimal Sleipnir setup: one agent, one typed tool, and a custom event sink.
orchestrator command
orchestrator demonstrates a parent agent dispatching two sub-agents via AgentAsTool, with per-agent model routing, ExtraTools isolation, and event collection for replay.
orchestrator demonstrates a parent agent dispatching two sub-agents via AgentAsTool, with per-agent model routing, ExtraTools isolation, and event collection for replay.
internal
Package mcpadapter bridges the MCP SDK with Sleipnir tools.
Package mcpadapter bridges the MCP SDK with Sleipnir tools.
middleware

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL