agents

package module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 30, 2026 License: MIT Imports: 9 Imported by: 0

README

agents — Go LLM agents framework

A standalone, stdlib-only Go module providing the building blocks for LLM-driven agents: the 5 classic agent paradigms (Simple, ReAct, Reflection, Plan-and-Solve, FunctionCall), Memory (Working/Episodic/Semantic), RAG, Context engineering, communication protocols (MCP / A2A / ANP), multi-agent orchestration (Pipeline, RoundRobin, RolePlay, StateGraph), Agentic-RL evaluation, and benchmarks (BFCL, GAIA, LLM-as-Judge, Win Rate).

v0.1.0 — 学习 / 原型 stage. API may break between 0.x releases. Wait for v1.0 for any stability commitment. Source spec: docs/superpowers/specs/2026-04-27-pkg-llm-agents-design.md in the parent AICS repo.

Install

go get github.com/costa92/aics-core/pkg/llm/agents@v0.1.0

Quick start

package main

import (
	"context"
	"fmt"

	"github.com/costa92/aics-core/pkg/llm/agents"
	"github.com/costa92/aics-core/pkg/llm/agents/builtin"
	"github.com/costa92/aics-core/pkg/llm/agents/llm"
)

// myClient satisfies llm.Client. Plug your own provider — OpenAI,
// Ollama, Anthropic, anything that returns LLM responses.
type myClient struct{}

func (myClient) Generate(ctx context.Context, req llm.GenerateRequest) (llm.GenerateResponse, error) {
	return llm.GenerateResponse{Text: "tool: calculator(12*8)", FinishReason: llm.FinishReasonStop}, nil
}
func (myClient) GenerateStream(ctx context.Context, req llm.GenerateRequest) (<-chan llm.StreamChunk, error) {
	ch := make(chan llm.StreamChunk, 1)
	close(ch)
	return ch, nil
}

func main() {
	reg := agents.NewRegistry(builtin.NewCalculator())
	a := agents.NewReActAgent(myClient{}, agents.ReActOptions{Registry: reg, MaxSteps: 4})
	res, _ := a.Run(context.Background(), "What is 12 times 8?")
	fmt.Println(res.Text)
}

Packages

Package Purpose
agents Agent / Tool interface + 5 paradigm constructors (Simple/ReAct/Reflection/PlanAndSolve/FunctionCall) + Chain + Async + Registry
agents/llm LLM contract: Client interface, GenerateRequest/Response, Message, Tool, ToolCall, StreamChunk, StreamUsage, FinishReason
agents/builtin Calculator, MockSearch, NoteTool, TerminalTool
agents/memory WorkingMemory, EpisodicMemory, SemanticMemory, Manager, MemoryTool
agents/rag Embedder (HashEmbedder), Chunker, InMemoryStore, RAGSystem, MQE, HyDE
agents/context GSSC ContextBuilder (Gather → Select → Structure → Compress)
agents/comm Envelope + Transport (HTTP, Stdio)
agents/comm/mcp Model Context Protocol client/server (toy spec coverage)
agents/comm/a2a Agent-to-Agent task lifecycle
agents/comm/anp Agent Network Protocol (in-memory registry)
agents/orchestrate Pipeline, RoundRobinChat, RolePlay, StateGraph, Termination
agents/rl Dataset, Trajectory, Reward, Evaluator, TrainerProxy (no training — Python TRL bridge)
agents/bench BFCL, GAIA, LLM-as-Judge, Win Rate, Reporter (mini fixtures only)

Local development

This module is developed inside the parent AICS monorepo. To work on it locally with main-repo callers picking up your changes:

# at the AICS repo root
cat > go.work <<'EOF'
go 1.26.0
use (
    .
    ./pkg/llm/agents
)
EOF

go.work is gitignored. CI and external go get callers rely on the require directive in the parent module's go.mod resolving to the published tag.

Status & roadmap

  • ✅ All 9 design phases shipped (see source spec)
  • ✅ 12 packages, ~5000 LOC, stdlib only
  • examples/agents-demo (in parent AICS repo) end-to-end runnable
  • ⏸ v1.0 — pending real-world feedback

License

MIT — see LICENSE.

Documentation

Overview

Package agents implements Agent paradigms (Simple/ReAct/Reflection/PlanAndSolve) and a Tool subsystem on top of pkg/llm. Subpackage of pkg/llm; inherits the same portability contract: no internal/* imports, no project-specific pkg/*, no business vocabulary.

Package agents implements five Agent paradigms (Simple, ReAct, Reflection, Plan-and-Solve, FunctionCall) and a Tool subsystem (Registry, Chain, AsyncRunner) on top of pkg/llm.

Portability contract

agents is a subpackage of pkg/llm and inherits the same constraints (see pkg/llm/doc.go and docs/CODING-CONVENTIONS.md §1):

  • No imports from internal/*
  • No project-specific pkg/* (no pkg/errors / pkg/store / ...)
  • No business vocabulary (tenant, faq, kb, channel, ...)
  • Sentinel errors only — callers translate to project taxonomy via errors.Is

Quick start

client, _ := llm.NewClient(llm.Config{Provider: "mock"})
agent := agents.NewSimpleAgent(client, agents.SimpleOptions{})
res, err := agent.Run(ctx, "Hello")

See examples/agents-demo for a complete demo of all five agents and the tool subsystem.

Agents

  • SimpleAgent: single-shot LLM forward.
  • ReActAgent: Thought → Action → Observation loop, parses LLM output for "Action: <tool>" / "Args: <json>" / "Final: <answer>" lines.
  • ReflectionAgent: gen → critique → revise loop, exits early on critique containing "APPROVED".
  • PlanAndSolveAgent: plan once (numbered steps), execute each step, synthesize final answer.
  • FunctionCallAgent: single-turn native function-calling — uses pkg/llm.Tool / resp.ToolCalls instead of prompt parsing; AsyncRunner for parallel tool execution with bounded parallelism.

Observability

Every Agent exposes two observation channels:

  • Options.OnStep func(Step): synchronous callback fired at each trace step. Lowest overhead; use for in-process logging.
  • RunStream(ctx, input) (<-chan StepEvent, error): channel-based streaming for cross-boundary consumers (HTTP SSE, gRPC stream). The channel is closed when the agent finishes; final event has Done=true with either Final or Err set.

Tools

  • Tool interface: Name / Description / Schema / Execute.
  • NewFuncTool: wrap a plain function as a Tool without writing a struct.
  • Registry: name→Tool with sorted List, AsLLMTools, PromptDescription helpers.
  • Chain: pipes Tools sequentially; itself satisfies Tool.
  • AsyncRunner: parallel Task execution with ctx cancellation; per-Task errors do not abort siblings.

Built-in tools live in pkg/llm/agents/builtin (Calculator, MockSearch).

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrMaxStepsExceeded      = errors.New("agents: max steps exceeded")
	ErrToolNotFound          = errors.New("agents: tool not found")
	ErrToolAlreadyRegistered = errors.New("agents: tool already registered")
	ErrPlanningFailed        = errors.New("agents: planning failed")
	ErrParseToolCall         = errors.New("agents: failed to parse tool call")
	ErrEmptyInput            = errors.New("agents: empty input")
)

Sentinel errors. Subpackage stays portable — does not import pkg/errors. Callers in internal/* translate via errors.Is at the boundary.

Functions

func AsLLMTool

func AsLLMTool(t Tool) llm.Tool

AsLLMTool translates an agents.Tool into pkg/llm.Tool so it can be passed to llm.Client via GenerateRequest.Tools (native function-calling).

Types

type Agent

type Agent interface {
	Name() string
	Run(ctx context.Context, input string) (Result, error)
	// RunStream emits trace step events through a channel. The channel is closed
	// when the Agent finishes; the final event always has Done=true with either
	// Final or Err set. Phase 8 SSE handlers are the natural consumer; service
	// layers don't need to write Step→event conversion themselves.
	RunStream(ctx context.Context, input string) (<-chan StepEvent, error)
}

Agent is the minimal contract every Agent implementation satisfies.

type AsyncRunner

type AsyncRunner struct {
	// contains filtered or unexported fields
}

AsyncRunner executes Tasks in parallel.

  • Single Task failures are captured in TaskResult.Err and do not abort other Tasks (no fail-fast).
  • The function-level error is set only when ctx is cancelled / times out.

func NewAsyncRunner

func NewAsyncRunner(maxParallel int) *AsyncRunner

NewAsyncRunner returns an AsyncRunner. maxParallel <= 0 means unlimited.

func (*AsyncRunner) Execute

func (r *AsyncRunner) Execute(ctx context.Context, tasks []Task) ([]TaskResult, error)

Execute fans out tasks, waits for all, returns results sorted by Index.

type Chain

type Chain struct {
	// contains filtered or unexported fields
}

Chain pipes Tools sequentially: each tool's string output becomes the next tool's args (wrapped as {"input": "..."}). Chain itself satisfies Tool, so it can be registered or nested.

func NewChain

func NewChain(name string, tools ...Tool) *Chain

NewChain constructs a Chain. The first tool receives the original args; subsequent tools receive {"input": <prev_output>}.

func (*Chain) Description

func (c *Chain) Description() string

Description implements Tool.

func (*Chain) Execute

func (c *Chain) Execute(ctx context.Context, args json.RawMessage) (string, error)

Execute runs each tool in sequence, piping output → next args as {"input": <prev>}.

func (*Chain) Name

func (c *Chain) Name() string

Name implements Tool.

func (*Chain) Schema

func (c *Chain) Schema() json.RawMessage

Schema implements Tool — delegates to the first tool's schema.

type ExecuteFunc

type ExecuteFunc func(ctx context.Context, args json.RawMessage) (string, error)

ExecuteFunc is the signature used when wrapping a plain function as a Tool.

type FunctionCallAgent

type FunctionCallAgent struct {
	// contains filtered or unexported fields
}

FunctionCallAgent uses native OpenAI-style function-calling: pkg/llm.Tool + resp.ToolCalls instead of prompt-based parsing. Single-turn — emits one LLM call, executes returned tool calls in parallel via AsyncRunner, aggregates outputs as the answer.

Why single-turn: pkg/llm.Message has no ToolCallID field, so we can't feed tool results back to the LLM per OpenAI spec for multi-turn function-calling. Multi-turn would require a pkg/llm enhancement (out of scope for this phase).

func NewFunctionCallAgent

func NewFunctionCallAgent(client llm.Client, opts FunctionCallOptions) *FunctionCallAgent

NewFunctionCallAgent constructs a FunctionCallAgent.

func (*FunctionCallAgent) Name

func (a *FunctionCallAgent) Name() string

Name implements Agent.

func (*FunctionCallAgent) Run

func (a *FunctionCallAgent) Run(ctx context.Context, input string) (Result, error)

Run executes one round of function-calling.

func (*FunctionCallAgent) RunStream

func (a *FunctionCallAgent) RunStream(ctx context.Context, input string) (<-chan StepEvent, error)

RunStream emits step events through a channel; see Agent interface docs.

type FunctionCallOptions

type FunctionCallOptions struct {
	Name         string     // default "function-call"
	Registry     *Registry  // required
	SystemPrompt string     // optional
	MaxParallel  int        // default 4 — caps goroutines spawned per Run
	OnStep       func(Step) // optional
}

FunctionCallOptions configures FunctionCallAgent.

type PlanAndSolveAgent

type PlanAndSolveAgent struct {
	// contains filtered or unexported fields
}

PlanAndSolveAgent: plan once (LLM emits N steps), then execute each step in a single LLM call, finally synthesize a final answer.

func NewPlanAndSolveAgent

func NewPlanAndSolveAgent(client llm.Client, opts PlanAndSolveOptions) *PlanAndSolveAgent

NewPlanAndSolveAgent constructs a PlanAndSolveAgent.

func (*PlanAndSolveAgent) Name

func (a *PlanAndSolveAgent) Name() string

Name implements Agent.

func (*PlanAndSolveAgent) Run

func (a *PlanAndSolveAgent) Run(ctx context.Context, input string) (Result, error)

Run executes plan → step1..stepN → synthesize.

func (*PlanAndSolveAgent) RunStream

func (a *PlanAndSolveAgent) RunStream(ctx context.Context, input string) (<-chan StepEvent, error)

RunStream emits step events through a channel; see Agent interface docs.

type PlanAndSolveOptions

type PlanAndSolveOptions struct {
	Name        string     // default "plan-and-solve"
	MaxSteps    int        // default 8 — caps the number of steps the planner may emit
	PlanPrompt  string     // default planPromptDefault
	StepPrompt  string     // default stepPromptDefault
	SynthPrompt string     // default synthPromptDefault
	OnStep      func(Step) // optional, called for each trace step (synchronous)
}

PlanAndSolveOptions configures PlanAndSolveAgent.

type ReActAgent

type ReActAgent struct {
	// contains filtered or unexported fields
}

ReActAgent runs a Thought→Action→Observation loop until the LLM emits a "Final:" line or MaxSteps is exceeded.

Output format the LLM is instructed to emit:

Thought: <reasoning>
Action: <tool_name>
Args: <json>
-- or --
Thought: <reasoning>
Final: <answer>

func NewReActAgent

func NewReActAgent(client llm.Client, opts ReActOptions) *ReActAgent

NewReActAgent constructs a ReActAgent.

func (*ReActAgent) Name

func (a *ReActAgent) Name() string

Name implements Agent.

func (*ReActAgent) Run

func (a *ReActAgent) Run(ctx context.Context, input string) (Result, error)

Run executes the ReAct loop.

func (*ReActAgent) RunStream

func (a *ReActAgent) RunStream(ctx context.Context, input string) (<-chan StepEvent, error)

RunStream emits step events through a channel; see Agent interface docs.

type ReActOptions

type ReActOptions struct {
	Name         string     // default "react"
	Registry     *Registry  // tools the agent can call (nil = no tools, only Final allowed)
	MaxSteps     int        // default 8 — bound on round-trips before ErrMaxStepsExceeded
	SystemPrompt string     // optional override; default = reactSystemPrompt
	OnStep       func(Step) // optional, called for each trace step (synchronous)
}

ReActOptions configures ReActAgent.

type ReflectionAgent

type ReflectionAgent struct {
	// contains filtered or unexported fields
}

ReflectionAgent: generate → critique → revise → ... up to MaxRounds. If the critique contains "APPROVED" (case-insensitive), stops early.

func NewReflectionAgent

func NewReflectionAgent(client llm.Client, opts ReflectionOptions) *ReflectionAgent

NewReflectionAgent constructs a ReflectionAgent.

func (*ReflectionAgent) Name

func (a *ReflectionAgent) Name() string

Name implements Agent.

func (*ReflectionAgent) Run

func (a *ReflectionAgent) Run(ctx context.Context, input string) (Result, error)

Run executes the gen→critique→revise loop.

func (*ReflectionAgent) RunStream

func (a *ReflectionAgent) RunStream(ctx context.Context, input string) (<-chan StepEvent, error)

RunStream emits step events through a channel; see Agent interface docs.

type ReflectionOptions

type ReflectionOptions struct {
	Name           string     // default "reflection"
	MaxRounds      int        // default 2
	GenPrompt      string     // default genPromptDefault
	CritiquePrompt string     // default critiquePromptDefault
	RevisePrompt   string     // default revisePromptDefault
	OnStep         func(Step) // optional, called for each trace step (synchronous)
}

ReflectionOptions configures ReflectionAgent.

type Registry

type Registry struct {
	// contains filtered or unexported fields
}

Registry holds a name→Tool map. Use one Registry per Agent (or share) so test isolation is preserved (no init()-time global singleton).

func NewRegistry

func NewRegistry(tools ...Tool) *Registry

NewRegistry returns a Registry with the given tools batch-registered. Panics if duplicate names appear in the variadic args (constructor-time safety: prefer panic over silent shadowing).

func (*Registry) AsLLMTools

func (r *Registry) AsLLMTools() []llm.Tool

AsLLMTools returns all tools formatted for llm.GenerateRequest.Tools.

func (*Registry) Get

func (r *Registry) Get(name string) (Tool, bool)

Get returns the tool registered under name.

func (*Registry) List

func (r *Registry) List() []Tool

List returns all tools sorted by Name (deterministic).

func (*Registry) PromptDescription

func (r *Registry) PromptDescription() string

PromptDescription formats all tools as a "- name: description" list suitable for prompt injection. Empty registry returns "(none)\n". Used by ReActAgent internally and exposed for external prompt assembly (e.g., Phase 8 PlannerAgent).

func (*Registry) Register

func (r *Registry) Register(t Tool) error

Register adds a tool. Returns ErrToolAlreadyRegistered if name collides.

type Result

type Result struct {
	Answer string
	Trace  []Step
	Usage  Usage
}

Result carries the final answer plus full trace and accumulated usage.

Trace memory contract (eng review 2026-04-27): Result.Trace is a debug snapshot for synchronous Run() callers and has no size limit. Streaming consumers (RunStream / SSE / gRPC stream) should consume StepEvents only and ignore Result.Trace at the end — they're the same information twice. Phase 8 SSE handlers should discard res.Trace once the channel closes (events already flushed to client). High-concurrency services that ignore this rule end up holding 50–100 Steps (~4KB each) per in-flight handler — 100 concurrent handlers ≈ 40MB wasted.

type SimpleAgent

type SimpleAgent struct {
	// contains filtered or unexported fields
}

SimpleAgent forwards user input to an llm.Client in a single call. No tools, no loop — the simplest possible Agent.

func NewSimpleAgent

func NewSimpleAgent(client llm.Client, opts SimpleOptions) *SimpleAgent

NewSimpleAgent constructs a SimpleAgent.

func (*SimpleAgent) Name

func (a *SimpleAgent) Name() string

Name implements Agent.

func (*SimpleAgent) Run

func (a *SimpleAgent) Run(ctx context.Context, input string) (Result, error)

Run sends one prompt and returns the LLM's reply as the final answer.

func (*SimpleAgent) RunStream

func (a *SimpleAgent) RunStream(ctx context.Context, input string) (<-chan StepEvent, error)

RunStream emits step events through a channel; see Agent interface docs.

type SimpleOptions

type SimpleOptions struct {
	Name         string     // default "simple"
	SystemPrompt string     // optional, prepended to user input as a system context
	OnStep       func(Step) // optional, called for each trace step (synchronous)
}

SimpleOptions configures SimpleAgent.

type Step

type Step struct {
	Kind    StepKind
	Content string // Thought / Reflection / Plan body
	Tool    string // Action only
	Args    string // Action only — raw JSON string
	Result  string // Observation only
}

Step is one entry in the trace. Kind decides which fields are meaningful.

type StepEvent

type StepEvent struct {
	Step  Step
	Done  bool
	Final *Result
	Err   error
}

StepEvent is the transport unit emitted by RunStream.

  • Done = false: Step is an intermediate event, Final/Err are nil.
  • Done = true: terminal event, exactly one of Final or Err is non-nil.
  • Channel close after the terminal event signals no more events.

type StepKind

type StepKind string

StepKind enumerates trace step types.

const (
	StepThought     StepKind = "thought"
	StepAction      StepKind = "action"
	StepObservation StepKind = "observation"
	StepReflection  StepKind = "reflection"
	StepPlan        StepKind = "plan"
	StepFinal       StepKind = "final"
)

type Task

type Task struct {
	Tool Tool
	Args json.RawMessage
}

Task pairs a Tool with its args for a single async invocation.

type TaskResult

type TaskResult struct {
	Index  int // position in the input tasks slice
	Output string
	Err    error
}

TaskResult carries one Task's outcome.

type Tool

type Tool interface {
	Name() string
	Description() string
	Schema() json.RawMessage
	Execute(ctx context.Context, args json.RawMessage) (string, error)
}

Tool is a capability unit an Agent may invoke.

Description is shown to the LLM (it decides whether to call); Schema describes the parameters as raw JSON Schema (we don't validate it — upstream provider does); Execute does the work and returns a string suitable for either prompt-injection (ReActAgent's Observation) or aggregation (FunctionCallAgent's answer).

func NewFuncTool

func NewFuncTool(name, description string, schema json.RawMessage, fn ExecuteFunc) Tool

NewFuncTool wraps a plain function as a Tool — saves writing a struct with Name/Description/Schema/Execute methods when the tool is trivial.

tool := agents.NewFuncTool(
    "weather",
    "Get weather for a city",
    json.RawMessage(`{"type":"object","properties":{"city":{"type":"string"}}}`),
    func(ctx context.Context, args json.RawMessage) (string, error) {
        var p struct{ City string }
        json.Unmarshal(args, &p)
        return "sunny in " + p.City, nil
    },
)

type Usage

type Usage struct {
	LLMCalls int
	Tokens   int
}

Usage tracks LLM cost across a single Run.

Directories

Path Synopsis
Package bench layers benchmark + judging primitives on top of pkg/llm/agents/rl.
Package bench layers benchmark + judging primitives on top of pkg/llm/agents/rl.
Package builtin provides ready-to-use Tool implementations for agents.
Package builtin provides ready-to-use Tool implementations for agents.
Package comm provides the shared transport + envelope layer for the agent communication protocols (MCP, A2A, ANP).
Package comm provides the shared transport + envelope layer for the agent communication protocols (MCP, A2A, ANP).
a2a
Package a2a is a simplified Agent-to-Agent protocol over HTTP.
Package a2a is a simplified Agent-to-Agent protocol over HTTP.
anp
Package anp is a minimal Agent Network Protocol implementation.
Package anp is a minimal Agent Network Protocol implementation.
mcp
Package mcp implements a minimal MCP (Model Context Protocol) client + toy server.
Package mcp implements a minimal MCP (Model Context Protocol) client + toy server.
Package context implements GSSC (Gather→Select→Structure→Compress) context engineering for LLM agents.
Package context implements GSSC (Gather→Select→Structure→Compress) context engineering for LLM agents.
Package llm owns the LLM-provider contract for the agents framework.
Package llm owns the LLM-provider contract for the agents framework.
Package memory implements 3 in-process Memory types + a Manager + an agents.Tool adapter:
Package memory implements 3 in-process Memory types + a Manager + an agents.Tool adapter:
Package orchestrate implements 4 multi-Agent orchestration paradigms on top of pkg/llm/agents.Agent:
Package orchestrate implements 4 multi-Agent orchestration paradigms on top of pkg/llm/agents.Agent:
Package rag implements Retrieval-Augmented Generation primitives:
Package rag implements Retrieval-Augmented Generation primitives:
Package rl implements the EVALUATION layer of Agentic Reinforcement Learning.
Package rl implements the EVALUATION layer of Agentic Reinforcement Learning.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL