agentmeter

package module

v0.0.1 Latest Latest Go to latest Published: Mar 10, 2026 License: Apache-2.0 Imports: 4 Imported by: 1

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/erlangb/agentmeter

Links

Open Source Insights

README ¶

agentmeter

A Go dev tool for inspecting and debugging LLM agent systems directly in your terminal.

When you're building multi-agent pipelines — planner, retriever, executor, whatever — things get hard to follow fast. agentmeter gives you a live, structured view of every step: which agent ran, what it said, what tools it called, how long it took, and what it cost. No dashboards, no cloud, no setup. Just your terminal.

It's framework-agnostic. The core has no SDK dependencies. An adapter for Eino is available as a separate module.

go get github.com/erlangb/agentmeter

Step output

History output

Quick start

meter := agentmeter.New(pricing.WithDefaultPricing())
meter.Reset("run-1")

start := time.Now()
// ... model call ...
meter.Record(agentmeter.AgentStep{
    Role:      "model",
    Cluster:   agentmeter.ClusterCognitive,
    AgentName: "planner",
    ModelID:   "gpt-4o",
    StartedAt: start,
    Content:   "I'll search for recent Go releases.",
    Usage:     agentmeter.TokenUsage{PromptTokens: 200, CompletionTokens: 30},
})

start = time.Now()
// ... tool call ...
meter.Record(agentmeter.AgentStep{
    Role:      "tool",
    Cluster:   agentmeter.ClusterAction,
    AgentName: "executor",
    StartedAt: start,
    ToolName:  "web_search",
    Content:   "Go 1.23 was released in August 2024...",
})

meter.Finalize()

printer := reasoning.NewPrinter(os.Stdout)
printer.Print(meter.Snapshot())

Adapters

Each adapter is a separate Go sub-module — pull in only what you need.

Eino

go get github.com/erlangb/agentmeter/adapters/eino

meter := agentmeter.New(pricing.WithDefaultPricing())
handler := einometer.NewAgentMeterHandler(meter)

runner, _ := graph.Compile(ctx, compose.WithGlobalCallbacks(handler))
runner.Invoke(ctx, "input")

printer.Print(meter.Snapshot())

Captures: chain start/end → Reset/Finalize, ChatModel output → ClusterCognitive step (with token usage, model ID, tool calls), ToolsNode output → ClusterAction steps, errors → ClusterError.

Core concepts

StepCluster

The fixed vocabulary that drives rendering and counters. Set it on every step.

Constant	Use for	Effect
`ClusterCognitive`	Model inference, thinking	Increments `ModelCalls`
`ClusterAction`	Tool result	Rendered as a tool block
`ClusterMessage`	User-facing output	Rendered as a message
`ClusterError`	Failures, exceptions	Rendered in red

Role is a free-form string you own. The library defines no role constants.

Run label vs agent name

Field	Scope	Set by
`Snapshot.Label`	The run	`meter.Reset(label)`
`AgentStep.AgentName`	Each step	You, in `Record()`

Label groups the session. AgentName tells you which agent produced each step. Useful when planner, retriever, and executor all share one meter.

Snapshot

Snapshot() returns an immutable, mutex-free copy of the current run — safe to log, marshal, or pass around.

snap := meter.Snapshot()
snap.Label
snap.Steps                          // []AgentStep, chronological
snap.TokenSummary.ByModel           // token breakdown per model
snap.TokenSummary.EstimatedCostUSD
snap.TotalDuration

History

Finalize() seals a run and appends it to a bounded history (default: 100 runs).

history := meter.History() // []Snapshot
printer.PrintHistory(history)

Pricing

// Built-in table: OpenAI, Anthropic Claude, Google Gemini
meter := agentmeter.New(pricing.WithDefaultPricing())

// Or roll your own
meter := agentmeter.New(agentmeter.WithCostFunc(func(s agentmeter.TokenSummary) float64 {
    u := s.AggregateTokenUsage()
    return float64(u.PromptTokens)*0.000002 + float64(u.CompletionTokens)*0.000006
}))

Cost is computed lazily at Snapshot() time. See pricing/pricing.go for the full model list.

Terminal output

printer := reasoning.NewPrinter(os.Stdout)
printer.Print(snap)           // single run
printer.PrintHistory(history) // all runs with aggregate summary

Examples

Example	What it shows
`go run ./examples/basic/`	One run: model → tool → model
`go run ./examples/history/`	Multi-run history with aggregate cost
`go run ./examples/mixed_pricing/`	GPT-4o and Gemini in one run
`go run ./examples/custom_cost/`	Custom `CostFunc`
`go run ./examples/terminal_output/`	Coloured, plain, and custom terminal styles

Options

meter := agentmeter.New(
    agentmeter.WithCostFunc(costFn),
    agentmeter.WithMaxHistory(50),
)

Testing

go test -v -race ./...
go vet ./...

# eino adapter
cd adapters/eino && go test -v -race ./...

Documentation ¶

Overview ¶

Package agentmeter provides framework-agnostic observability for LLM agent runs. It records reasoning traces, token usage, tool calls, and estimated costs without introducing any external dependencies in the core package.

Constants ¶

View Source

const DefaultMaxHistory = 100

DefaultMaxHistory is the default cap on completed-run snapshots retained in History.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type AgentStep ¶

type AgentStep struct {
	// Role is a free descriptive label set by the client (e.g. "model", "tool",
	// "retrieval"). It appears in headers and logs but does not drive rendering
	// or counter gating — use Cluster for that.
	Role StepRole
	// Cluster drives display and counter gating. Set to one of the Cluster*
	// constants. ClusterCognitive increments ModelCalls; ClusterAction renders
	// as a tool result; ClusterMessage and ClusterError have dedicated styles.
	Cluster StepCluster
	// AgentName is the name of the agent that performed this step.
	AgentName string
	// Provider identifies the API provider serving the model
	// (e.g. "openai", "anthropic", "google", "azure_openai", "aws_bedrock").
	// Follows the gen_ai.system semantic convention from OpenTelemetry.
	// Optional: leave empty if the provider is unambiguous from ModelID in your setup.
	// Explicit over inferred — the same ModelID can be served by multiple providers
	// (e.g. "claude-3-5-sonnet" via Anthropic direct, AWS Bedrock, or Vertex AI).
	Provider string
	// ModelID is the model identifier (e.g. "gpt-4o"). Non-empty for cognitive steps.
	ModelID string
	// Content holds the primary text output (model response text or tool result).
	Content string
	// ThinkingContent holds chain-of-thought text for thinking steps.
	ThinkingContent string
	// ToolName is the name of the tool invoked. Non-empty for action steps.
	ToolName string
	// ToolInput is the serialised input passed to the tool.
	ToolInput string
	// ToolCallID is the provider-assigned correlation ID linking a tool call
	// in an assistant message to its corresponding tool result step.
	ToolCallID string
	// ToolCalls is a list of tool-call summaries (e.g. "search({q:\"go\"})") for
	// model steps that dispatched one or more tools. Empty for other step types.
	ToolCalls []string
	// Usage records token counts for model and thinking steps; zero for tool steps.
	Usage TokenUsage
	// StartedAt is the wall-clock time at which this step began.
	// If Duration is zero and StartedAt is non-zero, Record() auto-computes
	// Duration = time.Since(StartedAt). Set this immediately before issuing
	// the model or tool call — before any blocking I/O — for accurate timing.
	StartedAt time.Time
	// Duration is the wall-clock time spent in this step.
	// If set explicitly it takes precedence over StartedAt-based auto-calculation.
	Duration time.Duration
}

AgentStep is an immutable record of one step in an agent run.

type CostFunc ¶

type CostFunc func(TokenSummary) float64

CostFunc computes the estimated cost in USD for a given TokenSummary. Implementations should be pure functions with no side effects. A nil CostFunc is valid; Meter will skip cost computation in that case.

type Meter ¶

type Meter struct {
	// contains filtered or unexported fields
}

Meter records the reasoning trace of one or more agent runs and accumulates a bounded history of completed runs. Safe for concurrent use.

func New ¶

func New(opts ...Option) *Meter

New creates a Meter instance with the supplied options applied. It is safe to call New with no options; all settings have safe defaults.

func (*Meter) ClearHistory ¶

func (t *Meter) ClearHistory()

ClearHistory removes all completed-run snapshots from memory.

func (*Meter) Finalize ¶

func (t *Meter) Finalize()

Finalize marks the current run complete and appends a Snapshot to history. Calling it more than once on the same run is a no-op.

func (*Meter) History ¶

func (t *Meter) History() []Snapshot

History returns a copy of all completed runs in chronological order. Runs are added by Finalize(). The slice is bounded by the MaxHistory option (default 100). Use ClearHistory to reset it.

func (*Meter) Record ¶

func (t *Meter) Record(s AgentStep)

Record appends s to the current run and updates token aggregates. Duration is auto-computed from StartedAt if not set explicitly.

func (*Meter) Reset ¶

func (t *Meter) Reset(label string)

Reset clears the current run state and prepares Meter to record a new run. label is a human-readable identifier for this run (e.g. "conversation-42", "search-agent-run-3"). It is stored in Snapshot.Label and does not affect how individual steps are recorded. Steps carry their own AgentName field. Reset does not clear history.

func (*Meter) Snapshot ¶

func (t *Meter) Snapshot() Snapshot

Snapshot returns a point-in-time, immutable view of the current run. The returned Snapshot is a deep copy: mutating it does not affect Meter. EstimatedCostUSD is computed here if a CostFunc was supplied.

type Option ¶

type Option func(*config)

Option is a functional option that configures a Meter instance. Options are applied in the order they are passed to New().

func WithCostFunc ¶

func WithCostFunc(fn CostFunc) Option

WithCostFunc sets the CostFunc used to estimate cost in USD for each run. Pass nil to disable cost estimation (the default).

func WithMaxHistory ¶

func WithMaxHistory(n int) Option

WithMaxHistory sets the maximum number of completed runs to retain in history. n must be positive; zero or negative values are ignored and the default is kept.

type Snapshot ¶

type Snapshot struct {
	// Label is the run identifier set by Reset(label). It identifies the run,
	// not the agents within it. Individual steps carry their own AgentName.
	Label string
	// Steps is a deep copy of the reasoning steps recorded so far.
	Steps []AgentStep
	// TokenSummary is an aggregated token and cost summary for this run.
	TokenSummary TokenSummary
	// TotalDuration is the accumulated wall-clock time across all steps in this snapshot.
	TotalDuration time.Duration
}

Snapshot is a point-in-time, immutable view of an agent run. It is returned by Meter.Snapshot() and stored in History(). Unlike Meter, a Snapshot carries no mutex and is safe to share freely.

type StepCluster ¶

type StepCluster string

StepCluster is the stable rendering and routing category for a step. It is the library's fixed vocabulary; Role is owned by the client. Use one of the Cluster* constants when recording a step.

const (
	// ClusterCognitive covers model inference and extended-thinking steps.
	// Steps with this cluster increment ModelCalls and are rendered with
	// thinking/plan/call labels.
	ClusterCognitive StepCluster = "cognitive"
	// ClusterAction covers tool calls and external side-effects.
	ClusterAction StepCluster = "action"
	// ClusterMessage covers text responses directed at the user.
	ClusterMessage StepCluster = "message"
	// ClusterError covers failures, exceptions, and retries.
	ClusterError StepCluster = "error"
)

type StepRole ¶

type StepRole string

StepRole is a free-form descriptive label for a reasoning step. The library defines the type but no constants — clients supply whatever role labels make sense for their framework or application (e.g. "model", "tool", "retrieval", "rerank"). Rendering and counter gating are driven by StepCluster, not by Role.

type TokenSummary ¶

type TokenSummary struct {
	// ByModel holds token usage aggregated per model ID.
	ByModel map[string]TokenUsage
	// ModelCalls is the number of ClusterCognitive steps recorded.
	ModelCalls int
	// EstimatedCostUSD is the cost computed by CostFunc at Snapshot time.
	EstimatedCostUSD float64
}

TokenSummary aggregates multiple usages into a final report.

func (TokenSummary) AggregateTokenUsage ¶

func (s TokenSummary) AggregateTokenUsage() TokenUsage

AggregateTokenUsage returns the sum of all per-model token usage.

type TokenUsage ¶

type TokenUsage struct {
	PromptTokens      int
	CompletionTokens  int
	TotalTokens       int
	CachedInputTokens int
	CacheWriteTokens  int
	ReasoningTokens   int
}

TokenUsage captures raw token counts. It is the "source of truth" for what happened in a single model interaction.

func (TokenUsage) Add ¶

func (u TokenUsage) Add(other TokenUsage) TokenUsage

Add returns a NEW TokenUsage that is the sum of u and other.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
adapters
eino module
examples
basic command Package main is the minimal agentmeter example.	Package main is the minimal agentmeter example.
custom_cost command
history command Package main demonstrates multi-run history with agentmeter.	Package main demonstrates multi-run history with agentmeter.
mixed_pricing command Package main demonstrates per-model cost tracking with agentmeter.	Package main demonstrates per-model cost tracking with agentmeter.
terminal_output command Package main demonstrates the terminal rendering options of agentmeter.	Package main demonstrates the terminal rendering options of agentmeter.
pricing
reasoning Package reasoning provides terminal rendering utilities for agentmeter traces.	Package reasoning provides terminal rendering utilities for agentmeter traces.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

README ¶

agentmeter

Quick start

Adapters

Eino

Core concepts

StepCluster

Run label vs agent name

Snapshot

History

Pricing

Terminal output

Examples

Options

Testing

Documentation ¶

Overview ¶

Index ¶

Constants ¶

Variables ¶

Functions ¶

Types ¶

type AgentStep ¶

type CostFunc ¶

type Meter ¶

func New ¶

func (*Meter) ClearHistory ¶

func (*Meter) Finalize ¶

func (*Meter) History ¶

func (*Meter) Record ¶

func (*Meter) Reset ¶

func (*Meter) Snapshot ¶

type Option ¶

func WithCostFunc ¶

func WithMaxHistory ¶

type Snapshot ¶

type StepCluster ¶

type StepRole ¶

type TokenSummary ¶

func (TokenSummary) AggregateTokenUsage ¶

type TokenUsage ¶

func (TokenUsage) Add ¶

Source Files ¶

Directories ¶