graymatter

package module
v0.2.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 11, 2026 License: MIT Imports: 7 Imported by: 0

README

GrayMatter

graymatter-banner

CI Go Reference Latest Release Platforms Go Report Card License

Three lines of code to give your AI agents persistent memory.
Single Go binary. Zero infra. Works with Claude Code, or any tool that calls the Anthropic Messages API.

Free, offline, no account required.


go get github.com/angelnicolasc/graymatter
mem := graymatter.New(".graymatter")
mem.Remember("agent", "user prefers bullet points, hates long intros")
ctx := mem.Recall("agent", "how should I format this response?")
// ["user prefers bullet points, hates long intros"]

Why

Every AI agent today is stateless by default. Every run starts from zero.

Mem0, Zep, Supermemory solve this — but in Python or TypeScript, and they require a server. Go has zero production-ready, embeddable, zero-deps memory layer for agents. That gap is GrayMatter.

~90% token reduction at 100 sessions versus full-history injection. No Docker. No Redis. No Python. No API key required for storage.


Install

Binary (recommended):

# macOS (Apple Silicon)
curl -sSL -o graymatter.tar.gz https://github.com/angelnicolasc/graymatter/releases/download/v0.2.1/graymatter_0.2.1_darwin_arm64.tar.gz
tar -xzf graymatter.tar.gz
sudo mv graymatter /usr/local/bin/

# Windows (PowerShell)
iwr https://github.com/angelnicolasc/graymatter/releases/download/v0.2.1/graymatter_0.2.1_windows_amd64.zip -OutFile graymatter.zip
Expand-Archive graymatter.zip -DestinationPath .\graymatter_cli

Go install:

go install github.com/angelnicolasc/graymatter/cmd/graymatter@latest

Library:

go get github.com/angelnicolasc/graymatter

Library usage

Three functions. That's the entire API surface.

import "github.com/angelnicolasc/graymatter"

// Open (or create) a memory store in the given directory.
mem := graymatter.New(".graymatter")
defer mem.Close()

// Store an observation.
mem.Remember("sales-closer", "Maria didn't reply Wednesday. Third touchpoint due Friday.")

// Retrieve relevant context for a query.
ctx := mem.Recall("sales-closer", "follow up Maria")
// ctx is a []string ready to inject into a system prompt:
// ["Maria didn't reply Wednesday. Third touchpoint due Friday."]

Every method has a context-aware variant that respects deadlines and cancellation signals end-to-end — no wrappers needed:

ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()

if err := mem.RememberCtx(ctx, "agent", "observation"); err != nil { ... }
results, err := mem.RecallCtx(ctx, "agent", "query")

Full agent pattern

mem := graymatter.New(project.Root + "/.graymatter")
defer mem.Close()

// 1. Recall before calling the LLM.
memCtx, _ := mem.Recall(skill.Name, task.Description)

messages := []anthropic.MessageParam{
    {Role: "system", Content: skill.Identity + "\n\n## Memory\n" + strings.Join(memCtx, "\n")},
    {Role: "user",   Content: task.Description},
}

// 2. Call your LLM.
response, _ := client.Messages.New(ctx, anthropic.MessageNewParams{...})

// 3. Remember after the run.
mem.Remember(skill.Name, extractKeyFacts(response))

Config

mem, err := graymatter.NewWithConfig(graymatter.Config{
    DataDir:          ".graymatter",
    TopK:             8,
    EmbeddingMode:    graymatter.EmbeddingAuto,  // Ollama → OpenAI → Anthropic → keyword
    OllamaURL:        "http://localhost:11434",
    OllamaModel:      "nomic-embed-text",
    AnthropicAPIKey:  os.Getenv("ANTHROPIC_API_KEY"),
    OpenAIAPIKey:     os.Getenv("OPENAI_API_KEY"),
    DecayHalfLife:    30 * 24 * time.Hour,        // 30 days
    AsyncConsolidate: true,
})

CLI

graymatter init                                    # create .graymatter/ + .mcp.json
graymatter remember "agent" "text to remember"    # store a fact
graymatter remember --shared "text"               # store in shared namespace (all agents)
graymatter recall   "agent" "query"               # print context
graymatter recall   --all "agent" "query"         # merge agent + shared memory
graymatter checkpoint list    "agent"             # show saved checkpoints
graymatter checkpoint resume  "agent"             # print latest checkpoint as JSON
graymatter mcp serve                              # start MCP server (Claude Code / Cursor)
graymatter mcp serve --http :8080                 # HTTP transport
graymatter export --format obsidian --out ~/vault # dump to Obsidian vault
graymatter tui                                    # 4-view terminal UI
graymatter run agent.md [--background]            # run a SKILL.md agent file
graymatter sessions list                          # list managed agent sessions
graymatter plugin install manifest.json           # install a plugin
graymatter server --addr :8080                    # REST API server

Global flags: --dir (data dir), --quiet, --json


Observability

The REST server (graymatter server) exposes a /metrics endpoint powered by Go's standard expvar package — zero extra dependencies.

GET /metrics
{
  "requests_total":     {"remember": 120, "recall": 340, "healthz": 5},
  "request_latency_us": {"remember": 4200, "recall": 1800},
  "facts_total":        {"stored": 120},
  "recall_total":       {"served": 340}
}

For library users, memory.StoreConfig exposes hooks for APM integration:

store, err := memory.Open(memory.StoreConfig{
    DataDir:       ".graymatter",
    DecayHalfLife: 30 * 24 * time.Hour,

    // Called after every Recall with agent ID, query, result count, and latency.
    OnRecall: func(agentID, query string, n int, d time.Duration) {
        metrics.RecordHistogram("graymatter.recall.latency", d.Seconds())
    },

    // Called after every successful Put with agent ID, fact ID, and latency.
    OnPut: func(agentID, factID string, d time.Duration) {
        metrics.Increment("graymatter.facts.stored")
    },

    // Routes internal log events to any standard logger.
    Logger: slog.NewLogLogger(slog.Default().Handler(), slog.LevelDebug),

    // Swap the vector backend entirely — bring your own Qdrant, pgvector, etc.
    VectorBackend: myQdrantAdapter,
})

Claude Code / Cursor (MCP)

graymatter init     # creates .mcp.json automatically

Claude Code detects .mcp.json automatically. Five tools become available:

Tool What it does
memory_search Recall facts for a query
memory_add Store a new fact
checkpoint_save Snapshot current session
checkpoint_resume Restore last checkpoint
memory_reflect Add / update / forget / link memories (agent self-edit)

Or add manually to your project's .mcp.json:

{
  "mcpServers": {
    "graymatter": {
      "command": "graymatter",
      "args": ["mcp", "serve"]
    }
  }
}

Storage

Layer Tech What it holds
KV store bbolt (pure Go, ACID) Sessions, checkpoints, facts, metadata, KG
Vector index chromem-go (pure Go) Semantic embeddings, hybrid retrieval
Export Markdown files Human-readable, git-friendly, Obsidian-compatible

Single file: ~/.graymatter/gray.db
Single folder: .graymatter/vectors/

No migrations. No schema versions. Append-only with decay-based eviction.


Embeddings

GrayMatter degrades gracefully. It works without any embedding model.

Mode When
Ollama (default) Machine has Ollama running with nomic-embed-text
OpenAI OPENAI_API_KEY set, Ollama not available
Anthropic ANTHROPIC_API_KEY set, Ollama and OpenAI not available
Keyword-only No embedding available — TF-IDF + recency, zero deps

Auto-detection order in EmbeddingAuto mode: Ollama → OpenAI → Anthropic → keyword.

# Pull the embedding model once (Ollama):
ollama pull nomic-embed-text

# Or set an API key (OpenAI or Anthropic):
export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...

Memory lifecycle

Recall(agent, task)          ← hybrid: vector + keyword + recency → top-8 facts
    ↓
Inject into system prompt    ← your 3 lines of code
    ↓
Agent runs
    ↓
Remember(agent, observation) ← store key facts during/after run
    ↓
Consolidate() [async]        ← summarise + decay + prune (LLM optional)

Consolidation is the only "smart" step. Everything else is deterministic. Without consolidation, GrayMatter still works — it just doesn't compress over time.

Consolidation auto-enables when ANTHROPIC_API_KEY is set. To use Ollama:

cfg := graymatter.DefaultConfig()
cfg.ConsolidateLLM = "ollama"

Token efficiency

Numbers produced by go run ./benchmarks/token_count — real Recall calls, keyword embedder, no LLM required:

Sessions Full injection GrayMatter Reduction
1 ~80 tokens ~80 tokens 0%
10 ~630 tokens ~550 tokens 12%
30 ~1,880 tokens ~550 tokens 71%
100 ~6,960 tokens ~670 tokens 90%

Each "session" = one paragraph-length agent observation (~60 words). GrayMatter always injects only the top-8 most relevant observations for the query. With vector embeddings the recall precision improves, maintaining similar reduction ratios.

Reproduce locally:

go run ./benchmarks/token_count

Build from source

git clone https://github.com/angelnicolasc/graymatter
cd graymatter
CGO_ENABLED=0 go build -ldflags="-s -w -X main.version=dev" -o graymatter ./cmd/graymatter

Output: single static binary, ~10 MB, no runtime dependencies.


Testing

The full test suite requires no LLM and no network — every test uses t.TempDir() with a keyword embedder or injected stubs. Runs clean on Linux, macOS, and Windows in CI.

# Core library
go test -count=1 -timeout=120s ./pkg/memory/...

# CLI / server / plugins
cd cmd/graymatter && go test -count=1 -timeout=120s ./internal/...
Package Tests What's covered
pkg/memory 42 unit tests + 3 fuzz targets Store lifecycle, hybrid recall, RRF fusion, decay math, semaphore, concurrent writes, vector paths, dimension guard
internal/harness 21 Agent file parsing, retry/backoff, session recovery
internal/kg 21 Graph CRUD, entity extraction, weight decay, Obsidian export
internal/server 11 All REST endpoints, concurrent remember/recall, cancelled-context requests
internal/plugin 10 Install, list, remove, E2E echo plugin binary

Fuzz targets (pkg/memory): FuzzTokenize, FuzzUnmarshalFact, FuzzKeywordScore — each with a seeded corpus so they run deterministically in CI and can be extended with go test -fuzz.

Core library coverage: 73.5% (CI gate: ≥ 70%). Measured without mocks — real bbolt + chromem-go instances in a temp directory.

Token-reduction benchmark (also zero deps):

go run ./benchmarks/token_count

What GrayMatter is NOT

  • Not a framework. Not an agent runner. Not a replacement for your existing tooling.
  • Not a hosted service. Not a SaaS. Not a cloud product.
  • Not a knowledge base UI. Not Notion. Not Obsidian.
  • Not trying to win the enterprise memory market.

It is exactly one thing: the missing stateful layer for Go CLI agents, packaged as a library you import in two lines.


Roadmap

  • Library: Remember / Recall / Consolidate
  • bbolt + chromem-go storage
  • Ollama + OpenAI + Anthropic + keyword-only embedding
  • Hybrid retrieval (vector + keyword + recency, RRF fusion)
  • CLI: init remember recall checkpoint export run sessions plugin server
  • MCP server (Claude Code / Cursor) + memory_reflect self-edit tool
  • Knowledge graph (entity extraction, node/edge linking, Obsidian export)
  • Shared memory across agents (--shared, --all flags, __shared__ namespace)
  • REST API server mode (graymatter server --addr :8080)
  • Plugin system (JSON line protocol, graymatter plugin install/list/remove)
  • 4-view Bubble Tea TUI (Memory / Sessions / Knowledge Graph / Stats)
  • Context-propagation API (RememberCtx, RecallCtx, RecallAllCtx, …)
  • Pluggable VectorStore interface (swap chromem-go for Qdrant, pgvector, etc.)
  • expvar /metrics endpoint — zero-dep, stdlib-only observability
  • OnRecall / OnPut / Logger hooks for APM integration
  • Embedding dimension guard — warns on provider switch instead of silent corruption
  • go.work workspace — core library imports zero TUI/CLI dependencies
  • Three-platform CI (Linux, macOS, Windows) + 73.5% coverage gate
  • Fuzz testing: FuzzTokenize, FuzzUnmarshalFact, FuzzKeywordScore
  • Ollama-backed consolidation LLM (Ollama as summariser, not just embedder)
  • WebSocket streaming for REST API

GrayMatter — v0.2.1 — April 2026

Documentation

Overview

Package graymatter provides persistent memory for Go AI agents.

Single static binary. Zero infra. Three public functions.

mem := graymatter.New(".graymatter")
mem.Remember("agent", "user prefers bullet points")
ctx := mem.Recall("agent", "how should I format this?")
// ctx is a []string ready to inject into a system prompt

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Config

type Config struct {
	// DataDir is the directory where gray.db and vector files are stored.
	// Default: ".graymatter"
	DataDir string

	// TopK is the maximum number of facts returned by Recall.
	// Default: 8
	TopK int

	// EmbeddingMode controls which embedding backend is used.
	// Default: EmbeddingAuto (Ollama → OpenAI → Anthropic → keyword)
	EmbeddingMode EmbeddingMode

	// OllamaURL is the base URL of the Ollama API.
	// Default: value of GRAYMATTER_OLLAMA_URL env var, or "http://localhost:11434"
	OllamaURL string

	// OllamaModel is the embedding model used with Ollama.
	// Default: value of GRAYMATTER_OLLAMA_MODEL env var, or "nomic-embed-text"
	OllamaModel string

	// AnthropicAPIKey for the Anthropic embeddings and consolidation endpoints.
	// Default: value of ANTHROPIC_API_KEY env var.
	AnthropicAPIKey string

	// OpenAIAPIKey for the OpenAI Embeddings API (text-embedding-3-small).
	// Default: value of OPENAI_API_KEY env var.
	OpenAIAPIKey string

	// OpenAIModel overrides the OpenAI embedding model.
	// Default: value of GRAYMATTER_OPENAI_MODEL env var, or "text-embedding-3-small"
	OpenAIModel string

	// ConsolidateLLM specifies which LLM provider drives memory consolidation.
	// Values: "anthropic", "ollama", "" (disable consolidation).
	// Default: "anthropic" if ANTHROPIC_API_KEY is set, else "" (disabled).
	// To use Ollama as the consolidation LLM, set this field explicitly to "ollama".
	ConsolidateLLM string

	// ConsolidateModel is the model used for consolidation summarisation.
	// Default: "claude-haiku-4-5-20251001"
	ConsolidateModel string

	// ConsolidateThreshold is the minimum fact count that triggers consolidation.
	// Default: 100
	ConsolidateThreshold int

	// DecayHalfLife is the half-life for the exponential weight decay curve.
	// Facts not accessed within this window lose half their retrieval weight.
	// Default: 720h (30 days)
	DecayHalfLife time.Duration

	// AsyncConsolidate runs consolidation in a background goroutine after Remember.
	// Default: true
	AsyncConsolidate bool

	// MaxAsyncConsolidations bounds how many consolidation goroutines may run
	// concurrently. Additional triggers while at capacity are silently dropped.
	// Default: 2
	MaxAsyncConsolidations int

	// OnConsolidateError is called when an async consolidation goroutine returns
	// an error. If nil, errors are discarded. The callback must be safe for
	// concurrent use.
	OnConsolidateError func(agentID string, err error)
}

Config holds all GrayMatter configuration. All fields have sane defaults via DefaultConfig(). Zero-value Config is not valid — always call DefaultConfig().

func DefaultConfig

func DefaultConfig() Config

DefaultConfig returns a Config with all defaults applied from environment variables and runtime probes. Safe to call multiple times.

func (Config) GetAnthropicAPIKey

func (c Config) GetAnthropicAPIKey() string

func (Config) GetConsolidateLLM

func (c Config) GetConsolidateLLM() string

func (Config) GetConsolidateModel

func (c Config) GetConsolidateModel() string

func (Config) GetConsolidateThreshold

func (c Config) GetConsolidateThreshold() int

func (Config) GetDecayHalfLife

func (c Config) GetDecayHalfLife() time.Duration

type EmbeddingMode

type EmbeddingMode int

EmbeddingMode controls how GrayMatter generates vector embeddings.

const (
	// EmbeddingAuto detects the best available provider at runtime.
	// Detection order: Ollama → OpenAI → Anthropic → keyword-only.
	EmbeddingAuto EmbeddingMode = iota
	// EmbeddingOllama forces Ollama (requires a running Ollama instance).
	EmbeddingOllama
	// EmbeddingAnthropic forces Anthropic API (requires ANTHROPIC_API_KEY).
	EmbeddingAnthropic
	// EmbeddingKeyword disables vector search; uses keyword+recency scoring only.
	EmbeddingKeyword
	// EmbeddingOpenAI forces OpenAI Embeddings API (requires OPENAI_API_KEY).
	EmbeddingOpenAI
)

type Memory

type Memory struct {
	// contains filtered or unexported fields
}

Memory is the primary handle for GrayMatter operations. It is safe for concurrent use.

func New

func New(dataDir string) *Memory

New creates a Memory with default configuration rooted at dataDir. If initialisation fails, it logs the error to stderr and returns a no-op Memory that never panics (callers need not check for nil).

func NewWithConfig

func NewWithConfig(cfg Config) (*Memory, error)

NewWithConfig creates a Memory with explicit configuration. Returns an error if the data directory cannot be created or the database cannot be opened.

func (*Memory) Close

func (m *Memory) Close() error

Close flushes pending writes and closes the underlying database. Always call Close when done; failing to do so may leave gray.db locked.

func (*Memory) Config

func (m *Memory) Config() Config

Config returns the active configuration.

func (*Memory) Consolidate

func (m *Memory) Consolidate(ctx context.Context, agentID string) error

Consolidate summarises and compacts memories for agentID. It calls the configured LLM to produce summary facts, applies the exponential decay curve, and prunes dead facts.

Consolidate is automatically triggered async after Remember when Config.AsyncConsolidate is true. Call it manually for synchronous control.

func (*Memory) Extract

func (m *Memory) Extract(ctx context.Context, llmResponse string) ([]string, error)

Extract calls the configured LLM and returns atomic facts distilled from llmResponse. Each returned string is a self-contained declarative sentence suitable for passing directly to Remember.

Requires an Anthropic API key. Without one, Extract returns the raw response as a single-element slice so the caller always receives a usable result.

facts, _ := mem.Extract(ctx, assistantReply)
for _, f := range facts {
    mem.Remember("agent", f)
}

func (*Memory) Recall

func (m *Memory) Recall(agentID, query string) ([]string, error)

Recall returns the top-k most relevant facts for agentID given query. The returned []string is ready to be joined and injected into a system prompt.

ctx := mem.Recall("sales-closer", "follow up Maria")
systemPrompt += "\n\n## Memory\n" + strings.Join(ctx, "\n")

func (*Memory) RecallAll

func (m *Memory) RecallAll(agentID, query string) ([]string, error)

RecallAll merges agent-scoped and shared memory results for agentID, deduplicates, and returns at most TopK combined facts.

func (*Memory) RecallAllCtx added in v0.2.0

func (m *Memory) RecallAllCtx(ctx context.Context, agentID, query string) ([]string, error)

RecallAllCtx is the context-aware variant of RecallAll.

func (*Memory) RecallCtx added in v0.2.0

func (m *Memory) RecallCtx(ctx context.Context, agentID, query string) ([]string, error)

RecallCtx is the context-aware variant of Recall. Use this when you need timeout control or tracing propagation.

func (*Memory) RecallShared

func (m *Memory) RecallShared(query string) ([]string, error)

RecallShared returns the top-k most relevant shared facts for query.

func (*Memory) RecallSharedCtx added in v0.2.0

func (m *Memory) RecallSharedCtx(ctx context.Context, query string) ([]string, error)

RecallSharedCtx is the context-aware variant of RecallShared.

func (*Memory) Remember

func (m *Memory) Remember(agentID, text string) error

Remember stores an observation associated with agentID. It is safe to call Remember concurrently from multiple goroutines.

mem.Remember("sales-closer", "Maria didn't reply Wednesday. Third touchpoint due Friday.")

func (*Memory) RememberCtx added in v0.2.0

func (m *Memory) RememberCtx(ctx context.Context, agentID, text string) error

RememberCtx is the context-aware variant of Remember. Use this when you need timeout control or tracing propagation.

func (*Memory) RememberExtracted

func (m *Memory) RememberExtracted(ctx context.Context, agentID, llmResponse string) error

RememberExtracted combines Extract and Remember in a single call: it extracts atomic facts from llmResponse and stores each one for agentID. This is the idiomatic replacement for the extractKeyFacts() pattern shown in the README.

mem.RememberExtracted(ctx, "sales-closer", assistantReply)

func (*Memory) RememberShared

func (m *Memory) RememberShared(text string) error

RememberShared stores an observation in the shared memory namespace, readable by all agents via RecallShared and RecallAll.

func (*Memory) RememberSharedCtx added in v0.2.0

func (m *Memory) RememberSharedCtx(ctx context.Context, text string) error

RememberSharedCtx is the context-aware variant of RememberShared.

func (*Memory) Store

func (m *Memory) Store() *memory.Store

Store exposes the internal Store for advanced use (CLI, MCP, TUI). Callers outside the graymatter package use this to access full CRUD.

Directories

Path Synopsis
benchmarks
token_count command
Command token_count benchmarks GrayMatter's token efficiency versus full-history injection.
Command token_count benchmarks GrayMatter's token efficiency versus full-history injection.
cmd
graymatter module
examples
agent command
Package main shows the canonical GrayMatter integration pattern for a skill-based agent that calls the Anthropic Messages API.
Package main shows the canonical GrayMatter integration pattern for a skill-based agent that calls the Anthropic Messages API.
plugin-hello command
Command plugin-hello is a reference GrayMatter plugin that implements the hello_greet MCP tool.
Command plugin-hello is a reference GrayMatter plugin that implements the hello_greet MCP tool.
standalone command
Package main demonstrates GrayMatter with a bare Anthropic Messages API call.
Package main demonstrates GrayMatter with a bare Anthropic Messages API call.
pkg
embedding
Package embedding provides pluggable vector embedding backends for GrayMatter.
Package embedding provides pluggable vector embedding backends for GrayMatter.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL