recall

module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 17, 2026 License: MIT

README

recall

CI Release Go Reference Go Report Card

Embedded, atomic, microsecond-fast key/value + entity store with git-backed time travel, exposed as a CLI and an MCP server. Single static Go binary. No service to run.

Have I seen this thing? Have I acted on every sub-event? What did my state look like an hour ago?recall answers all three.

Why

LLM agents and shell pipelines often need persistent, atomic, very-very-fast memory of "things I've already done" — but spinning up Redis or a database service for that is overkill. recall is:

  • Embedded — single file at ~/.recall/recall.db, no daemon.
  • Atomic — every op is an ACID transaction inside bbolt (the same engine etcd and consul use).
  • Fast — µs-class point lookups, fsync-bound writes (~17ms, or ~322µs batched, or ~30µs with NoSync).
  • Hierarchical — path-shaped keys give you cheap prefix scans (the HTTP-router trick).
  • Time-travellingrecall snapshot commits a consistent copy of the whole store to a git repo. recall at <commit> ... reads any historical state.
  • Dual frontend — same core powers a CLI (recall) and an MCP server (recall-mcp) for LLMs.
  • Zero CGO — pure Go, builds anywhere go builds.

Install

Pre-built binaries

Grab the right archive for your platform from the releases page and drop recall + recall-mcp somewhere on your $PATH.

From source
go install github.com/dreamware-nz/recall/cmd/recall@latest
go install github.com/dreamware-nz/recall/cmd/recall-mcp@latest
From a clone
git clone https://github.com/dreamware-nz/recall.git
cd recall
make install        # -> $(go env GOPATH)/bin/{recall,recall-mcp}

Set RECALL_HOME to override the default location (~/.recall).

Data model

recall exposes five primitives. Pick the one that fits the shape of your question.

Primitive Use when
kv Generic string → bytes
set "Have I seen X?" (membership)
ctr Atomic counters
ent Named records with attributes (a PR, an email, a file)
child Status-bearing items under an entity (a PR's comments, a PR's CI checks)

Everything else (predicates, completion checks) is composition over these.

Quick start: the PR-review use case

# 1. Cheap "have I seen it" set
recall set add seen-prs "dreamware-nz/recall#42"
recall set has seen-prs "dreamware-nz/recall#42"   # -> true

# 2. Rich tracking with sub-events
recall ent put pr "dreamware-nz/recall#42" head_sha=abc state=open

recall child put pr "dreamware-nz/recall#42" comments c-9981 --status=pending body="rename foo"
recall child put pr "dreamware-nz/recall#42" comments c-9982 --status=pending
recall child put pr "dreamware-nz/recall#42" checks   ci/test --status=running
recall child put pr "dreamware-nz/recall#42" checks   ci/lint --status=success

# 3. The completion predicate (composition, no DSL needed)
recall child count pr "dreamware-nz/recall#42" comments --status=pending   # 2
recall child put   pr "dreamware-nz/recall#42" comments c-9981 --status=acted
recall child count pr "dreamware-nz/recall#42" comments --status=pending   # 1

# 4. Snapshot for time travel
recall snapshot -m "after first review pass"
recall log
# dc62d1f1  2026-05-17 08:49:41  after first review pass

# 5. Read historical state
recall at dc62d1f1 child count pr "dreamware-nz/recall#42" comments --status=pending
"Have I acted on the whole PR?"

A done predicate is just two counts:

[ "$(recall child count pr "$PR" comments --status-not=acted 2>/dev/null || \
     recall child count pr "$PR" comments --status=pending)" = "0" ] && \
[ "$(recall child count pr "$PR" checks   --status=failing)" = "0" ] && echo "done"

(For now --status matches exactly; combine multiple counts client-side. A rule DSL may come later if the same predicate gets written twice.)

Head SHA rolled forward
recall child supersede pr "$PR" comments    # mark all old comments superseded
recall ent put pr "$PR" head_sha=newsha     # merge: only head_sha changes

More example use cases

The entity + child pattern fits anywhere a parent thing has many sub-signals you want to track individually:

Use case Entity Children
Threaded conversations thread/<channel>/<ts> messages/<id>, mentions/<user>
Document review doc/<id> sections/<name>, reviewers/<who>
Multi-stage pipelines file/<sha> stages/{download,extract,embed,index}
Notification dedup (just a set) set:notified membership
Webhook idempotency (just a kv) kv:webhook/<provider>/<event_id>
Agent task state task/<id> subtasks/<n> with status
RSS / feed reading feed/<source> items/<id>
Code/doc reviews (any) review/<id> comments/<id>, signoffs/<who>
GitHub PR tracking pr/<owner>/<repo>#<n> comments/, reviews/, checks/

A scripts/gh-to-recall shim ships in the repo to demonstrate the GitHub-PR variant end-to-end, including child_supersede on head-SHA rollover. Use it as a template — the same shape works for email triage, Slack thread followups, document signoff workflows, you name it.

CLI reference

recall kv      set|get|del|has|list
recall set     add|rem|has|members|card
recall ctr     incr|get|set
recall ent     put|get|del|list
recall child   put|get|del|list|count|supersede
recall snapshot [-m msg]
recall log     [-n N]
recall at <ref> <read-cmd...>
recall batch              < ops.jsonl    # write ops, single tx
recall where

recall help for full usage. Attribute syntax is key=value after positional args, e.g. recall ent put pr 42 head_sha=abc state=open.

MCP server

Run recall-mcp over stdio. Tools mirror the CLI 1:1 — every primitive is available to the LLM.

Example configuration

For Claude Desktop / Crush / any MCP client:

{
  "mcpServers": {
    "recall": {
      "command": "/usr/local/bin/recall-mcp",
      "env": { "RECALL_HOME": "/Users/you/.recall" }
    }
  }
}
Tool surface
Tool Purpose
kv_set / kv_get / kv_has / kv_del / kv_list Plain KV
set_add / set_has / set_rem / set_members / set_card Membership
counter_incr / counter_get Atomic counters
entity_put / entity_get / entity_del / entity_list Named records
child_put / child_get / child_del / child_list / child_count / child_supersede Status-bearing children
snapshot / history_log / read_at Time travel

read_at takes a ref, an op (one of kv_get, kv_has, set_has, set_card, counter_get, entity_get, child_count), and op-specific args. The whole DB at that commit is opened read-only — no need to copy state around.

Architecture

                  +-------------------+
   CLI  ---->     |                   |
                  |    internal/      |
  MCP  ---->      |     store         |  <-- single bbolt file ~/.recall/recall.db
                  |     history       |  <-- go-git repo at ~/.recall/history/
                  |     config        |
                  +-------------------+

bbolt buckets (path-shaped keys for cheap prefix scans):

kv                          string -> bytes
set:<name>                  member -> empty
ctr                         name   -> int64 (big-endian)
ent:<kind>                  id     -> msgpack(attrs)
child                       <kind>/<id>/<coll>/<child_id> -> msgpack(record)
idx:status                  <kind>/<id>/<coll>/<status>/<child_id> -> empty

The idx:status bucket is the trick that makes child_count --status=pending O(matches), not O(all_children). Status changes update the index atomically inside the same transaction.

Time-travel semantics

  • A snapshot is bbolt.Tx.WriteTo, which produces a consistent copy of the entire database even under concurrent writes.
  • Snapshots are committed as a single file (recall.db) into a git repo at ~/.recall/history/. Git handles dedup, packing, branching.
  • recall at <ref> <read-cmd> checks out the snapshot blob from that commit into a temp file, opens it read-only, runs the read, closes it. No reflog gymnastics needed.

Snapshots are explicit and cheap — call recall snapshot whenever a checkpoint makes sense (after a logical milestone, on a cron, on shutdown). They are not called on every write because that would destroy the atomic-µs perf budget.

Performance

Measured on Apple M1, go test -bench, default options (durable, fsync on every commit) unless noted. Numbers are per-op.

Operation Time Notes
KVGet (hot) 0.86 µs mmap-backed B+tree, sub-microsecond
KVHas (hit) 1.0 µs
KVHas (miss) 1.1 µs misses just as fast — no Bloom filter needed
SHas in 100k-member set 0.94 µs
SHas miss in 100k set 0.90 µs
CtrGet 0.76 µs
ChildCount(status=pending) over 1k children 11.9 µs uses idx:status
ChildList(status=pending) returning 500 751 µs ~1.5 µs per result
KVSet (one tx) 18.9 ms fsync-bound
SAdd (one tx) 18.0 ms fsync-bound
ChildPut (one tx) 16.7 ms fsync-bound
Batch of 50 ChildPut ops 16.1 ms total → 322 µs/op one fsync amortised
KVSet with NoSync 29.5 µs
ChildPut with NoSync 39.9 µs
SAdd with NoSync 33.5 µs
What this means

Reads are genuinely Redis-class. Writes are fsync-bound on default settings — about 60 writes/sec sequentially. This is fine for the interactive use case (one PR sync is dozens of writes, ~1 second), but if you need bulk throughput, use one of the two knobs below.

Knob 1: Batch — many ops, one fsync
err := s.Batch(func(b *store.WriteTx) error {
    _ = b.SAdd("seen-prs", id)
    _, _ = b.EntPut("pr", id, attrs, true)
    for _, c := range comments {
        _, _ = b.ChildPut("pr", id, "comments", c.ID, "pending", nil, false)
    }
    return nil
})

All ops commit together (or none do — full rollback on error). Throughput jumps ~50× because the single fsync at the end is the only durability cost.

The CLI exposes this as recall batch, reading JSONL from stdin:

{
  echo '["set","add","seen-prs","pr#42"]'
  echo '["ent","put","pr","pr#42","head_sha=abc"]'
  echo '["child","put","pr","pr#42","comments","c1","--status=pending"]'
  # ...
} | recall batch

scripts/gh-to-recall sync uses this — one PR sync is one fsync.

Knob 2: NoSync — trade durability for ~500× throughput
s, err := store.OpenWith(path, store.Options{NoSync: true})

Or set RECALL_NOSYNC=1 in the environment. Writes drop to ~30–40 µs. Use this for cache-grade data where losing the last few writes after a crash is acceptable. Call s.Sync() to force a manual fsync at logical checkpoints.

Don't mix NoSync with primary-state data unless you've thought hard about the crash window.

Why not Redis / SQLite / Loveliness / Dolt?

  • Redis: needs a service, not embedded, no time travel.
  • SQLite: great alternative; we picked bbolt for purer KV ergonomics and zero-CGO build.
  • Loveliness (sibling project): a clustered graph DB with Raft and Bolt protocol — exactly the opposite of what recall is. Use Loveliness when you need a graph service across machines; use recall when one process needs fast, durable, atomic memory.
  • Dolt: "Git for data" with SQL — heavier dep, single-writer-ish, but worth considering if branching/merging data (not just snapshots) becomes a hard requirement.

Roadmap

  • TTL / expiry on KV and sets.
  • Optional declarative rule files for completion predicates (only if the same predicate keeps appearing).
  • HTTP frontend (for non-MCP clients) — probably never; the CLI already covers it.
  • Optional Bloom-filter cache in front of set_has for huge sets (only if profiling demands it).

Contributing

Issues and PRs welcome. The bar:

  1. go test ./... passes.
  2. go vet ./... clean.
  3. New behaviour gets a test.
  4. Public API changes update the README.

CI runs build + test + vet on Linux, macOS, and Windows on every push and PR. A tag matching v*.*.* triggers a goreleaser run that publishes cross-platform binaries and checksums.

make test       # go test ./...
make build      # produce ./bin/{recall,recall-mcp}
make install    # to $(go env GOPATH)/bin

Status

Early but functional. Core primitives, CLI, MCP, time travel, batched writes, no-sync mode, and tests all work. API may change before 1.0.

License

MIT — see LICENSE.

Directories

Path Synopsis
cmd
recall command
recall is the CLI frontend for the recall embedded store.
recall is the CLI frontend for the recall embedded store.
recall-mcp command
recall-mcp exposes the recall store as an MCP server over stdio.
recall-mcp exposes the recall store as an MCP server over stdio.
internal
config
Package config resolves where recall stores its data.
Package config resolves where recall stores its data.
history
Package history wraps go-git to give recall time-travel for its bbolt database.
Package history wraps go-git to give recall time-travel for its bbolt database.
store
Package store is the embedded persistence layer for recall.
Package store is the embedded persistence layer for recall.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL