firememory

module

v0.1.0 Latest Latest Go to latest Published: May 3, 2026 License: Apache-2.0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/phmotad/firememory

Links

Open Source Insights

README ¶

FireMemory

Local-first semantic memory engine for AI agents.

FireMemory stores everything in a single .fbrain file — no server, no cloud, no configuration. Agents read and write memory through MCP via fquery mcp. ML models (~325 MB) are downloaded automatically on first use.

60-second quickstart

1. Install

macOS / Linux

curl -fsSL https://raw.githubusercontent.com/phmotad/firememory/main/scripts/install.sh | bash

Windows (PowerShell) — installs fmem; fquery requires WSL2 or Docker

irm https://raw.githubusercontent.com/phmotad/firememory/main/scripts/install.ps1 | iex

Homebrew

brew tap phmotad/firememory
brew install firememory

Scoop

scoop bucket add phmotad https://github.com/phmotad/scoop-firememory
scoop install firememory

2. Wire your editor

fquery init-mcp claude-code   # Claude Code
fquery init-mcp cursor        # Cursor
fquery init-mcp windsurf      # Windsurf
fquery init-mcp zed           # Zed

This writes the MCP server entry into the editor's config file and prints the path it modified.

3. Create a brainfile

fmem init ~/my.fbrain

Or skip this — fmem stats and any fquery tool call will auto-create ~/.firememory/default.fbrain if it doesn't exist.

4. Restart your editor

The MCP server starts on demand. On the first call, fquery mcp downloads the three ML models (~325 MB, runs once). Subsequent starts are instant.

What it is

FireMemory is not a vector database, not a RAG layer, and not SQL.

It is a cognitive memory engine: it understands what is being stored, deduplicates semantically, builds a knowledge graph, and assembles context windows tailored to a query.

Concept	FireMemory
Storage format	Single `.fbrain` file (bbolt)
Embeddings	multilingual-e5-small INT8 (local ONNX)
Entity extraction	GLiNER-small-v2.1 INT8 (local ONNX)
Intent / classification	DeBERTa-v3-small INT8 (local ONNX)
Model size	~325 MB total, downloaded once
Transport	MCP over stdio (`fquery mcp`)
Privacy	100% local — nothing leaves your machine

Agent connectivity

Agents talk to FireQuery (the MCP layer), not directly to FireMemory.

Your editor agent
      │  MCP (stdio)
      ▼
  fquery mcp          ← FireQuery: validates, classifies, enriches
      │
      ▼
  .fbrain file        ← FireMemory: stores, recalls, syncs

Supported MCP tools

Tool	Description
`remember`	Store a memory (deduplication is automatic)
`recall`	Semantic search over stored memories
`get_context`	Retrieve a ranked context window for a query
`sync`	Run slow-path enrichment (entities, relations, graph)
`explain`	Explain a stored memory

CLI reference

fmem

fmem init <file.fbrain>                 create a new brainfile
fmem remember <file.fbrain> <text>      store a memory
fmem recall <file.fbrain> <query>       semantic search
fmem sync <file.fbrain>                 entity/relation enrichment
fmem context <file.fbrain> <query>      build a context window
fmem inspect <file.fbrain>              show manifest
fmem snapshot <file.fbrain>             full data dump (JSON)
fmem backup <file.fbrain> <dest>        copy to backup path
fmem restore <backup> <file.fbrain>     restore from backup
fmem compact <file.fbrain>              reclaim space (bbolt vacuum)
fmem stats [<file.fbrain>]              memory counts
fmem default                            print/create default brainfile path
fmem version                            print version

fquery

fquery mcp                              start MCP server (stdio)
fquery init-mcp <client>               configure editor MCP entry
  clients: claude-code, cursor, windsurf, zed
  --print                               dry-run: show config that would be written
  --config <path>                       override config file path
fquery models list                      show downloaded model status
fquery models pull                      download missing models
fquery models pull --force              re-download all models
fquery models gc                        remove cached models
fquery devices                          list compute devices (CPU/GPU)
fquery doctor                           run diagnostics
fquery version                          print version

Models

FireQuery uses three local ONNX INT8 models, downloaded automatically:

Model	Use	Size
`multilingual-e5-small`	Embeddings, semantic recall	~120 MB
`deberta-v3-small`	Intent & trigger classification	~72 MB
`gliner-small-v2.1`	Named entity extraction	~121 MB

Models are stored in:

macOS — ~/Library/Caches/firememory/models
Linux — ~/.cache/firememory/models
Windows — %LOCALAPPDATA%\firememory\models

Override with FIREMEMORY_MODELS_DIR.

To remove: fquery models gc

Docker

docker run --rm -i \
  -v "$HOME/.firememory/models:/models" \
  ghcr.io/phmotad/firequery mcp

Models are cached in the mounted volume and downloaded on first run.

Build from source

Requires Go 1.24 and a C compiler (for CGO).

git clone https://github.com/phmotad/firememory
cd firememory
make build          # produces bin/fmem and bin/fquery (with -tags onnx)
make test           # runs all tests (offline-safe, no models needed)

Release binaries are built with goreleaser and the ONNX Runtime shared library is bundled in each archive (no separate install needed).

Architecture

cmd/fmem       — FireMemory CLI
cmd/fquery     — FireQuery CLI + MCP server

internal/
  engine/        — remember / recall / sync / context / explain
  storage/       — bbolt store behind the Store interface
  brainfile/     — .fbrain format, validation, migration
  dedup/         — semantic deduplication (hash + embedding)
  embedder/      — Embedder interface (E5, deterministic, external)
  graph/         — knowledge graph (entities + relations)
  firequery/     — cognitive interface layer (pipeline, MCP, contracts)
  firequery/onnx — ONNX inference backend (build tag: onnx)
  modelcache/    — auto-download, verify, extract ML models
  initcfg/       — write MCP entries into editor config files
  defaultbrain/  — default brainfile path + auto-init
  version/       — version string injected at build time

Fast path (remember): hash → embed → dedup → persist
Slow path (sync): extract entities → build relations → update graph

Contributing

See CONTRIBUTING.md. All tests must pass (go test ./...) before submitting a PR.

The ONNX backend is behind //go:build onnx — tests run offline without models by design.

Directories ¶

Path	Synopsis
cmd
fmem command
fquery command
fquery-mcp command
internal
brainfile
context
dedup
defaultbrain Package defaultbrain resolves and auto-initialises the default brainfile (~/.firememory/default.fbrain) used when no explicit path is given.	Package defaultbrain resolves and auto-initialises the default brainfile (~/.firememory/default.fbrain) used when no explicit path is given.
embedder
engine
extractor
firequery
firequery/adapters
firequery/builder
firequery/contract
firequery/devices
firequery/doctor
firequery/mcp
firequery/models
firequery/onnx Package onnx provides ONNX Runtime-based inference for FireQuery specialists.	Package onnx provides ONNX Runtime-based inference for FireQuery specialists.
firequery/pipeline
firequery/python
firequery/runtime
firequery/validator
firequeryapp
graph
initcfg Package initcfg writes MCP server entries into the configuration files used by popular AI coding clients (Claude Code, Cursor, Windsurf, …).	Package initcfg writes MCP server entries into the configuration files used by popular AI coding clients (Claude Code, Cursor, Windsurf, …).
memory
modelcache Package modelcache manages automatic download and local caching of ONNX models required by the FireQuery inference pipeline.	Package modelcache manages automatic download and local caching of ONNX models required by the FireQuery inference pipeline.
protocol/adp Package adp contains the internal agent data protocol contracts.	Package adp contains the internal agent data protocol contracts.
protocol/mcp
storage
util Package util contains shared helpers that do not belong to a domain package.	Package util contains shared helpers that do not belong to a domain package.
vector
version

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL