firememory

module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 3, 2026 License: Apache-2.0

README

FireMemory

Test Release License Go Report Card

Local-first semantic memory engine for AI agents.

FireMemory stores everything in a single .fbrain file — no server, no cloud, no configuration. Agents read and write memory through MCP via fquery mcp. ML models (~325 MB) are downloaded automatically on first use.


60-second quickstart

1. Install

macOS / Linux

curl -fsSL https://raw.githubusercontent.com/phmotad/firememory/main/scripts/install.sh | bash

Windows (PowerShell) — installs fmem; fquery requires WSL2 or Docker

irm https://raw.githubusercontent.com/phmotad/firememory/main/scripts/install.ps1 | iex

Homebrew

brew tap phmotad/firememory
brew install firememory

Scoop

scoop bucket add phmotad https://github.com/phmotad/scoop-firememory
scoop install firememory
2. Wire your editor
fquery init-mcp claude-code   # Claude Code
fquery init-mcp cursor        # Cursor
fquery init-mcp windsurf      # Windsurf
fquery init-mcp zed           # Zed

This writes the MCP server entry into the editor's config file and prints the path it modified.

3. Create a brainfile
fmem init ~/my.fbrain

Or skip this — fmem stats and any fquery tool call will auto-create ~/.firememory/default.fbrain if it doesn't exist.

4. Restart your editor

The MCP server starts on demand. On the first call, fquery mcp downloads the three ML models (~325 MB, runs once). Subsequent starts are instant.


What it is

FireMemory is not a vector database, not a RAG layer, and not SQL.

It is a cognitive memory engine: it understands what is being stored, deduplicates semantically, builds a knowledge graph, and assembles context windows tailored to a query.

Concept FireMemory
Storage format Single .fbrain file (bbolt)
Embeddings multilingual-e5-small INT8 (local ONNX)
Entity extraction GLiNER-small-v2.1 INT8 (local ONNX)
Intent / classification DeBERTa-v3-small INT8 (local ONNX)
Model size ~325 MB total, downloaded once
Transport MCP over stdio (fquery mcp)
Privacy 100% local — nothing leaves your machine

Agent connectivity

Agents talk to FireQuery (the MCP layer), not directly to FireMemory.

Your editor agent
      │  MCP (stdio)
      ▼
  fquery mcp          ← FireQuery: validates, classifies, enriches
      │
      ▼
  .fbrain file        ← FireMemory: stores, recalls, syncs
Supported MCP tools
Tool Description
remember Store a memory (deduplication is automatic)
recall Semantic search over stored memories
get_context Retrieve a ranked context window for a query
sync Run slow-path enrichment (entities, relations, graph)
explain Explain a stored memory

CLI reference

fmem
fmem init <file.fbrain>                 create a new brainfile
fmem remember <file.fbrain> <text>      store a memory
fmem recall <file.fbrain> <query>       semantic search
fmem sync <file.fbrain>                 entity/relation enrichment
fmem context <file.fbrain> <query>      build a context window
fmem inspect <file.fbrain>              show manifest
fmem snapshot <file.fbrain>             full data dump (JSON)
fmem backup <file.fbrain> <dest>        copy to backup path
fmem restore <backup> <file.fbrain>     restore from backup
fmem compact <file.fbrain>              reclaim space (bbolt vacuum)
fmem stats [<file.fbrain>]              memory counts
fmem default                            print/create default brainfile path
fmem version                            print version
fquery
fquery mcp                              start MCP server (stdio)
fquery init-mcp <client>               configure editor MCP entry
  clients: claude-code, cursor, windsurf, zed
  --print                               dry-run: show config that would be written
  --config <path>                       override config file path
fquery models list                      show downloaded model status
fquery models pull                      download missing models
fquery models pull --force              re-download all models
fquery models gc                        remove cached models
fquery devices                          list compute devices (CPU/GPU)
fquery doctor                           run diagnostics
fquery version                          print version

Models

FireQuery uses three local ONNX INT8 models, downloaded automatically:

Model Use Size
multilingual-e5-small Embeddings, semantic recall ~120 MB
deberta-v3-small Intent & trigger classification ~72 MB
gliner-small-v2.1 Named entity extraction ~121 MB

Models are stored in:

  • macOS~/Library/Caches/firememory/models
  • Linux~/.cache/firememory/models
  • Windows%LOCALAPPDATA%\firememory\models

Override with FIREMEMORY_MODELS_DIR.

To remove: fquery models gc


Docker

docker run --rm -i \
  -v "$HOME/.firememory/models:/models" \
  ghcr.io/phmotad/firequery mcp

Models are cached in the mounted volume and downloaded on first run.


Build from source

Requires Go 1.24 and a C compiler (for CGO).

git clone https://github.com/phmotad/firememory
cd firememory
make build          # produces bin/fmem and bin/fquery (with -tags onnx)
make test           # runs all tests (offline-safe, no models needed)

Release binaries are built with goreleaser and the ONNX Runtime shared library is bundled in each archive (no separate install needed).


Architecture

cmd/fmem       — FireMemory CLI
cmd/fquery     — FireQuery CLI + MCP server

internal/
  engine/        — remember / recall / sync / context / explain
  storage/       — bbolt store behind the Store interface
  brainfile/     — .fbrain format, validation, migration
  dedup/         — semantic deduplication (hash + embedding)
  embedder/      — Embedder interface (E5, deterministic, external)
  graph/         — knowledge graph (entities + relations)
  firequery/     — cognitive interface layer (pipeline, MCP, contracts)
  firequery/onnx — ONNX inference backend (build tag: onnx)
  modelcache/    — auto-download, verify, extract ML models
  initcfg/       — write MCP entries into editor config files
  defaultbrain/  — default brainfile path + auto-init
  version/       — version string injected at build time

Fast path (remember): hash → embed → dedup → persist
Slow path (sync): extract entities → build relations → update graph


Contributing

See CONTRIBUTING.md. All tests must pass (go test ./...) before submitting a PR.

The ONNX backend is behind //go:build onnx — tests run offline without models by design.

Directories

Path Synopsis
cmd
fmem command
fquery command
fquery-mcp command
internal
defaultbrain
Package defaultbrain resolves and auto-initialises the default brainfile (~/.firememory/default.fbrain) used when no explicit path is given.
Package defaultbrain resolves and auto-initialises the default brainfile (~/.firememory/default.fbrain) used when no explicit path is given.
firequery/onnx
Package onnx provides ONNX Runtime-based inference for FireQuery specialists.
Package onnx provides ONNX Runtime-based inference for FireQuery specialists.
initcfg
Package initcfg writes MCP server entries into the configuration files used by popular AI coding clients (Claude Code, Cursor, Windsurf, …).
Package initcfg writes MCP server entries into the configuration files used by popular AI coding clients (Claude Code, Cursor, Windsurf, …).
modelcache
Package modelcache manages automatic download and local caching of ONNX models required by the FireQuery inference pipeline.
Package modelcache manages automatic download and local caching of ONNX models required by the FireQuery inference pipeline.
protocol/adp
Package adp contains the internal agent data protocol contracts.
Package adp contains the internal agent data protocol contracts.
util
Package util contains shared helpers that do not belong to a domain package.
Package util contains shared helpers that do not belong to a domain package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL