mmapforge

package module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 21, 2026 License: MIT Imports: 14 Imported by: 0

README

mmapforge

CI Go Report Card GoDoc Coverage License

Incubating - still a work in progress.

A zero-copy, mmap-backed typed record store for Go. No serialization. No allocation on reads. No external dependencies.

You define a struct, annotate it, run the code generator, and get a fully typed store that reads and writes directly from memory-mapped files. Field access is a single memory load - ~3ns per read on Apple M4 Pro, zero heap allocations.

Install

go install github.com/CreditWorthy/mmapforge/cmd/mmapforge@latest

This installs the mmapforge code generator. Then add the library to your project:

go get github.com/CreditWorthy/mmapforge

Usage

1. Define your struct

Create a Go file with a struct annotated with mmap tags and a mmapforge:schema comment:

package mypackage

//go:generate mmapforge -input types.go

// mmapforge:schema version=1
type Tick struct {
    Symbol    string  `mmap:"symbol,64"`
    Price     float64 `mmap:"price"`
    Volume    float64 `mmap:"volume"`
    Timestamp uint64  `mmap:"timestamp"`
}

String fields take a max size after the name (e.g. symbol,64 for a 64-byte string). Numeric fields are fixed size.

2. Generate the store
go generate ./...

This creates a tick_store.go file with a fully typed TickStore that has Get/Set methods for every field, plus Append, Len, Close, and Sync.

3. Use it
// Create a new store
store, err := NewTickStore("ticks.mmf")
if err != nil {
    log.Fatal(err)
}
defer store.Close()

// Append a record
idx, err := store.Append()
if err != nil {
    log.Fatal(err)
}

// Write fields
store.SetSymbol(idx, "AAPL")
store.SetPrice(idx, 189.50)
store.SetVolume(idx, 52_000_000)
store.SetTimestamp(idx, uint64(time.Now().UnixNano()))

// Read fields 
price, err := store.GetPrice(idx)

All reads and writes go directly to the memory-mapped file. No serialization, no copies. Concurrent reads are lock-free via per-record seqlocks.

Why

Most storage libraries serialize your data on write and deserialize on read. That costs CPU time and heap allocations. mmapforge skips all of that - your data lives in a flat binary format on disk, memory-mapped into your process. Reading a field is just pointer arithmetic into the mapped region.

This is useful for:

  • Game state - thousands of entities updated every tick
  • Time-series data - append-only streams of fixed-size records
  • Caches - memory-mapped shared state between processes
  • Anything where read speed matters more than flexibility

Benchmarks

All benchmarks run on Apple M4 Pro, darwin/arm64, Go 1.24. Run with:

go test ./... -bench=. -benchmem
Core Store — Read Path
Benchmark ns/op B/op allocs/op
ReadUint64 1.79 0 0
ReadFloat64 1.80 0 0
ReadInt32 1.79 0 0
ReadUint8 1.79 0 0
ReadString 2.30 0 0
ReadMultiField (4 fields) 7.52 0 0
Core Store — Write Path
Benchmark ns/op B/op allocs/op
WriteUint64 1.81 0 0
WriteFloat64 1.81 0 0
WriteInt32 2.01 0 0
WriteString 4.09 0 0
WriteMultiField (4 fields) 13.44 0 0
Seqlock
Benchmark ns/op B/op allocs/op
SeqReadBegin 0.44 0 0
SeqWriteCycle 1.40 0 0
Append
Benchmark ns/op B/op allocs/op
Append 5.12 0 0
Generated Store (MarketCap example) — Per-field Get
Benchmark ns/op B/op allocs/op
GetID 3.11 0 0
GetPrice 3.23 0 0
GetVolume 3.25 0 0
GetMarketCap 3.23 0 0
GetStale 3.17 0 0
Generated Store (MarketCap example) — Per-field Set
Benchmark ns/op B/op allocs/op
SetID 3.55 0 0
SetPrice 3.62 0 0
SetVolume 3.62 0 0
SetMarketCap 3.62 0 0
SetStale 3.58 0 0
Generated Store — Bulk Operations
Benchmark ns/op B/op allocs/op
BulkGet (all fields, atomic) 18.83 48 1
BulkSet (all fields, atomic) 10.62 0 0
Header
Benchmark ns/op B/op allocs/op
EncodeHeader 2.12 0 0
DecodeHeader 13.19 64 1
Layout Engine
Benchmark ns/op B/op allocs/op
ComputeLayout (2 fields) 88.61 272 3
ComputeLayout (5 fields) 178.1 640 3
ComputeLayout (10 fields) 446.2 1768 6
SchemaHash 666.5 808 22
vs. os.File + encoding/binary Baseline
Benchmark ns/op Speedup
mmap ReadUint64 1.79
os.File ReadAt 319.6 179× slower
mmap WriteUint64 1.81
os.File WriteAt 609.6 337× slower

Crash Safety

mmapforge is a datastore primitive, not a database. It provides fast, typed, memory-mapped storage but makes no durability or transactional guarantees. Here is what happens if the process dies unexpectedly:

What's protected
  • Seqlock recovery - if a writer crashes mid-write, the per-record sequence counter gets stuck at an odd value. On the next OpenStore, all stuck counters are automatically reset so readers don't spin forever. The data in that record may be partially written (torn).
What's not protected
  • Torn multi-field writes - writing multiple fields is not atomic. If the process dies mid-write, some fields may have the new value and others the old value. Single aligned 8-byte writes (WriteUint64, WriteFloat64, etc.) are hardware-atomic on x86/arm64.
  • Stale header - the on-disk header RecordCount is updated on Sync() or Close(). If neither is called before a crash, the header may report fewer records than were actually appended. The data is present in the file but the count is stale.
  • No fsync on write - writes go to the kernel page cache via mmap. They are not flushed to stable storage until Sync() is called or the kernel decides to write back dirty pages. A power failure (not just process crash) can lose recently written data.
Recommendations
  • Call Sync() periodically if you need durability.
  • Use mmapforge for hot in-process data (caches, game state, real-time feeds), not as a primary durable store.
  • If you need crash-safe transactions, put a WAL or database in front.

Documentation

Index

Constants

View Source
const DefaultMaxVA = 1 << 30

DefaultMaxVA is the fallback virtual address reservation when no reserveVA is passed to Map. Set low because callers should always provide an explicit value (e.g. StoreReserveVA). The actual reservation is clamped to at least the page-aligned file size, so this only controls headroom for future growth.

View Source
const HeaderSize = 64

HeaderSize is the fixed size of the file header in bytes.

View Source
const MagicString = "MMFG"

MagicString is the string form of Magic for display purposes.

View Source
const SeqFieldSize = 8
View Source
const StoreReserveVA = 1 << 30

StoreReserveVA is the default virtual address reservation for Store files (1 GB).

View Source
const Version uint32 = 1

Version is the current binary format version.

Variables

View Source
var (
	ErrSchemaMismatch = errors.New("mmapforge: schema hash mismatch")
	ErrOutOfBounds    = errors.New("mmapforge: index out of bounds")
	ErrCorrupted      = errors.New("mmapforge: file corrupted")
	ErrBadMagic       = errors.New("mmapforge: invalid magic bytes")
	ErrStringTooLong  = errors.New("mmapforge: string exceeds max size")
	ErrBytesTooLong   = errors.New("mmapforge: bytes exceeds max size")
	ErrReadOnly       = errors.New("mmapforge: store is read-only")
	ErrClosed         = errors.New("mmapforge: store is closed")
	ErrInvalidBool    = errors.New("mmapforge: invalid bool value")
	ErrTypeMismatch   = errors.New("mmapforge: field type changed during migration")
)

Sentinel errors returned by Store and Region operations.

View Source
var Magic = [4]byte{'M', 'M', 'F', 'G'}

Magic is the 4-byte file signature written at the start of every mmapforge file.

Functions

func EncodeHeader

func EncodeHeader(dst []byte, h *Header) error

EncodeHeader writes h into the first 64 bytes of dst.

func SchemaHash

func SchemaHash(fields []FieldDescriptor) [32]byte

SchemaHash computes the SHA-256 of a canonical field descriptor string. Fields are sorted by name so the hash is layout-order-independent

Types

type AccessPattern

type AccessPattern int

hints to the kernel about how we plan to read the mapped region when you touch a mapped page the OS loads it from disk on demand ("page fault") if we tell it our pattern ahead of time it can prefetch smarter

const (
	// Sequential <- we read front to back; kernel will aggressively prefetch ahead
	Sequential AccessPattern = iota

	// Random <- we jump around; kernel skips prefetch, keeps more pages cached instead
	Random
)

type FieldDef

type FieldDef struct {
	Name    string
	GoName  string
	Type    FieldType
	MaxSize uint32
}

FieldDef is the input to the layout engine: one per struct field.

type FieldDescriptor

type FieldDescriptor struct {
	Name string
	Type string
	Size uint32
}

FieldDescriptor is the canonical representation of a field for schema hashing.

type FieldLayout

type FieldLayout struct {
	FieldDef
	Offset uint32
	Size   uint32
	Align  uint32
}

FieldLayout is the output: a field with its computed offset and size.

type FieldType

type FieldType int

FieldType enumerates the supported binary field types.

const (
	FieldBool FieldType = iota
	FieldInt8
	FieldUint8
	FieldInt16
	FieldUint16
	FieldInt32
	FieldUint32
	FieldInt64
	FieldUint64
	FieldFloat32
	FieldFloat64
	FieldString
	FieldBytes
)

func (FieldType) String

func (t FieldType) String() string

String returns the canonical name for a field type.

type Header struct {
	Magic         [4]byte
	FormatVersion uint32
	SchemaHash    [32]byte
	SchemaVersion uint32
	RecordSize    uint32
	RecordCount   uint64
	Capacity      uint64
}

Header is the 64-byte metadata block at the start of every mmapforge file.

func DecodeHeader

func DecodeHeader(src []byte) (*Header, error)

DecodeHeader reads the first 64 bytes of src into a Header.

type RecordLayout

type RecordLayout struct {
	Fields     []FieldLayout
	RecordSize uint32
}

RecordLayout is the complete layout for one struct.

func ComputeLayout

func ComputeLayout(fields []FieldDef) (*RecordLayout, error)

ComputeLayout takes field definitions in declaration order and returns the byte layout with proper alignment. Returns an error if any field definition is invalid.

The first 8 bytes of every record are reserved for the seqlock sequence counter. User fields start at offset 8.

func (*RecordLayout) Descriptors

func (r *RecordLayout) Descriptors() []FieldDescriptor

Descriptors converts the layout to FieldDescriptors for schema hashing

type Region

type Region struct {
	// contains filtered or unexported fields
}

Region is a page-aligned, memory-mapped view of a file with a stable base address. A large virtual address range is reserved up front with PROT_NONE. The file is mapped over the start of that range using MAP_FIXED. On Grow the file is extended and remapped at the same base address, so pointers and slices obtained from Slice remain valid as long as they fall within the previously mapped size.

Owns the underlying *os.File. Safe for concurrent reads after Map returns.

func Map

func Map(f *os.File, size int, writable bool, access AccessPattern, reserveVA ...int) (*Region, error)

Map opens a memory-mapped view of f starting at offset 0.

A virtual address range of maxVA bytes is reserved (PROT_NONE, anonymous). The file is then mapped over the first `size` bytes of that reservation using MAP_FIXED|MAP_SHARED. If the file is smaller than the requested size it is extended via Truncate.

reserveVA must be >= size. Pass 0 to use DefaultMaxVA.

Caller must call Close when done.

func (*Region) Close

func (r *Region) Close() error

Close unmaps the region and closes the file descriptor.

func (*Region) Grow

func (r *Region) Grow(minSize int) error

Grow remaps the file to at least minSize bytes (page-aligned) at the same base address using MAP_FIXED. No-op if already large enough.

Because the base address never changes, slices from previous Slice calls remain valid (they point into the same VA range). New pages beyond the old size become accessible after Grow returns.

Must be externally serialized (Store.appendMu).

func (*Region) Mapped

func (r *Region) Mapped() int

Mapped returns the size of the mapped region in bytes.

func (*Region) Slice

func (r *Region) Slice(offset, n int) []byte

Slice returns the mmap byte range [off, off+n) from the stable base. Out-of-range panics on purpose so layout bugs surface fast. Valid for the lifetime of the Region (base address never changes).

func (*Region) Sync

func (r *Region) Sync() error

Sync flushes dirty pages to disk via msync. Block until the kernal confirms the write hit stable storage.

func (*Region) Unmap

func (r *Region) Unmap() error

Unmap releases the entire VA reservation. Idempotent.

type Store

type Store struct {
	// contains filtered or unexported fields
}

Store is the base mmap-backed record store.

func CreateStore

func CreateStore(path string, layout *RecordLayout, schemaVersion uint32) (*Store, error)

CreateStore creates a new mmapforge file at path with the given layout and schema version.

func OpenStore

func OpenStore(path string, layout *RecordLayout) (*Store, error)

OpenStore opens an existing mmapforge file and validates the schema hash.

func (*Store) Append

func (s *Store) Append() (int, error)

Append adds a new zero-filled record and returns its index.

func (*Store) Cap

func (s *Store) Cap() int

Cap returns how many records fit in the current file mapping.

func (*Store) Close

func (s *Store) Close() error

Close syncs and closes the store. All references into store memory become invalid.

func (*Store) Len

func (s *Store) Len() int

Len returns the number of records in the store.

func (*Store) ReadBool

func (s *Store) ReadBool(idx int, offset uint32) (bool, error)

ReadBool reads a bool from record idx at the given byte offset.

func (*Store) ReadBytes

func (s *Store) ReadBytes(idx int, offset, fieldSize, maxSize uint32) ([]byte, error)

ReadBytes returns a zero-copy byte slice from the mmap region. The returned slice is valid only until Close() is called.

func (*Store) ReadFloat32

func (s *Store) ReadFloat32(idx int, offset uint32) (float32, error)

ReadFloat32 reads a float32 from record idx at the given byte offset.

func (*Store) ReadFloat64

func (s *Store) ReadFloat64(idx int, offset uint32) (float64, error)

ReadFloat64 reads a float64 from record idx at the given byte offset.

func (*Store) ReadInt8

func (s *Store) ReadInt8(idx int, offset uint32) (int8, error)

ReadInt8 reads an int8 from record idx at the given byte offset.

func (*Store) ReadInt16

func (s *Store) ReadInt16(idx int, offset uint32) (int16, error)

ReadInt16 reads an int16 from record idx at the given byte offset.

func (*Store) ReadInt32

func (s *Store) ReadInt32(idx int, offset uint32) (int32, error)

ReadInt32 reads an int32 from record idx at the given byte offset.

func (*Store) ReadInt64

func (s *Store) ReadInt64(idx int, offset uint32) (int64, error)

ReadInt64 reads an int64 from record idx at the given byte offset.

func (*Store) ReadString

func (s *Store) ReadString(idx int, offset, fieldSize, maxSize uint32) (string, error)

ReadString returns a zero-copy string from the mmap region. The returned string is valid only until Close() is called. fieldSize is the total field size

func (*Store) ReadUint8

func (s *Store) ReadUint8(idx int, offset uint32) (uint8, error)

ReadUint8 reads a uint8 from record idx at the given byte offset.

func (*Store) ReadUint16

func (s *Store) ReadUint16(idx int, offset uint32) (uint16, error)

ReadUint16 reads a uint16 from record idx at the given byte offset.

func (*Store) ReadUint32

func (s *Store) ReadUint32(idx int, offset uint32) (uint32, error)

ReadUint32 reads a uint32 from record idx at the given byte offset.

func (*Store) ReadUint64

func (s *Store) ReadUint64(idx int, offset uint32) (uint64, error)

ReadUint64 reads a uint64 from record idx at the given byte offset.

func (*Store) SeqBeginWrite

func (s *Store) SeqBeginWrite(idx int)

SeqBeginWrite marks the start of a write to record idx. Increments the 8-byte sequence counter at offset 0 of the record to an odd value. Caller must call SeqEndWrite when the write is complete.

func (*Store) SeqEndWrite

func (s *Store) SeqEndWrite(idx int)

SeqEndWrite marks the end of a write to record idx. Increments the sequence counter to an even value.

func (*Store) SeqReadBegin

func (s *Store) SeqReadBegin(idx int) uint64

SeqReadBegin loads the sequence counter for record idx. If the value is odd, a write is in progress and the caller should spin.

func (*Store) SeqReadValid

func (s *Store) SeqReadValid(idx int, seq uint64) bool

SeqReadValid returns true if seq is even (no write in progress) and the current counter still matches seq (no write happened during the read).

func (*Store) Sync

func (s *Store) Sync() error

Sync flushes the header and dirty pages to disk.

func (*Store) WriteBool

func (s *Store) WriteBool(idx int, offset uint32, val bool) error

WriteBool writes a bool to record idx at the given byte offset.

func (*Store) WriteBytes

func (s *Store) WriteBytes(idx int, offset, fieldSize, maxSize uint32, val []byte) error

WriteBytes writes a length-prefixed byte slice into the field, zero-padding the remainder.

func (*Store) WriteFloat32

func (s *Store) WriteFloat32(idx int, offset uint32, val float32) error

WriteFloat32 writes a float32 to record idx at the given byte offset.

func (*Store) WriteFloat64

func (s *Store) WriteFloat64(idx int, offset uint32, val float64) error

WriteFloat64 writes a float64 to record idx at the given byte offset.

func (*Store) WriteInt8

func (s *Store) WriteInt8(idx int, offset uint32, val int8) error

WriteInt8 writes an int8 to record idx at the given byte offset.

func (*Store) WriteInt16

func (s *Store) WriteInt16(idx int, offset uint32, val int16) error

WriteInt16 writes an int16 to record idx at the given byte offset.

func (*Store) WriteInt32

func (s *Store) WriteInt32(idx int, offset uint32, val int32) error

WriteInt32 writes an int32 to record idx at the given byte offset.

func (*Store) WriteInt64

func (s *Store) WriteInt64(idx int, offset uint32, val int64) error

WriteInt64 writes an int64 to record idx at the given byte offset.

func (*Store) WriteString

func (s *Store) WriteString(idx int, offset, fieldSize, maxSize uint32, val string) error

WriteString writes a length-prefixed string into the field, zero-padding the remainder.

func (*Store) WriteUint8

func (s *Store) WriteUint8(idx int, offset uint32, val uint8) error

WriteUint8 writes a uint8 to record idx at the given byte offset.

func (*Store) WriteUint16

func (s *Store) WriteUint16(idx int, offset uint32, val uint16) error

WriteUint16 writes a uint16 to record idx at the given byte offset.

func (*Store) WriteUint32

func (s *Store) WriteUint32(idx int, offset uint32, val uint32) error

WriteUint32 writes a uint32 to record idx at the given byte offset.

func (*Store) WriteUint64

func (s *Store) WriteUint64(idx int, offset uint32, val uint64) error

WriteUint64 writes a uint64 to record idx at the given byte offset.

Directories

Path Synopsis
cmd
mmapforge command
internal

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL