stockpile

package module
v0.0.0-...-59b0c1d Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 8, 2026 License: MIT Imports: 15 Imported by: 0

README

Stockpile

Fast lookups for hundreds of millions of chess position evaluations from the Lichess database.

eval, err := client.Lookup(ctx, "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq -")
fmt.Println(eval.Score()) // +0.20

Why?

Running Stockfish analysis is expensive. Analyzing a single game at depth 20+ takes several seconds of CPU time. Disco Chess is a chess training platform built around the Woodpecker Method — solving the same puzzles repeatedly until pattern recognition becomes automatic. Its review queue resurfaces mistakes from both training cycles and users' actual games, which means analyzing thousands of imported games to find missed tactics. That's a lot of Stockfish:

  • Scaling headaches: Spin up worker pools, manage job queues, handle bursty workloads (sound familiar?)
  • Slow feedback: Users wait minutes for game analysis to complete
  • High costs: CPU-intensive workloads don't come cheap

Stockpile sidesteps most of this. Lichess has already analyzed hundreds of millions of positions with Stockfish at depth 30+. Why redo work they've already done? Look up what exists, run Stockfish only for the gaps.

How?

  1. Shard by material — Positions are distributed across 32K shards based on piece counts. Positions with similar material land in the same shard.

  2. Sort by FEN — Within each shard, positions are sorted lexicographically. Lookups use binary search.

  3. Cache hot shards — An LRU cache keeps frequently accessed shards in memory. Game analysis hits the same few shards repeatedly.

The key insight: consecutive positions in a chess game almost always have the same material (captures are rare). Material-based sharding keeps them together, maximizing cache hits.

Features

  • Fast lookups with LRU caching
  • Hundreds of millions of positions from Lichess Stockfish evaluations (depth 30+)
  • Pluggable storage: Local filesystem, GCS, S3
  • Material-based sharding for cache locality during game analysis
  • Zero external dependencies at runtime (all data self-contained)

Quick Start

Installation
go get github.com/discochess/stockpile
Build the Database

Download and process the Lichess evaluation database:

# Install CLI
go install github.com/discochess/stockpile/cmd/stockpile@latest

# Build from Lichess source (downloads ~17GB)
stockpile build --output ./data

# Or from a local file
stockpile build --source ./lichess_db_eval.jsonl.zst --output ./data

Build options:

Flag Default Description
--output ./data Output directory for shards (local)
--output-gcs GCS path for output (gs://bucket/prefix)
--shards 32768 Number of shards to create
--strategy material Sharding strategy: material, fnv32
--workers 4 Parallel workers for compression
--max-memory 1024 Max memory (MB) before spilling to disk

Memory note: The build process can be memory-intensive. If you experience OOM kills, lower --max-memory (e.g., --max-memory 512). For long builds, use caffeinate on macOS:

caffeinate -dims stockpile build --source ./lichess_db_eval.jsonl.zst --output ./data --workers 10

GCS output: For cloud deployments, build directly to GCS:

stockpile build --output-gcs gs://my-bucket/stockpile

This builds locally to a temp directory, then uploads to GCS. Suitable for monthly cronjobs to pick up new positions from Lichess.

Use the Library
package main

import (
    "context"
    "fmt"
    "log"

    "github.com/discochess/stockpile"
)

func main() {
    // Create client with default settings (LRU cache, zstd decompression).
    opt, _ := stockpile.WithDataDir("./data")
    client, _ := stockpile.New(opt)
    defer client.Close()

    // Look up a position.
    ctx := context.Background()
    eval, err := client.Lookup(ctx, "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq -")
    if err != nil {
        log.Fatal(err)
    }

    fmt.Printf("Score: %s\n", eval.Score())  // +0.20
    fmt.Printf("Depth: %d\n", eval.Depth)    // 36
    if pv := eval.BestPV(); pv != nil {
        fmt.Printf("PV: %s\n", pv.Line)      // e2e4 e7e5 g1f3
    }
}

For advanced configuration (custom cache size, cloud storage), see examples/.

CLI Usage
# Look up a position
stockpile lookup "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq -"

# Show database stats
stockpile stats --data-dir ./data

# Verify database integrity
stockpile verify --data-dir ./data

Architecture

Design Principles

Inspired by SSTables: sorted immutable files with binary search.

  1. Simplicity over cleverness — Standard formats (JSONL, zstd) over custom binary formats
  2. Pluggable components — Interfaces for storage, cache, and sharding strategy
  3. Zero runtime dependencies — All data self-contained, no external services required
Data Flow
Build Phase:
  Lichess DB (.zst) → Decompress → Shard by Material → Sort by FEN → Compress

Lookup Phase:
  FEN → Compute Shard ID → Check Cache → Decompress (if miss) → Binary Search
Shard File Format

Each shard is a zstd-compressed JSONL file with lines sorted by FEN:

shards/
├── 00000.zst
├── 00001.zst
├── ...
└── 32767.zst

Each line matches the Lichess format:

{"fen":"rnbqkbnr/pppppppp/8/8/4P3/8/PPPP1PPP/RNBQKBNR b KQkq -","evals":[{"pvs":[{"cp":20,"line":"e7e5"}],"knodes":3000,"depth":36}]}
Material-Based Sharding

The default strategy encodes piece counts into a shard ID:

Piece Type Bits Range
White/Black Queens 3 each 0-7
White/Black Rooks 3 each 0-7
White/Black Minors (B+N) 3 each 0-7
Side to move 1 0-1

Total: 19 bits → modulo 32,768 shards.

This clusters positions with similar material together. Games progress through predictable material phases, so consecutive positions land in the same shard.

Lookup within a shard:

  1. Decompress (zstd)
  2. Binary search by FEN
  3. Parse matching JSON line

FEN extraction during search avoids full JSON parsing—just a string search for "fen":".

Thread Safety

The Client is safe for concurrent use. Stats use atomic operations. Cache and store implementations handle concurrent access.

Sentinel Errors
var (
    ErrNotFound = errors.New("stockpile: position not found")
    ErrClosed   = errors.New("stockpile: client closed")
)

Benchmarking

Compare sharding strategies with real game data:

# Install benchmark CLI
go install github.com/discochess/stockpile/cmd/stockpile-bench@latest

# Run simulation with PGN games
stockpile-bench run --games games.pgn --strategies material,fnv32

# Generate markdown report
stockpile-bench run --games games.pgn --format markdown --output report.md --verbose

Storage Backends

Local Filesystem (default)
store, _ := diskstore.New("./data", zstdcodec.New())
Google Cloud Storage
store, _ := gcsstore.New(ctx, "my-bucket", gcsstore.WithPrefix("stockpile/"))
AWS S3
store, _ := s3store.New(ctx, "my-bucket", s3store.WithPrefix("stockpile/"))

Performance

Performance depends on storage backend, cache size, and access patterns. Warm cache lookups (shard already in memory) are fast. Cold lookups require decompression. Cloud storage adds network latency.

Run benchmarks on your hardware with stockpile-bench to measure actual performance.

Data Source

Evaluations come from the Lichess evaluation database:

  • Hundreds of millions of unique positions (and growing)
  • Stockfish 16+ at depth 30+
  • Updated monthly

Project Structure

stockpile/
├── cmd/
│   ├── stockpile/              # Main CLI (build, lookup, stats, verify)
│   └── stockpile-bench/        # Benchmark CLI
├── internal/
│   ├── builder/                # Database build pipeline
│   ├── codec/                  # Compression codecs (zstd, gzip, noop)
│   ├── search/                 # Binary search on sorted JSONL
│   ├── shard/                  # Sharding strategies
│   │   ├── materialshard/      # Material-based (default)
│   │   └── fnvshard/           # FNV32 hash
│   ├── stats/                  # Metrics collection
│   └── store/                  # Storage backends
│       ├── diskstore/          # Local filesystem
│       ├── gcsstore/           # Google Cloud Storage
│       ├── s3store/            # AWS S3
│       └── cachedstore/        # LRU caching wrapper
├── benchmark/                  # Benchmarking infrastructure
├── examples/                   # Example applications
└── fx/                         # Uber fx modules for DI

Fx Modules

For applications using Uber Fx:

import "github.com/discochess/stockpile/fx/diskstockpilefx"

fx.New(
    diskstockpilefx.Module,
    // ... your modules
)

License

MIT License - see LICENSE for details.

Contributing

Contributions welcome! Please read the Architecture section above first.

Documentation

Overview

Package stockpile provides fast lookups into pre-computed chess position evaluations from the Lichess database.

Example usage:

client, err := stockpile.New(
    stockpile.WithDataDir("/path/to/data"),
)
if err != nil {
    log.Fatal(err)
}
defer client.Close()

eval, err := client.Lookup(ctx, "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1")
if err != nil {
    log.Fatal(err)
}
fmt.Printf("Evaluation: %s\n", eval.Score())

Index

Constants

This section is empty.

Variables

View Source
var (
	// ErrNotFound indicates the position was not found in the database.
	ErrNotFound = errors.New("stockpile: position not found")

	// ErrClosed indicates the client has been closed.
	ErrClosed = errors.New("stockpile: client closed")

	// ErrNoStore indicates no store was provided.
	ErrNoStore = errors.New("stockpile: no store provided")
)

Sentinel errors for well-defined error conditions.

Functions

This section is empty.

Types

type Client

type Client struct {
	// contains filtered or unexported fields
}

Client provides access to the Lichess evaluation database. A Client is safe for concurrent use by multiple goroutines.

func New

func New(opts ...Option) (*Client, error)

New creates a new Client with the given options. If no options are provided, sensible defaults are used.

func (*Client) Close

func (c *Client) Close() error

Close releases all resources associated with the client. After Close, the client should not be used.

func (*Client) Lookup

func (c *Client) Lookup(ctx context.Context, fen string) (*Eval, error)

Lookup returns the evaluation for a given FEN position. Returns ErrNotFound if the position is not in the database.

func (*Client) ShardStrategy

func (c *Client) ShardStrategy() shard.Strategy

ShardStrategy returns the sharding strategy used by this client.

func (*Client) Store

func (c *Client) Store() store.Store

Store returns the storage backend used by this client.

type Eval

type Eval struct {
	// FEN is the position in Forsyth-Edwards Notation.
	FEN string

	// Depth is the search depth used to compute this evaluation.
	Depth int

	// Knodes is the number of kilo-nodes searched.
	Knodes int

	// PVs contains all principal variations from multi-PV analysis.
	// The first PV is the best line.
	PVs []PV
}

Eval represents a chess position evaluation from the Lichess database.

func (*Eval) BestPV

func (e *Eval) BestPV() *PV

BestPV returns the best principal variation, or nil if none available.

func (*Eval) IsMate

func (e *Eval) IsMate() bool

IsMate returns true if the best line is a forced checkmate.

func (*Eval) Score

func (e *Eval) Score() string

Score returns a human-readable score string for the best line. Examples: "+1.25", "-0.50", "#3", "#-5"

type Option

type Option interface {
	// contains filtered or unexported methods
}

Option configures a Client.

func WithDataDir

func WithDataDir(dir string) (Option, error)

WithDataDir configures the client from a data directory. It reads the manifest.json to auto-configure shard count and strategy, and creates a disk-based store with zstd compression. This is the recommended way to create a client for local data.

func WithLogger

func WithLogger(l *zap.Logger) Option

WithLogger sets the logger. If not set, a no-op logger is used.

func WithShardStrategy

func WithShardStrategy(s shard.Strategy) Option

WithShardStrategy sets the sharding strategy to use. If not set, material-based sharding is used.

func WithStats

func WithStats(c stats.Collector) Option

WithStats sets the stats collector. If not set, a no-op collector is used.

func WithStore

func WithStore(s store.Store) Option

WithStore sets the storage backend to use.

func WithTotalShards

func WithTotalShards(n int) Option

WithTotalShards sets the total number of shards. Default is 32768 (2^15).

type PV

type PV struct {
	// Centipawns is the evaluation in centipawns from White's perspective.
	// Positive values favor White, negative values favor Black.
	// Nil if the position has a forced mate.
	Centipawns *int

	// Mate is the number of moves until checkmate.
	// Positive values mean White delivers mate, negative means Black.
	// Nil if there is no forced mate.
	Mate *int

	// Line is the sequence of moves in UCI notation.
	Line string
}

PV represents a principal variation (line of play) from the engine.

func (*PV) IsMate

func (pv *PV) IsMate() bool

IsMate returns true if this PV is a forced checkmate.

func (*PV) Score

func (pv *PV) Score() string

Score returns a human-readable score string. Examples: "+1.25", "-0.50", "#3", "#-5"

Directories

Path Synopsis
benchmark
analysis
Package analysis provides statistical analysis for benchmark results.
Package analysis provides statistical analysis for benchmark results.
pgn
Package pgn provides utilities for extracting FEN positions from PGN files.
Package pgn provides utilities for extracting FEN positions from PGN files.
reporting
Package reporting provides report generation for benchmark results.
Package reporting provides report generation for benchmark results.
simulation
Package simulation provides tools for simulating shard access patterns.
Package simulation provides tools for simulating shard access patterns.
cmd
analyze-games command
Command analyze-games analyzes positions from a PGN file using stockpile.
Command analyze-games analyzes positions from a PGN file using stockpile.
stockpile command
Package main provides the stockpile CLI tool for managing and querying chess position evaluations from the Lichess database.
Package main provides the stockpile CLI tool for managing and querying chess position evaluations from the Lichess database.
stockpile-bench command
Package main provides the stockpile-bench CLI tool for benchmarking sharding strategies with real game data.
Package main provides the stockpile-bench CLI tool for benchmarking sharding strategies with real game data.
examples
benchmark-demo command
Package main demonstrates benchmarking sharding strategies with the stockpile library.
Package main demonstrates benchmarking sharding strategies with the stockpile library.
game-analysis command
Package main demonstrates analyzing a chess game with stockpile.
Package main demonstrates analyzing a chess game with stockpile.
quickstart command
Package main demonstrates basic stockpile usage.
Package main demonstrates basic stockpile usage.
fx
diskstockpilefx
Package diskstockpilefx provides an fx module for a disk-backed stockpile client.
Package diskstockpilefx provides an fx module for a disk-backed stockpile client.
gcpstockpilefx
Package gcpstockpilefx provides an fx module for a GCS-backed stockpile client.
Package gcpstockpilefx provides an fx module for a GCS-backed stockpile client.
memorystockpilefx
Package memorystockpilefx provides an fx module for an in-memory stockpile client.
Package memorystockpilefx provides an fx module for an in-memory stockpile client.
internal
builder
Package builder implements the data build pipeline for stockpile.
Package builder implements the data build pipeline for stockpile.
codec
Package codec provides compression and decompression for shard data.
Package codec provides compression and decompression for shard data.
codec/gzipcodec
Package gzipcodec provides a gzip compression codec.
Package gzipcodec provides a gzip compression codec.
codec/noopcodec
Package noopcodec provides a no-op codec (no compression).
Package noopcodec provides a no-op codec (no compression).
codec/zstdcodec
Package zstdcodec provides a zstd compression codec.
Package zstdcodec provides a zstd compression codec.
fen
Package fen provides FEN (Forsyth-Edwards Notation) parsing utilities.
Package fen provides FEN (Forsyth-Edwards Notation) parsing utilities.
search
Package search implements binary search within sorted shard data.
Package search implements binary search within sorted shard data.
shard
Package shard defines the sharding strategy interface for distributing chess positions across multiple shard files.
Package shard defines the sharding strategy interface for distributing chess positions across multiple shard files.
shard/fnvshard
Package fnvshard implements FNV-1a hash-based sharding for chess positions.
Package fnvshard implements FNV-1a hash-based sharding for chess positions.
shard/materialshard
Package materialshard implements material-based sharding for chess positions.
Package materialshard implements material-based sharding for chess positions.
stats
Package stats provides a unified interface for collecting metrics.
Package stats provides a unified interface for collecting metrics.
stats/logger
Package logger provides a zap-based stats collector that logs metrics.
Package logger provides a zap-based stats collector that logs metrics.
stats/prometheus
Package prometheus provides a Prometheus-based stats collector.
Package prometheus provides a Prometheus-based stats collector.
store
Package store defines the storage backend interface for reading shard files.
Package store defines the storage backend interface for reading shard files.
store/cachedstore
Package cachedstore provides a caching wrapper for Store implementations.
Package cachedstore provides a caching wrapper for Store implementations.
store/cachedstore/cachestrategy
Package cachestrategy defines cache eviction strategy interfaces.
Package cachestrategy defines cache eviction strategy interfaces.
store/cachedstore/cachestrategy/lru
Package lru implements an LRU cache eviction strategy.
Package lru implements an LRU cache eviction strategy.
store/cachedstore/memory
Package memory implements an in-memory cache backend.
Package memory implements an in-memory cache backend.
store/diskcache
Package diskcache implements a disk-based cache that wraps a remote store.
Package diskcache implements a disk-based cache that wraps a remote store.
store/diskstore
Package diskstore implements a disk-based filesystem storage backend.
Package diskstore implements a disk-based filesystem storage backend.
store/gcsstore
Package gcsstore implements a Google Cloud Storage backend.
Package gcsstore implements a Google Cloud Storage backend.
store/memstore
Package memstore provides an in-memory store implementation for testing.
Package memstore provides an in-memory store implementation for testing.
store/s3store
Package s3store implements an AWS S3 storage backend.
Package s3store implements an AWS S3 storage backend.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL