lockd

package module
v0.9.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 25, 2026 License: MIT Imports: 77 Imported by: 0

README

lockd

lockd is a single-binary coordination plane that fuses exclusive leases, atomic JSON state + attachments, indexed document search with streaming queries, an at-least-once queue, and an MCP facade for agent workflows into one API—so you don’t need a lock service, a document store, a message broker, and a separate agent memory/sync facade just to run reliable workflows. It runs its own TC leader election and implicit XA, so multi-key updates and queue acknowledgements commit safely across nodes and even across backend "islands" without external consensus systems. The same storage abstraction powers disk, S3/MinIO, Azure Blob, or in-memory backends with optional envelope encryption and mTLS, while built-in back-pressure (QRF/LSF) keeps the service stable under load. You get a production-grade Go SDK + CLI in a CGO-free, statically linked binary that’s happy as PID 1 yet feature-rich enough for orchestration, ETL checkpoints, IoT fleets, and long-running services.

Typical flow

In production, a worker claims a shard/tenant with a lease, loads the last committed state (and any attachments), stages updates under a transaction, optionally advances work via the queue, then commits. If the worker crashes or the lease expires, staging is rolled back and the next worker resumes from the last committed checkpoint.

  1. Acquire a lease for the workload key and get a txn_id for multi-key work.
  2. Read the current JSON state and any required attachments.
  3. Perform work; stage state updates/attachments, and optionally enqueue downstream tasks.
  4. Release to commit (or rollback on failure); queue acks/enlisted dequeues follow the same txn.
  5. On crash/TTL expiry, the lease rolls back and another worker resumes safely.

Note from the author

lockd started as an experiment to see how far you could really go with AI today. I’ve spent over two decades working across IT operations and software development, with a strong focus on distributed systems - enough experience to lead a project like lockd, but not the time, cognitive bandwidth, or budget to build something like this on my own.

What’s been achieved here genuinely amazes me. It shows what augmentation really means — how far human expertise can be amplified when combined with capable AI systems. It’s not fair to say lockd was "vibe coded"; you still need solid foundations in distributed computing. But nearly all of the code was written by an AI coding agent, while I’ve acted as the solution architect - steering design decisions, ensuring technical feasibility, and keeping the vision cohesive.

The result is far beyond what I first imagined. Just a few years ago, if someone told you I built lockd in less than two weeks of spare time, you’d probably laugh. Today, you might just believe it.

We’re living in an incredible moment - where building at the speed of twelve parsecs or less is no longer science fiction.

Features

Coordination primitives
  • Exclusive leases per key with configurable TTLs, keep-alives, fencing tokens, and sweeper reaping for expired holders.
  • Create-only acquire mode via if_not_exists / --if-not-exists to fail fast with already_exists when a key is already initialized.
  • Acquire-for-update helpers wrap acquire → get → update, keeping leases alive while user code runs and releasing them automatically.
  • Public reads let dashboards or downstream systems fetch published state without holding leases (/v1/get?public=1, Client.Get default behavior).
  • Reserved namespaces: user namespaces must not start with .; .txns and other dot-prefixed namespaces are reserved for lockd internals.
Identifiers & namespaces

Lease IDs and transaction IDs are compact xid strings (20 lowercase base32 chars, e.g. c5v9d0sl70b3m3q8ndg0). Keys/manifests/segments retain uuidv7 for ordering. Use the txn_id returned by Acquire to join multiple keys into the same local transaction; namespaces that start with . are reserved for lockd internals (e.g. .txns stores transaction decisions) and are rejected.

  • Atomic JSON state up to ~100 MB (configurable) with CAS (version + ETag) headers.
  • State attachments: stream binary files alongside a key (staged under the lease transaction and committed on release).
  • Namespace-aware indexing across S3/MinIO, Azure, or disk backends. Query with RFC 6901 selectors (/workflow/state="done", braces, >=, etc.).
  • Streaming query results: keys-only (compact JSON) or NDJSON documents with metadata, consumable via SDK/CLI.
  • Encryption everywhere when configured—metadata, state blobs, queue payloads all flow through internal/storage.Crypto.
Queue subsystem
  • At-least-once queue built on the same storage: enqueue, stateless/stateful dequeue, ack/nack/extend, DLQ after max_attempts.
  • Visibility controls (ack deadlines, reschedule, extend) and QRF throttling hooks for overload protection.
  • State sidecar per message to hold workflow context; created lazily via EnsureStateExists.
APIs & tooling
  • Simple HTTP/JSON API (no gRPC) with optional mTLS. All endpoints support correlation IDs, structured errors, and fencing tokens.
  • Go SDK (client) with retries, structured APIError, acquire-for-update, streaming query helpers, document helpers (client.Document), and attachment helpers.
  • Cobra/Viper CLI that mirrors the SDK (leases, attachments, queue operations, lockd client query, lockd client namespace, etc.) and is covered by unit tests.
  • MCP facade server (lockd mcp, package pkt.systems/lockd/mcp) for agent workflows with lock/state/queue/query/attachment tools, LQL query+mutate support, and large-payload stream capabilities (docs/MCP.md).
Storage backends
  • S3 / MinIO / S3-compatible using conditional copy CAS and optional KMS/SSE.
  • Azure Blob Storage (Shared Key or SAS) with the same storage port.
  • Disk backend optimized for SSD/NVMe, with optional inotify watcher for queue change feed; works over NFS via pure polling.
  • In-memory backend for tests and demos.
Operations & observability
  • Structured logging (pslog) with subsystem tagging (server.lifecycle.core, search.index, queue.dispatcher, etc.).
  • Transaction telemetry: decision/apply/replay/sweep logs (txn.*, txn.tc.*, txn.rm.*), watchdog warnings for long-running operations, and Prometheus metrics (lockd.txn.*, lockd.txn.fanout.*).
  • Internal batching: disk/NFS backends group logstore fsyncs automatically to amortize load without a bulk API; see docs/BATCHING.md.
  • Watchdogs baked into unit/integration tests to catch hangs instantly.
  • lockd verify store diagnostics ensure backend credentials + permissions + encryption descriptors are valid before deploying.
  • Integration suites (run-integration-suites.sh) cover every backend/feature combination; use them before landing cross-cutting changes.

Architecture

PlantUML sources live in docs/diagrams/ (render with make diagrams):

  • component-view.puml — component-level view of lockd subsystems and XA coordination.
  • xa-happy-path.puml — multi-key commit on a single backend.
  • xa-rollback-timeout.puml — rollback after lease/decision expiry.
  • xa-tc-rm-fanout.puml — TC decision + RM fan-out across backends.
  • xa-queue-dequeue.puml — queue dequeue enlisted in a transaction.
Subsystems (sys hierarchy)

Every structured log carries a sys field (system.subsystem.component). The strings come directly from svcfields.WithSubsystem and represent concrete code paths. The primary production subsystems are:

  • client.sdk – Go SDK calls (leases, queue APIs, acquire-for-update helper).
  • client.cli, cli.root, cli.verify – Cobra/Viper CLI plumbing, namespace helpers, and the lockd verify workflows.
  • api.http.server – TLS listener, mux, and server-level errors. Each handler also emits api.http.router.<operation> (for example api.http.router.acquire, api.http.router.queue.enqueue, api.http.router.query).
  • control.lsf.observer – Host sampling loop that feeds the QRF controller.
  • control.connguard – Listener-level filtering for suspicious TLS and plain-TCP handshakes.
  • control.qrf.controller – Applies throttling/Retry-After headers and exposes demand metrics to HTTP handlers.
  • server.lifecycle.core – Supervises background loops (sweeper, telemetry, namespace config) and owns process-level lifecycle logging.
  • server.shutdown.controller – Drives drain-aware shutdown, emits Shutdown-Imminent, and monitors the close budget.
  • namespace.config – Manages per-namespace capabilities (scan vs. index) and backs /v1/namespace plus search adapters.
  • queue.dispatcher.core – Ready cache, change-feed watcher, and consumer demand reconciliation.
  • search.scan – Selector evaluation + scan adapter (explicit engine or fallback).
  • search.index – Index manager, memtable flushers, /v1/index/flush.
  • storage.pipeline.snappy.pre_encrypt – Optional compression pass executed before encryption whenever storage compression is enabled.
  • storage.crypto.envelope – Kryptograf envelope encryption for metadata/state/queue payloads.
  • storage.backend.core – Storage adapters (S3/MinIO, disk, Azure, mem).
  • observability.telemetry.exporter – OTLP trace exporter + Prometheus metrics endpoint bootstrap.

Additional prefixes show up in specialised contexts:

  • bench.disk.*, bench.minio.* – Benchmark harnesses and perf suites.
  • api.http.router.* – Every HTTP route (acquire, query, queue ops, etc.).
  • client.cli.* / cli.verify – CLI subcommands and auth workflows.
Request flow
  1. AcquirePOST /v1/acquire → acquire lease (optionally blocking). Include X-Txn-ID (or txn_id in the body) to join an existing transaction across keys. Set if_not_exists=true for create-only semantics (returns 409 already_exists when state already exists for the key). Namespaces starting with . are rejected to protect internal records such as .txns.
  2. Get statePOST /v1/get → stream JSON state with CAS headers. Supply X-Lease-ID + X-Fencing-Token from the acquire response.
  3. Update statePOST /v1/update → upload new JSON with X-If-Version and/or X-If-State-ETag to enforce CAS. Include X-Lease-ID, X-Txn-ID, and the current X-Fencing-Token.
  4. Attach files (optional)POST /v1/attachments → stream binary payloads while holding the lease (X-Lease-ID, X-Txn-ID). List via GET /v1/attachments, download via GET /v1/attachment, and delete via DELETE.
  5. Remove state (optional)POST /v1/remove → delete the stored JSON blob while holding the lease. Honor the same X-Lease-ID, X-Txn-ID, X-Fencing-Token, and CAS headers (X-If-Version, X-If-State-ETag) as updates.
  6. ReleasePOST /v1/release → commit by default, or pass rollback=true to discard staged changes. The request body must include the xid txn_id from Acquire; the sweeper handles timeouts for crashed workers.
Atomic acquire + update helper

AcquireForUpdate combines the normal acquire → get → update → release flow into a single call that takes a user-supplied function. The handler receives an AcquireForUpdateContext exposing the current StateSnapshot along with helper methods (Update, UpdateBytes, Save, Remove, etc.). While the handler runs the client keeps the lease alive in the background; once the handler returns the helper always releases the lease.

Only the initial acquire/get handshake retries automatically (respecting client.WithAcquireFailureRetries and related backoff settings). After the handler begins executing, any error is surfaced immediately so callers can decide whether to re-run the helper.

The legacy /v1/acquire-for-update streaming endpoint has been removed; all clients must use the callback helper described above.

Attachments

Attachments are staged under the lease transaction and committed on release. Each attachment has a name, UUIDv7, size, content type, and timestamps.

eval "$(lockd client acquire --key orders)"
lockd client attachments put --name invoice.pdf --file ./invoice.pdf --content-type application/pdf
lockd client release

Public reads are available after release:

lockd client attachments list --key orders --public
lockd client attachments get --key orders --name invoice.pdf --public -o invoice.pdf

See docs/ATTACHMENTS.md for SDK and API details.

XA transactions (TC + RM)

Lockd records decisions in .txns and stages writes under state/<key>/.staging/<txn_id> (attachments live under state/<key>/.staging/<txn_id>/attachments/<uuidv7> and commit to state/<key>/attachments/<uuidv7>). A single TC leader is elected via quorum leases; normal client SDK flows do not call /v1/txn/decide directly. Instead, Release/Ack/Nack trigger the core decider, which records pending|commit|rollback and fans out to RM apply endpoints (/v1/txn/commit, /v1/txn/rollback) with a tc_term fencing token that RMs use to reject stale decisions. Any TC endpoint can receive /v1/txn/decide (TC-auth only); non-leaders forward to the leader. Queue dequeues can be enlisted in a transaction; commit ACKs messages, rollback NACKs them, and stale leases return 409 queue_message_lease_mismatch.

TC cluster membership leases are stored under .lockd/tc-cluster/leases/<identity> and managed with lockd tc announce|leave|list (servers refresh leases automatically). Use --self plus optional --join seeds (config key tc-join) to bootstrap leader election. RM endpoints are registered under .lockd/tc-rm-members via /v1/tc/rm/register (server cert required when mTLS is enabled) and replicated across the TC cluster.

For TC tooling (TC-auth), use lockd txn prepare|commit|rollback|replay. Normal SDK/CLI flows use lockd client release --rollback or queue ACK/NACK. Full SDK/CLI examples and failure-mode guidance live in docs/XA.md.

Recovery + watchdogs

  • .txns records are durable and re-applied after process restarts; staged blobs under .staging/ are swept automatically.
  • POST /v1/txn/replay (or client.TxnReplay) forces a decision to re-run if you need deterministic recovery in tests or during incident response.
  • Integration suites include commit/rollback restart coverage for mem/disk and a txn soak that fails fast on stalled sweeps or leaked staged data.
Internal layout
  • server.go – server wiring, storage retry wrapper, sweeper.
  • internal/httpapi – HTTP handlers for the API surface.
  • internal/storage – backend interface, retry wrapper, S3/Disk/Memory.
  • client – public Go SDK.
  • cmd/lockd – CLI entrypoint (Cobra/Viper).
  • tlsutil – bundle loading/generation helpers.
  • integration/ – end-to-end tests (mem, disk, NFS, AWS, MinIO, Azure, OTLP, queues).

API documentation

The HTTP handlers embed swaggo annotations so the OpenAPI description can be produced straight from the code. Run make swagger (or go generate ./swagger) with the swag binary on your PATH to refresh the spec. Generation writes three artifacts to swagger/docs/:

  • swagger.json and swagger.yaml for downstream tooling.
  • swagger.html, a self-contained Swagger UI page that inlines the JSON spec.

These files live alongside the CLI so the reusable server library stays swaggo-free. Serve the HTML with any static web host—or just open it locally—to explore the API interactively.

Storage Backends

lockd picks the storage implementation from the --store flag (or LOCKD_STORE environment variable) by inspecting the URL scheme. Each backend maps to a dedicated driver with its own consistency and HA characteristics.

Scheme Example Driver Notes
mem:// or empty mem:// In-memory Ephemeral; test only.
aws:// aws://my-bucket/prefix AWS S3 (AWS SDK) Uses the official AWS SDK and credential chain (not an alias of s3://).
s3:// s3://localhost:9000/lockd-data?insecure=1 S3-compatible (MinIO, Localstack, etc.) TLS enabled by default. Append ?insecure=1 for HTTP. Uses explicit LOCKD_S3_* credentials.
disk:// disk:///var/lib/lockd-data Local disk / NFS SSD/NVMe‑tailored logstore. NFS is supported via the same driver.
azure:// azure://account/container/prefix Azure Blob Storage Host = account; path = container + optional prefix.

For AWS, the standard credential chain (AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY, profiles, IAM roles, etc.) is used. For other S3-compatible stores, lockd reads LOCKD_S3_ACCESS_KEY_ID and LOCKD_S3_SECRET_ACCESS_KEY (falling back to LOCKD_S3_ROOT_USER/LOCKD_S3_ROOT_PASSWORD). No secret keys are stored in the lockd config file.

AWS S3 (aws://)
  • Uses the AWS SDK (v2) and the standard AWS credential chain.
  • HA: concurrent and failover are supported. Concurrent mode is the default choice for multi-node clusters sharing the same bucket/prefix.
  • Best fit for production HA on AWS. Use --aws-region (or env) plus optional --aws-kms-key-id for SSE‑KMS.
  • URL format: aws://<bucket>/<optional-prefix>.
S3-compatible (s3://)
  • Uses temp uploads + CopyObject with conditional headers for CAS.
  • Supports SSE (aws:kms / AES256) and custom endpoints (MinIO, Localstack).
  • Default retry budget: 12 attempts, 500 ms base delay, capped at 15 s.
  • HA: concurrent and failover are supported. For shared MinIO clusters, concurrent mode is typical; for single‑node MinIO, failover also works.

Configuration (flags or env via LOCKD_ prefix):

Flag / Env Description
--store / LOCKD_STORE aws://bucket/prefix or s3://host:port/bucket
--aws-region AWS region for aws:// stores (required unless provided via env)
--s3-sse AES256 or aws:kms
--s3-kms-key-id KMS key for generic s3:// stores
--aws-kms-key-id KMS key for aws:// stores
--s3-max-part-size Multipart upload part size
--s3-encrypt-buffer-budget Max total bytes for buffered S3 encryption payloads before streaming

Shutdown tuning:

Flag / Env Description
--drain-grace / LOCKD_DRAIN_GRACE How long to keep serving existing lease holders before HTTP shutdown begins (default 10s; set 0s to disable draining).
--shutdown-timeout / LOCKD_SHUTDOWN_TIMEOUT Overall shutdown budget, split 80/20 between drain and HTTP server teardown (default 10s; set 0s to rely on the orchestrator’s deadline).
HA modes

Lockd defaults to failover mode (single active writer per backend). Passive nodes return HTTP 503 so clients can retry another endpoint. Use --ha concurrent to enable concurrent multi-writer semantics when multiple servers share the same backend, --ha single for an operator-managed single-node deployment, or --ha auto to start in single mode and promote to failover when another live node is observed.

  • --ha-lease-ttl controls failover lease duration and auto heartbeat cadence. Default is 10s; values below 5s are rejected.
  • --ha-single-presence-ttl controls the long-lived .ha advertisement used by ha=single on backends without native single-writer detection. Default is 5m.
  • On disk/NFS, ha=single does not write .ha single-presence heartbeats; mixed single/auto fencing uses native single-writer markers instead.
Disk / NFS (logstore)
  • Uses a log-structured store on disk/NFS to minimize small-file churn: writes are append-only into segment files, each segment is recorded in a manifest, and periodic snapshots bound recovery time. Reads consult the in-memory index + manifest to find the newest record for a key, while background compaction rewrites hot data into fewer segments.
  • Optional retention (--disk-retention, LOCKD_DISK_RETENTION) prunes keys whose metadata updated_at_unix is older than the configured duration. Set to 0 (default) to keep data indefinitely.
  • Disk/NFS backends use durable group commit driven by natural backpressure. --logstore-commit-max-ops caps batch size, and --logstore-segment-size rolls segment files at a fixed size.
  • Background snapshot compaction is enabled by default. Tune it with --logstore-compaction, --logstore-compaction-interval (default 30m), --logstore-compaction-min-segments (default 2), --logstore-compaction-min-reclaim-size (default 64MiB), --logstore-compaction-delete-grace (default 15m), --logstore-compaction-max-io-bytes-per-sec (default 8MiB/s; 0 uses the default throttle), and --disable-logstore-compaction-throttling.
  • HA: the logstore remains single-writer. --ha concurrent is automatically downgraded to failover. Use --ha single for an operator-managed standalone node, --ha auto for single-writer startup with promotion to failover on peer detection, or --ha failover for explicit lease-based coordination.
  • The janitor sweep interval defaults to half the retention window (clamped between 1 minute and 1 hour). Override via --disk-janitor-interval.
  • Configure with --store disk:///var/lib/lockd-data. All files live beneath the specified root; lockd creates logstore/segments, logstore/manifest, and logstore/snapshots directories per namespace.
  • For NFS: supported via the same driver with polling (queue watcher disabled on unsupported filesystems). Use failover mode to avoid concurrent writers.
Azure Blob Storage
  • URL format: azure://<account>/<container>/<optional-prefix> (the account comes from the host component).
  • Supply credentials via --azure-key / LOCKD_AZURE_ACCOUNT_KEY (Shared Key) or --azure-sas-token / LOCKD_AZURE_SAS_TOKEN (SAS). Standard Azure environment variables such as AZURE_STORAGE_ACCOUNT are also honoured if the account is omitted from the URL.
  • Default endpoint is https://<account>.blob.core.windows.net; override with --azure-endpoint or ?endpoint= on the store URL when using custom domains/emulators.
  • Authentication supports either account keys (Shared Key) or SAS tokens. Provide exactly one of the above secrets; the CLI no longer requires --azure-account because the account name is embedded in the store URL.
  • HA: concurrent and failover are supported; concurrent mode is the normal choice for multi‑node clusters sharing the same container/prefix.
  • Example:
set -a && source .env.azure && set +a
lockd --store "azure://lockdintegration/container/pipelines" --listen :9341 --disable-mtls
Memory

In-process backend utilized for unit tests (mem://); it can also serve for experimentation or support a no-footprint ephemeral instance. HA is not applicable: each process has its own isolated in-memory store.

Drain-aware shutdown

When the server receives a shutdown signal it immediately advertises a Shutdown-Imminent header, rejects new acquires with a shutdown_draining API error, and keeps existing lease holders alive until the drain grace expires. By default the 10 s --shutdown-timeout budget is split 80/20: approximately 8 s are dedicated to draining active leases and the remaining 2 s are reserved for http.Server.Shutdown to close idle connections cleanly. Setting --drain-grace 0s (or LOCKD_DRAIN_GRACE=0s) skips the drain window entirely when an orchestrator already enforces a strict deadline.

Clients are drain-aware too. The Go SDK and CLI listen for the Shutdown-Imminent header and, after in-flight work completes, automatically release the lease so another worker can be scheduled. Opt out via client.WithDrainAwareShutdown(false), --drain-aware-shutdown=false, or LOCKD_CLIENT_DRAIN_AWARE=false if your workflow prefers to hold the lease until the next session. KeepAlive and Release continue to succeed during the drain window, so operators can monitor progress (metrics/logs) while state is flushed to storage.

Configuration & CLI

lockd exposes flags mirrored by LOCKD_* environment variables. Example:

# AWS S3 with region-based endpoint
export LOCKD_STORE="aws://my-bucket/prefix"
export LOCKD_AWS_REGION="us-west-2"
export AWS_ACCESS_KEY_ID="AKIA..."
export AWS_SECRET_ACCESS_KEY="..."
lockd \
  --listen :9341 \
  --store "$LOCKD_STORE" \
  --default-namespace workflows \
  --json-max 100MB \
  --default-ttl 30s \
  --max-ttl 30m \
  --acquire-block 60s \
  --sweeper-interval 5s \
  --bundle $HOME/.lockd/server.pem

# MinIO running locally over HTTP
export LOCKD_STORE="s3://localhost:9000/lockd-data?insecure=1"
export LOCKD_S3_ACCESS_KEY_ID="lockddev"
export LOCKD_S3_SECRET_ACCESS_KEY="lockddevpass"
lockd --store "$LOCKD_STORE" --listen :9341 --bundle $HOME/.lockd/server.pem

# Azure Blob Storage (account key)
set -a && source .env.azure && set +a
# .env.azure should export LOCKD_STORE=azure://account/container/prefix and LOCKD_AZURE_ACCOUNT_KEY=...
lockd --store "$LOCKD_STORE" --listen :9341 --disable-mtls

# IPv4-only binding
lockd --listen-proto tcp4 --listen 0.0.0.0:9341 --store "$LOCKD_STORE"

# Prefer stdlib JSON compaction for tiny payloads
lockd --store mem:// --json-util stdlib

The default listen address is :9341, chosen from the unassigned IANA space to avoid clashes with common cloud-native services.

Namespaces

Every key, queue, and workflow lease lives inside a namespace. When callers omit the field, the server falls back to --default-namespace (defaults to "default"). Use --default-namespace / LOCKD_DEFAULT_NAMESPACE to set the cluster-wide default, or supply namespace in API requests to isolate workloads.

The Go SDK mirrors this behaviour via client.WithDefaultNamespace. CLI commands expose --namespace / -n and honor LOCKD_CLIENT_NAMESPACE for lease/state operations plus LOCKD_QUEUE_NAMESPACE for queue helpers. Queue dequeue commands export LOCKD_QUEUE_NAMESPACE (along with message metadata) so follow-up ack/nack/extend calls can reuse the value without repeating --namespace. Set these environment variables in your shell to avoid passing the flag on every invocation.

Namespaces cascade to storage: metadata and payload objects are prefixed with <namespace>/... regardless of backend. When adding new features or tests, ensure we exercise at least one non-default namespace to avoid regressions in prefix handling.

Metadata attributes & hidden keys

Each key’s metadata protobuf stores lease info plus user-controlled attributes. The server exposes POST /v1/metadata so callers can mutate attributes without rewriting the JSON state. Today the flag query_hidden (persisted as the lockd.query.exclude attribute) hides a key from /v1/query results while keeping it readable via the lease or public=1 GET helpers. Include the txn_id from Acquire via X-Txn-ID and toggle it by holding the lease and calling:

curl -sS -X POST "https://host:9341/v1/metadata?key=orders&namespace=default" \
  -H "X-Lease-ID: $LEASE_ID" \
  -H "X-Txn-ID: $TXN_ID" \
  -H "X-Fencing-Token: $FENCING" \
  -d '{"query_hidden":true}'

The same attribute can ride along with state uploads by setting the X-Lockd-Meta-Query-Hidden header; the Go SDK exposes client.WithQueryHidden() / client.WithQueryVisible() helpers and LeaseSession.UpdateMetadata / Client.UpdateMetadata for direct control. Server-side diagnostics (e.g. lockd-diagnostics/*) are automatically hidden so queries never leak internal housekeeping blobs. Future metadata attributes will use the same endpoint and response envelope (api.MetadataUpdateResponse).

Query return modes

POST /v1/query still returns {namespace, keys, cursor} JSON by default, but you can now request document streams by passing return=documents. In document mode the server replies with Content-Type: application/x-ndjson, streams rows immediately, and emits cursor/index/metadata as HTTP trailers at end-of-stream:

{"ns":"default","key":"orders/123","ver":42,"doc":{"status":"ready"}}

Only published documents are streamed, so the semantics match public=1 GET. The Go SDK exposes client.WithQueryReturnDocuments() plus a revamped QueryResponse that provides Mode(), Keys(), and ForEach helpers. In streaming mode each row exposes row.DocumentReader() (close it when you’re done) or the convenience row.DocumentInto(...) / row.Document() helpers. If you only need the identifiers, call Keys() to drain the stream without materialising any documents. In return=documents, query metadata is trailer-only: X-Lockd-Query-Cursor, X-Lockd-Query-Index-Seq, X-Lockd-Query-Metadata. For keys-only mode, metadata remains in the JSON body plus response headers.

Queue dispatcher tuning

The queue dispatcher multiplexes storage polling across all connected consumers. Use these flags (or matching LOCKD_QUEUE_* environment variables) to adjust its behaviour:

Flag / Env Description
--queue-max-consumers (LOCKD_QUEUE_MAX_CONSUMERS) Caps the number of concurrent dequeue waiters handled by a single server (default 1000).
--queue-poll-interval (LOCKD_QUEUE_POLL_INTERVAL) Baseline interval between storage scans when no change notifications arrive (default 3s).
--queue-poll-jitter (LOCKD_QUEUE_POLL_JITTER) Adds randomised delay up to the specified duration to stagger concurrent servers (default 500ms; set 0 to disable).
--disk-queue-watch (LOCKD_DISK_QUEUE_WATCH) Enables Linux/inotify queue watchers on the disk backend (default true; automatically ignored on unsupported filesystems such as NFS).
--disable-mem-queue-watch (LOCKD_DISABLE_MEM_QUEUE_WATCH) Disables in-process queue notifications for the in-memory backend (default false).
Development Environment (Docker Compose + Prometheus)

devenv/docker-compose.yaml brings up a complete local stack—Jaeger, the OTLP collector (traces), MinIO (for S3-compatible backends), a single-node etcd service for YCSB comparisons, and a disk-backed lockd container. Start the environment with:

cd devenv
docker compose up -d

The MinIO container listens on localhost:9000 (S3 API) / 9001 (console), with credentials lockddev / lockddevpass. Point lockd at the local bucket via:

export LOCKD_STORE="s3://localhost:9000/lockd-data?insecure=1"
export LOCKD_S3_ACCESS_KEY_ID="lockddev"
export LOCKD_S3_SECRET_ACCESS_KEY="lockddevpass"

Set --otlp-endpoint (or LOCKD_OTLP_ENDPOINT) to export traces to the compose collector. Omit the scheme to default to OTLP/gRPC on port 4317; use grpc:///grpcs:// to force gRPC, or http:///https:// (default port 4318) for the HTTP transport. With the compose stack running:

LOCKD_STORE=mem:// \
LOCKD_OTLP_ENDPOINT=localhost \
lockd

Metrics are served via a Prometheus scrape endpoint on a separate listener. Set --metrics-listen (or LOCKD_METRICS_LISTEN) to enable it, for example:

LOCKD_METRICS_LISTEN=:9464 \
lockd

Profiling tools can be enabled independently. Use --pprof-listen to expose /debug/pprof endpoints and --enable-profiling-metrics to export Go runtime metrics via Prometheus:

LOCKD_METRICS_LISTEN=:9464 \
LOCKD_ENABLE_PROFILING_METRICS=1 \
LOCKD_PPROF_LISTEN=:6060 \
lockd

The compose-managed lockd container persists its bootstrap bundles under devenv/volumes/lockd-config/ (client bundle: client.pem) and its disk backend under devenv/volumes/lockd-storage/. The bundled etcd instance listens on localhost:2379 for single-node YCSB comparisons.

Prometheus is exposed on localhost:9090, and Grafana on localhost:3000 (login admin / admin). Grafana auto-provisions a “Lockd Overview” dashboard that reads metrics from Prometheus. Prometheus scrapes the lockd metrics listener on the compose network.

All HTTP endpoints are wrapped with otelhttp, storage backends emit child spans, and structured logs attach trace_id/span_id when a span is active. Transaction telemetry is exported as metrics such as lockd.txn.decisions.recorded, lockd.txn.apply.applied/failed/retries, lockd.txn.decide|apply|replay|sweep.duration_ms, and lockd.txn.fanout.*. Lease operations, queue operations, and attachments also emit telemetry (lockd.lease.*, lockd.queue.*, lockd.attachments.*).

Dev Environment Assurance

Run go run ./devenv/assure to execute an end-to-end probe that verifies the compose stack. The tool assumes the default compose settings (MinIO on localhost:9000, OTLP collector on localhost:4317, Jaeger UI/API on localhost:16686), connects to MinIO, starts an in-process lockd server against the dev bucket, performs lease/state mutations, confirms the underlying objects exist, and queries Jaeger’s HTTP API to ensure OTLP traces arrived via the bundled collector. It’s a zero-config “go run” sanity check once docker compose up is running.

Correlation IDs

Every request processed by lockd carries a correlation identifier:

  • Incoming clients may provide X-Correlation-Id; the server accepts printable ASCII values up to 128 characters. Invalid identifiers are ignored and a fresh UUID is generated instead.
  • Successful acquire responses include correlation_id in the JSON payload and echo X-Correlation-Id in the response headers. Follow-up operations must continue sending the same header.
  • Spans exported via OTLP include the lockd.correlation_id attribute so traces and logs can be tied together.
  • Queue enqueue/dequeue responses and the ack/nack/extend APIs echo correlation_id, preserving the identifier across retries, DLQ moves, and stateful leases.
  • The Go client automatically propagates correlation IDs. Seed a context with client.WithCorrelationID(ctx, id) (use client.GenerateCorrelationID() to mint one) and all lease/session operations will emit the header. For generic API clients, wrap an http.Client with client.WithCorrelationHTTPClient or a custom transport via client.WithCorrelationTransport to overwrite X-Correlation-Id on every request. client.CorrelationIDFromResponse extracts the identifier from HTTP responses when needed.
Lease fencing tokens

Every successful acquire response includes a fencing_token and echoing X-Fencing-Token on follow-up requests is mandatory. The Go SDK manages the token automatically when you reuse the same client.Client. For CLI workflows you can export the token so subsequent commands pick it up:

eval "$(lockd client acquire --server localhost:9341 --owner worker --key orders)"
lockd client keepalive --lease "$LOCKD_CLIENT_LEASE_ID" --key orders
lockd client update --lease "$LOCKD_CLIENT_LEASE_ID" --fencing-token "$LOCKD_CLIENT_FENCING_TOKEN" --key orders payload.json

lockd client acquire also exports LOCKD_CLIENT_TXN_ID; lease-bound mutations default to that transaction id unless you override --txn-id.

If the server detects a stale token it returns 403 fencing_mismatch, ensuring delayed or replayed requests cannot clobber state after a lease changes hands.

Why fencing tokens matter

The token is a strictly monotonic counter that advances on every successful acquire. Compared with relying purely on lease IDs and CAS/version checks, it adds several key safeguards:

  • Lease turnover without state writes – metadata version only increments when the JSON blob changes. If lease A expires and lease B acquires but has not updated the state yet, the version remains unchanged. The fencing token has already increased, so any delayed keepalive/update from lease A is rejected immediately.
  • Ordering, not just identity – lease IDs are random, so downstream systems cannot tell which one is newer. By carrying the token, workers give databases and queues a simple “greater-than” check: accept writes with the highest token, reject anything older.
  • Cache resilience inside the server – the handler caches lease metadata to avoid extra storage reads. A stale request might otherwise slip through by reading the cached entry; the fencing token still forces a mismatch and blocks the outdated lease holder.
  • Protection for downstream systems – workers can forward the token to other services (databases, queues) and let them reject stale writers. CAS keeps lockd’s JSON state consistent, while fencing tokens extend that guarantee to anything else the worker touches.
  • Operational guardrails – CLI/scripts that stash lease IDs in environment variables gain an extra safety net. If an operator forgets to refresh after a re-acquire, the stale token triggers a clear 403 instead of silently updating the wrong state.
Client retries & callback behaviour

Handshake retries for AcquireForUpdate are bounded by DefaultFailureRetries (currently 5). Each time the initial acquire/get sequence encounters lease_required, throttling, or a transport error, the helper burns one retry according to the configured backoff (client.WithAcquireFailureRetries, client.WithAcquireBackoff). Once the handler starts running, any subsequent error is returned immediately so the caller can decide whether to invoke the helper again.

While the handler executes, the helper issues keepalives at half the TTL. A failed keepalive cancels the handler’s context, surfaces the error, and the helper releases the lease (treating lease_required as success). Other client calls (Get, Update, KeepAlive, Release) continue to surface lease_required immediately so callers can choose their own retry strategy.

For multi-host deployments, build clients with client.NewWithEndpoints([]string{...}). The SDK rotates through the supplied base URLs when a request fails, carrying the same bounded retry budget across endpoints, so failovers remain deterministic.

When Acquire is called with create-only semantics (if_not_exists=true), already_exists is treated as terminal and returned immediately (not retried).

Configuration files

lockd can also read a YAML configuration file (loaded via Viper). At start-up the server looks for $HOME/.lockd/config.yaml; use -c/--config (or LOCKD_CONFIG) to point at an alternate file:

lockd -c /etc/lockd/config.yaml
# or
LOCKD_CONFIG=/etc/lockd/config.yaml lockd

Generate a template with sensible defaults using the helper command:

lockd config gen            # writes $HOME/.lockd/config.yaml
lockd config gen --stdout   # print the template instead of writing a file
lockd config gen --out /tmp/lockd.yaml --force

The generated file contains the same keys as the CLI flags (for example listen-proto, json-max, json-util, payload-spool-mem, disk-retention, disk-janitor-interval, s3-region, s3-disable-tls). When present, the configuration file is read before environment variables so you can override individual settings via LOCKD_* exports or command-line flags.

Bootstrap config, CA, and client certs

Use lockd --bootstrap /path/to/config-dir to idempotently generate a CA bundle (ca.pem), server bundle with kryptograf metadata (server.pem + server.denylist), a starter client certificate (client.pem), and a config.yaml wired to that material. The flag is safe to run on every start; it skips files that already exist and only fills in the missing pieces. Container images built from this repo run with --bootstrap /config by default so mounting an empty /config volume automatically provisions mTLS + storage encryption.

Container image

Build a minimal image with the provided Containerfile:

nerdctl build -t lockd:latest .
# or with podman...
podman build -t lockd:latest .

The final stage is FROM scratch with /lockd as entrypoint. It exposes two volumes:

  • /config – persisted certificates, config, denylist, and kryptograf material
  • /storage – default disk:///storage backend for state/queue data

Run the server:

nerdctl run -p 9341:9341 -v lockd-config:/config -v lockd-data:/storage localhost/lockd:latest

Because the entrypoint appends --bootstrap /config, mounting a fresh config volume auto-creates CA/server/client bundles plus config.yaml. You can also invoke admin commands directly:

nerdctl run -ti -v lockd-config:/config localhost/lockd:latest auth new client \
  --cn worker-2 --out /config/client-worker-2.pem

Environment variables still override everything (for example supply LOCKD_STORE, LOCKD_S3_ENDPOINT, etc.) so the same image works for disk, MinIO, Azure, or AWS deployments.

JSON Compaction Engines

Lockd ships with three drop-in JSON compactors. Select one with --json-util or the LOCKD_JSON_UTIL environment variable:

  • lockd (default) – streaming writer tuned for multi-megabyte payloads.
  • jsonv2 – tokenizer inspired by Go 1.25’s experimental JSONv2 runtime.
  • stdlib – Go’s stock encoding/json.Compact helper for minimal dependencies.

Benchmarks on a 13th Gen Intel Core i7-1355U (Go 1.25.2):

Implementation Small (~150B) ns/op (allocs) Medium (~60KB) ns/op (allocs) Large (~2MB) ns/op (allocs)
encoding/json 380 (1) 69,818 (1) 4,032,017 (1)
lockd 1,609 (5) 98,083 (20) 2,818,759 (26)
jsonv2 1,572 (5) 92,892 (16) 3,046,118 (23)

The default remains lockd, which provides the best throughput on large payloads — the primary lockd use-case — while jsonv2 or stdlib can be selected for small, latency-sensitive workloads. Re-run the suite locally with:

go test -bench=BenchmarkCompact -benchmem ./internal/jsonutil
Streaming State Updates & Payload Spooling

The client SDK now exposes streaming helpers so large JSON blobs no longer need to be buffered in memory. On the same 13th Gen Intel Core i7-1355U host:

Benchmark ns/op MB/s B/op allocs/op
BenchmarkClientGetBytes 707,158 370.72 1,218,417 131
BenchmarkClientGetStream 219,847 1,192.45 8,100 97
BenchmarkClientUpdateBytes 241,807 1,084.16 43,330 114
BenchmarkClientUpdateStream 402,225 651.74 9,759 122

Streaming reads cut allocations by ~150× and uploads by ~4.4×; throughput is a touch lower for uploads because the payload is generated on the fly, but avoids materialising entire documents in memory.

On the server side, lockd compacts JSON through an in-memory spool that spills to disk once a threshold is exceeded. By default up to 4 MiB of the request is kept in RAM. You can tune this via --payload-spool-mem / LOCKD_PAYLOAD_SPOOL_MEM / payload-spool-mem in the config file to trade memory for CPU (or vice versa).

Running the MinIO-backed benchmarks with the default threshold:

Benchmark ns/op MB/s B/op allocs/op
BenchmarkLockdLargeJSON 112,064,222 46.78 22,309,298 6,166
BenchmarkLockdLargeJSONStream 59,486,906 88.14 22,279,726 6,315
BenchmarkLockdSmallJSON 76,891,365 0.01 1,790,225 7,513
BenchmarkLockdSmallJSONStream 18,178,942 0.03 439,805 2,261

Large uploads still allocate heavily because the spool buffers the first 4 MiB before spilling to disk. Lowering the threshold (for example --payload-spool-mem=1MB) pushes more work onto disk IO, which may improve tail latency on constrained hosts. Small updates benefit significantly from streaming even with the default threshold. Choose a value that matches your workload and disk characteristics; the benchmarks above were gathered via:

set -a && source .env.local && set +a && go test -run=^$ -bench=BenchmarkClientGet -benchmem ./client
set -a && source .env.local && set +a && go test -run=^$ -bench=BenchmarkClientUpdate -benchmem ./client
set -a && source .env.local && set +a && go test -run='^$' -bench='BenchmarkLockd(LargeJSON|SmallJSON)' -benchmem ./integration/minio -tags "integration minio bench"
Example Use-cases

In addition to coordinating workflow checkpoints, lockd’s lease + atomic JSON model unlocks several other patterns once performance and durability goals are met:

  • Feature flag shards – hold per-segment experiment state and atomically roll back under contention without adding a new datastore.
  • Session handoff / sticky routing – track live client sessions across stateless edge workers using short leases and protobuf metadata blobs.
  • IoT rollout controller – drive firmware or configuration rollouts where each device claims work and reports progress exactly once.
  • Distributed cron / windowing – serialize recurring jobs (per key) so retries don’t overlap, while keeping per-run state directly in lockd.

Acquire-for-update is particularly useful for these scenarios because the state reader holds the lease while a worker inspects the JSON payload. Once it computes the next cursor it can call Update followed by Release, avoiding any race window between separate update and release calls.

Benchmarking with MinIO

./run-benchmark-suites.sh provides a single entry point for benchmark runs. Invoke ./run-benchmark-suites.sh list to see the available suites (disk, minio, mem/lq) and run ./run-benchmark-suites.sh minio (or all) to execute them with the correct build tags and per-suite logs in benchmark-logs/.

With MinIO running locally (for example on localhost:9000) you can compare raw object-store performance against lockd by running the benchmark suite:

# Large (5 MiB) and small payload benchmarks + concurrency tests
go test -bench . -run '^$' -tags "integration minio bench" ./integration/minio

The harness measures both sequential and concurrent scenarios for large (~5 MiB) and small (~512 B) payloads:

  • Raw MinIO PutObject throughput (large/small).
  • lockd acquire/update/release cycles (large/small).
  • Raw MinIO concurrent writers on distinct keys (large/small).
  • lockd concurrent writers on distinct keys (large/small).

Benchmarks assume the same environment variables as the MinIO integration tests (LOCKD_STORE, MINIO_ROOT_USER, MINIO_ROOT_PASSWORD, etc.). Use LOCKD_STORE=s3://localhost:9000/lockd-integration?insecure=1 for a default local setup, or point it at HTTPS by omitting the ?insecure=1 query string.

Benchmarking with Disk

The disk backend benchmarks pit raw disk.Store throughput against the HTTP API and include an optional NFS-targeted scenario. Run them with:

set -a && source .env.local && set +a && go test -run=^$ -bench='Benchmark(Disk|LockdDisk)' -benchmem ./integration/disk -tags "integration disk bench"
set -a && source .env.local && set +a && go test -run ^$ -bench BenchmarkLockdDiskLargeJSONNFS -benchmem ./integration/disk -tags "integration disk bench"

Source .env.disk (or export the variables manually) before running; the suite fails fast if the required paths are missing:

  • LOCKD_STORE=disk:///absolute/path/on/ssd-or-nvme – disk backend root for local disk benchmarks.
  • LOCKD_STORE=disk:///absolute/path/to/nfs-mount – NFS benchmark root when targeting an NFS-backed mount.
In-memory queue benchmarks

The benchmark/mem/lq package measures dispatcher throughput using the in-process mem:// backend. Each case launches real servers and clients so the numbers reflect the full subscribe/ack handshake rather than synthetic mocks:

go test -bench . -tags "bench mem lq" ./benchmark/mem/lq

Scenarios currently included:

Name Description
single_server_prefetch1_100p_100c 100 producers / 100 consumers on one server with subscribe prefetch=1 (baseline).
single_server_prefetch4_100p_100c Same workload with prefetch batches of four (default tuning).
single_server_subscribe_100p_1c High fan-in into a single blocking subscriber (prefetch=16).
single_server_dequeue_guard Legacy dequeue path kept as a regression guard.
double_server_prefetch4_100p_100c Two servers sharing the same mem store to verify routing/failover performance.

Only the total messages and enqueue/dequeue rates are printed by default so CI stays readable. Extra instrumentation can be toggled per run:

  • MEM_LQ_BENCH_EXTRA=1 – export latency-derived metrics such as dequeue_ms/op, ack_ms/op, messages_per_batch, and dequeue_gap_max_ms.
  • MEM_LQ_BENCH_DEBUG=1 – emit verbose client logs (including stale/idempotent ACKs) when chasing data races.
  • MEM_LQ_BENCH_TRACE=1 – attach the trace logger to both servers; combine with LOCKD_READY_CACHE_TRACE=1 to trace ready-cache refreshes.
  • MEM_LQ_BENCH_TRACE_GAPS=1 – print per-delivery gaps over 10 ms to stdout.
  • MEM_LQ_BENCH_PRODUCERS, MEM_LQ_BENCH_CONSUMERS, MEM_LQ_BENCH_MESSAGES, MEM_LQ_BENCH_PREFETCH, and MEM_LQ_BENCH_WARMUP – override the baked-in workload sizes without editing the source file.
  • MEM_LQ_BENCH_CPUPROFILE=/tmp/cpu.pprof – capture a CPU profile for the run.

These knobs let you keep the fast, quiet default for routine regression runs while enabling deep tracing when profiling throughput locally.

In-process client & background server helper

For tests or embedded use-cases you can run lockd entirely in-process. The client/inprocess package starts a private Unix-socket server and returns a regular client facade:

ctx := context.Background()
cfg := lockd.Config{Store: "mem://", DisableMTLS: true}
inproc, err := inprocess.New(ctx, cfg)
if err != nil { log.Fatal(err) }
defer inproc.Close(ctx)

lease, err := inproc.Acquire(ctx, api.AcquireRequest{Key: "jobs", Owner: "worker", TTLSeconds: 15})
if err != nil { log.Fatal(err) }
defer inproc.Release(ctx, api.ReleaseRequest{Key: "jobs", LeaseID: lease.LeaseID})

Use client.BlockWaitForever (default) to wait indefinitely when acquiring, or client.BlockNoWait to fail immediately if the lease is already held.

Behind the scenes it relies on lockd.StartServer, which launches the server in a goroutine and returns a handle with a Stop method. You can use the helper directly when wiring tests around Unix-domain sockets:

cfg := lockd.Config{Store: "mem://", ListenProto: "unix", Listen: "/tmp/lockd.sock", DisableMTLS: true}
handle, err := lockd.StartServer(ctx, cfg)
if err != nil { log.Fatal(err) }
defer handle.Stop(context.Background())

cli, err := client.New("unix:///tmp/lockd.sock")
if err != nil { log.Fatal(err) }
sess, err := cli.Acquire(ctx, api.AcquireRequest{Key: "demo", Owner: "worker", TTLSeconds: 20})
if err != nil { log.Fatal(err) }
defer sess.Close()

Health endpoints:

  • /healthz – liveness probe
  • /readyz – readiness probe
Test server helper & chaos proxy

The repository includes a dedicated testing harness (lockd.NewTestServer) that starts a fully configured server, returns a shutdown function, and can emit logs through testing.T. It also accepts a ChaosConfig that injects bounded latency, drops, or a single forced disconnect via an in-process TCP proxy. This is used heavily by the Go client’s integration tests to validate AcquireForUpdate failover logic across multiple endpoints.

Storage verification

lockd verify store validates credentials (list/get/put/delete) and prints a suggested IAM policy when access fails. Disk backends run a multi-replica simulation (metadata CAS and payload writes) so locking bugs or stale CAS tokens fail fast.

# Verify a disk mount before starting the server
LOCKD_STORE=disk:///var/lib/lockd lockd verify store

# Verify Azure Blob credentials (requires LOCKD_AZURE_ACCOUNT_KEY or LOCKD_AZURE_SAS_TOKEN)
LOCKD_STORE=azure://lockdaccount/lockd-container LOCKD_AZURE_ACCOUNT_KEY=... lockd verify store

When --store uses disk://, the same verification runs automatically during server startup and the process exits if any check fails.

Structured logging convention

To keep observability output, docs, and dashboards aligned, every structured log emitted by lockd adheres to the following rules:

  • app is always lockd, whether the line originates from the server, CLI, or in-process harness.
  • sys is mandatory and encodes the origin hierarchy as system.subsystem.component[.subcomponent]. Use as many segments as needed to pinpoint the emitting code (for example storage.crypto.envelope, queue.dispatcher.ready_cache.prune, or server.shutdown.controller.drain).
  • We no longer emit svc or component keys—sys replaces both. Any additional structured fields (latency, owner, key, cid, etc.) stay as-is.

High-volume events (per-message queue delivery/enqueue logs, subscription delivery traces) are emitted at Debug/Trace so Info remains a summary signal.

OpenTelemetry traces and Prometheus metrics follow the same convention: every server-managed span records lockd.sys=<value> alongside lockd.operation, so traces and metrics can be filtered with the same identifiers as logs.

TLS (mTLS)

mTLS is enabled by default. lockd looks for a bundle at $HOME/.lockd/server.pem unless --bundle points elsewhere. Disable with --disable-mtls / LOCKD_DISABLE_MTLS=1 (testing only).

Lockd also enforces a listener-level connection guard by default via a dedicated control.connguard component. It classifies suspicious connection attempts (for example bad TLS handshakes or zero-byte/probe-timeout plain-TCP attempts) before the HTTP handler layer and can temporarily block a source IP. This behavior is controlled by connguard-* flags and LOCKD_CONNGUARD_* environment variables; the full matrix is documented in docs/QRF.md.

Bundle format (PEM concatenated):

  1. CA certificate (trust anchor)
  2. Server certificate (leaf + chain)
  3. Server private key
  4. Optional denylist block (LOCKD DENYLIST)

The CA private key lives in ca.pem and should be stored securely. Keep it offline when possible; only the CA certificate is bundled with each server.

Generating certificates
# Create a Certificate Authority bundle (ca.pem)
lockd auth new ca --cn lockd-root

# Issue a server certificate signed by the CA
lockd auth new server --ca-in $HOME/.lockd/ca.pem --hosts lockd.example.com --hosts 127.0.0.1

# Explicit wildcard SAN (equivalent to default behavior when --hosts is omitted)
lockd auth new server --ca-in $HOME/.lockd/ca.pem --hosts '*'

# Issue a new client certificate signed by the bundle CA
lockd auth new client --ca-in $HOME/.lockd/ca.pem --cn worker-1

# Issue a TC client certificate for TC-to-TC / TC-to-RM calls
lockd auth new tcclient --ca-in $HOME/.lockd/ca.pem --cn tc-worker-1

# Revoke previously issued client certificates (by serial number)
lockd auth revoke client <hex-serial> [<hex-serial>...]

# Inspect bundle details (CA, server cert, denylist)
lockd auth inspect server        # lists all server*.pem bundles
lockd auth inspect client --in $HOME/.lockd/client-*.pem   # includes tc-client*.pem as well

# Verify bundles (validity, EKUs, denylist enforcement)
lockd auth verify server        # scans all server*.pem in the config dir
lockd auth verify client --server-in $HOME/.lockd/server.pem

The commands default to $HOME/.lockd/, creating the directory with 0700 and files with 0600 permissions. Use --out/--ca-in/--force to override file locations. ca.pem contains the trust anchor + private key and is intended to be stored in a secure location separate from the server runtime bundle.

lockd auth new server treats --hosts as SAN DNS/IP values and accepts both repeatable flags and CSV input. If --hosts is omitted, lockd issues a wildcard (*) SAN by default.

When lockd auth new server writes to the default location and server.pem already exists, the CLI automatically picks the next available name (server02.pem, server03.pem, …) so existing bundles are preserved without requiring --force.

Client issuance follows the same pattern: the first default bundle is written to client.pem, then client02.pem, client03.pem, and so on when rerun.

lockd auth verify ensures that the server bundle presents a CA + ServerAuth certificate (with matching server private key) and that client bundles were issued by the same CA and are not present on the denylist.

lockd auth revoke client updates the denylist for every server*.pem bundle in the same directory as the referenced server bundle so multi-replica nodes block revoked certificates consistently. Pass --propagate=false to limit the update to just the specified bundle when needed (e.g. staging experiments).

Storage Encryption

All metadata, lock state JSON blobs, and queue payloads are encrypted at rest by default using pkt.systems/kryptograf. When you run lockd auth new ca the CLI generates a kryptograf root key and a global metadata descriptor and embeds them alongside the CA certificate/key in ca.pem. Subsequent lockd auth new server invocations propagate the same material into each server.pem bundle so every replica can reconstruct the metadata DEK on startup.

At runtime the server mints per-object DEKs (one per lock state blob, per queue message metadata document, and per queue payload) derived from stable contexts (e.g. state:<key>, queue-meta:q/<queue>/msg/<id>.pb, queue-payload:q/<queue>/msg/<id>.bin). Descriptors for these DEKs are stored alongside the objects themselves so reads remain stateless. Encrypted objects use deterministic content-types:

  • Metadata protobuf: application/vnd.lockd+protobuf-encrypted
  • JSON state: application/vnd.lockd+json-encrypted
  • Queue payloads / DLQ binaries: application/vnd.lockd.octet-stream+encrypted

Disable encryption (testing only) with --disable-storage-encryption (or LOCKD_DISABLE_STORAGE_ENCRYPTION=1). Optional Snappy compression is available via --storage-encryption-snappy (only honoured when encryption remains enabled); when encryption is disabled, the original content-types (application/x-protobuf, application/json, application/octet-stream) are restored automatically.

lockd verify store now exercises the decrypt path by reading (or, when the store is empty, synthesising and deleting) sample metadata/state and queue objects. Failures surface immediately so misconfigured bundles or mismatched descriptors are caught during deployment. Because storage encryption is tied to the bundle, servers must load server.pem even when mTLS is disabled.

Go Client Usage

cli, err := client.New("https://lockd.example.com")
if err != nil { log.Fatal(err) }
sess, err := cli.Acquire(ctx, api.AcquireRequest{
    Key:        "orders",
    Owner:      "worker-1",
    TTLSeconds: 30,
    BlockSecs:  client.BlockWaitForever,
})
if err != nil { log.Fatal(err) }
defer sess.Close()

var payload struct {
    Data []byte
    Counter int
}

if err := sess.Load(ctx, &payload); err != nil { log.Fatal(err) }
payload.Counter++
if err := sess.Save(ctx, payload); err != nil { log.Fatal(err) }

// Customise timeouts (HTTP requests + close/release window) when constructing the client:
cli, err := client.New(
    "https://lockd.example.com",
    client.WithHTTPTimeout(30*time.Second),
    client.WithCloseTimeout(10*time.Second),
    client.WithKeepAliveTimeout(8*time.Second),
)

Create-only acquire (initialize once, fail if state already exists):

_, err := cli.Acquire(ctx, api.AcquireRequest{
    Key:         "orders",
    Owner:       "initializer",
    TTLSeconds:  30,
    BlockSecs:   client.BlockNoWait,
    IfNotExists: true,
})
if err != nil {
    if errors.Is(err, client.ErrAlreadyExists) {
        // Key was already initialized by another worker.
        return
    }
    log.Fatal(err)
}

Use either errors.Is(err, client.ErrAlreadyExists) or client.IsAlreadyExists(err); for SDK-returned acquire errors they are equivalent checks.

To connect over a Unix domain socket (useful when the server runs on the same host), point the client at unix:///path/to/lockd.sock:

cli, err := client.New("unix:///var/run/lockd.sock")
if err != nil { log.Fatal(err) }
// run the server with --disable-mtls (or supply a client bundle)
sess, err := cli.Acquire(ctx, api.AcquireRequest{Key: "orders", Owner: "worker-uds", TTLSeconds: 30})
if err != nil { log.Fatal(err) }
defer sess.Close()

Acquire automatically retries conflicts and transient 5xx/429 responses with exponential backoff.

Client CLI

lockd client ships alongside the server binary for quick interactions with a running cluster. Flags mirror the Go SDK defaults and honour LOCKD_CLIENT_* environment variables.

# Acquire and release leases (exports LOCKD_CLIENT_* env vars)
eval "$(lockd client acquire --server 127.0.0.1:9341 --owner worker-1 --ttl 30s --key orders)"
lockd client keepalive --ttl 45s --key orders
lockd client release --key orders

# State operations / pipe through edit
lockd client get --key orders -o - \
  | lockd client edit /status/counter++ \
  | lockd client update --key orders
lockd client remove --key orders

# Atomic JSON mutations (streaming server-side mutate under the active lease)
lockd client mutate --key orders /progress/step=fetch /progress/count++ time:/progress/updated=NOW

# Client-local streaming mutate (required for file-backed mutators)
lockd client mutate --local --key orders 'base64file:/payload=blob.bin' /filename=blob.bin

# Rich mutations with brace/quoted syntax (LQL)
lockd client mutate --key ledger \
  '/data{/hello key="mars traveler",/count++}' \
  /meta/previous=world \
  time:/meta/processed=NOW

# Client-local streaming set (no full document buffering)
lockd client set --key ledger 'textfile:/content=notes.txt' /filename=notes.txt

# Local JSON helper (no server interaction)
lockd client edit checkpoint.json /progress/step="done" /progress/count=+5

# Query keys or stream documents
lockd client query '/report/status="staged"'
lockd client query --output text --file keys.txt '/report/status="staged"'
lockd client query --documents --file staged.ndjson '/progress/count>=10'
lockd client query --documents --directory ./export '/report/region="emea"'

`lockd client query` parses the same LQL selector syntax as the server and
defaults to a compact JSON array of keys. Pass `--output text` for newline
lists that are easy to pipe into other shell tools. `--documents` switches the
request to `return=documents`, streaming each JSON state as NDJSON (to stdout by
default, or `--file`/`--directory` to store the feed). Directory mode writes one
`<key>.json` file per match, making it trivial to diff or archive results.
Selectors support shorthand comparison syntax, so `/progress/count>=10` is
automatically rewritten into the full `range{...}` clause, and `/status="ok"`
maps to an equality predicate without brace boilerplate. The selector argument
is required; to query “everything” explicitly pass an empty string (e.g.
`lockd client query ""`).

Pass --public to lockd client get when you only need the last published state and don’t want to hold a lease (the CLI calls /v1/get?public=1 under the hood). The default mode still enforces lease ownership and fencing tokens so writers remain serialized.

Every lockd client subcommand accepts an optional --key (-k) flag. When you omit --key, the command falls back to LOCKD_CLIENT_KEY (typically set by the most recent acquire). Invoking acquire without --key requests a server-generated identifier; the resulting value is exported via LOCKD_CLIENT_KEY so follow-up calls can rely on the environment.

Use lockd client acquire --if-not-exists to request create-only semantics. When the key already exists, acquire fails with already_exists and exits without retrying.

Queue operations

Queue commands ship alongside the standard lease helpers:

# Enqueue a JSON payload (stdin, --data, or --file)
lockd client queue enqueue \
  --queue orders \
  --content-type application/json \
  --data '{"op":"ship","order_id":42}'

# Dequeue and export LOCKD_QUEUE_* environment variables
eval "$(lockd client queue dequeue --queue orders --owner worker-1)"
printf 'payload stored at %s\n' "$LOCKD_QUEUE_PAYLOAD_PATH"

# Use the exported metadata to ack/nack/extend
lockd client queue ack
lockd client queue nack --delay 15s --reason "upstream retry"
lockd client queue nack --intent defer --delay 15s
lockd client queue extend --extend 45s

queue dequeue supports a --stateful flag which acquires both the message and workflow state leases; the exported LOCKD_QUEUE_STATE_* variables align with the fields consumed by queue ack/nack/extend.

queue nack supports --intent (-i) with two values:

  • failure (default): records a failed processing attempt, increments failure_attempts, and counts toward max_attempts.
  • defer: intentional requeue for non-failure workflow waits; does not increment failure_attempts and does not consume the max_attempts failure budget.

Use defer when work should be retried later without being treated as a processing failure. --reason is only valid with --intent failure.

Custom clients must send /v1/queue/enqueue as multipart/form-data (or multipart/related) with two parts named meta and payload. The meta part contains a JSON-encoded api.EnqueueRequest; the payload part streams the message body and may use any Content-Type (e.g. application/json). Earlier builds auto-detected JSON, but current releases require the explicit field names, matching the Go SDK and CLI.

Payloads are streamed directly to disk. When --payload-out is omitted the CLI creates a temporary file and exports its location via LOCKD_QUEUE_PAYLOAD_PATH, making it easy to hand off large bodies to other tools without buffering in memory.

The CLI auto-discovers client*.pem bundles under $HOME/.lockd/ (or use --bundle) and performs the same host-agnostic mTLS verification as the SDK. mutate, set, and edit consume the shared LQL mutation DSL. mutate is the streaming server-side path (/v1/mutate) unless -l/--local is set. set now uses a client-local streaming Get -> LQL MutateStream -> Update pipeline so large documents and file-backed mutators do not require full JSON buffering in memory. file:, textfile:, and base64file: mutation values are local-only and require set or mutate --local. Paths now use JSON Pointer syntax (RFC 6901) (/progress/counter++), so literal dots, spaces, or Unicode in keys are handled transparently (only / and ~ are escaped as ~1/~0). The mutator also supports brace blocks that fan out to nested fields (/data{/hello key="mars traveler",/count++}), increments (=+3/--), removals (rm:/delete:), and time: prefixes for RFC3339 timestamps. Commas and newlines can be mixed freely, e.g. lockd client mutate --key ledger 'meta{/owner="alice",/previous="world"}'.

Timeout knobs mirror the Go client: --timeout (HTTP dial+request), --close-timeout (release window), and --keepalive-timeout (LOCKD_CLIENT_TIMEOUT, LOCKD_CLIENT_CLOSE_TIMEOUT, and LOCKD_CLIENT_KEEPALIVE_TIMEOUT respectively). Use --drain-aware-shutdown/LOCKD_CLIENT_DRAIN_AWARE (enabled by default) to control whether the CLI auto-releases leases when the server signals a drain phase via the Shutdown-Imminent header.

Use - with --output to stream results to standard output or with file inputs to read from standard input (e.g. -o -, lockd client update ... -). When the acquire command runs in text mode it prints shell-compatible export LOCKD_CLIENT_*= assignments, making eval "$(lockd client acquire ...)" a convenient way to populate environment variables for subsequent commands.

When mTLS is enabled (default) the CLI assumes HTTPS for bare host[:port] values; when you pass --disable-mtls it assumes HTTP. Supplying an explicit http://.../https://... URL is always honoured.

Integration Tests

Integration suites are selected via build tags. Every run must include the integration tag plus one (or more) backend tags; optional feature tags narrow the scope further. The general pattern is:

go test -tags "integration <backend> [feature ...]" ./integration/...

For everyday development, ./run-integration-suites.sh wraps these invocations, sources the required .env.<backend> files, and stores logs under integration-logs/. Use list to see available suites (mem, disk, nfs, aws, azure, minio, plus /lq, /query, and /crypto variants), pass specific suites such as disk disk/lq or nfs nfs/lq, or run the full matrix with all. The helper honors LOCKD_GO_TEST_TIMEOUT, uses LOCKD_AWS_GO_TEST_TIMEOUT (default 10m) for AWS suites, exports LOCKD_TEST_STORAGE_ENCRYPTION=1 so disk/nfs/etc. suites run with envelope crypto by default, and exposes --disable-crypto to flip the env var when targeting legacy buckets.

Current backends:

Backend Notes Examples
mem Uses the in-memory store; no environment needed. go test -tags "integration mem" ./integration/... (all mem suites) / go test -tags "integration mem lq" ./integration/... (queue scenarios only) / go test -tags "integration mem query" ./integration/... (query-only scenarios).
disk Local disk backend. Requires .env.disk (see integration/disk). set -a && source .env.disk && set +a && go test -tags "integration disk" ./integration/... / ... -tags "integration disk lq" ... for queue-only coverage.
nfs Disk backend mounted on NFS. Source .env.nfs so LOCKD_STORE points at the NFS mount. set -a && source .env.nfs && set +a && go test -tags "integration nfs lq" ./integration/....
aws Real S3 credentials via .env. set -a && source .env && set +a && go test -tags "integration aws" ./integration/....
minio, azure S3-compatible / Azure Blob suites. e.g. go test -tags "integration minio" ./integration/... (requires appropriate env).

The queue-specific feature tag is lq, and query-specific coverage uses the query tag. A suite built with integration && mem && lq, for example, only compiles the queue wrappers in integration/mem/lq, while integration && mem && query targets the integration/mem/query tests without running the rest of the mem suite. We’ll extend the same layout to the AWS, Azure, and MinIO queue suites next so go test -tags "integration aws lq" ./integration/... (and similar) will target their queue scenarios without running unrelated tests.

For the Go client’s AcquireForUpdate failover tests:

go test -tags integration -run 'TestAcquireForUpdateCallback(SingleServer|FailoverMultiServer)' ./client

These harnesses ship with watchdog timers that panic after 5–10 s and print full goroutine dumps via runtime.Stack, making hangs immediately actionable— retain these guards whenever tests are updated.

Roadmap

  • Javascript/Typescript client (bun/deno-compatible).
  • Python client.
  • C# client.
  • Client helpers (auto keepalive, JSON patch utilities).
  • Metrics/observability and additional diagnostics.

License

MIT – see LICENSE.

Documentation

Overview

Package lockd exposes the Go APIs behind the single-binary coordination plane that combines exclusive leases, atomic JSON state (with search/index), binary attachments, and an at-least-once queue. The server runs cleanly as PID 1 or can be embedded as a library; the same storage abstraction powers disk, S3/MinIO, Azure Blob, and in-memory backends with optional envelope encryption.

Copyright (C) 2025 Michel Blomgren <https://pkt.systems>

Running a server

The server listens on the network specified by Config.ListenProto (default tcp) and address Config.Listen. Mutual TLS is enabled by default.

cfg := lockd.Config{
    Store:       "s3://locks/prod",
    Listen:      ":9341",
    ListenProto: "tcp",
    BundlePath:  "/etc/lockd/server.pem",
    DefaultNamespace: "analytics",
}
srv, err := lockd.NewServer(cfg)
if err != nil { log.Fatal(err) }
go func() {
    if err := srv.Start(); err != nil {
        log.Fatalf("lockd: %v", err)
    }
}()
defer func() {
    if err := srv.Shutdown(context.Background()); err != nil {
        log.Printf("lockd shutdown: %v", err)
    }
}()

Disk/NFS backends use a log-structured store with durable group commit. Batching is driven by natural backpressure: fsync occurs for each batch and the queue builds only while prior fsyncs are in-flight. LogstoreCommitMaxOps caps batch size. LogstoreSegmentSize controls when segment files roll. Background snapshot compaction is enabled by default on disk/NFS. It compacts sealed history only, installs snapshots atomically, and deletes obsolete files after a grace period. Tune it with LogstoreCompactionEnabled, LogstoreCompactionInterval, LogstoreCompactionMinSegments, LogstoreCompactionMinReclaimBytes, LogstoreCompactionDeleteGrace, LogstoreCompactionMaxIOBytesPerSec, and DisableLogstoreCompactionThrottling. DiskLockFileCacheSize caps cached lockfile descriptors for disk/NFS (default 2048; set negative to disable caching). Hot state reads can be cached in-process via StateCacheBytes (default 64 MiB; set 0 to disable). QueryDocPrefetch controls parallel fetch depth for query return=documents (default 8; set 1 to disable). In HA concurrent mode, single-writer optimizations are disabled.

cfg := lockd.Config{
    Store:                              "disk:///var/lib/lockd-data",
    LogstoreCommitMaxOps:               128,                // disk/NFS only
    LogstoreSegmentSize:                64 << 20,           // disk/NFS only (bytes)
    LogstoreCompactionInterval:         30 * time.Minute,   // disk/NFS only
    LogstoreCompactionDeleteGrace:      15 * time.Minute,   // disk/NFS only
    LogstoreCompactionMaxIOBytesPerSec: 8 << 20,            // disk/NFS only
    DiskLockFileCacheSize:              2048,               // disk/NFS only
    StateCacheBytes:                    64 << 20,           // cache hot state payloads
    QueryDocPrefetch:                   8,                  // return=documents prefetch depth
}

The CLI mirrors this with --logstore-commit-max-ops, --logstore-segment-size, --logstore-compaction, --logstore-compaction-interval, --logstore-compaction-min-segments, --logstore-compaction-min-reclaim-size, --logstore-compaction-delete-grace, --logstore-compaction-max-io-bytes-per-sec, --disable-logstore-compaction-throttling, --disk-lock-file-cache-size, --state-cache-bytes, and --query-doc-prefetch.

HA modes

When multiple lockd servers share the same backend, HAMode controls concurrent vs coordinated behaviour. HAMode="failover" (default) uses a lease stored under the internal .ha/activelease key to elect a single active writer; passive nodes return HTTP 503 so clients can retry another endpoint. HAMode="concurrent" enables multi-writer semantics. HAMode="single" disables HA coordination entirely and assumes the backend is owned by one server process. HAMode="auto" starts in single-writer mode and promotes to failover when it observes another live node. HALeaseTTL controls the lease duration in failover mode and heartbeat cadence in auto mode. On backends without native single-writer detection, HASinglePresenceTTL controls how long a single-mode presence record fences peers.

cfg := lockd.Config{
    Store:               "disk:///var/lib/lockd-data",
    HAMode:              "failover",
    HALeaseTTL:          10 * time.Second,
    HASinglePresenceTTL: 5 * time.Minute, // object-store style backends only
}

The CLI mirrors this with --ha, --ha-lease-ttl, and --ha-single-presence-ttl.

Namespaces partition keys and metadata. When callers omit the namespace, the server falls back to Config.DefaultNamespace (default "default"). Setting the field on Config, providing Namespace in api.AcquireRequest, or configuring clients via client.WithDefaultNamespace keeps each workload’s state isolated under its own prefix. Namespaces that start with a dot are reserved for lockd internals (e.g. .txns stores transaction records) and are rejected by both the HTTP layer and the core service. Always use user-defined namespaces that do not begin with '.'.

Unix domain sockets

For same-host sidecars you can serve over a Unix socket by setting ListenProto to "unix". Clean-up is automatic and mTLS can be disabled when the connection never leaves the machine.

cfg := lockd.Config{
    Store:        "mem://",
    ListenProto:  "unix",
    Listen:       "/var/run/lockd.sock",
    DisableMTLS:  true,
}
handle, err := lockd.StartServer(ctx, cfg)
if err != nil { log.Fatal(err) }
defer handle.Stop(context.Background())

Client SDK

The Go client (pkt.systems/lockd/client) wraps the HTTP API. The base URL decides the transport:

  • https://host:9341 – production mTLS connection (default)
  • http://host:9341 – plain HTTP for trusted networks or local testing
  • unix:///path/to/lockd.sock – Unix-domain sockets (requires DisableMTLS or supplying a client bundle)

Example:

cli, err := client.New("https://lockd.example.com")
if err != nil { log.Fatal(err) }
sess, err := cli.Acquire(ctx, api.AcquireRequest{
    Namespace: "orders-v2",
    Key:        "orders",
    Owner:      "worker-1",
    TTLSeconds: 30,
    BlockSecs:  client.BlockWaitForever,
    // Optional: join an existing transaction across keys.
    TxnID:      existingTxnID,
})
if err != nil { log.Fatal(err) }
defer sess.Close()
var state struct {
    Hello   string
    World   string
    Counter int
}
if err := sess.Load(ctx, &state); err != nil { log.Fatal(err) }
state.Counter++
if err := sess.Save(ctx, state); err != nil { log.Fatal(err) }

For create-only initialization, set AcquireRequest.IfNotExists=true. When state already exists for the key, acquire fails with error code "already_exists". The SDK exposes both helper styles: client.IsAlreadyExists(err) and errors.Is(err, client.ErrAlreadyExists). You only need one check style; for SDK-returned acquire errors they are equivalent.

_, err = cli.Acquire(ctx, api.AcquireRequest{
    Namespace:   "orders-v2",
    Key:         "orders",
    Owner:       "initializer",
    TTLSeconds:  30,
    BlockSecs:   client.BlockNoWait,
    IfNotExists: true,
})
if err != nil && client.IsAlreadyExists(err) {
    // Another worker initialized the key first.
}

The lease session carries the TxnID minted by Acquire. All lease-bound mutations (Update/Remove/UpdateMetadata/Release/attachments) require that transaction id. The SDK wires X-Txn-ID automatically when you use LeaseSession; custom HTTP clients must supply it (and include txn_id in the release request body).

The client tracks fencing tokens automatically; reusing the same Client instance ensures follow-up KeepAlive/Get/Update/Release calls include the freshest X-Fencing-Token. For multi-process flows the CLI exports the token via LOCKD_CLIENT_FENCING_TOKEN, and your program can register it manually with Client.RegisterLeaseToken.

Shutdown sequencing is controlled via Config.DrainGrace and Config.ShutdownTimeout. Draining gives existing lease holders time to wrap up (default 10 s) before the HTTP server begins closing connections; the timeout caps the combined drain + HTTP shutdown interval (default 10 s). The defaults mirror the CLI flags (--drain-grace, --shutdown-timeout) and can be disabled by setting them to 0. Each server splits the budget 80/20 so a 10 s timeout reserves ~8 s for draining and ~2 s for http.Server.Shutdown.

Sweeping is split into two paths: (1) transparent cleanup on relevant ops (e.g., if an acquire encounters an expired lease) and (2) an idle maintenance sweeper that runs only after a period of inactivity. The idle sweeper is rate-limited (max ops and max runtime) and can be configured to pause between operations to reduce backend pressure. Transaction replay has its own throttle so active traffic can kick replay without inheriting the idle sweeper cadence. Configure these in the server SDK with:

cfg := lockd.Config{
    Store:            "aws://lockd-prod",
    SweeperInterval:  5 * time.Minute,   // idle sweep tick
    IdleSweepGrace:   5 * time.Minute,   // idle time before a sweep can start
    IdleSweepOpDelay: 100 * time.Millisecond,
    IdleSweepMaxOps:  1000,
    IdleSweepMaxRuntime: 30 * time.Second,
    TxnReplayInterval:   5 * time.Second, // throttle for replay on active ops
}

To disable the idle sweeper entirely, set SweeperInterval <= 0 (replay throttling is independent and controlled by TxnReplayInterval).

Queue transactions use a dedicated worklist to avoid list scans when a queue message is enlisted in a transaction decision. When a commit/rollback includes a queue message participant, the TC records a small per-queue decision list under .txn-queue-decisions. Dequeue checks that worklist at most once per cache window (no background sweeps) and applies up to a capped number of items, making rollback-visible messages reappear after restarts without scanning the queue. Non-transactional queues do not write these markers, so they incur no extra IO beyond the cached check.

Configure the worklist behavior with:

cfg := lockd.Config{
    QueueDecisionCacheTTL: 60 * time.Second, // cache empty worklist checks
    QueueDecisionMaxApply: 50,               // max items applied per dequeue
    QueueDecisionApplyTimeout: 2 * time.Second, // time budget per dequeue apply
}

The CLI mirrors these as --queue-decision-cache-ttl, --queue-decision-max-apply, and --queue-decision-apply-timeout.

The Go client and CLI are drain-aware by default: when the server emits the Shutdown-Imminent header, the SDK auto-releases leases once in-flight work is done. Opt out by using client.WithDrainAwareShutdown(false) or passing --drain-aware-shutdown=false / LOCKD_CLIENT_DRAIN_AWARE=false to the CLI.

Multi-host deployments can construct the client with multiple base URLs via client.NewWithEndpoints([]string{...}). The SDK rotates through the provided endpoints on failure, carrying the same bounded retry budget so that reacquire attempts remain deterministic even when the primary server drops mid-request.

Acquire-for-update workflow

AcquireForUpdate wraps the usual acquire → get → update → release cycle in a single helper. Callers supply a function that receives an AcquireForUpdateContext; the context exposes the current StateSnapshot (so the handler can stream or decode the JSON) and forwards convenience methods like Update/UpdateBytes/Save/Remove. The helper keeps the lease alive in the background while the handler executes and always releases the lease when the handler returns. Both LeaseSession and AcquireForUpdateContext expose Remove helpers that delete the JSON blob while holding the lease. They honour the same conditional headers (X-If-Version, X-If-State-ETag) as updates and clear the server-side metadata when the delete succeeds.

Handshake failures (for example lease_required during the initial Get) consume the same bounded retry budget controlled by client.WithAcquireFailureRetries and client.WithAcquireBackoff. Once the handler starts running, further errors are surfaced immediately—the typical pattern is to return the error (acquire-for-update propagates it) and decide whether to retry at a higher level.

AcquireForUpdate forwards the acquire request unchanged. If IfNotExists=true and the key already exists, it returns already_exists and does not invoke the handler.

Explicit Release calls elsewhere still treat lease_required as success so teardown never hangs even if the lease has already been reclaimed.

State attachments

Keys can carry multiple named binary attachments. Attachments are staged under the lease transaction just like JSON state updates: attach files while holding the lease, and they become visible to public reads once the lease is released (commit). Attachments are stored under state/<key>/attachments/<id> and flow through the same encryption-at-rest pipeline as state/queue payloads. Lease-bound attachment operations require X-Txn-ID; public reads do not.

Use ListAttachments/RetrieveAttachment on the lease to inspect staged files, and DeleteAttachment/DeleteAllAttachments to stage removals that apply on release (rollbacks discard staged changes).

lease, err := cli.Acquire(ctx, api.AcquireRequest{Key: "orders", Owner: "worker-1", TTLSeconds: 30})
if err != nil { log.Fatal(err) }
defer lease.Close()

if _, err := lease.Attach(ctx, client.AttachRequest{
    Name:        "invoice.pdf",
    ContentType: "application/pdf",
    Body:        fileReader,
}); err != nil {
    log.Fatal(err)
}
if err := lease.Release(ctx); err != nil { log.Fatal(err) }

// Public reads can list/retrieve attachments after release.
resp, err := cli.Get(ctx, "orders")
if err != nil { log.Fatal(err) }
defer resp.Close()
attachments, err := resp.ListAttachments(ctx)
if err != nil { log.Fatal(err) }
_ = attachments

Queue service

lockd includes an at-least-once queue built on the same storage backends, encryption pipeline, and namespace layout as the lease/state surface. The HTTP API exposes /v1/queue/enqueue, /v1/queue/dequeue, /v1/queue/dequeue/state, /v1/queue/subscribe, /v1/queue/ack, /v1/queue/nack, and /v1/queue/extend. Namespaces are required for every queue call; omitting the field falls back to Config.DefaultNamespace, and the Go client/CLI mirror the same default/override behaviour (LOCKD_QUEUE_NAMESPACE for CLI flows).

Producers stream payloads (optionally zero-length) alongside JSON metadata describing queue, delay_seconds, visibility_timeout_seconds, ttl_seconds, max_attempts, and arbitrary attributes. Consumers issue dequeue requests with an owner identity; responses stream message metadata and payload via multipart/related parts so large blobs never buffer in RAM. State-aware dequeues acquire a workflow lease in the same request and expose helpers that keep message + state fencing tokens aligned when acking, nacking, or extending visibility.

The queue dispatcher multiplexes watchers/pollers across namespaces and storage backends. Disk/mem stores use native notifications (inotify / in-process) when available and fall back to polling when disabled or running on NFS. Object stores rely on the same observed-key tracking used by the lock surface so stale reads never reset metadata.

Perimeter defence (LSF + QRF)

Each server embeds a **Local Security Force (LSF)** observer that samples host metrics (memory, swap, load averages, CPU) plus per-endpoint inflight counters, and a **Quick Reaction Force (QRF)** controller that applies adaptive back-pressure. When the QRF soft-arms or engages, lockd paces requests server-side by waiting for a computed delay before continuing. Only if the delay exceeds the configured max wait does lockd respond with HTTP 429, surface a Retry-After hint, and tag the response with X-Lockd-QRF-State. The Go client honours these signals automatically; other clients should do the same to keep queues draining while the perimeter defence recovers. Configuration knobs and workflow details live in docs/QRF.md. By default the controller leans on host-wide memory budgets (80 % soft / 90 % hard) and load-average multipliers derived from the LSF baseline (4×/8×) while queue/lock/query inflight guards remain disabled unless configured explicitly.

Telemetry

Traces are exported over OTLP when Config.OTLPEndpoint is set (gRPC by default; use grpc://, grpcs://, http://, or https:// to force a transport). Metrics are exposed via a Prometheus scrape endpoint when Config.MetricsListen is non-empty (for example :9464). Runtime profiling metrics (goroutines, heap, scheduler latency) are opt-in via Config.EnableProfilingMetrics. A pprof debug listener can be exposed with Config.PprofListen. All three can be enabled together or independently:

cfg := lockd.Config{
    Store:         "disk:///var/lib/lockd",
    OTLPEndpoint:  "localhost:4317",
    MetricsListen: ":9464",
    EnableProfilingMetrics: true,
    PprofListen:            ":6060",
}
srv, err := lockd.NewServer(cfg)
if err != nil { log.Fatal(err) }
defer srv.Close(context.Background())

Embedded servers can override metrics via lockd.WithMetricsListen when using helpers such as StartServer or client/inprocess.

Embedding and helpers

StartServer launches a server in a goroutine, waits for readiness, and returns a handle with a Stop method. It’s useful when wiring lockd into existing processes or sidecars. The client/inprocess package builds on top of it, starting an embedded server (MTLS disabled) and returning a ready-to-use client facade:

cfg := lockd.Config{Store: "mem://", DisableMTLS: true}
inproc, err := inprocess.New(ctx, cfg)
if err != nil { log.Fatal(err) }
defer inproc.Close(ctx)

sess, err := inproc.Acquire(ctx, api.AcquireRequest{Key: "demo", Owner: "inproc", TTLSeconds: 20})
if err != nil { log.Fatal(err) }
defer sess.Close()

Storage backends

Configure the storage layer via Config.Store:

  • mem:// – in-memory (tests and local experimentation)
  • disk:///var/lib/lockd-data – SSD/NVMe-oriented disk backend with optional retention
  • azure://account/container – Azure Blob Storage (Shared Key or SAS auth)
  • aws://bucket/prefix – AWS S3 (uses standard AWS credential sources, requires region)
  • s3://host:port/bucket – MinIO or other S3-compatible stores (TLS on unless ?insecure=1)

JSON uploads are compacted using the selected compactor (see Config.JSONUtil), and large payloads spill to disk after Config.SpoolMemoryThreshold.

LQL query & mutation language

Both the CLI and HTTP APIs share a common selector/mutation DSL implemented by pkt.systems/lql. Selectors accept JSON Pointer field paths (and.eq{field=/status,value=open}, or.1.range{field=/progress/percent,gte=50}), while mutations cover assignments, arithmetic (++, --, =+5), removals (rm:/delete:), time: aliases for RFC3339 timestamps, and brace shorthand that fans out to nested keys. Examples:

lockd client set --key ledger \
    '/data{/hello key="mars traveler",/count++}' \
    /meta/previous=world \
    time:/meta/processed=NOW

Keys follow RFC 6901 JSON Pointer semantics (leading /; escape / as ~1 and ~ as ~0). Commas/newlines can be mixed freely—making it practical to paste production-style JSON paths into CLI tests, Go unit tests, or query strings (/v1/query?and.eq{...}).

Consult README.md for detailed guidance, additional examples, and operational considerations (TLS, auth bundles, environment variables).

Index

Constants

View Source
const (

	// SPIFFESDKPrefix is the SPIFFE URI prefix for SDK client identities.
	SPIFFESDKPrefix = "spiffe://lockd/sdk/"
	// SPIFFETCPrefix is the SPIFFE URI prefix for transaction coordinator client identities.
	SPIFFETCPrefix = "spiffe://lockd/tc/"
	// SPIFFEServerPrefix is the SPIFFE URI prefix for server identities.
	SPIFFEServerPrefix = "spiffe://lockd/server/"
)
View Source
const (
	// JSONUtilLockd selects the native lockd streaming JSON compactor.
	JSONUtilLockd = "lockd"
	// JSONUtilJSONV2 enables the Go 1.25 json/v2 tokenizer pipeline.
	JSONUtilJSONV2 = "jsonv2"
	// JSONUtilStdlib opts into the encoding/json standard library implementation.
	JSONUtilStdlib = "stdlib"
)
View Source
const (

	// DefaultQueuePollInterval controls how often the dispatcher polls storage when no event hint exists.
	DefaultQueuePollInterval = 3 * time.Second
	// DefaultQueuePollJitter adds randomised delay to poll intervals to stagger load.
	DefaultQueuePollJitter = 500 * time.Millisecond
	// DefaultQueueResilientPollInterval bounds how often watchers fall back to polling to recover missed events.
	DefaultQueueResilientPollInterval = 5 * time.Minute
	// DefaultQueueListPageSize caps how many queue metadata entries are fetched per poll.
	DefaultQueueListPageSize = 128
)
View Source
const (
	// DefaultPayloadSpoolMemoryThreshold defines how much JSON payload is buffered in memory before spilling to disk.
	DefaultPayloadSpoolMemoryThreshold = defaultSpoolMemoryThreshold
	// DefaultListen is the default TCP endpoint the server binds to.
	DefaultListen = ":9341"
	// DefaultListenProto controls the scheme used when no protocol is configured.
	DefaultListenProto = "tcp"
	// DefaultMetricsListen is the default metrics endpoint (Prometheus scrape).
	// Empty disables metrics unless explicitly configured.
	DefaultMetricsListen = ""
	// DefaultPprofListen is the default pprof debug listener (empty disables).
	DefaultPprofListen = ""
	// DefaultStore points the server at the in-memory backend when no store is provided.
	DefaultStore = "mem://"
	// DefaultHAMode controls coordination behaviour when multiple servers share a backend.
	DefaultHAMode = "failover"
	// DefaultHALeaseTTL controls how long HA failover leases are held in failover mode.
	DefaultHALeaseTTL = 10 * time.Second
	// DefaultHASinglePresenceTTL controls how long single-mode presence records
	// remain live on backends that require .ha advertisement.
	DefaultHASinglePresenceTTL = 5 * time.Minute
	// DefaultJSONMaxBytes bounds incoming JSON payloads.
	DefaultJSONMaxBytes = 100 * 1024 * 1024
	// DefaultAttachmentMaxBytes bounds attachment payloads when not specified by the caller.
	DefaultAttachmentMaxBytes = int64(1 << 40)
	// DefaultDefaultTTL is the baseline lease duration handed to new acquirers.
	DefaultDefaultTTL = 30 * time.Second
	// DefaultMaxTTL is the hard ceiling enforced on user-supplied TTLs.
	DefaultMaxTTL = 30 * time.Minute
	// DefaultAcquireBlock controls how long acquire requests block before timing out.
	DefaultAcquireBlock = 60 * time.Second
	// DefaultSweeperInterval sets the tick frequency for idle maintenance sweeps.
	DefaultSweeperInterval = 5 * time.Minute
	// DefaultTxnReplayInterval throttles transaction replay sweeps on active operations.
	DefaultTxnReplayInterval = 5 * time.Second
	// DefaultQueueDecisionCacheTTL bounds how long empty queue decision checks are cached.
	DefaultQueueDecisionCacheTTL = 60 * time.Second
	// DefaultQueueDecisionMaxApply caps how many queue decision items are applied per dequeue attempt.
	DefaultQueueDecisionMaxApply = 50
	// DefaultQueueDecisionApplyTimeout bounds how long a dequeue spends applying queued decisions.
	DefaultQueueDecisionApplyTimeout = 2 * time.Second
	// DefaultIdleSweepGrace controls how long the server must be idle before running maintenance sweeps.
	DefaultIdleSweepGrace = 5 * time.Minute
	// DefaultIdleSweepOpDelay pauses between maintenance sweep operations to reduce backend pressure.
	DefaultIdleSweepOpDelay = 100 * time.Millisecond
	// DefaultIdleSweepMaxOps caps how many sweep operations execute per run.
	DefaultIdleSweepMaxOps = 1000
	// DefaultIdleSweepMaxRuntime caps how long a maintenance sweep run may execute.
	DefaultIdleSweepMaxRuntime = 30 * time.Second
	// DefaultDrainGrace is the grace period granted before HTTP shutdown begins.
	DefaultDrainGrace = 10 * time.Second
	// DefaultShutdownTimeout caps the total shutdown time (drain + HTTP server).
	DefaultShutdownTimeout = 10 * time.Second
	// DefaultMaxConcurrentStreams sets the default HTTP/2 MaxConcurrentStreams when not explicitly configured.
	DefaultMaxConcurrentStreams = 1024
	// DefaultLogstoreCommitMaxOps caps how many logstore entries are committed per fsync batch.
	DefaultLogstoreCommitMaxOps = 4096
	// DefaultLogstoreSegmentSize caps the size of a single logstore segment before rolling.
	DefaultLogstoreSegmentSize = int64(64 << 20)
	// DefaultLogstoreCompactionEnabled enables background logstore compaction on disk/NFS.
	DefaultLogstoreCompactionEnabled = true
	// DefaultLogstoreCompactionInterval controls how often the background compactor checks for eligible work.
	DefaultLogstoreCompactionInterval = 30 * time.Minute
	// DefaultLogstoreCompactionMinSegments requires at least this many sealed snapshot/segment files before compacting.
	DefaultLogstoreCompactionMinSegments = 2
	// DefaultLogstoreCompactionMinReclaimBytes requires at least this many bytes of reclaimable sealed data before compacting.
	DefaultLogstoreCompactionMinReclaimBytes = int64(64 << 20)
	// DefaultLogstoreCompactionDeleteGrace delays deletion of obsolete compacted files so in-flight readers can finish.
	DefaultLogstoreCompactionDeleteGrace = 15 * time.Minute
	// DefaultLogstoreCompactionMaxIOBytesPerSec throttles background compaction IO on constrained systems.
	DefaultLogstoreCompactionMaxIOBytesPerSec = int64(8 << 20)
	// DefaultQueryDocPrefetch caps the number of in-flight document fetches for query return=documents.
	// A conservative default avoids over-saturating local disk backends; callers can raise this explicitly.
	DefaultQueryDocPrefetch = 1
	// DefaultDiskLockFileCacheSize caps cached lockfile descriptors (disk/NFS).
	DefaultDiskLockFileCacheSize = 2048
	// DefaultS3MaxPartSize tunes multipart uploads when writing state to S3-compatible stores.
	DefaultS3MaxPartSize = 16 * 1024 * 1024
	// DefaultS3SmallEncryptBufferBudget caps concurrent small-object encryption buffers for S3 backends.
	DefaultS3SmallEncryptBufferBudget = 64 * 1024 * 1024
	// DefaultStorageRetryMaxAttempts describes how many transient storage errors are retried.
	DefaultStorageRetryMaxAttempts = 6
	// DefaultStorageRetryBaseDelay configures the base delay between storage retries.
	DefaultStorageRetryBaseDelay = 100 * time.Millisecond
	// DefaultStorageRetryMaxDelay caps the exponential backoff between storage retries.
	DefaultStorageRetryMaxDelay = 5 * time.Second
	// DefaultStorageRetryMultiplier defines the exponential backoff ratio.
	DefaultStorageRetryMultiplier = 2.0
	// DefaultClientBlock drives the CLI client's default acquire block duration.
	DefaultClientBlock = 10 * time.Second
	// DefaultAzureEndpoint is empty so we can derive endpoints automatically for public regions.
	DefaultAzureEndpoint = ""
	// DefaultAzureEndpointPattern expands Azure account names into their HTTPS endpoint.
	DefaultAzureEndpointPattern = "https://%s.blob.core.windows.net"
	// DefaultAzureEndpointHelp documents the Azure endpoint format in CLI help output.
	DefaultAzureEndpointHelp = "https://<account>.blob.core.windows.net"
	// DefaultConfigFileName is the config file searched for when --config is omitted.
	DefaultConfigFileName = "config.yaml"
	// DefaultServerBundleName is the PEM bundle name emitted by lockd auth helpers.
	DefaultServerBundleName = "server.pem"
)
View Source
const (
	// DefaultQRFSoftDelay sets the base delay while the QRF is soft-armed.
	DefaultQRFSoftDelay = 50 * time.Millisecond
	// DefaultQRFEngagedDelay sets the base delay when the QRF is fully engaged.
	DefaultQRFEngagedDelay = 250 * time.Millisecond
	// DefaultQRFRecoveryDelay sets the base delay while recovering.
	DefaultQRFRecoveryDelay = 200 * time.Millisecond
	// DefaultQRFMaxWait caps how long a request will wait under QRF pacing before failing.
	DefaultQRFMaxWait = 5 * time.Second
	// DefaultStateCacheBytes caps in-memory cached state payloads for hot reads.
	DefaultStateCacheBytes = 64 << 20
	// DefaultTCFanoutTimeout bounds how long the TC waits per RM apply request.
	DefaultTCFanoutTimeout = 5 * time.Second
	// DefaultTCFanoutMaxAttempts describes how many times to retry RM apply calls.
	DefaultTCFanoutMaxAttempts = 4
	// DefaultTCFanoutBaseDelay configures the base backoff between RM apply retries.
	DefaultTCFanoutBaseDelay = 100 * time.Millisecond
	// DefaultTCFanoutMaxDelay caps RM apply retry backoff.
	DefaultTCFanoutMaxDelay = 2 * time.Second
	// DefaultTCFanoutMultiplier defines the exponential retry multiplier.
	DefaultTCFanoutMultiplier = 2.0
	// DefaultTCDecisionRetention retains decided txn records for recovery/fan-out.
	DefaultTCDecisionRetention = 24 * time.Hour
	// DefaultQRFRecoverySamples controls how many consecutive healthy samples are required before disengaging.
	DefaultQRFRecoverySamples = 5
	// DefaultQRFMemorySoftLimitPercent applies a soft guardrail when overall memory usage crosses this percentage.
	DefaultQRFMemorySoftLimitPercent = 75.0
	// DefaultQRFMemoryHardLimitPercent applies a hard guardrail when overall memory usage crosses this percentage.
	DefaultQRFMemoryHardLimitPercent = 85.0
	// DefaultQRFMemoryStrictHeadroomPercent discounts this much usage when reclaimable cache is unknown.
	DefaultQRFMemoryStrictHeadroomPercent = 15.0
	// DefaultQRFMemorySoftLimitBytes is disabled by default; set explicitly to enforce a process RSS cap.
	DefaultQRFMemorySoftLimitBytes = 0
	// DefaultQRFMemoryHardLimitBytes is disabled by default; set explicitly to enforce a hard process RSS cap.
	DefaultQRFMemoryHardLimitBytes = 0
	// DefaultQRFSwapSoftLimitPercent disables swap-based QRF by default.
	DefaultQRFSwapSoftLimitPercent = 0.0
	// DefaultQRFSwapHardLimitPercent disables swap-based QRF by default.
	DefaultQRFSwapHardLimitPercent = 0.0
	// DefaultQRFCPUPercentSoftLimit applies a soft guardrail when CPU utilisation crosses this percentage.
	DefaultQRFCPUPercentSoftLimit = 70.0
	// DefaultQRFCPUPercentHardLimit applies a hard guardrail when CPU utilisation crosses this percentage.
	DefaultQRFCPUPercentHardLimit = 85.0
	// DefaultQRFLoadSoftLimitMultiplier is the baseline load-average multiplier that soft-arms the QRF.
	DefaultQRFLoadSoftLimitMultiplier = 4.0
	// DefaultQRFLoadHardLimitMultiplier is the load-average multiplier that fully engages the QRF.
	DefaultQRFLoadHardLimitMultiplier = 8.0
	// DefaultQRFQueueConsumerSoftLimitRatio controls the default soft ceiling for concurrent queue consumers.
	DefaultQRFQueueConsumerSoftLimitRatio = 0.75
	// DefaultConnguardFailureThreshold is the number of suspicious connection events required before hard-blocking an IP.
	DefaultConnguardFailureThreshold = 5
	// DefaultConnguardFailureWindow is the rolling window for suspicious connect attempts.
	DefaultConnguardFailureWindow = 30 * time.Second
	// DefaultConnguardBlockDuration controls how long an IP remains blocked.
	DefaultConnguardBlockDuration = 5 * time.Minute
	// DefaultConnguardProbeTimeout bounds the wait for early classification on plain TCP connections.
	DefaultConnguardProbeTimeout = 250 * time.Millisecond
	// DefaultLSFSampleInterval configures how frequently the LSF observer samples system metrics.
	DefaultLSFSampleInterval = 200 * time.Millisecond
	// DefaultLSFLogInterval controls how often the LSF emits lockd.lsf.sample telemetry logs.
	DefaultLSFLogInterval = 15 * time.Second

	// DefaultIndexerFlushDocs determines how many documents trigger a flush.
	DefaultIndexerFlushDocs = 2000
	// DefaultIndexerFlushInterval bounds how long a memtable buffers before flushing.
	DefaultIndexerFlushInterval = 30 * time.Second
)
View Source
const (
	// DefaultNamespace applies when callers omit a namespace.
	DefaultNamespace = namespaces.Default
)

Variables

This section is empty.

Functions

func BuildAzureConfig

func BuildAzureConfig(cfg Config) (azurestore.Config, error)

BuildAzureConfig derives the Azure backend configuration.

func CreateCABundle added in v0.7.0

func CreateCABundle(req CreateCABundleRequest) ([]byte, error)

CreateCABundle generates a CA PEM bundle containing CA cert+key and kryptograf metadata material.

func CreateCABundleFile added in v0.7.0

func CreateCABundleFile(req CreateCABundleFileRequest) error

CreateCABundleFile writes a generated CA bundle to path. Parent directories are created with mode 0700 and the output file uses mode 0600.

func CreateClientBundle added in v0.7.0

func CreateClientBundle(req CreateClientBundleRequest) ([]byte, error)

CreateClientBundle generates an SDK client PEM bundle signed by the supplied CA bundle.

func CreateClientBundleFile added in v0.7.0

func CreateClientBundleFile(req CreateClientBundleFileRequest) error

CreateClientBundleFile writes a generated SDK client bundle to path. Parent directories are created with mode 0700 and the output file uses mode 0600.

func CreateServerBundle added in v0.7.0

func CreateServerBundle(req CreateServerBundleRequest) ([]byte, error)

CreateServerBundle generates a server PEM bundle signed by the supplied CA bundle.

func CreateServerBundleFile added in v0.7.0

func CreateServerBundleFile(req CreateServerBundleFileRequest) error

CreateServerBundleFile writes a generated server bundle to path. Parent directories are created with mode 0700 and the output file uses mode 0600.

func CreateTCClientBundle added in v0.7.0

func CreateTCClientBundle(req CreateTCClientBundleRequest) ([]byte, error)

CreateTCClientBundle generates a TC client PEM bundle signed by the supplied CA bundle.

func CreateTCClientBundleFile added in v0.7.0

func CreateTCClientBundleFile(req CreateTCClientBundleFileRequest) error

CreateTCClientBundleFile writes a generated TC client bundle to path. Parent directories are created with mode 0700 and the output file uses mode 0600.

func DefaultBundlePath

func DefaultBundlePath() (string, error)

DefaultBundlePath returns the default server bundle location.

func DefaultCAPath

func DefaultCAPath() (string, error)

DefaultCAPath returns the default CA bundle location.

func DefaultConfigDir

func DefaultConfigDir() (string, error)

DefaultConfigDir returns the default configuration directory ($HOME/.lockd).

func DefaultQueueMaxConsumers

func DefaultQueueMaxConsumers() int

DefaultQueueMaxConsumers returns an adaptive per-server consumer ceiling derived from CPU count.

func DefaultTCTrustDir added in v0.1.0

func DefaultTCTrustDir() (string, error)

DefaultTCTrustDir returns the default directory holding trusted TC CA certificates.

func NewTestingLogger

func NewTestingLogger(t testing.TB, level pslog.Level) pslog.Logger

NewTestingLogger creates a pslog logger that writes through testing.TB.

func NormalizeNamespace added in v0.1.0

func NormalizeNamespace(ns, fallback string) (string, error)

NormalizeNamespace delegates to namespaces.Normalize for config-level usage.

func OpenBackend added in v0.3.0

func OpenBackend(cfg Config, crypto *storage.Crypto) (storage.Backend, error)

OpenBackend constructs a storage backend from the supplied config and crypto. Intended for server-side tooling; callers must Close() the returned backend.

func ResolveClientBundlePath added in v0.1.0

func ResolveClientBundlePath(role ClientBundleRole, explicitPath string) (string, error)

ResolveClientBundlePath resolves or validates the client bundle path for a role. When explicitPath is empty, it auto-discovers bundle files under the default config dir.

func ResolveClientBundlePathWithHint added in v0.7.0

func ResolveClientBundlePathWithHint(role ClientBundleRole, explicitPath, hint string) (string, error)

ResolveClientBundlePathWithHint resolves or validates the client bundle path for a role. This variant allows callers to override the CLI hint shown in error messages.

func SPIFFEURIForRole added in v0.1.0

func SPIFFEURIForRole(role ClientBundleRole, name string) (*url.URL, error)

SPIFFEURIForRole builds the default SPIFFE URI for a role and common name.

func SPIFFEURIForServer added in v0.1.0

func SPIFFEURIForServer(nodeID string) (*url.URL, error)

SPIFFEURIForServer builds the SPIFFE URI for a server node identity.

func ValidJSONUtils

func ValidJSONUtils() []string

ValidJSONUtils returns the supported jsonutil implementations.

Types

type AWSConfigResult added in v0.3.0

type AWSConfigResult struct {
	Config      awsstore.Config
	Credentials CredentialSummary
}

AWSConfigResult captures AWS configuration and selected credentials.

func BuildAWSConfig

func BuildAWSConfig(cfg Config) (AWSConfigResult, error)

BuildAWSConfig parses aws:// URLs that target AWS S3 with regional configuration.

type ChaosConfig

type ChaosConfig struct {
	// Seed controls the pseudo-random source. When zero, time.Now is used.
	Seed int64

	// MinDelay and MaxDelay bound per-chunk latency. When both zero no delay is added.
	MinDelay time.Duration
	MaxDelay time.Duration

	// DropProbability skips forwarding a chunk (0.0-1.0).
	DropProbability float64

	// ResetProbability closes both connections abruptly (0.0-1.0) evaluated per chunk.
	ResetProbability float64

	// DisconnectAfter closes the downstream connection after the specified duration (0 disables).
	DisconnectAfter time.Duration

	// BandwidthBytesPerSecond throttles throughput when >0.
	BandwidthBytesPerSecond int64

	// ChunkSize controls read/write batch size. Defaults to 32k if <=0.
	ChunkSize int

	// MaxDisconnects limits how many times DisconnectAfter is applied across connections (0 = unlimited).
	MaxDisconnects int
}

ChaosConfig describes network perturbations applied by the chaos proxy.

type ClientBundleRole added in v0.1.0

type ClientBundleRole int

ClientBundleRole identifies which client bundle role to resolve.

const (
	// ClientBundleRoleSDK resolves SDK client bundles.
	ClientBundleRoleSDK ClientBundleRole = iota
	// ClientBundleRoleTC resolves transaction coordinator client bundles.
	ClientBundleRoleTC
)

func (ClientBundleRole) String added in v0.1.0

func (r ClientBundleRole) String() string

type CloseOption added in v0.1.0

type CloseOption func(*closeOptions)

CloseOption customises server shutdown semantics.

func WithDrainLeases added in v0.1.0

func WithDrainLeases(grace time.Duration) CloseOption

WithDrainLeases configures the shutdown grace period used to let existing lease holders flush state. Passing a negative duration disables draining.

func WithDrainLeasesPolicy added in v0.1.0

func WithDrainLeasesPolicy(policy DrainLeasesPolicy) CloseOption

WithDrainLeasesPolicy applies a full drain policy to shutdown calls.

func WithShutdownTimeout added in v0.1.0

func WithShutdownTimeout(d time.Duration) CloseOption

WithShutdownTimeout caps the total time allowed for drain plus HTTP shutdown. Zero disables the explicit cap and relies solely on the provided context.

type Config

type Config struct {
	// Listen is the server bind address (for example ":9341").
	Listen string
	// ListenProto selects listener type (for example "tcp").
	ListenProto string
	// MetricsListen is the metrics endpoint bind address; empty disables metrics.
	MetricsListen string
	// MetricsListenSet reports whether MetricsListen was explicitly set by caller/flags/env.
	MetricsListenSet bool
	// PprofListen is the pprof endpoint bind address; empty disables pprof.
	PprofListen string
	// PprofListenSet reports whether PprofListen was explicitly set by caller/flags/env.
	PprofListenSet bool
	// EnableProfilingMetrics enables runtime profiling metrics on the metrics endpoint.
	EnableProfilingMetrics bool
	// EnableProfilingMetricsSet reports whether profiling metrics toggle was explicitly set.
	EnableProfilingMetricsSet bool
	// Store is the backend DSN (for example mem://, disk://..., s3://..., azure://...).
	Store string
	// HAMode controls cluster coordination strategy ("concurrent", "failover", "single", or "auto").
	HAMode string
	// HALeaseTTL controls leader-lease lifetime in failover mode and heartbeat cadence in auto mode.
	HALeaseTTL time.Duration
	// HASinglePresenceTTL controls how long single-mode presence records remain
	// live on backends that require .ha advertisement.
	HASinglePresenceTTL time.Duration
	// DefaultNamespace is used when requests omit namespace.
	DefaultNamespace string
	// JSONMaxBytes caps incoming JSON payload size.
	JSONMaxBytes int64
	// AttachmentMaxBytes caps attachment payload size.
	AttachmentMaxBytes int64
	// JSONUtil selects JSON implementation (lockd/jsonv2/stdlib).
	JSONUtil string
	// SpoolMemoryThreshold controls memory buffering before payload spill-to-disk.
	SpoolMemoryThreshold int64
	// DiskRetention controls retention for disk-backed transient files/log fragments.
	DiskRetention time.Duration
	// DiskJanitorInterval controls how often disk retention janitor runs.
	DiskJanitorInterval time.Duration
	// DiskQueueWatch enables native filesystem queue-watch acceleration on disk backends.
	DiskQueueWatch bool
	// DiskLockFileCacheSize caps cached lock-file descriptors for disk/NFS locking.
	DiskLockFileCacheSize int
	// LogstoreCommitMaxOps caps logstore entries committed per fsync batch.
	LogstoreCommitMaxOps int
	// LogstoreSegmentSize caps logstore segment size before rolling.
	LogstoreSegmentSize int64
	// LogstoreCompactionEnabled enables background logstore compaction on disk/NFS.
	LogstoreCompactionEnabled bool
	// LogstoreCompactionEnabledSet reports whether LogstoreCompactionEnabled was explicitly set.
	LogstoreCompactionEnabledSet bool
	// LogstoreCompactionInterval controls how often the background compactor checks for eligible work.
	LogstoreCompactionInterval time.Duration
	// LogstoreCompactionMinSegments requires at least this many sealed snapshot/segment files before compacting.
	LogstoreCompactionMinSegments int
	// LogstoreCompactionMinReclaimBytes requires at least this many bytes of reclaimable sealed data before compacting.
	LogstoreCompactionMinReclaimBytes int64
	// LogstoreCompactionDeleteGrace delays deletion of obsolete compacted files after cutover.
	LogstoreCompactionDeleteGrace time.Duration
	// LogstoreCompactionMaxIOBytesPerSec throttles background compaction IO (0 uses the default throttle).
	LogstoreCompactionMaxIOBytesPerSec int64
	// DisableLogstoreCompactionThrottling disables background compaction IO throttling entirely.
	DisableLogstoreCompactionThrottling bool
	// DisableMemQueueWatch disables in-memory queue watch hints (poll-only fallback).
	DisableMemQueueWatch bool
	// DefaultTTL is the default lease TTL for acquire/dequeue operations.
	DefaultTTL time.Duration
	// MaxTTL is the maximum allowed lease TTL.
	MaxTTL time.Duration
	// AcquireBlock is the default acquire/dequeue blocking window.
	AcquireBlock time.Duration
	// SweeperInterval controls maintenance sweep cadence.
	SweeperInterval time.Duration
	// TxnReplayInterval controls how often transaction replay scans run.
	TxnReplayInterval time.Duration
	// QueueDecisionCacheTTL controls empty decision-cache TTL for queue decision checks.
	QueueDecisionCacheTTL time.Duration
	// QueueDecisionMaxApply caps decision records applied per dequeue attempt.
	QueueDecisionMaxApply int
	// QueueDecisionApplyTimeout caps dequeue time spent applying queued decisions.
	QueueDecisionApplyTimeout time.Duration
	// StateCacheBytes caps in-memory state cache size (0 uses default, negative disables).
	StateCacheBytes int64
	// StateCacheBytesSet reports whether StateCacheBytes was explicitly set.
	StateCacheBytesSet bool
	// IdleSweepGrace is required idle time before maintenance sweeps begin.
	IdleSweepGrace time.Duration
	// IdleSweepOpDelay inserts pacing delay between maintenance operations.
	IdleSweepOpDelay time.Duration
	// IdleSweepMaxOps caps maintenance operations per sweep pass.
	IdleSweepMaxOps int
	// IdleSweepMaxRuntime caps wall-clock duration of a sweep pass.
	IdleSweepMaxRuntime time.Duration
	// DrainGrace is pre-shutdown lease-drain grace period.
	DrainGrace time.Duration
	// DrainGraceSet reports whether DrainGrace was explicitly set.
	DrainGraceSet bool
	// ShutdownTimeout caps total graceful shutdown duration (drain + HTTP shutdown).
	ShutdownTimeout time.Duration
	// ShutdownTimeoutSet reports whether ShutdownTimeout was explicitly set.
	ShutdownTimeoutSet bool
	// OTLPEndpoint enables OTLP export to the given collector endpoint.
	OTLPEndpoint string
	// DisableHTTPTracing disables OpenTelemetry spans for HTTP handlers.
	DisableHTTPTracing bool
	// DisableStorageTracing disables OpenTelemetry spans in storage backends.
	DisableStorageTracing bool

	// DisableMTLS disables mutual TLS on the public HTTP server.
	DisableMTLS bool
	// BundlePath points to server PEM bundle (CA + server cert + key + metadata material).
	BundlePath string
	// BundlePathDisableExpansion disables env/tilde expansion for BundlePath.
	BundlePathDisableExpansion bool
	// BundlePEM provides server bundle bytes directly (takes precedence when non-empty).
	BundlePEM []byte
	// DenylistPath points to serial denylist file merged with bundle denylist entries.
	DenylistPath string

	// HTTP2MaxConcurrentStreams sets HTTP/2 MaxConcurrentStreams; 0 uses default.
	HTTP2MaxConcurrentStreams int
	// HTTP2MaxConcurrentStreamsSet reports whether HTTP2MaxConcurrentStreams was explicitly set.
	HTTP2MaxConcurrentStreamsSet bool

	// TCTrustDir is directory of trusted CA certs for TC federation calls.
	TCTrustDir string
	// TCDisableAuth disables TC peer/client auth checks (testing/isolated setups only).
	TCDisableAuth bool
	// TCAllowDefaultCA allows trust fallback to local default CA material when explicit trust is absent.
	TCAllowDefaultCA bool

	// SelfEndpoint is this node's externally reachable endpoint for TC federation.
	SelfEndpoint string
	// TCJoinEndpoints is optional seed endpoint list used for initial TC peer discovery.
	TCJoinEndpoints []string

	// TCFanoutTimeout caps each remote apply attempt during TC fan-out.
	TCFanoutTimeout time.Duration
	// TCFanoutMaxAttempts caps retry attempts for TC fan-out calls.
	TCFanoutMaxAttempts int
	// TCFanoutBaseDelay is exponential backoff base for TC fan-out retries.
	TCFanoutBaseDelay time.Duration
	// TCFanoutMaxDelay caps TC fan-out backoff.
	TCFanoutMaxDelay time.Duration
	// TCFanoutMultiplier is exponential growth factor for TC fan-out retries.
	TCFanoutMultiplier float64
	// TCDecisionRetention keeps decided transaction records for replay/fan-out recovery.
	TCDecisionRetention time.Duration
	// TCClientBundlePath points to TC client bundle used for mTLS fan-out calls.
	TCClientBundlePath string

	// DisableStorageEncryption disables kryptograf envelope encryption at rest.
	DisableStorageEncryption bool
	// StorageEncryptionSnappy enables Snappy compression before encryption.
	StorageEncryptionSnappy bool
	// MetadataRootKey is kryptograf root key used to derive metadata/object keys.
	MetadataRootKey keymgmt.RootKey
	// MetadataDescriptor is kryptograf descriptor used for metadata encryption context.
	MetadataDescriptor keymgmt.Descriptor
	// MetadataContext is CA-derived context identifier used for encryption material lookup.
	MetadataContext string
	// DisableKryptoPool disables pooled crypto buffers (diagnostic mode).
	DisableKryptoPool bool

	// S3SSE controls server-side encryption mode for S3 writes (for example AES256/KMS).
	S3SSE string
	// S3KMSKeyID is KMS key identifier for S3 SSE-KMS mode.
	S3KMSKeyID string
	// AWSKMSKeyID is AWS KMS key identifier used by lockd's envelope crypto integrations.
	AWSKMSKeyID string
	// S3MaxPartSize controls multipart upload part size.
	S3MaxPartSize int64
	// S3SmallEncryptBufferBudget caps concurrent small-object encryption buffers.
	S3SmallEncryptBufferBudget int64
	// AWSRegion sets AWS region for aws:// and related integrations.
	AWSRegion string
	// S3AccessKeyID sets static S3 access key credential.
	S3AccessKeyID string
	// S3SecretAccessKey sets static S3 secret credential.
	S3SecretAccessKey string
	// S3SessionToken sets optional session token for temporary S3 credentials.
	S3SessionToken string

	// AzureAccount is the Azure storage account name.
	AzureAccount string
	// AzureAccountKey is the shared-key credential for Azure Blob.
	AzureAccountKey string
	// AzureEndpoint overrides Azure Blob endpoint URL.
	AzureEndpoint string
	// AzureSASToken configures SAS-token auth for Azure Blob.
	AzureSASToken string

	// StorageRetryMaxAttempts caps transient backend retry attempts.
	StorageRetryMaxAttempts int
	// StorageRetryBaseDelay is exponential retry base delay for backend operations.
	StorageRetryBaseDelay time.Duration
	// StorageRetryMaxDelay caps backend retry backoff.
	StorageRetryMaxDelay time.Duration
	// StorageRetryMultiplier is exponential growth factor for backend retries.
	StorageRetryMultiplier float64

	// QueueMaxConsumers caps concurrent queue consumer workers per server.
	QueueMaxConsumers int
	// QueuePollInterval controls queue poll cadence when no watch hint exists.
	QueuePollInterval time.Duration
	// QueuePollJitter adds random delay to queue polling intervals.
	QueuePollJitter time.Duration
	// QueueResilientPollInterval is fallback full-poll cadence to recover missed events.
	QueueResilientPollInterval time.Duration
	// QueueListPageSize caps queue list page size per poll request.
	QueueListPageSize int

	// IndexerFlushDocs flushes in-memory index docs after this many buffered docs.
	IndexerFlushDocs int
	// IndexerFlushInterval flushes in-memory index docs after this wall-clock interval.
	IndexerFlushInterval time.Duration
	// IndexerFlushDocsSet reports whether IndexerFlushDocs was explicitly set.
	IndexerFlushDocsSet bool
	// IndexerFlushIntervalSet reports whether IndexerFlushInterval was explicitly set.
	IndexerFlushIntervalSet bool

	// QueryDocPrefetch caps concurrent state fetches for query return=documents.
	QueryDocPrefetch int

	// QRFDisabled disables Quick Reaction Force request shaping.
	QRFDisabled bool
	// QRFQueueSoftLimit soft-arms QRF when in-flight queue leases exceed this count.
	QRFQueueSoftLimit int64
	// QRFQueueHardLimit fully engages QRF when in-flight queue leases exceed this count.
	QRFQueueHardLimit int64
	// QRFQueueConsumerSoftLimit soft-arms QRF when active queue consumers exceed this count.
	QRFQueueConsumerSoftLimit int64
	// QRFQueueConsumerHardLimit fully engages QRF when active queue consumers exceed this count.
	QRFQueueConsumerHardLimit int64
	// QRFLockSoftLimit soft-arms QRF when in-flight lock leases exceed this count.
	QRFLockSoftLimit int64
	// QRFLockHardLimit fully engages QRF when in-flight lock leases exceed this count.
	QRFLockHardLimit int64
	// QRFQuerySoftLimit soft-arms QRF when concurrent query load exceeds this count.
	QRFQuerySoftLimit int64
	// QRFQueryHardLimit fully engages QRF when concurrent query load exceeds this count.
	QRFQueryHardLimit int64
	// QRFMemorySoftLimitBytes soft-arms QRF when process memory exceeds this absolute byte threshold.
	QRFMemorySoftLimitBytes uint64
	// QRFMemoryHardLimitBytes fully engages QRF when process memory exceeds this absolute byte threshold.
	QRFMemoryHardLimitBytes uint64
	// QRFMemorySoftLimitPercent soft-arms QRF when process memory exceeds this percentage.
	QRFMemorySoftLimitPercent float64
	// QRFMemoryHardLimitPercent fully engages QRF when process memory exceeds this percentage.
	QRFMemoryHardLimitPercent float64
	// QRFMemoryStrictHeadroomPercent reserves this headroom when reclaimable cache is uncertain.
	QRFMemoryStrictHeadroomPercent float64
	// QRFSwapSoftLimitBytes soft-arms QRF when swap usage exceeds this absolute byte threshold.
	QRFSwapSoftLimitBytes uint64
	// QRFSwapHardLimitBytes fully engages QRF when swap usage exceeds this absolute byte threshold.
	QRFSwapHardLimitBytes uint64
	// QRFSwapSoftLimitPercent soft-arms QRF when swap usage exceeds this percentage.
	QRFSwapSoftLimitPercent float64
	// QRFSwapHardLimitPercent fully engages QRF when swap usage exceeds this percentage.
	QRFSwapHardLimitPercent float64
	// QRFCPUPercentSoftLimit soft-arms QRF when CPU utilization exceeds this percentage.
	QRFCPUPercentSoftLimit float64
	// QRFCPUPercentHardLimit fully engages QRF when CPU utilization exceeds this percentage.
	QRFCPUPercentHardLimit float64
	// QRFCPUPercentSoftLimitSet reports whether QRFCPUPercentSoftLimit was explicitly set.
	QRFCPUPercentSoftLimitSet bool
	// QRFCPUPercentHardLimitSet reports whether QRFCPUPercentHardLimit was explicitly set.
	QRFCPUPercentHardLimitSet bool
	// QRFLoadSoftLimitMultiplier soft-arms QRF when load average exceeds this CPU-multiplier.
	QRFLoadSoftLimitMultiplier float64
	// QRFLoadHardLimitMultiplier fully engages QRF when load average exceeds this CPU-multiplier.
	QRFLoadHardLimitMultiplier float64
	// QRFRecoverySamples is number of healthy samples required to disengage/recover.
	QRFRecoverySamples int
	// QRFSoftDelay is per-request pacing delay while soft-armed.
	QRFSoftDelay time.Duration
	// QRFEngagedDelay is per-request pacing delay while engaged.
	QRFEngagedDelay time.Duration
	// QRFRecoveryDelay is per-request pacing delay while recovering.
	QRFRecoveryDelay time.Duration
	// QRFMaxWait caps total time a request may wait under QRF pacing.
	QRFMaxWait time.Duration

	// LSFSampleInterval controls Local Security Force sample cadence.
	LSFSampleInterval time.Duration
	// LSFLogInterval controls cadence of lockd.lsf.sample telemetry logs.
	LSFLogInterval time.Duration
	// LSFLogIntervalSet reports whether LSFLogInterval was explicitly set.
	LSFLogIntervalSet bool

	// ConnguardEnabled enables suspicious-connection protection in the TCP listener path.
	// This setting is unsupported for listen-proto=unix.
	ConnguardEnabled bool
	// ConnguardEnabledSet reports whether ConnguardEnabled was explicitly set.
	ConnguardEnabledSet bool
	// ConnguardFailureThreshold controls how many suspicious connection events trigger a hard block.
	ConnguardFailureThreshold int
	// ConnguardFailureWindow is the rolling window used to count suspicious connection events.
	ConnguardFailureWindow time.Duration
	// ConnguardBlockDuration controls how long a suspicious source IP is blocked.
	ConnguardBlockDuration time.Duration
	// ConnguardProbeTimeout controls how long plain TCP connections are probed before classification.
	ConnguardProbeTimeout time.Duration
}

Config captures the tunables for a lockd.Server instance.

func (Config) MTLSEnabled added in v0.1.0

func (c Config) MTLSEnabled() bool

MTLSEnabled reports whether mutual TLS is active.

func (Config) StorageEncryptionEnabled

func (c Config) StorageEncryptionEnabled() bool

StorageEncryptionEnabled reports whether kryptograf envelope encryption is active.

func (*Config) Validate

func (c *Config) Validate() error

Validate applies defaults and sanity-checks the configuration.

type CreateCABundleFileRequest added in v0.7.0

type CreateCABundleFileRequest struct {
	// Path is the destination PEM file path. This field is required.
	Path string
	// Force controls overwrite behavior.
	// When false, writing fails if Path already exists.
	Force bool
	// CreateCABundleRequest configures CA generation.
	CreateCABundleRequest
}

CreateCABundleFileRequest controls CA bundle generation + file write.

type CreateCABundleRequest added in v0.7.0

type CreateCABundleRequest struct {
	// CommonName sets the CA certificate subject CN.
	// When empty, the default is "lockd-ca".
	CommonName string
	// ValidFor sets CA certificate validity duration.
	// When <= 0, the default is 10 years (10 * 365 * 24h).
	ValidFor time.Duration
}

CreateCABundleRequest controls CA bundle generation.

type CreateClientBundleFileRequest added in v0.7.0

type CreateClientBundleFileRequest struct {
	// Path is the destination PEM file path. This field is required.
	Path string
	// Force controls overwrite behavior.
	// When false, writing fails if Path already exists.
	Force bool
	// CreateClientBundleRequest configures client bundle generation.
	CreateClientBundleRequest
}

CreateClientBundleFileRequest controls SDK client bundle generation + file write.

type CreateClientBundleRequest added in v0.7.0

type CreateClientBundleRequest struct {
	// CABundlePEM is the CA bundle content (CA cert + CA key + kryptograf metadata).
	// This field is required.
	CABundlePEM []byte
	// CommonName sets the client certificate subject CN.
	// When empty, the default is "lockd-client".
	CommonName string
	// ValidFor sets client certificate validity duration.
	// When <= 0, the default is 1 year (365 * 24h).
	ValidFor time.Duration
	// NamespaceClaims defines namespace ACL claims in CLI-compatible format.
	// Accepted formats are "namespace" (defaults to rw) or "namespace=perm",
	// and each entry may contain comma-separated values.
	// Perm values are r, w, rw.
	// If no explicit claims/flags are provided, default namespace rw is added.
	NamespaceClaims []string
	// ReadAll adds the ALL=r claim (alias of CLI --read-all).
	ReadAll bool
	// WriteAll adds the ALL=w claim (alias of CLI --write-all).
	WriteAll bool
	// ReadWriteAll adds the ALL=rw claim (alias of CLI --rw-all).
	ReadWriteAll bool
}

CreateClientBundleRequest controls SDK client bundle generation from an existing CA bundle.

type CreateServerBundleFileRequest added in v0.7.0

type CreateServerBundleFileRequest struct {
	// Path is the destination PEM file path. This field is required.
	Path string
	// Force controls overwrite behavior.
	// When false, writing fails if Path already exists.
	Force bool
	// CreateServerBundleRequest configures server bundle generation.
	CreateServerBundleRequest
}

CreateServerBundleFileRequest controls server bundle generation + file write.

type CreateServerBundleRequest added in v0.7.0

type CreateServerBundleRequest struct {
	// CABundlePEM is the CA bundle content (CA cert + CA key + kryptograf metadata).
	// This field is required.
	CABundlePEM []byte
	// CommonName sets the server certificate subject CN.
	// When empty, the default is "lockd-server".
	CommonName string
	// ValidFor sets server certificate validity duration.
	// When <= 0, the default is 1 year (365 * 24h).
	ValidFor time.Duration
	// Hosts lists DNS names/IPs for SANs. Values are trimmed.
	// When empty, a wildcard DNS SAN "*" is used.
	Hosts []string
	// NodeID controls the server SPIFFE URI identity (spiffe://lockd/server/<NodeID>).
	// When empty, a new UUIDv7 is generated.
	NodeID string
	// Denylist optionally seeds revoked client serials embedded in the server bundle.
	// Nil/empty means no revoked serials.
	Denylist []string
}

CreateServerBundleRequest controls server bundle generation from an existing CA bundle.

type CreateTCClientBundleFileRequest added in v0.7.0

type CreateTCClientBundleFileRequest struct {
	// Path is the destination PEM file path. This field is required.
	Path string
	// Force controls overwrite behavior.
	// When false, writing fails if Path already exists.
	Force bool
	// CreateTCClientBundleRequest configures TC client bundle generation.
	CreateTCClientBundleRequest
}

CreateTCClientBundleFileRequest controls TC client bundle generation + file write.

type CreateTCClientBundleRequest added in v0.7.0

type CreateTCClientBundleRequest struct {
	// CABundlePEM is the CA bundle content (CA cert + CA key + kryptograf metadata).
	// This field is required.
	CABundlePEM []byte
	// CommonName sets the TC client certificate subject CN.
	// When empty, the default is "lockd-tc-client".
	CommonName string
	// ValidFor sets client certificate validity duration.
	// When <= 0, the default is 1 year (365 * 24h).
	ValidFor time.Duration
}

CreateTCClientBundleRequest controls TC client bundle generation from an existing CA bundle.

type CredentialSummary

type CredentialSummary struct {
	AccessKey string
	HasSecret bool
	Source    string
}

CredentialSummary describes which credentials were selected for object storage.

type DiskConfigResult added in v0.1.0

type DiskConfigResult struct {
	Config disk.Config
	Root   string
}

DiskConfigResult captures disk configuration and its root path.

func BuildDiskConfig

func BuildDiskConfig(cfg Config) (DiskConfigResult, error)

BuildDiskConfig parses disk:// URLs into a disk.Config.

type DrainLeasesPolicy added in v0.1.0

type DrainLeasesPolicy struct {
	// GracePeriod defines how long the server should keep serving requests
	// (while denying new leases) before beginning the HTTP shutdown. Zero skips
	// the grace window.
	GracePeriod time.Duration

	// ForceRelease toggles metadata rewrites that explicitly clear outstanding
	// leases when the grace period elapses. This is experimental and disabled by
	// default.
	ForceRelease bool

	// NotifyClients controls whether the server surfaces Shutdown-Imminent
	// headers while draining so clients can release proactively.
	NotifyClients bool
}

DrainLeasesPolicy describes how the server should attempt to let existing lease holders finish work before the HTTP server stops accepting new connections.

type Option

type Option func(*options)

Option configures server instances.

func WithBackend

func WithBackend(b storage.Backend) Option

WithBackend injects a pre-built backend (useful for tests).

func WithClock

func WithClock(c clock.Clock) Option

WithClock injects a custom clock implementation.

func WithDefaultCloseOptions added in v0.1.0

func WithDefaultCloseOptions(opts ...CloseOption) Option

WithDefaultCloseOptions sets the server-wide defaults applied to Close/Shutdown calls.

func WithLSFLogInterval

func WithLSFLogInterval(interval time.Duration) Option

WithLSFLogInterval overrides the cadence for lockd.lsf.sample telemetry logs; use 0 to disable logging.

func WithLogger

func WithLogger(l pslog.Logger) Option

WithLogger supplies a custom logger. Passing nil falls back to pslog.NoopLogger().

func WithMetricsListen added in v0.2.0

func WithMetricsListen(addr string) Option

WithMetricsListen overrides the metrics listener address (empty disables metrics).

func WithOTLPEndpoint

func WithOTLPEndpoint(endpoint string) Option

WithOTLPEndpoint overrides the OTLP collector endpoint used for telemetry.

func WithPprofListen added in v0.3.0

func WithPprofListen(addr string) Option

WithPprofListen overrides the pprof listener address (empty disables).

func WithProfilingMetrics added in v0.3.0

func WithProfilingMetrics(enabled bool) Option

WithProfilingMetrics toggles Go runtime profiling metrics on the metrics endpoint.

func WithTCFanoutGate added in v0.1.0

func WithTCFanoutGate(gate txncoord.FanoutGate) Option

WithTCFanoutGate injects a hook between local apply and remote fan-out (test-only).

func WithTCLeaderLeaseTTL added in v0.1.0

func WithTCLeaderLeaseTTL(ttl time.Duration) Option

WithTCLeaderLeaseTTL overrides the lease TTL used for TC leader election.

type S3ConfigResult added in v0.1.0

type S3ConfigResult struct {
	Config      s3.Config
	Credentials CredentialSummary
}

S3ConfigResult captures S3 configuration and selected credentials.

func BuildGenericS3Config

func BuildGenericS3Config(cfg Config) (S3ConfigResult, error)

BuildGenericS3Config parses s3:// URLs that target generic S3-compatible services (MinIO, etc.).

type Server

type Server struct {
	// contains filtered or unexported fields
}

Server wraps the HTTP server, storage backend, and supporting components.

func NewServer

func NewServer(cfg Config, opts ...Option) (*Server, error)

NewServer constructs a lockd server according to cfg. Example:

cfg := lockd.Config{Store: "mem://", Listen: ":9341", ListenProto: "tcp"}
srv, err := lockd.NewServer(cfg)
if err != nil {
    log.Fatal(err)
}
go srv.Start()

func (*Server) Abort added in v0.1.0

func (s *Server) Abort(ctx context.Context) error

Abort stops serving and background loops without leaving the TC cluster. Intended for tests that need to simulate abrupt server loss.

func (*Server) Close

func (s *Server) Close(opts ...CloseOption) error

Close gracefully shuts the server down using a background context.

func (*Server) ForceQRFObserve

func (s *Server) ForceQRFObserve(snapshot qrf.Snapshot)

ForceQRFObserve injects a metrics snapshot into the QRF controller. It is intended for tests that need to drive the perimeter-defence state machine deterministically.

func (*Server) Handler

func (s *Server) Handler() http.Handler

Handler returns the underlying HTTP handler so lockd can be mounted inside an existing mux when embedding the server into another program.

func (*Server) LastServeError

func (s *Server) LastServeError() error

LastServeError returns the most recent error reported by the underlying HTTP server. It is primarily useful for diagnostics; Shutdown already reports any fatal serve/shutdown errors to callers.

func (*Server) ListenerAddr

func (s *Server) ListenerAddr() net.Addr

ListenerAddr returns the bound listener address once available.

func (*Server) QRFState

func (s *Server) QRFState() qrf.State

QRFState returns the current state of the perimeter defence controller.

func (*Server) QRFStatus

func (s *Server) QRFStatus() qrf.Status

QRFStatus returns the current controller state, reason, and last snapshot.

func (*Server) Shutdown

func (s *Server) Shutdown(ctx context.Context) error

Shutdown gracefully stops the server and returns any fatal serve/shutdown error. The returned error will be nil for clean shutdowns.

func (*Server) ShutdownWithOptions added in v0.1.0

func (s *Server) ShutdownWithOptions(ctx context.Context, opts ...CloseOption) error

ShutdownWithOptions gracefully stops the server while applying custom close behaviour.

func (*Server) Start

func (s *Server) Start() error

Start begins serving requests and blocks until the server stops.

func (*Server) WaitUntilReady

func (s *Server) WaitUntilReady(ctx context.Context) error

WaitUntilReady blocks until the server listener is initialized or context ends.

type ServerHandle added in v0.1.0

type ServerHandle struct {
	Server *Server
	Stop   func(context.Context, ...CloseOption) error
}

ServerHandle wraps a running server and its shutdown hook.

func StartServer

func StartServer(ctx context.Context, cfg Config, opts ...Option) (ServerHandle, error)

StartServer starts a lockd server in a background goroutine and waits until it is ready to accept connections. It returns the running server alongside a stop function that gracefully shuts it down. Example:

cfg := lockd.Config{Store: "mem://", ListenProto: "unix", Listen: "/tmp/lockd.sock", DisableMTLS: true}
handle, err := lockd.StartServer(ctx, cfg)
if err != nil {
    log.Fatal(err)
}
defer handle.Stop(context.Background())

type TestMTLSCredentials added in v0.1.0

type TestMTLSCredentials struct {
	// contains filtered or unexported fields
}

TestMTLSCredentials captures ephemeral test-only MTLS material (server bundle + client credentials).

func NewTestMTLSCredentialsFromBundles added in v0.1.0

func NewTestMTLSCredentialsFromBundles(serverBundle, clientBundle []byte) (TestMTLSCredentials, error)

NewTestMTLSCredentialsFromBundles constructs test MTLS credentials using the provided server and client bundles.

func (TestMTLSCredentials) ClientBundle added in v0.1.0

func (c TestMTLSCredentials) ClientBundle() []byte

ClientBundle returns a copy of the PEM-encoded client bundle associated with the credentials.

func (TestMTLSCredentials) NewHTTPClient added in v0.1.0

func (c TestMTLSCredentials) NewHTTPClient() (*http.Client, error)

NewHTTPClient constructs an HTTP client configured for MTLS using the embedded client bundle.

func (TestMTLSCredentials) ServerBundle added in v0.1.0

func (c TestMTLSCredentials) ServerBundle() []byte

ServerBundle returns a copy of the PEM-encoded server bundle associated with the credentials.

func (TestMTLSCredentials) Valid added in v0.1.0

func (c TestMTLSCredentials) Valid() bool

Valid reports whether the credentials contain MTLS material.

type TestServer

type TestServer struct {
	Server   *Server
	BaseURL  string
	Listener net.Addr
	Client   *client.Client
	Config   Config
	// contains filtered or unexported fields
}

TestServer wraps a running lockd.Server with convenient handles for tests.

func NewTestServer

func NewTestServer(ctx context.Context, opts ...TestServerOption) (*TestServer, error)

NewTestServer starts a lockd server suitable for tests. Call Stop to clean up resources.

func StartTestServer

func StartTestServer(t testing.TB, opts ...TestServerOption) *TestServer

StartTestServer is a convenience wrapper that fails the test on error and registers cleanup.

func (*TestServer) Abort added in v0.1.0

func (ts *TestServer) Abort(ctx context.Context) error

Abort stops the server abruptly without leaving TC cluster membership. Intended for tests that need crash-like behaviour.

func (*TestServer) Addr

func (ts *TestServer) Addr() net.Addr

Addr returns the listener address the server is bound to.

func (*TestServer) Backend

func (ts *TestServer) Backend() storage.Backend

Backend exposes the storage backend used by the server.

func (*TestServer) NewClient

func (ts *TestServer) NewClient(opts ...client.Option) (*client.Client, error)

NewClient returns a new client configured against the test server.

func (*TestServer) NewEndpointsClient added in v0.1.0

func (ts *TestServer) NewEndpointsClient(endpoints []string, opts ...client.Option) (*client.Client, error)

NewEndpointsClient returns a client configured with explicit endpoints while inheriting the test server defaults (mTLS, timeouts, logging, etc.).

func (*TestServer) NewHTTPClient added in v0.1.0

func (ts *TestServer) NewHTTPClient() (*http.Client, error)

NewHTTPClient returns a raw HTTP client configured for the test server's MTLS settings.

func (*TestServer) Stop

func (ts *TestServer) Stop(ctx context.Context, opts ...CloseOption) error

Stop shuts down the server using the provided context.

func (*TestServer) TestMTLSCredentials added in v0.1.0

func (ts *TestServer) TestMTLSCredentials() TestMTLSCredentials

TestMTLSCredentials returns a clone of the MTLS material backing the test server (when enabled).

func (*TestServer) URL

func (ts *TestServer) URL() string

URL returns the base URL clients should use to reach the server.

type TestServerOption

type TestServerOption func(*testServerOptions)

TestServerOption customises NewTestServer/StartTestServer behaviour.

func WithTestBackend

func WithTestBackend(backend storage.Backend) TestServerOption

WithTestBackend injects a pre-built backend (shared between servers if desired).

func WithTestChaos

func WithTestChaos(cfg *ChaosConfig) TestServerOption

WithTestChaos enables an in-process chaos proxy in front of the listener. Passing nil disables chaos behaviour.

func WithTestClientOptions

func WithTestClientOptions(opts ...client.Option) TestServerOption

WithTestClientOptions appends client options used when auto-constructing the helper client.

func WithTestClock added in v0.1.0

func WithTestClock(c clock.Clock) TestServerOption

WithTestClock injects a custom clock implementation for the server.

func WithTestCloseDefaults added in v0.1.0

func WithTestCloseDefaults(opts ...CloseOption) TestServerOption

WithTestCloseDefaults overrides the shutdown CloseOptions applied to StartTestServer instances. Passing no options restores the production defaults (currently 8s drain / 10s overall).

func WithTestConfig

func WithTestConfig(cfg Config) TestServerOption

WithTestConfig provides an explicit Config to use. Missing fields will be defaulted during validation.

func WithTestConfigFunc

func WithTestConfigFunc(fn func(*Config)) TestServerOption

WithTestConfigFunc applies a mutation to the server configuration before start.

func WithTestListener

func WithTestListener(proto, address string) TestServerOption

WithTestListener overrides the listen protocol and address.

func WithTestLogger

func WithTestLogger(logger pslog.Logger) TestServerOption

WithTestLogger supplies a custom logger.

func WithTestLoggerFromTB

func WithTestLoggerFromTB(t testing.TB, level pslog.Level) TestServerOption

WithTestLoggerFromTB routes server logs to the provided testing logger at the supplied level.

func WithTestLoggerTB

func WithTestLoggerTB(t testing.TB) TestServerOption

WithTestLoggerTB uses the testing logger with Debug level.

func WithTestMTLS added in v0.1.0

func WithTestMTLS() TestServerOption

WithTestMTLS forces StartTestServer to configure mutual TLS, regardless of the environment toggle.

func WithTestMTLSCredentials added in v0.1.0

func WithTestMTLSCredentials(creds TestMTLSCredentials) TestServerOption

WithTestMTLSCredentials reuses the provided MTLS material for the test server.

func WithTestStartTimeout

func WithTestStartTimeout(d time.Duration) TestServerOption

WithTestStartTimeout overrides the wait timeout when starting the server.

func WithTestStore

func WithTestStore(store string) TestServerOption

WithTestStore sets the storage URL while still defaulting other values.

func WithTestTCFanoutGate added in v0.1.0

func WithTestTCFanoutGate(gate txncoord.FanoutGate) TestServerOption

WithTestTCFanoutGate injects a hook between local apply and remote fan-out.

func WithTestTCLeaderLeaseTTL added in v0.1.0

func WithTestTCLeaderLeaseTTL(ttl time.Duration) TestServerOption

WithTestTCLeaderLeaseTTL overrides the lease TTL used for TC leader election.

func WithTestUnixSocket

func WithTestUnixSocket(path string) TestServerOption

WithTestUnixSocket configures the server to listen on the provided unix socket path.

func WithoutTestClient

func WithoutTestClient() TestServerOption

WithoutTestClient disables automatic client creation.

func WithoutTestMTLS added in v0.1.0

func WithoutTestMTLS() TestServerOption

WithoutTestMTLS disables automatic mTLS configuration for this test server.

Directories

Path Synopsis
benchmark
Package client provides the Go SDK for talking to a lockd cluster over HTTP.
Package client provides the Go SDK for talking to a lockd cluster over HTTP.
cmd
licensegen command
lockd command
lockd-bench command
integration
internal/locktest
Package locktest provides reusable lock/lease integration test scenarios.
Package locktest provides reusable lock/lease integration test scenarios.
internal/storepath
Package storepath provides helpers for scoping integration test store URLs.
Package storepath provides helpers for scoping integration test store URLs.
internal
connguard
Package connguard provides listener-level protection for suspicious TCP/TLS connections before requests reach HTTP handlers.
Package connguard provides listener-level protection for suspicious TCP/TLS connections before requests reach HTTP handlers.
lsf
nsauth
Package nsauth parses and evaluates certificate-based namespace authorization claims encoded in URI SAN entries.
Package nsauth parses and evaluates certificate-based namespace authorization claims encoded in URI SAN entries.
qrf
storage/aws
Package aws provides the AWS S3 storage backend built on the AWS SDK v2.
Package aws provides the AWS S3 storage backend built on the AWS SDK v2.
mcp
Package mcp provides the lockd MCP facade server.
Package mcp provides the lockd MCP facade server.
admin
Package admin provides an SDK-friendly administrative surface for lockd MCP OAuth/bootstrap lifecycle operations.
Package admin provides an SDK-friendly administrative surface for lockd MCP OAuth/bootstrap lifecycle operations.
oauth
Package oauth implements local OAuth 2.1 primitives for the lockd MCP facade, including confidential client management, authorization/token handlers, and bearer-token verification.
Package oauth implements local OAuth 2.1 primitives for the lockd MCP facade, including confidential client management, authorization/token handlers, and bearer-token verification.
state
Package state provides encrypted on-disk persistence for lockd MCP OAuth configuration and client credentials.
Package state provides encrypted on-disk persistence for lockd MCP OAuth configuration and client credentials.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL