nimbus

package module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 21, 2026 License: Apache-2.0 Imports: 16 Imported by: 0

README

Nimbus

Cloud Run-first caching for Go. A fast in-process L1, a shared versioned L2 (the source of truth), and a Pub/Sub invalidation bus that keeps per-instance caches coherent across ephemeral, autoscaling instances.

CI Go Reference Go Report Card OpenSSF Scorecard License Go

Status: v0.1.0. L1, L2 (Redis), stampede protection, stale-while-revalidate, the cross-instance Pub/Sub bus, OpenTelemetry metrics, and a deployable Cloud Run example (Dockerfile + Terraform) are all implemented and tested (integration against real Redis and the Pub/Sub emulator via testcontainers). Pre-1.0: minor (0.x) releases may still change the API; patch releases never do.


Why caching on Cloud Run is hard

Naive caching breaks on Cloud Run for three reasons:

  1. Instances are ephemeral and scale to zero. In-memory caches vanish on scale-down. When traffic spikes from 0 to N, every new instance starts cold and stampedes the database at the worst moment.
  2. There is no shared memory between instances. A request lands on any instance, so in-process caches diverge, and because instances are ephemeral you cannot address them individually to invalidate.
  3. Cost vs. latency is a real tradeoff. An external cache (Memorystore) gives coherence but adds cost, latency, and a VPC connector; in-memory is free but volatile.

Nimbus handles these tradeoffs so you do not re-implement them in every service.


How it works

Three tiers, one rule: L2 is the source of truth; L1 is a best-effort per-instance accelerator; the bus is a latency optimization, not the sole coherence mechanism.

flowchart LR
    R[Request] --> L1[L1 in-process LRU+TTL]
    L1 -- miss --> L2[(L2 Redis / Memorystore<br/>versioned, source of truth)]
    L2 -- miss --> SF[singleflight] --> O[(origin loader)]
    W[Set / Invalidate] --> BUS{{Pub/Sub bus}}
    BUS -.evict L1.-> I2[instance B]
    BUS -.evict L1.-> I3[instance N]

The correctness backbone is the fill invariant: no value enters L1 except stamped with a version minted by L2, decided atomically against concurrent invalidations (a CAS on write). A value loaded by an in-flight fill is discarded, not cached, if the key was invalidated while the loader ran. An instance that misses a bus broadcast still converges on its next L2 read, because L2 versions are authoritative.

Keys are string-keyed internally (Store[V]), with the user's key type K living only on the public Cache[K, V], so eviction-by-key is consistent across L1, L2, and the bus.


Quickstart

cache, err := nimbus.NewBuilder[string, User](loadUser).
    L1(memory.New[User]()).               // in-process accelerator
    L2(redisstore.New[User](rdb, store.JSON[User]())). // shared source of truth
    TTL(30*time.Second, 5*time.Minute).   // fresh window + stale-while-revalidate window
    Jitter(0.1).                          // desync expiries
    NegativeTTL(2 * time.Second).         // cache known-absent keys
    Build()
if err != nil {
    log.Fatal(err)
}
defer cache.Close()

// Read-through with stampede protection: concurrent misses collapse to one load.
v, err := cache.GetOrLoad(ctx, "user:42")
switch {
case errors.Is(err, nimbus.ErrNotFound): // known-absent (negative hit)
case err != nil:                           // transient failure (not cached)
default:                                    // use v
}

The loader signals a missing key by returning nimbus.ErrNotFound, which triggers negative caching. The builder names the types once; every option is a plain method.


Caching techniques

Technique Solves
Multi-tier L1/L2 speed of in-process + coherence of a shared store
Singleflight cold-start thundering herd on the origin
Stale-while-revalidate tail latency and backend resilience
TTL + jitter cache avalanche from synchronized expiry
Negative caching DB pressure from missing-key lookups
Versioned fill invariant the fill-after-invalidate stale-read race
Cross-instance invalidation (Pub/Sub) coherence across ephemeral replicas

Performance

Benchmarks (go test -bench=. -benchmem). Hot paths allocate zero times per operation for string keys (and any key type whose KeyString renders without allocating). Integer keys take an allocation-light strconv path — zero allocations for small magnitudes, at most one for the rendered key string otherwise — while other key types fall back to fmt; supply KeyString to keep them zero-alloc. Numbers below are from an Apple-class CPU running the darwin/amd64 toolchain under emulation, so treat them as indicative and relative, not absolute.

Operation Latency Allocations
L1 Get (hit) ~95 ns/op 0 B, 0 allocs
L1 Set ~47 ns/op 0 B, 0 allocs
L1 Get (parallel) ~115 ns/op 0 B, 0 allocs
GetOrLoad (L1 hit) ~181 ns/op 0 B, 0 allocs
GetOrLoad (L1 hit, parallel) ~307 ns/op 0 B, 0 allocs
Get (L1 hit) ~179 ns/op 0 B, 0 allocs
Invalidation dedupe (Seen) ~22 ns/op 0 B, 0 allocs
In-process bus publish (Mem) ~185 ns/op 0 B, 0 allocs

The ~85 ns gap between a raw L1 Get and GetOrLoad is the key-string mapping, freshness check, and stats accounting on the hot path. L2 (Redis) reads and the gcppubsub bus are dominated by network and Redis/Pub/Sub round-trips, not CPU, so they are exercised in the integration suite rather than micro-benchmarked.

Reproduce with:

go test -run='^$' -bench=. -benchmem ./...

Observability

The metrics adapter exports cache statistics as OpenTelemetry metrics. It ships as a separate module so the core carries no OpenTelemetry dependency — add it only if you want OTel:

go get github.com/ant-caor/nimbus/metrics

It observes Stats() through asynchronous instruments, so it adds nothing to the hot path:

reg, err := metrics.Register(meter, cache) // meter is an otel metric.Meter
defer reg.Unregister()

It reports nimbus.hits, .stale_hits, .misses, .loads, .load_errors, .negative_hits, .refreshes, .bus_evicts, .evictions, and .l1.entries.

Try it locally

demo/local/ brings up two cache instances + Redis in Docker so you can watch L1/L2 sharing on your laptop (see demo/local/README.md). For the full cross-instance bus over Pub/Sub deployed to Cloud Run via Terraform, see examples/cloudrun/ (Cloud Run + Memorystore + Pub/Sub + Direct VPC Egress).


Invalidation transports

The bus is an interface (invalidation.Bus), so the transport is pluggable. The in-process and Redis transports live in the core module; the GCP transport is a separate module (go get github.com/ant-caor/nimbus/invalidation/gcppubsub) so the core never pulls the GCP/gRPC tree:

Transport Package Module Notes
GCP Pub/Sub (pull) invalidation/gcppubsub separate true fan-out; needs always-on CPU
GCP Pub/Sub (push) invalidation/gcppubsub separate load-balanced; throttle-safe for request-only-CPU Cloud Run
Redis Pub/Sub invalidation/redispubsub core reuses the Redis client your L2 already holds — no extra infra, no GCP dependency
In-process invalidation.Mem core single-process fan-out for tests

The Redis Pub/Sub bus + Redis L2 is a fully GCP-free, cloud-agnostic coherence stack: it runs on any cloud or on-prem wherever Redis is reachable.

bus := redispubsub.New(rdb, "myapp:invalidations") // rdb is your rueidis.Client
cache, err := nimbus.NewBuilder[string, User](loadUser).
    L2(redisstore.New[User](rdb, store.JSON[User]())).
    Bus(bus).
    Build()

When not to use this

  • A single long-lived instance with plenty of memory: a plain in-process cache (Ristretto, Otter) is simpler.
  • Reads that must be strongly consistent: Nimbus is eventually consistent across instances by design.
  • Near-100% write workloads: the machinery is not worth it.

Nimbus is for read-heavy, high-traffic services on autoscaling Cloud Run where cold starts and cross-instance coherence actually bite.


Alternatives / comparison

Most Go caching libraries answer a different question. Ristretto and Otter are single-instance in-process caches optimized for raw eviction throughput; groupcache does peer-to-peer replication across long-lived processes. Nimbus is built for ephemeral, autoscaling instances that need a shared, versioned source of truth and cross-instance coherence.

Nimbus groupcache Ristretto / Otter
Topology L1 + versioned shared L2 + invalidation bus peer-to-peer per-process replication single-instance in-process
Cross-instance coherence yes (bus eviction + L2 versions) partial (peers fill from each other; no shared store) none
Shared source of truth yes — versioned Redis L2 no no
Invalidation / writes Set/Invalidate/InvalidateTag + TTL + SWR immutable values, no TTL; only coarse Remove TTL/eviction + explicit delete, single instance
Stampede protection yes (singleflight, cross-instance via L2) yes (cluster-wide; owner peer loads) Ristretto: no; Otter v2: loader single-flight
Fits scale-to-zero / autoscaling yes — designed for it poor (peers are long-lived, addressed individually) n/a (one instance)
Cloud coupling GCP or cloud-agnostic (Redis L2 + Redis Pub/Sub bus) none none
Primary strength serverless coherence across replicas LAN peer fan-out for static-ish data fastest single-node eviction

Use Nimbus when you need cross-instance coherence on ephemeral, autoscaling instances (Cloud Run, Knative, autoscaled containers): cold starts stampede your origin, requests land on any replica, and you cannot address instances to invalidate them. Use groupcache when a fixed set of long-lived peers can replicate mostly-immutable values among themselves. Use Ristretto or Otter when a single instance with plenty of memory suffices and you just want the fastest in-process eviction — Nimbus deliberately does not try to out-compete them there, and its hand-written L1 sits behind the store.Store interface so a faster engine can be dropped in.


Design notes

The in-memory cache space in Go is well served by Ristretto and Otter. Nimbus does not try to out-compete them on raw eviction; its value is the serverless orchestration layer: tiering, the versioned fill invariant, cross-instance coherence, and a deployable reference. The L1 is hand-written behind the store.Store interface, so a faster engine can be dropped in later.

The core module github.com/ant-caor/nimbus depends on only rueidis and golang.org/x/sync. Provider- and OTel-specific code is isolated into its own modules — invalidation/gcppubsub (GCP), metrics (OpenTelemetry), and the testcontainers tree under test/integration/ — so a dependent never pulls those graphs unless it imports them. go list -m all on the core resolves to roughly a dozen modules rather than two hundred.

A deeper write-up of the coherence protocol lives in DESIGN.md.


Versioning and compatibility

Nimbus follows Semantic Versioning. While the project is pre-1.0, minor releases (0.x) may contain breaking changes; patch releases never do. Breaking changes are called out in CHANGELOG.md with a migration note. After 1.0, the public API of the root package follows the standard Go compatibility promise within a major version.

  • Supported Go: the latest 1.25 patch. The floor (Go 1.25) is set by a transitive rueidis requirement, not by choice.
  • Module path: github.com/ant-caor/nimbus. A future v2+ would use the /v2 import-path suffix per Go module versioning.
  • Scope of the promise: exported symbols of the root package and documented subpackages. Anything under internal/ is not part of the public API.

The release process is documented in RELEASING.md, and security reporting in SECURITY.md.


License

Apache-2.0. See LICENSE and NOTICE.

Documentation

Overview

Package nimbus is a Cloud Run-first cache for Go.

It combines a fast in-process L1, a shared versioned L2 (the source of truth), and a Pub/Sub invalidation bus that keeps per-instance caches coherent across ephemeral, autoscaling Cloud Run instances.

The correctness backbone is the fill invariant: no value enters L1 except stamped with a version minted by the authoritative L2 store, decided atomically against concurrent invalidations. The bus is a latency optimization, not the sole coherence mechanism; an instance that misses a broadcast still converges on its next L2 read. See DESIGN.md for details.

Example

Example shows the simplest use: an L1-only cache with read-through and stampede protection. The loader runs once; subsequent reads hit L1.

package main

import (
	"context"
	"fmt"
	"time"

	"github.com/ant-caor/nimbus"
)

func main() {
	loader := func(_ context.Context, key string) (int, error) {
		return len(key), nil // pretend this is an expensive lookup
	}

	cache, err := nimbus.NewBuilder[string, int](loader).
		TTL(time.Minute, 0).
		Build()
	if err != nil {
		panic(err)
	}
	defer func() { _ = cache.Close() }()

	v, _ := cache.GetOrLoad(context.Background(), "hello")
	fmt.Println(v)
}
Output:
5

Index

Examples

Constants

This section is empty.

Variables

View Source
var ErrClosed = errors.New("nimbus: cache closed")

ErrClosed is returned by cache operations invoked after Close.

View Source
var ErrNotFound = errors.New("nimbus: not found")

ErrNotFound is the negative-cache sentinel. A Loader returns ErrNotFound to signal that a key is known-absent, which makes nimbus store a negative entry (subject to the negative TTL). GetOrLoad returns ErrNotFound when it serves such a negative hit. It is distinct from a transient load failure, which is never cached.

Functions

This section is empty.

Types

type Builder

type Builder[K comparable, V any] struct {
	// contains filtered or unexported fields
}

Builder constructs a Cache. The user names the key and value types once, on NewBuilder; every configuration method is a plain (non-generic) method, so there is no per-option type-annotation tax.

func NewBuilder

func NewBuilder[K comparable, V any](loader Loader[K, V]) *Builder[K, V]

NewBuilder starts building a Cache around loader. Return ErrNotFound from the loader to request negative-caching of a key.

func (*Builder[K, V]) BackgroundRefresh

func (b *Builder[K, V]) BackgroundRefresh(workers int) *Builder[K, V]

BackgroundRefresh selects background revalidation with the given worker count. This requires Cloud Run always-on CPU; see RefreshBackground.

func (*Builder[K, V]) Build

func (b *Builder[K, V]) Build() (Cache[K, V], error)

Build validates the configuration and returns a ready Cache.

func (*Builder[K, V]) Bus

func (b *Builder[K, V]) Bus(bus invalidation.Bus) *Builder[K, V]

Bus sets the cross-instance invalidation bus. Defaults to a no-op bus.

func (*Builder[K, V]) Clock

func (b *Builder[K, V]) Clock(c clock.Clock) *Builder[K, V]

Clock injects a time source for deterministic tests.

func (*Builder[K, V]) Jitter

func (b *Builder[K, V]) Jitter(frac float64) *Builder[K, V]

Jitter applies +/- frac randomization to the fresh TTL to avoid synchronized expiry. frac must be in [0, 1].

func (*Builder[K, V]) KeyString

func (b *Builder[K, V]) KeyString(fn func(K) string) *Builder[K, V]

KeyString overrides how a key K is rendered to the string used by L1, L2, and the bus. The default renders string keys directly and integer keys via strconv (both allocation-light on the hot path) and falls back to fmt for any other type; supply KeyString to keep non-string, non-integer keys zero-allocation.

func (*Builder[K, V]) L1

func (b *Builder[K, V]) L1(s store.Store[V]) *Builder[K, V]

L1 sets the in-process accelerator tier. Defaults to an in-memory LRU.

func (*Builder[K, V]) L2

func (b *Builder[K, V]) L2(s store.VersionedStore[V]) *Builder[K, V]

L2 sets the shared, authoritative tier (the source of truth).

func (*Builder[K, V]) MaxConcurrentRefresh

func (b *Builder[K, V]) MaxConcurrentRefresh(n int) *Builder[K, V]

MaxConcurrentRefresh caps how many request-bound revalidations may run at once, so a synchronized stale wave cannot fan out into an unbounded burst of loader calls against the origin. On saturation a refresh is dropped (the entry stays stale and re-triggers on the next request); stale-while-revalidate keeps serving meanwhile. n <= 0 selects a CPU-scaled default. Has no effect in background refresh mode, whose concurrency is its worker count.

func (*Builder[K, V]) MaxTTL

func (b *Builder[K, V]) MaxTTL(d time.Duration) *Builder[K, V]

MaxTTL caps the absolute lifetime of an entry so stale-while-revalidate cannot renew it indefinitely without reconciling against L2.

func (*Builder[K, V]) NegativeTTL

func (b *Builder[K, V]) NegativeTTL(d time.Duration) *Builder[K, V]

NegativeTTL sets how long a known-absent key is negatively cached. If unset, it falls back to the fresh TTL; if that is also unset, negative entries persist until evicted. Negative entries are L1-only and converge across instances via the bus or this TTL (not via an L2 read), so keep it modest.

func (*Builder[K, V]) RefreshMode

func (b *Builder[K, V]) RefreshMode(m RefreshMode) *Builder[K, V]

RefreshMode selects request-bound (default) or background revalidation.

func (*Builder[K, V]) RefreshTimeout

func (b *Builder[K, V]) RefreshTimeout(d time.Duration) *Builder[K, V]

RefreshTimeout bounds how long a single stale-while-revalidate refresh may run. Defaults to 5s.

func (*Builder[K, V]) TTL

func (b *Builder[K, V]) TTL(fresh, staleWindow time.Duration) *Builder[K, V]

TTL sets the fresh duration and the additional stale-while-revalidate window.

type Cache

type Cache[K comparable, V any] interface {
	// Get returns a cached value if present and servable (fresh or stale). It is
	// a read-only peek: it never invokes the loader, never schedules a
	// revalidation, and does not update Stats. A negative entry reports false.
	Get(ctx context.Context, key K) (V, bool, error)
	// GetOrLoad returns the value, loading it through the loader on a miss with
	// stampede protection. It returns ErrNotFound on a negative hit.
	GetOrLoad(ctx context.Context, key K) (V, error)
	// Set writes a value and broadcasts an invalidation so other instances drop
	// any stale or negative entry for the key.
	Set(ctx context.Context, key K, val V, opts ...EntryOption) error
	// Invalidate evicts a key locally and broadcasts the eviction.
	Invalidate(ctx context.Context, key K) error
	// InvalidateTag evicts every key carrying tag and broadcasts the eviction.
	InvalidateTag(ctx context.Context, tag string) error
	// Stats returns a counter snapshot.
	Stats() Stats
	// Close stops background work and the bus subscription. It does not close
	// stores or clients passed in by the caller.
	Close() error
}

Cache is a Cloud Run-first cache keyed by a user type K with values of type V.

type EntryOption

type EntryOption interface {
	// contains filtered or unexported methods
}

EntryOption customizes a single Set. It is an opaque, sealed interface: options are produced only by the With* constructors in this package, which keeps the set of options extensible without exposing the entry's internals or widening the public API surface.

func WithTags

func WithTags(tags ...string) EntryOption

WithTags associates the written key with one or more tags so it can later be invalidated via InvalidateTag.

type Loader

type Loader[K comparable, V any] func(ctx context.Context, key K) (V, error)

Loader fetches the authoritative value for key on a cold miss or on revalidation. Return ErrNotFound to request negative-caching of key; any other error is treated as transient and is not cached.

type RefreshMode

type RefreshMode int

RefreshMode selects how stale-while-revalidate revalidation is executed.

The distinction matters on Cloud Run: with request-only CPU allocation, background goroutines are throttled to near-zero between requests, so a detached refresh can stall. See DESIGN.md.

const (
	// RefreshRequestBound runs revalidation within the lifecycle of the request
	// that observed the stale entry, so CPU is allocated while that request is
	// in flight. It is the default and is safe under Cloud Run request-only CPU.
	RefreshRequestBound RefreshMode = iota

	// RefreshBackground runs revalidation on a long-lived worker pool off the
	// request path. It REQUIRES Cloud Run always-on CPU (instance-based
	// billing); otherwise refreshes stall between requests.
	RefreshBackground
)

func (RefreshMode) String

func (m RefreshMode) String() string

String implements fmt.Stringer.

type Stats

type Stats struct {
	Hits         uint64
	StaleHits    uint64 // served stale while a revalidation was scheduled
	Misses       uint64
	Loads        uint64
	LoadErrors   uint64
	NegativeHits uint64
	Refreshes    uint64 // stale-while-revalidate refreshes scheduled
	BusEvicts    uint64 // L1 entries evicted by cross-instance invalidation events
	L2Errors     uint64 // fills that degraded to an origin response because L2 was unreachable
	Evictions    uint64
	L1Len        int
}

Stats is a snapshot of cache counters. The counters are monotonic and read independently, so a snapshot is eventually-consistent rather than a single consistent instant (Hits and Misses may be sampled a few operations apart).

Directories

Path Synopsis
examples
basic command
Command basic is the smallest Nimbus example: an L1-only cache with stampede protection.
Command basic is the smallest Nimbus example: an L1-only cache with stampede protection.
internal
clock
Package clock provides an injectable time source so TTL behavior is deterministic in tests and clock skew can be simulated per instance.
Package clock provides an injectable time source so TTL behavior is deterministic in tests and clock skew can be simulated per instance.
singleflight
Package singleflight is a thin generic wrapper over golang.org/x/sync/singleflight.
Package singleflight is a thin generic wrapper over golang.org/x/sync/singleflight.
Package invalidation defines the cross-instance eviction bus contracts.
Package invalidation defines the cross-instance eviction bus contracts.
redispubsub
Package redispubsub implements the nimbus invalidation bus over Redis Pub/Sub (via rueidis).
Package redispubsub implements the nimbus invalidation bus over Redis Pub/Sub (via rueidis).
gcppubsub module
metrics module
Package redisstore is nimbus's shared, authoritative L2 tier, backed by Redis or Memorystore via rueidis.
Package redisstore is nimbus's shared, authoritative L2 tier, backed by Redis or Memorystore via rueidis.
Package refresh defines how stale-while-revalidate revalidation is scheduled.
Package refresh defines how stale-while-revalidate revalidation is scheduled.
Package store defines the storage contracts for nimbus tiers.
Package store defines the storage contracts for nimbus tiers.
memory
Package memory is nimbus's own in-process L1 store: a sharded LRU with TTL.
Package memory is nimbus's own in-process L1 store: a sharded LRU with TTL.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL