kollect

module
v0.7.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 17, 2026 License: MIT

README

OpenSSF Scorecard CI Release codecov

Git-simple to start · platform-grade to grow

Kollect

Kubernetes knows what's running right now. Kollect turns that into a durable record your whole platform can use — a Git history you can diff, a database your portal can query, an event stream your automation can react to. Declare what matters in a few CRs (select by GVK, extract with CEL), and every sink receives the same rows, in parallel.

Start with one Git repo. Grow to a whole platform. On day one, a single pipeline gives you a Git-committed inventory — git log is your audit trail, git diff is your drift report, no scripts, no apiserver hammering. As adoption grows, nothing gets rebuilt: the same rows fan out to Postgres, Kafka, and object storage, and KollectScope keeps it multi-tenant — every team owns its inventory as configuration, not code, in its own namespace. Consumers read export data, never unbounded list/watch against the live cluster.

Read the docs: konih.github.io/kollect — architecture, quick start, CR reference, ADRs, and examples. This README is the front door; the site is the map.

Pre-beta. APIs and defaults may change until the first release candidate. See the roadmap for current status.

Why Kollect?

  • Decoupled read model — consumers query a sink, not the apiserver. No RBAC blast radius, no watch-storm risk, no etcd size limits (why).
  • Event-driven, no polling — one shared informer per GVK keeps inventory current as the cluster changes (ADR-0301).
  • Schema-flexible — declare the attributes you want in a KollectProfile; no bespoke collector per resource kind.
  • Pluggable sinks, no privileged backend — the same snapshot fans out to Git, Postgres, object store, or an event stream (sink taxonomy).
  • Multi-tenant by designKollectScope gates which teams, namespaces, and sinks each tenant may use.
  • Fleet-readyN single-mode operators → one shared sink, partitioned by spec.cluster; no central hub tier to operate (ADR-0501).
  • Built for scale — a 10,000-row baseline validated in CI, a 100,000-row design target per cluster with export sharding, plus tunable reconcile/dispatch concurrency (performance).

See it end-to-end

A real pipeline is a handful of Kubernetes resources. This is the Deployment-inventory walkthrough — collect container images from Deployments and export them to Postgres (for portals) and Git (for audit) at the same time:

flowchart LR
  Profile["<b>KollectProfile</b><br/>Deployment schema"]
  Target["<b>KollectTarget</b><br/>select Deployments"]
  Inv["<b>KollectInventory</b><br/>aggregate · debounce · export"]
  Snap["<b>KollectSnapshotSink</b>"]
  Db["<b>KollectDatabaseSink</b>"]
  Ev["<b>KollectEventSink</b>"]
  K8s[("Kubernetes API")]

  Profile --> Target
  K8s -- "informer per GVK" --> Target
  Target --> Inv
  Inv --> Snap
  Inv --> Db
  Inv --> Ev
  Snap --> SnapOut["Git · GitLab · S3 · GCS"]
  Db --> DbOut["Postgres · MongoDB"]
  Ev --> EvOut["Kafka"]

Quick start (MVP)

Spin up the full pipeline on a local kind cluster in one command (needs Docker, kind, kubectl, and Task):

git clone https://github.com/konih/kollect.git && cd kollect
task dev-up                       # build, create kind cluster, install operator + sample CRs
kubectl get kinv,ktgt,ksnap,kdb -A    # watch the pipeline come up

task dev-up builds the manager, boots a kollect-dev kind cluster, installs the operator, and applies the sample Profile → Sink → Target → Inventory pipeline. Watch the KollectInventory Ready condition, then read your sink — the live demo repo shows what the Git export looks like.

Full walkthrough — prerequisites, Helm install, maturity notes: Quick start →

How it works

Kollect operator pipeline from Kubernetes API through shared informers, in-memory collect store, and debounced KollectInventory export to Git, GitLab, S3, GCS, Postgres, MongoDB, and Kafka sink projections.

The in-memory snapshot per inventory is canonical; every sink is a projection of it — no single backend is privileged (sink roles). Sinks are split into three CRD families (ADR-0414):

Sink family Examples Good for
KollectSnapshotSink Git, GitLab, S3, GCS Audit, diff, GitOps-friendly history
KollectDatabaseSink Postgres, MongoDB Rich queries for portals and dashboards
KollectEventSink Kafka, NATS Change streams, downstream consumers
Supported & planned sinks

Honest maturity tiers — see the roadmap for release timing.

Family CRD spec.type Status
KollectSnapshotSink git Core — production-ready
KollectSnapshotSink gitlab Core
KollectSnapshotSink s3 Core
KollectSnapshotSink gcs Beta — shipped, maturing
KollectDatabaseSink postgres Core
KollectDatabaseSink mongodb Beta
KollectDatabaseSink bigquery Beta — analytics SQL; v0.7.x hardening
KollectEventSink kafka Beta
KollectEventSink nats Beta — JetStream emitter; v0.7.x hardening
KollectSnapshotSink azureblob Planned — needs real backend (roadmap)
KollectSnapshotSink Parquet on S3/GCS Planned — layout on existing object-store sinks

Full payload lives in sinks; CR .status holds summaries only (etcd limits).

Performance

Kollect is built for large single clusters and multi-cluster fleets, with honest, tested targets (ADR-0603) — 10,000+ rows validated in nightly load tests, 100,000-row design target per cluster, and fleet fan-in with no hub merge tier. Tuning knobs (reconcile concurrency, export debounce, sharding) are in the performance guide.

Learn more

Topic Link
Problem statement, CRD model, reconciliation Architecture
Locked platform decisions Platform decisions
CR fields, RBAC, failure modes CR reference
Multi-cluster fleet ADR-0501
Sink taxonomy (state vs stream) ADR-0401
Build-order phases and status Roadmap
Examples index Examples
Example: Deployment → Git export Walkthrough
Live demo inventory (Git sink) kollect-inventory-demo

Developers: run task lint, task test, and task verify before opening a PR — CONTRIBUTING.md.

Community

Contributing CONTRIBUTING.md — DCO, PR workflow, good first tasks
Code of Conduct CODE_OF_CONDUCT.md — Contributor Covenant v2.1
Governance GOVERNANCE.md — roles, decisions, continuity

Security

Report vulnerabilities privately — see SECURITY.md. Security architecture: docs/ASSURANCE-CASE.md.

License

Copyright (c) 2026 Konrad Heimel. Licensed under the MIT License.

Directories

Path Synopsis
api
v1alpha1
Package v1alpha1 contains API Schema definitions for the v1alpha1 API group.
Package v1alpha1 contains API Schema definitions for the v1alpha1 API group.
internal
aggregate
Package aggregate holds cross-target rollup helpers for Phase 4 (ADR-0304).
Package aggregate holds cross-target rollup helpers for Phase 4 (ADR-0304).
errors
Package errors provides typed reconcile error classes (ADR-0602).
Package errors provides typed reconcile error classes (ADR-0602).
export
Package export defines the versioned inventory export data contract (ADR-0405).
Package export defines the versioned inventory export data contract (ADR-0405).
pathvalidate
Package pathvalidate holds shared relative-path rules for Git and object-store export paths.
Package pathvalidate holds shared relative-path rules for Git and object-store export paths.
sink/cap
Package cap holds sink capability types shared by the registry and backends without import cycles.
Package cap holds sink capability types shared by the registry and backends without import cycles.
sink/layout
Package layout projects an inventory snapshot into the readable file tree written by Git/GitLab snapshot sinks (ADR-0419).
Package layout projects an inventory snapshot into the readable file tree written by Git/GitLab snapshot sinks (ADR-0419).
sink/objectstore
Package objectstore holds shared helpers for Git/S3/GCS snapshot path layout (ADR-0401, ADR-0407).
Package objectstore holds shared helpers for Git/S3/GCS snapshot path layout (ADR-0401, ADR-0407).
sink/parquet
Package parquet encodes inventory snapshots to Parquet (ADR-0401 hybrid schema, Q11).
Package parquet encodes inventory snapshots to Parquet (ADR-0401 hybrid schema, Q11).
sink/preview
Package preview renders read-only sink implications without side effects (ADR-0416).
Package preview renders read-only sink implications without side effects (ADR-0416).
test

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL