cerberus

module
v0.0.0-...-23f8742 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 1, 2026 License: MIT

README

cerberus

Three-headed query gateway: translate PromQL, LogQL, and TraceQL into native ClickHouse SQL against OpenTelemetry data.

CI Release License Go Reference Go Report Card Go Version Container Open Issues Open PRs Last Commit Contributors PRs Welcome

FeaturesQuick startHow it worksDocsArchitectureDeployMigrate


Cerberus accepts the canonical Prometheus, Loki, and Tempo query languages and rewrites them into ClickHouse SQL targeting tables created by the OpenTelemetry Collector ClickHouse exporter. Existing Grafana dashboards, alert rules, and operator muscle memory keep working when the storage backend is ClickHouse.

What you get

  • PromQL v0.2: vector / matrix selectors, the full aggregation set (sum, avg, topk, quantile, stddev, count_values, …), rate/irate/increase/delta, *_over_time, histogram_quantile, label_replace/label_join, arithmetic + comparison + set binary operators with scalar broadcasting plus cross-cardinality vector matching (on(...), ignoring(...), group_left, group_right), the @ modifier (@ ts, @ start(), @ end()), the offset modifier, subqueries (<func>_over_time(<expr>[D:R])), and a function library (abs, ceil, floor, exp, ln, log2, log10, sqrt, sort, sort_desc, absent, absent_over_time, changes, deriv).
  • LogQL alpha: matchers, line filters with or-chains, label filters with regex / numeric / duration / bytes comparisons, JSON / logfmt / regex / pattern parsers, unwrap, every range aggregation (count_over_time, rate, bytes_over_time, bytes_rate, sum_over_time, avg_over_time, min/max/quantile/stddev/stdvar/first/last_over_time), every vector aggregation (sum, avg, min, max, count, stddev, stdvar, topk, bottomk) with by(...) push-down, and binops (<vec> <op> <lit> and <vec> <op> <vec>).
  • TraceQL v0.2: spanset filters with the full comparator set including regex, intrinsic mapping (name, duration, status), Map and JSON layout-aware attribute scopes (resource.X, span.X, .X), spanset operators (&&, ||, >>, >, ~), pipeline aggregates as ScalarFilter (| count() > N, | sum(duration) > 1s, | avg/min/max(...) …) with optional | by(<attr>) grouping, and trace-summary projection.
  • OpenTelemetry-native schema: targets the otel_metrics_*, otel_logs, and otel_traces tables emitted by the collector's ClickHouse exporter, in either the Map (default) or JSON (CH 25+) attribute layout.
  • Production deployment shapes out of the box: container image at ghcr.io/tsouza/cerberus, a Helm chart with TLS / Ingress / HPA / PDB / NetworkPolicy / ServiceMonitor, sidecar-with-ClickHouse and standalone topologies, and a k3s end-to-end harness as a reference deployment.

How it works

Cerberus is a single Go binary that imports each query language's canonical upstream parser directly:

  • PromQLgithub.com/prometheus/prometheus/promql/parser
  • LogQLgithub.com/grafana/loki/v3/pkg/logql/syntax
  • TraceQLgithub.com/grafana/tempo/pkg/traceql

Each translator package walks the parser AST in-process and emits a Plan IR: an ordered pipeline of ClickHouse SQL steps with optional Go-side post-processing. The HTTP server runs the plan via the official clickhouse-go/v2 client and reshapes results into whatever response shape the source language's data source expects.

  query string                                                  JSON response
       │                                                              ▲
       ▼                                                              │
┌──────────────────┐    ┌──────────────────────┐    ┌──────────────────────────┐
│ upstream parser  │ ─► │ translator (per      │ ─► │ HTTP server              │
│ (in-process)     │    │ language)            │    │ (executes Plan via       │
│  PromQL / LogQL  │    │ AST → Plan IR        │    │  clickhouse-go/v2,       │
│  TraceQL         │    │ (multi-step)         │    │  shapes Prom/Loki/Tempo  │
└──────────────────┘    └──────────────────────┘    │  response envelopes)     │
                                                    └──────────────────────────┘

Importing the canonical parsers (rather than reimplementing each grammar) keeps cerberus bug-for-bug compatible with what Prometheus, Loki, and Tempo themselves accept.

See docs/architecture.md for a deeper dive into the Plan IR.

Data source: the OTel ClickHouse exporter

Cerberus does not write data — it only reads. The expected producer is the OpenTelemetry Collector ClickHouse exporter (opentelemetry-collector-contrib/exporter/clickhouseexporter). That exporter is the source of truth for the table layout cerberus targets:

Signal Tables
Metrics otel_metrics_gauge, otel_metrics_sum, otel_metrics_histogram, otel_metrics_exponential_histogram, otel_metrics_summary
Logs otel_logs
Traces otel_traces (+ optional otel_traces_trace_id_ts lookup)

Two attribute storage shapes are supported, mirroring the exporter's json config knob: Map (the default — Map(LowCardinality(String), String) columns) and JSON (ClickHouse v25+ native JSON type, logs and traces only). Pick one with --layout map | json on the cerberus binary.

To run the e2e harness without a live collector pipeline, the project vendors the exporter's DDL verbatim under e2e/schema/upstream/, pinned to v0.150.0 of opentelemetry-collector-contrib. The e2e/schema package renders those templates with single-node defaults (MergeTree(), no cluster, no TTL) and the e2e tests apply them at bootstrap. To bump the pin, change UpstreamVersion in e2e/schema/schema.go and re-fetch the files from the matching tag.

For production, run the exporter with create_schema: true (the default) so it manages the DDL itself, then point cerberus at the same ClickHouse host, database, and layout. The exporter's README.md documents every configuration knob.

HTTP feature matrix

Point a stock Prometheus / Loki / Tempo data source at http://cerberus:9090 and queries are translated and executed against ClickHouse.

Endpoint Source compat Translator status HTTP status
GET/POST /api/v1/query Prometheus instant query full alpha+ coverage 200/4xx (envelope)
GET/POST /api/v1/query_range Prometheus range query translator emits matrix-shaped Plan; HTTP returns matrix envelope 200/4xx (envelope)
GET/POST /api/v1/series Prometheus series metadata wired against match[]; SELECT DISTINCT over metrics tables 200/4xx (envelope)
GET/POST /api/v1/labels Prometheus label names wired; union of label keys across metrics tables 200/4xx (envelope)
GET /api/v1/label/{name}/values Prometheus label values wired; SELECT DISTINCT for the named key 200/4xx (envelope)
GET/POST /loki/api/v1/query Loki instant query translator covers matchers, pipelines, range + vector aggs, binops; HTTP wired 200/4xx (envelope)
GET/POST /loki/api/v1/query_range Loki range query translator covers matchers, pipelines, range + vector aggs, binops; HTTP wired 200/4xx (envelope)
GET /loki/api/v1/labels Loki label names wired; union of label keys across LogAttributes + ResourceAttributes 200/4xx (envelope)
GET /loki/api/v1/label/{name}/values Loki label values wired; SELECT DISTINCT for the named key 200/4xx (envelope)
GET /loki/api/v1/tail Loki streaming logs (websocket) deferred to v1.x not registered
GET /api/search Tempo trace search translator covers spansets, operators, pipeline aggregates; HTTP wired 200/4xx (envelope)
GET /api/traces/{id} Tempo trace lookup wired; OTLP-encoded batch shape grouped by ResourceAttributes 200/4xx (envelope)
GET /healthz, GET /readyz liveness / readiness always 200 200

The HTTP error envelope matches Prometheus: {"status":"error","errorType":"bad_data" | "internal","error":"…"}. Translator parse / unsupported errors map to 400; everything else to 500.

The translator-side feature streams (PromQL alpha+, LogQL alpha, TraceQL v0.2) and the HTTP routing layer are now both wired against main. Each route translates the source query, executes the resulting Plan via clickhouse-go/v2, and reshapes rows into the data-source-specific response envelope. Loki /tail (websocket streaming) remains the only deliberately deferred route.

Compatibility matrix

Component Tested Notes
ClickHouse 24.x, 25.x 25+ is required for the JSON attribute layout. e2e harness pins clickhouse/clickhouse-server:24.8.
Prometheus parser prometheus/prometheus@v0.55.0 replaced via replace in go.mod to dodge the upstream v1.8.x tag.
Loki parser grafana/loki/v3@v3.6.10 dependabot pinned to < 3.7.0 until dskit's memberlist fork lands a tagged release.
Tempo parser grafana/tempo@v1.5.0 TraceQL v2 syntax (&>, ~>, structural { A } { B }, bare `
OTel collector exporter DDL v0.150.0 vendored in e2e/schema/upstream/. Bump via e2e/schema/schema.go.
Kubernetes >= 1.27 declared in Chart.yaml. e2e-k8s harness validates against rancher/k3s:v1.31.4-k3s1.
Go toolchain 1.25.0 repo-wide; LogQL bumped the minimum to 1.23.

Repository layout

cmd/
  cerberus/                  CLI + HTTP-server entry point
internal/
  schema/                    OTel ClickHouse exporter table/column constants;
                             attribute_lookup helper for both Map and JSON layouts
  translator/                Translator interface + cross-language types
    plan/                    Plan IR: multi-step Step.SQL / Step.Build / Step.Transform
    promql/                  PromQL AST → Plan
    logql/                   LogQL AST  → Plan
    traceql/                 TraceQL    → Plan (with reflection-based AST accessors
                             + compat canary against the pinned tempo build)
  server/                    HTTP server with Prometheus / Loki / Tempo APIs
                             plus the plan.Runner abstraction (real impl
                             backed by clickhouse-go/v2; tests use a stub)
deploy/
  cerberus-chart/            Helm chart (sidecar + standalone modes)
docs/                        architecture, deployment, migration, tuning, troubleshooting
e2e/                         integration harness against a real ClickHouse
  schema/upstream/           OTel exporter DDL vendored from
                             opentelemetry-collector-contrib v0.150.0
  schema/                    Go renderer that fills the upstream templates
                             with single-node defaults
  seed/                      deterministic fixture writer (metrics + logs +
                             traces) used by `e2e_test.go`
  e2e_test.go                builds the cerberus HTTP server in-process and
                             exercises the public API against ClickHouse
                             (build tag `e2e`)
e2e-k8s/                     k3s + OTel collector + Altinity ClickHouse harness
                             (build tag `k8se2e`); separate Go module

Prerequisites

  • Go ≥ 1.25 (go.mod directive). Build / test only — runtime needs none.
  • just — task runner. Every command in this README maps to a just recipe.
  • ClickHouse for end-to-end tests and at runtime when serving.

Quick start

Three ways to run cerberus, in order of production-readiness:

The chart at deploy/cerberus-chart/ supports two topologies: as a sidecar container alongside ClickHouse (zero network hop, the e2e-k8s reference layout), or as a standalone Deployment fronted by a Service.

# Standalone Deployment + Service.
helm install cerberus ./deploy/cerberus-chart \
    --set sidecar.enabled=false \
    --set clickhouse.addr=clickhouse:9000 \
    --set clickhouse.database=otel

For the sidecar mode and the full TLS / Ingress / HPA / PDB / NetworkPolicy / ServiceMonitor knob set, see docs/deployment.md and the chart README.

2. Container image

Cerberus publishes a static, distroless nonroot image to ghcr.io/tsouza/cerberus for linux/amd64 and linux/arm64 on every v*.*.* tag.

docker run --rm -p 9090:9090 \
    -e CLICKHOUSE_ADDR=clickhouse:9000 \
    -e CLICKHOUSE_DATABASE=otel \
    -e CLICKHOUSE_USER=default \
    ghcr.io/tsouza/cerberus:latest \
    serve --listen 0.0.0.0:9090

3. Binary (single-node demo / dev)

# Build into ./bin/cerberus (or just download from GH releases).
just build

./bin/cerberus serve --listen 0.0.0.0:9090 \
    --clickhouse-addr localhost:9000 \
    --clickhouse-database otel \
    --clickhouse-user default

Server flags also fall back to env vars: CERBERUS_LISTEN, CERBERUS_LAYOUT, CLICKHOUSE_ADDR, CLICKHOUSE_DATABASE, CLICKHOUSE_USER, CLICKHOUSE_PASSWORD.

Migrating from an existing stack

If you're already running Prometheus / Loki / Tempo and want to move the storage backend to ClickHouse without rewriting your dashboards or alert rules, see the migration guides:

Building from source

# Build the cerberus binary into ./bin/cerberus
just build

# Release build (-ldflags='-s -w' -trimpath)
just release

# Run all tests
just test

# Lint: gofmt --check + go vet + staticcheck (if installed)
just lint

# Auto-format the source
just fmt

# Quick CI gate locally
just ci

# Mutation testing via gremlins (advisory; coverage-aware).
# Whole module (slow), single package, or diff-vs-main:
just mutate
just mutate-pkg internal/translator/promql
just mutate-diff

# Translate a query to ClickHouse SQL
just translate promql 'up{job="api"}'

# Inspect the parsed AST for a query
just parse-ast promql 'rate(http_requests_total{job="api"}[5m])'

# Run the HTTP server (default :9090)
just serve

# Build the container image
just build-image

# Render the Helm chart locally
just helm-template

just (no args) lists every recipe.

CLI

cerberus translate --lang promql 'up{job="api"}'
cerberus translate --lang logql  '{job="api"} |= "error"'
cerberus translate --lang traceql '{ duration > 1s } | count() > 10'

cerberus parse-ast --lang promql 'rate(up[5m])'

cerberus serve --listen 0.0.0.0:9090 \
    --clickhouse-addr localhost:9000 \
    --clickhouse-database otel \
    --clickhouse-user default

--layout {map | json} switches between the OTel ClickHouse exporter's default Map-typed attribute storage and the newer JSON-typed variant. Defaults to map.

For multi-step plans, cerberus translate prints each step in order separated by ;. Deferred or transform-only steps emit a -- step N: dynamic SQL or transform-only marker.

Documentation

License

MIT. See LICENSE.

Directories

Path Synopsis
cmd
cerberus command
Command cerberus is the CLI + HTTP-server entry point.
Command cerberus is the CLI + HTTP-server entry point.
e2e
schema
Package schema renders the OpenTelemetry Collector ClickHouse exporter's DDL into executable CREATE TABLE statements.
Package schema renders the OpenTelemetry Collector ClickHouse exporter's DDL into executable CREATE TABLE statements.
seed command
Command seed inserts a deterministic fixture set into a running ClickHouse so the e2e tests have known data to assert against.
Command seed inserts a deterministic fixture set into a running ClickHouse so the e2e tests have known data to assert against.
seed/fixtures
Package fixtures contains the deterministic fixture data and the helper that writes it into ClickHouse.
Package fixtures contains the deterministic fixture data and the helper that writes it into ClickHouse.
internal
auth
Package auth provides HTTP authentication backends for cerberus.
Package auth provides HTTP authentication backends for cerberus.
observability
Package observability bundles cerberus's first-class self-observability stack: structured logs (`log/slog`), Prometheus metrics (`prometheus/client_golang`), and OpenTelemetry traces.
Package observability bundles cerberus's first-class self-observability stack: structured logs (`log/slog`), Prometheus metrics (`prometheus/client_golang`), and OpenTelemetry traces.
schema
Package schema captures the OpenTelemetry Collector ClickHouse exporter schema (table names, column names) and the helper for rendering attribute lookups under both the Map-based and JSON-typed layouts.
Package schema captures the OpenTelemetry Collector ClickHouse exporter schema (table names, column names) and the helper for rendering attribute lookups under both the Map-based and JSON-typed layouts.
server
Package server is the HTTP layer with Prometheus / Loki / Tempo API compatibility.
Package server is the HTTP layer with Prometheus / Loki / Tempo API compatibility.
server/middleware
Package middleware contains HTTP middleware used by the cerberus server: per-request structured logging, Prometheus metrics instrumentation, and OTel span correlation.
Package middleware contains HTTP middleware used by the cerberus server: per-request structured logging, Prometheus metrics instrumentation, and OTel span correlation.
tenant
Package tenant provides multi-tenant context plumbing for cerberus.
Package tenant provides multi-tenant context plumbing for cerberus.
translator
Package translator defines the cross-language types that PromQL/LogQL/ TraceQL translators all produce and that the HTTP server consumes.
Package translator defines the cross-language types that PromQL/LogQL/ TraceQL translators all produce and that the HTTP server consumes.
translator/logql
Package logql translates LogQL expressions into ClickHouse SQL targeting the OTel exporter's otel_logs table.
Package logql translates LogQL expressions into ClickHouse SQL targeting the OTel exporter's otel_logs table.
translator/plan
Package plan defines the intermediate execution model that translators produce and the HTTP layer consumes.
Package plan defines the intermediate execution model that translators produce and the HTTP layer consumes.
translator/promql
Package promql translates PromQL expressions into ClickHouse SQL targeting the OTel exporter's otel_metrics_* tables.
Package promql translates PromQL expressions into ClickHouse SQL targeting the OTel exporter's otel_metrics_* tables.
translator/traceql
Package traceql — accessors.go bridges the upstream tempo AST to the concrete-typed shapes the translator works with.
Package traceql — accessors.go bridges the upstream tempo AST to the concrete-typed shapes the translator works with.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL