trace

package
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 29, 2026 License: MPL-2.0 Imports: 21 Imported by: 0

Documentation

Overview

Package trace records per-request artifacts (the inbound envelope, every op execution, the final response, plus a timeline of routing events) to durable storage. Developers browse the result to see exactly what happened during a request without rerunning it or turning on debug logging.

The package is interface-first: the chassis depends only on `Sink` and `RequestTracer`. `NoopSink` (mode=off) is the prod default and a true zero-cost no-op. `FileSink` (mode=summary|full) writes the artifact tree under a configured directory. A future `SQLiteSink` can be added behind the same interface without touching the processor or inlets.

File layout under `<trace_dir>/requests/<rid>/`:

in.json            inlet's initial envelope (input to the request)
out.json           chassis's final response (after all merges)
timeline.jsonl     line-per-event log of stages/steps/jumps
steps/
  NNNN-<name>/     one folder per fired op, prefixed by zero-
                   padded scope so the dir lex-sorts in scope order
    op.json        the rule's stored definition (txcl, exec, etc.)
    in.json        envelope handed to the handler
    out.json       handler's raw response
    meta.json      timing, sizes, status, transport

`in.json`/`out.json` are paired at every level — request root and per-step — so a developer can `diff` request-level in vs out, or step-level in vs out, with the same naming.

In `summary` mode the per-step `op.json`/`in.json`/`out.json` are omitted; `meta.json` and the timeline still record what ran and how long it took.

Index

Constants

This section is empty.

Variables

View Source
var ErrNotFound = errors.New("trace: request not found")

ErrNotFound is returned by Reader.Get when the rid is absent.

Functions

func ApplyHints

func ApplyHints(b []byte, h Hints) []byte

ApplyHints applies omits-then-redacts to b. Order matters at the margin: applying omits first lets `_txc.lmtp.msg` vanish without wasting a sentinel write on `_txc.lmtp.msg.headers.authorization`. It also makes behavior deterministic when an author lists the same path in both kinds. Returns the (possibly mutated) byte slice; the underlying array may or may not be reused by sjson.

func Register

func Register(name string, c Constructor)

Register adds a named Sink backend. Called from init() in the backend package; built-ins ("file", "noop") register from this package's own init (file.go / noop.go), so no blank import is needed for them.

func RegisterArmable

func RegisterArmable(name string, c ArmableConstructor)

RegisterArmable registers a named live-stream backend. Called from init() in the backend package; backends are activated by blank import (same discipline as Sink/Reader).

func RegisterReader

func RegisterReader(name string, c ReaderConstructor)

RegisterReader registers a named Reader backend (called from init(); built-ins register from this package, so no blank import is needed).

func WithContext

func WithContext(parent context.Context, t RequestTracer) context.Context

WithContext returns a copy of parent carrying the given tracer. A nil tracer is acceptable — FromContext returns NoopTracer when the key is absent, so callers don't need nil checks at every call site.

Types

type Armable

type Armable interface {
	// Subscribe returns a per-request subscription delivering closed
	// traces newer than sinceCursor (opaque to the client; empty =
	// "from now on"). buf is the channel buffer hint; backends may
	// clamp. The subscription is bound to ctx and drains on
	// cancellation.
	Subscribe(ctx context.Context, sinceCursor string, buf int) (Subscription, error)

	// Close drains any backend-side resources (NATS connections,
	// background goroutines). Respect the ctx deadline.
	Close(ctx context.Context) error
}

Armable is the read-side seam for *live* trace streams. Unlike Reader (which serves the persistent archive — newest-N + Get-by-rid), an Armable delivers each request's closed trace to one or more subscribers as soon as End fires. Backends that have no live-stream story (file, noop) register no Armable; backends that do (e.g. a NATS subscriber overlay) register one and the admin server exposes GET /traces/stream against it.

The seam lives on the *subscriber* side — i.e. an Armable knows how to obtain a feed of closed traces, regardless of how the chassis emits them. In the NATS overlay the publisher is a separate trace.Sink and the Armable is a NATS subscriber; multi-node fan-in is automatic (every chassis publishes on the same subject hierarchy and the admin subscriber receives them all).

func OpenArmable

func OpenArmable(name string, cfg StoreConfig) (Armable, error)

OpenArmable constructs the named live-stream backend. Returns an error if the backend has no Armable registered — the admin server uses this to gate /traces/stream registration: backends without an Armable (file, noop) silently skip the route, backends with one (nats) get the route mounted.

type ArmableConstructor

type ArmableConstructor func(StoreConfig) (Armable, error)

ArmableConstructor builds an Armable from StoreConfig (same config the Sink/Reader sides get).

type AsyncOpts

type AsyncOpts struct {
	BufferSize   int
	BodyCapBytes int
	StaleAfter   time.Duration
	DropCounter  *atomic.Int64
}

AsyncOpts configures an AsyncSink.

BufferSize    — channel capacity. When full, new events are dropped
                rather than blocking the request path. Default 1024.
BodyCapBytes  — maximum bytes kept per body (Input, Output, Payload,
                final). Larger payloads are truncated; meta.json
                records the original size and a `*_truncated` flag.
                0 disables capping. Default 65536.
StaleAfter    — idle TTL for per-request tracers held in the worker's
                map. If a tracer hasn't received Step/Event/End for
                this long, the worker closes it to avoid leaked file
                descriptors when End events drop under load. Default
                60s.
DropCounter   — optional atomic counter incremented when the buffer
                is full and an event is dropped. nil to ignore.

type AsyncSink

type AsyncSink struct {
	// contains filtered or unexported fields
}

AsyncSink wraps another Sink with a worker goroutine and a buffered channel. Every per-request method (Begin, Step, Event, End) goes through the channel — the request path NEVER touches the disk and returns within a few microseconds whether tracing is on or off.

The worker drains the channel in order, looks each event up in its tracer map by rid, and forwards to the per-request tracer it got from base.Begin. When an End arrives, the worker spawns a flush goroutine so it doesn't sit blocked through the per-request disk burst — many flushes can run in parallel, and the worker returns to draining the channel immediately.

Per-request ordering is preserved by the FIFO channel: all events from one request originate in the same request goroutine and are enqueued in order.

Drop semantics under overload:

  • If a Begin enqueue drops, the tracer never gets created; all subsequent Step/Event/End for that rid silently no-op in the worker. Trace for that request is missing entirely (better than half-formed).
  • If an End enqueue drops AFTER a Begin landed, the tracer would linger in the worker's map. The ticker-based GC closes any tracer idle longer than StaleAfter so the leak is bounded.

func NewAsyncSink

func NewAsyncSink(base Sink, opts AsyncOpts) *AsyncSink

NewAsyncSink wraps base with the given options. The worker goroutine starts immediately.

func (*AsyncSink) Begin

func (s *AsyncSink) Begin(info RequestInfo) RequestTracer

Begin enqueues a begin event and returns an asyncTracer keyed by the request rid. The actual base.Begin call happens in the worker — dir creation, in.json write, and timeline.jsonl open all run off the request path so request latency is independent of disk speed.

If the channel is full when Begin enqueues, the tracer is silently dropped: the request continues normally (this isn't a request error), but no trace artifacts will be written for it. Better than producing a partial trace.

func (*AsyncSink) Close

func (s *AsyncSink) Close(ctx context.Context) error

Close stops accepting new work, drains pending work, waits for any in-flight flush goroutines, and closes the underlying base sink. Returns ctx.Err() if the drain doesn't complete before ctx is cancelled.

func (*AsyncSink) DroppedCount

func (s *AsyncSink) DroppedCount() int64

DroppedCount returns the number of events dropped because the buffer was full (or zero if no DropCounter was configured).

type ClosedTrace

type ClosedTrace struct {
	RequestDetail        // embedded: same shape as Reader.Get
	Cursor        string // monotonic per-subscription; opaque to client
}

ClosedTrace is what flows live to admin clients on End. It carries the same fields Reader.Get would return (so the admin handler can map to its wire struct verbatim) plus an opaque per-subscription Cursor the client echoes back on reconnect.

type Constructor

type Constructor func(StoreConfig) (Sink, error)

Constructor builds a Sink from StoreConfig.

type FileSink

type FileSink struct {
	Dir  string
	Mode Mode
}

FileSink writes the per-request artifact tree under Dir. Concurrent calls to Begin are safe: each request gets its own subdirectory keyed by RID, so there's no cross-request contention.

Writes are STREAMING — each Step/Event lands on disk as it arrives. We tried deferring all writes to End in a single burst (in case batching helped) but benched slower across the board: the worker got blocked through each request's flush burst, the channel filled, and overall throughput dropped. Streaming smears the small writes out so the worker stays busy without stalling.

func NewFileSink

func NewFileSink(dir string, mode Mode) (*FileSink, error)

NewFileSink returns a sink that writes to dir in the given mode. dir is created (with parents) if it doesn't exist; an error from MkdirAll fails fast at startup rather than on the first request.

func (*FileSink) Begin

func (s *FileSink) Begin(info RequestInfo) RequestTracer

Begin makes the per-request directory, writes in.json, and opens timeline.jsonl. Returns a tracer that's safe for concurrent Step/Event calls; End is expected exactly once.

func (*FileSink) Close

func (s *FileSink) Close(context.Context) error

Close is a no-op for FileSink — sink-level resources aren't held; each request tracer closes its own timeline file in End.

type HintLookup

type HintLookup func(tenant, stack string) Hints

HintLookup answers "what hints apply to this (tenant, stack) write?" AsyncSink calls it on the worker thread, off the request hot path. nil lookup ⇒ no redaction ever happens (the zero-cost default).

Stack may be the empty string when an envelope hasn't been routed (boot/% fallback, system paths); callers should treat that as "lookup with stack=”" — the registry returns empty hints unless an untenanted/unstacked rule explicitly declared something.

type Hints

type Hints struct {
	Redact []string
	Omit   []string
}

Hints are the redact/omit path lists for one (tenant, stack) slot. Redact replaces matched path values with the sentinel "[REDACTED]"; Omit deletes the path entirely. Both lists are gjson dot-paths (exact match only; no wildcards in v1).

Same path appearing in both lists is resolved at build time: Omit wins, the path is dropped from Redact. See chassis/server/redact.go for the build-side dedupe.

func (Hints) Empty

func (h Hints) Empty() bool

Empty reports whether either list has work to do.

type ListQuery

type ListQuery struct {
	Limit       int
	Grep        string
	IfNoneMatch string
}

ListQuery is the read-side query for the trace list. IfNoneMatch is the client's cached ETag; backends that can cheaply detect "unchanged" set ListResult.NotModified so the handler can return 304 without a body. Limit/Grep mirror the ?limit=/?grep= params.

type ListResult

type ListResult struct {
	Traces      []Summary
	Total       int
	ETag        string
	NotModified bool
}

ListResult is the aggregated list response. ETag is an OPAQUE cursor: the file backend uses an fs-stat fingerprint; a DB/index backend uses e.g. max ingest-id + count. The admin layer treats it as opaque and only echoes/compares it. NotModified ⇒ Traces is empty and the handler should emit 304.

type Mode

type Mode string

Mode controls how much detail Sinks write.

off      no writes — production default.
summary  request + timeline + per-step meta. No payload bytes.
full     everything, including handler in/out bodies per step.
const (
	ModeOff     Mode = "off"
	ModeSummary Mode = "summary"
	ModeFull    Mode = "full"
)

func ParseMode

func ParseMode(s string) Mode

ParseMode normalizes user-supplied strings. Anything unrecognized falls back to ModeOff so a typo in TXCO_TRACE_MODE doesn't silently enable tracing in production.

type NoopSink

type NoopSink struct{}

NoopSink is the zero-cost Sink used when TXCO_TRACE_MODE=off. Every method is a no-op; nothing touches the filesystem or allocates.

func (NoopSink) Begin

Begin returns a NoopTracer, which discards everything written to it.

func (NoopSink) Close

func (NoopSink) Close(context.Context) error

Close is a no-op — there's no work to drain.

type NoopTracer

type NoopTracer struct{}

NoopTracer is the per-request side of NoopSink. Exported because FromContext returns it when no tracer is attached to the ctx — callers can compare for it if they need to short-circuit setup.

func (NoopTracer) End

func (NoopTracer) End(status string, final []byte)

func (NoopTracer) Event

func (NoopTracer) Event(TimelineEvent)

func (NoopTracer) Step

func (NoopTracer) Step(StepInfo)

type Reader

type Reader interface {
	// List returns newest-first summaries (≤ q.Limit), an opaque ETag,
	// and Total (full count, or match count when q.Grep != "").
	List(ctx context.Context, q ListQuery) (ListResult, error)

	// Get aggregates one request; full also embeds payloads. ErrNotFound
	// when rid is absent.
	Get(ctx context.Context, rid string, full bool) (RequestDetail, error)

	// IndexNames returns the newest ≤ max request ids plus the total
	// count, for the minimal HTML index page.
	IndexNames(ctx context.Context, max int) (names []string, total int, err error)

	// RawFS exposes the raw artifact tree for the browse-it file server,
	// when the backend is filesystem-shaped. (nil,false) ⇒ the admin
	// serves 404 for the raw path (non-fs backends).
	RawFS() (http.FileSystem, bool)
}

Reader is the read side of a trace backend. The admin endpoints go through this instead of the filesystem directly, so a separate-machine admin can read traces a chassis shipped to a central store. The built-in "file" reader preserves byte-for-byte the legacy fs behavior; "noop" returns empty/NotFound.

func OpenReader

func OpenReader(name string, cfg StoreConfig) (Reader, error)

OpenReader constructs the named Reader; unknown name is a hard error.

type ReaderConstructor

type ReaderConstructor func(StoreConfig) (Reader, error)

ReaderConstructor builds a Reader from StoreConfig (same config the Sink side gets).

type RedactingSink

type RedactingSink struct {
	// contains filtered or unexported fields
}

RedactingSink wraps another Sink and applies a per-(tenant, stack) HintLookup to every envelope-bearing event before forwarding to the inner sink. Composable above or below an AsyncSink:

AsyncSink(RedactingSink(FileSink))   ← redaction runs on the
                                       async worker, off the
                                       request hot path
RedactingSink(FileSink)              ← sync mode; redaction runs
                                       on the request goroutine
                                       just before disk write

nil lookup ⇒ NewRedactingSink returns the inner sink unchanged. The hot path stays exactly as it was when no redaction is configured.

func (*RedactingSink) Begin

func (s *RedactingSink) Begin(info RequestInfo) RequestTracer

Begin captures the request's tenant and (initial) stack, applies hints to the inbound payload, then forwards to the inner sink.

func (*RedactingSink) Close

func (s *RedactingSink) Close(ctx context.Context) error

Close forwards to the inner sink.

type RequestDetail

type RequestDetail struct {
	RID              string
	Src              string
	Tenant           string
	Stack            string
	Route            string
	StartedAt        string
	FinishedAt       string
	DurationMs       *int64
	Status           string
	PayloadBytes     int64
	PayloadTruncated bool
	Steps            []Step
	In               map[string]any
	Out              any
}

RequestDetail is the aggregated per-request document (everything the admin detail endpoint returns except the continuation cross-links, which the admin layer composes from the run store — kept out of here so the trace package doesn't depend on chassis/continuation).

type RequestInfo

type RequestInfo struct {
	RID          string
	Src          string
	Tenant       string
	Stack        string
	StartedAt    time.Time
	Payload      []byte
	PayloadBytes int
}

RequestInfo is what the chassis hands the sink at request start. Payload is the raw envelope bytes that landed in the chassis after inlet construction and any ingress stamping.

PayloadBytes is the ORIGINAL payload size before any truncation a wrapping sink may apply. Zero means "use len(Payload)" — direct callers (no wrapper) don't need to set it. AsyncSink sets it to the true size before capping `Payload` to BodyCapBytes so meta.json records what was sent, not just what we kept on disk.

type RequestTracer

type RequestTracer interface {
	Step(info StepInfo)
	Event(ev TimelineEvent)
	End(status string, finalPayload []byte)
}

RequestTracer is the per-request handle. Step / Event must be safe for concurrent calls (parallel ops at the same scope fire from different goroutines). End is called exactly once.

func FromContext

func FromContext(ctx context.Context) RequestTracer

FromContext returns the tracer attached by the bus loop, or a NoopTracer if none is set (typical when tests or admin handlers drive the processor outside a request lifecycle).

type Sink

type Sink interface {
	Begin(info RequestInfo) RequestTracer
	Close(ctx context.Context) error
}

Sink hands out a RequestTracer per inbound request. Implementations must be safe for concurrent calls to Begin.

Close is called once on chassis shutdown. Synchronous sinks (NoopSink, FileSink) can return nil immediately. Buffered/async sinks should drain any in-flight work, respecting the ctx deadline — return ctx.Err() if the drain doesn't complete in time.

func NewRedactingSink

func NewRedactingSink(inner Sink, lookup HintLookup) Sink

NewRedactingSink wraps inner with redaction. When lookup is nil (no rule declared any redact/omit), returns inner unchanged — no allocation, no wrapper.

func Open

func Open(name string, cfg StoreConfig) (Sink, error)

Open constructs the named backend. Unknown name is a hard error (fail-fast at boot) listing what is available.

type Step

type Step struct {
	Name            string
	Operation       string
	Transport       string
	Stack           string
	Scope           int
	StartedAt       string
	FinishedAt      string
	DurationMs      int64
	Status          string
	InputBytes      int64
	OutputBytes     int64
	InputTruncated  bool
	OutputTruncated bool
	Error           string
	In              any
	Out             any
}

Step is one op execution in the aggregated detail.

type StepInfo

type StepInfo struct {
	Stack      string
	Scope      int
	Name       string
	Operation  string
	Transport  string
	Txcl       string
	Input      []byte
	Output     []byte
	StartedAt  time.Time
	FinishedAt time.Time
	Status     string
	Error      string

	// InputBytes / OutputBytes are the ORIGINAL payload sizes before
	// any truncation a wrapping sink may apply. Zero means "use
	// len(Input)/len(Output)" — direct callers don't need to set
	// these. AsyncSink populates them before capping the slices so
	// meta.json records what the handler actually saw, not just what
	// we kept on disk.
	InputBytes  int
	OutputBytes int
}

StepInfo describes one op execution.

Stack/Scope/Name uniquely identify the rule that fired (the same triple the chassis stamps as `_txc.op` on outbound envelopes). Operation is the literal EXEC operand (a URL, txco://, or stage jump). Transport classifies the dispatch path.

Input is the envelope bytes the chassis posted to the handler; Output is the handler's raw response. Both are recorded only in ModeFull.

type StoreConfig

type StoreConfig struct {
	Dir  string
	Mode Mode
}

StoreConfig carries backend-selecting options resolved from chassis config. Only the file/noop backends are wired in open core; an out-of-tree backend (e.g. a queue/object-store shipper for a separate-machine admin) registers itself via init() + blank import and reads any additional config (endpoint/token) from its OWN env in its constructor — the same seam discipline as chassis/continuation/factory.go and chassis/artifact/factory.go.

type Subscription

type Subscription interface {
	Events() <-chan ClosedTrace
	Close()
}

Subscription is the per-request handle the admin endpoint reads from. Events() yields closed traces in arrival order (best-effort — the bus may not preserve cross-node ordering; clients sort on display by RequestDetail timestamps). Close() releases the subscription; backends MUST tolerate Close() being called multiple times.

type Summary

type Summary struct {
	RID        string
	Src        string
	Tenant     string
	Stack      string
	Route      string
	StartedAt  string
	FinishedAt string
	DurationMs *int64
	Status     string
}

Summary is one row in the trace list (the wire shape the admin list endpoint emits; the admin handler copies these fields verbatim into its JSON struct).

type TimelineEvent

type TimelineEvent struct {
	Ts     time.Time
	Event  string
	Fields map[string]any
}

TimelineEvent is a single line in timeline.jsonl. Event values are stable strings; Fields is open-ended (the sink marshals as-is).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL