gembatesting

package

v0.0.0-...-39bba70 Latest Latest Go to latest Published: May 13, 2026 License: MIT Imports: 12 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/GembaCore/gemba-core

Links

Open Source Insights

README ¶

`github.com/GembaCore/gemba-core/testing`

Conformance harness for Gemba adaptors. Import this package from your adaptor's own go test suite to validate that your WorkPlane or OrchestrationPlaneAdaptor implementation satisfies the contracts in docs/adaptors/workplane.md and docs/adaptors/orchestration.md before you ship.

Resolves DD-12, DD-15. Published early per the Foolery-spike lesson (docs/prior-art/foolery.md): a contract is only real when external authors can run the contract's own tests against their code.

Why this package exists

Foolery's BackendPort is the canonical shape for "extension point you actually treat as external": the contract type lives next to the tests that bind the contract, and third-party backends can go get both to validate their implementation in their own CI. Gemba follows the same pattern — anything less leaves adaptor authors hand-rolling tests against docs/adaptors/*.md, which rots the moment the contract tightens.

Quick start

package beads_test

import (
    "testing"

    "yourorg/beads" // your adaptor
    "github.com/GembaCore/gemba-core/core"
    gembatesting "github.com/GembaCore/gemba-core/testing"
)

func TestBeadsConformance(t *testing.T) {
    impl := beads.New(ctx, beads.Config{...})
    gembatesting.RunWorkPlaneConformance(t, impl, &gembatesting.WorkPlaneFixture{
        KnownMissingID: core.WorkItemID("gemba/gemba/bd-does-not-exist"),
    })
}

Run it:

$ go test -v -run TestBeadsConformance ./...
=== RUN   TestBeadsConformance
=== RUN   TestBeadsConformance/A_describe_returns_valid_manifest
=== RUN   TestBeadsConformance/A_manifest_round_trips_json
=== RUN   TestBeadsConformance/A_describe_is_idempotent
=== RUN   TestBeadsConformance/E_capability_denial_matches_manifest
=== RUN   TestBeadsConformance/F_not_found_is_tagged_adaptor_error
--- PASS: TestBeadsConformance (0.00s)

Entry points

Two parallel APIs — same probes, same group layout — cover the two contexts a conformance run happens in:

RunWorkPlaneConformance(t, impl, fixture) / RunOrchestrationConformance — drive probes from a *testing.T (e.g., a TestXxxConformance test in your adaptor's Go test suite).
RunWorkPlaneProbes(impl, fixture) / RunOrchestrationProbes — programmatic, testing-free entry points returning a structured *Report. Used by the gemba adaptor test CLI (gm-e3.5) and by any CI system that would rather consume JSON than parse go test output.

`RunWorkPlaneConformance(t, impl, fixture)`

Exercises the probes in docs/adaptors/workplane.md:

Group	Probe	Asserts
A	`describe_returns_valid_manifest`	`Describe()` returns; `CapabilityManifest.Validate()` passes; `ProtocolVersion` matches core.
A	`manifest_round_trips_json`	Manifest re-decodes byte-identical through `encoding/json`.
A	`describe_is_idempotent`	Two consecutive `Describe()` calls return equal manifests.
E	`capability_denial_matches_manifest`	Gated ops (`attach_evidence`, `list_sprints`, `read_budget_rollup`) opt-out via manifest → adaptor returns `capability_denied` `*core.AdaptorError`.
F	`not_found_is_tagged_adaptor_error`	`GetWorkItem(KnownMissingID)` returns a tagged error that also satisfies `errors.Is(err, core.ErrNotFound)`. Skipped when `KnownMissingID` is empty.

`RunOrchestrationConformance(t, impl, fixture)`

Exercises the probes in docs/adaptors/orchestration.md:

Group	Probe	Asserts
A	`describe_returns_valid_manifest`	Manifest has non-empty `workspace_kinds`, `group_modes`, `cost_axes`, `escalation_kinds`, `peek_modes` and a valid `Transport`; `OrchestrationAPIVersion` matches core.
A	`manifest_round_trips_json`	Manifest re-decodes byte-identical through `encoding/json`.
B	`list_pending_requests_returns_empty_slice`	For a freshly-started session, `ListPendingRequests` returns a non-nil empty slice.
B	`end_session_populates_close_reason`	First terminal close populates `Session.CloseReason` with a valid `SessionCloseReason`.
B	`end_session_idempotent_under_same_nonce`	Second `EndSession` with the same `(sessionID, nonce)` is a no-op — no error.
B	`end_session_terminal_absorbing`	`EndSession` under a fresh nonce on an already-terminal session is a no-op — no error.
F	`list_pending_requests_unknown_session_is_tagged`	`ListPendingRequests` on an unknown id returns a tagged `KindSessionNotFound` error. Skipped when `KnownMissingSessionID` is empty.

The Group B probes need a live session. Supply OrchestrationFixture.SessionStarter:

gembatesting.RunOrchestrationConformance(t, impl, &gembatesting.OrchestrationFixture{
    KnownMissingSessionID: "sess-does-not-exist",
    SessionStarter: func(t *testing.T, adaptor core.OrchestrationPlaneAdaptor) (string, func()) {
        id, err := myAdaptor.MintFixtureSession(ctx)
        if err != nil { t.Fatal(err) }
        return id, func() { myAdaptor.DestroyFixtureSession(id) }
    },
})

`RunWorkPlaneProbes(impl, fixture) *Report` / `RunOrchestrationProbes`

The programmatic runner. Same probes as above, but failures accumulate into a *Report rather than failing a *testing.T. The CLI path:

gemba adaptor test --transport jsonl --target builtin:noop-work
gemba adaptor test --transport jsonl --target builtin:noop-work --json
gemba adaptor test --transport jsonl --target builtin:noop-work --junit out.xml

--target builtin:noop-work / builtin:noop-orch exercises the in-process reference adaptors. Remote targets (URL/socket/cmd) require a transport wire client (gm-e4.x) — they fail fast with a structured not-yet-implemented until those land.

The orchestration fixture used by the CLI path supplies ProgrammaticSessionStarter (a *testing.T-free counterpart to SessionStarter); it is called once per Group B probe because those probes individually close the session they receive.

Fixture contract

Field	Required?	Used by
`WorkPlaneFixture.KnownMissingID`	optional	Group F not-found probe (`GetWorkItem`).
`OrchestrationFixture.KnownMissingSessionID`	optional	Group F unknown-session probe (`ListPendingRequests`).
`OrchestrationFixture.SessionStarter`	optional	All Group B lifecycle probes (end session, close reason, idempotency).

Probes with no fixture data are skipped — they will not fail your suite silently, but the corresponding t.Run subtests will not execute.

Stability

Signatures are stable across minor versions. New probe subtests MAY land as part of minor releases; passing adaptors MUST keep passing. Breaking changes to probe semantics land with a bump of core.ProtocolVersion.

Reference implementation

internal/adapter/noop/ ships a minimal in-memory adaptor that passes both harnesses (see internal/adapter/noop/conformance_test.go). Clone the test file as the starting point for your own adaptor suite.

Documentation ¶

Overview ¶

Package testing publishes Gemba's conformance test harness as an importable library so third-party adaptor authors can run our contract tests inside their own `go test` suites (gm-2am, Foolery-spike lesson).

Import as:

import gembatesting "github.com/GembaCore/gemba-core/testing"

The alias avoids the obvious clash with the standard-library `testing` package, which every harness consumer also imports.

Entry points ¶

testing.T-driven (Go test binary):

RunWorkPlaneConformance(t, impl, fixture)
RunOrchestrationConformance(t, impl, fixture)

Programmatic (gm-e3.5, powers `gemba adaptor test`):

RunWorkPlaneProbes(impl, fixture) *Report
RunOrchestrationProbes(impl, fixture) *Report
WriteTextReport(w, r) / WriteJUnit(w, r)

Both APIs exercise the same probe set. The testing.T-driven path surfaces failures through t.Run subtests; the programmatic path collects them into a structured *Report that the CLI renders or exports to JUnit. A fixture argument lets the adaptor pre-seed the data the probes need (known-missing ids, a live session id) so that adaptors with no way to construct fixture state still pass the fixture-independent groups.

Conformance groups ¶

WorkPlane (docs/adaptors/workplane.md):

Group A — manifest validity, JSON round-trip, Describe idempotency.
Group E — capability denial: gated ops the manifest opts out of surface as `capability_denied` AdaptorError (gm-4qf).
Group F — error algebra: every observed boundary error is a tagged *core.AdaptorError (gm-faz).

Orchestration (docs/adaptors/orchestration.md):

Group A — manifest validity, required-axes coverage.
Group B — session lifecycle: EndSession idempotency (same-nonce replay + terminal-absorbing), CloseReason populated, active-turn protection, scope teardown, ListPendingRequests exists.
Group F — error algebra: ListPendingRequests on a bogus session id returns a tagged KindSessionNotFound.

Stability ¶

The function signatures are stable across minor releases; new probe groups MAY be added behind new subtests without bumping the major version, but existing passing adaptors MUST keep passing. Breaking probe changes land with a bump of core.ProtocolVersion.

Index ¶

func RunOrchestrationConformance(t *testing.T, impl core.OrchestrationPlaneAdaptor, ...)
func RunWorkPlaneConformance(t *testing.T, impl core.WorkPlane, fixture *WorkPlaneFixture)
func WriteJUnit(w io.Writer, r *Report) error
func WriteTextReport(w io.Writer, r *Report)
type GroupResult
type JUnitFailure
type JUnitSkipped
type JUnitTestCase
type JUnitTestSuite
type JUnitTestSuites
type OrchestrationFixture
type ProbeResult
type Report
- func RunOrchestrationProbes(impl core.OrchestrationPlaneAdaptor, fixture *OrchestrationFixture) *Report
- func RunWorkPlaneProbes(impl core.WorkPlane, fixture *WorkPlaneFixture) *Report
- func (r *Report) Passed() bool
- func (r *Report) Totals() (passed, failed, skipped int)
type WorkPlaneFixture

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func RunOrchestrationConformance ¶

func RunOrchestrationConformance(t *testing.T, impl core.OrchestrationPlaneAdaptor, fixture *OrchestrationFixture)

RunOrchestrationConformance runs the OrchestrationPlane contract probes against impl. Subtests are named by conformance group so failures surface directly; fixture-independent probes always run.

fixture may be nil; that is equivalent to a zero-value OrchestrationFixture.

func RunWorkPlaneConformance ¶

func RunWorkPlaneConformance(t *testing.T, impl core.WorkPlane, fixture *WorkPlaneFixture)

RunWorkPlaneConformance runs the WorkPlane contract probes against impl (see package doc for group breakdown). Each group is a t.Run subtest so failures point directly at the offending probe; callers should invoke this from a top-level TestXxxConformance and let the subtest names surface in go test -v output.

fixture may be nil; that is equivalent to passing a zero-value WorkPlaneFixture — fixture-independent probes still run.

Usage:

func TestBeadsConformance(t *testing.T) {
    impl := bd.New(ctx, cfg)
    gembatesting.RunWorkPlaneConformance(t, impl, &gembatesting.WorkPlaneFixture{
        KnownMissingID: "gemba/gemba/gm-does-not-exist",
    })
}

func WriteJUnit ¶

func WriteJUnit(w io.Writer, r *Report) error

WriteJUnit serializes r as a JUnit XML <testsuites> document. The `gemba adaptor test --junit <path>` flag emits this exact format.

func WriteTextReport ¶

func WriteTextReport(w io.Writer, r *Report)

WriteTextReport emits a human-readable per-group summary of r to w. It is the default format for `gemba adaptor test` and mirrors what go test -v shows, except that skipped and not-applicable groups are called out explicitly so operators don't misread a missing fixture as a pass.

Types ¶

type GroupResult ¶

type GroupResult struct {
	Name          string        `json:"name"`
	NotApplicable bool          `json:"not_applicable,omitempty"`
	Note          string        `json:"note,omitempty"`
	Probes        []ProbeResult `json:"probes,omitempty"`
}

GroupResult carries one conformance group's probes. Name is the group letter + short description (e.g. "A: describe + capability shape"). NotApplicable=true means the group has no probes on this plane in the current release (e.g. WorkPlane has no edge interface yet, so Group C reports not-applicable).

type JUnitFailure ¶

type JUnitFailure struct {
	Message string `xml:"message,attr"`
	Body    string `xml:",chardata"`
}

JUnitFailure carries the rolled-up probe messages as a <failure> element. Message is the one-line summary; the body is the full message block so operators can see every assertion that fired.

type JUnitSkipped ¶

type JUnitSkipped struct {
	Message string `xml:"message,attr"`
}

JUnitSkipped carries the skip reason.

type JUnitTestCase ¶

type JUnitTestCase struct {
	Name      string        `xml:"name,attr"`
	ClassName string        `xml:"classname,attr"`
	Failure   *JUnitFailure `xml:"failure,omitempty"`
	Skipped   *JUnitSkipped `xml:"skipped,omitempty"`
}

JUnitTestCase represents one probe. Exactly one of Failure / Skipped is set on a non-passing case.

type JUnitTestSuite ¶

type JUnitTestSuite struct {
	Name     string          `xml:"name,attr"`
	Tests    int             `xml:"tests,attr"`
	Failures int             `xml:"failures,attr"`
	Skipped  int             `xml:"skipped,attr"`
	Cases    []JUnitTestCase `xml:"testcase"`
}

JUnitTestSuite represents one conformance group.

type JUnitTestSuites ¶

type JUnitTestSuites struct {
	XMLName  xml.Name         `xml:"testsuites"`
	Name     string           `xml:"name,attr,omitempty"`
	Tests    int              `xml:"tests,attr"`
	Failures int              `xml:"failures,attr"`
	Skipped  int              `xml:"skipped,attr"`
	Suites   []JUnitTestSuite `xml:"testsuite"`
}

JUnitTestSuites is the root element JUnit consumers (GitHub Actions, GitLab, Jenkins) look for. The fields are intentionally the common subset every renderer we care about understands — more elaborate extensions (system-out, properties, categories) are out of scope here.

type OrchestrationFixture ¶

type OrchestrationFixture struct {
	// KnownMissingSessionID is a session id the adaptor is guaranteed
	// NOT to have. The Group F probe calls ListPendingRequests(id) and
	// asserts the error is a tagged KindSessionNotFound AdaptorError.
	// Leave empty to skip.
	KnownMissingSessionID string

	// SessionStarter provisions a live session for the Group B
	// lifecycle probes. It returns the session id and a cleanup
	// callback the harness runs after the lifecycle probes complete.
	// Leave nil to skip the session-dependent probes.
	//
	// Used by the testing.T entry point (RunOrchestrationConformance).
	SessionStarter func(t *testing.T, impl core.OrchestrationPlaneAdaptor) (sessionID string, cleanup func())

	// ProgrammaticSessionStarter is the testing.T-free counterpart of
	// SessionStarter, used by RunOrchestrationProbes (the `gemba
	// adaptor test` CLI). It returns a fresh session id and an optional
	// cleanup callback, or an error if provisioning failed; the error
	// surfaces in the probe result instead of silently skipping.
	//
	// Each Group B probe requests its own session because most of them
	// move the session to terminal state. Adaptor authors implementing
	// both starters typically share a single internal helper that
	// mints an assignment and calls StartSession.
	ProgrammaticSessionStarter func(impl core.OrchestrationPlaneAdaptor) (sessionID string, cleanup func(), err error)

	// AcquireWorkspaceRequest seeds the Group D acquire/release event-
	// emission probe (gm-e3.6.3). Leave nil to skip the workspace half
	// of the event-emission group; adaptors without a workspace
	// allocator (noop) won't run it. The probe pairs the request with
	// a matching ReleaseWorkspace call so the workspace doesn't leak
	// into a follow-up probe.
	AcquireWorkspaceRequest *core.WorkspaceRequest
}

OrchestrationFixture carries the optional state some Group B probes need. Probes that require a live session skip when SessionStarter is nil; the manifest / error-algebra probes still run regardless.

type ProbeResult ¶

type ProbeResult struct {
	Name     string   `json:"name"`
	Passed   bool     `json:"passed"`
	Skipped  bool     `json:"skipped,omitempty"`
	Messages []string `json:"messages,omitempty"`
}

ProbeResult is one probe's outcome. Passed=false with messages is a failure; Skipped=true means a required fixture was missing (e.g. no KnownMissingID, so the not-found probe could not run).

type Report ¶

type Report struct {
	// Plane is "work" or "orchestration".
	Plane string `json:"plane"`
	// AdaptorName is the manifest-declared name (best-effort; empty if
	// Describe failed hard enough that no name is available).
	AdaptorName string `json:"adaptor_name,omitempty"`
	// AdaptorVersion is the manifest-declared version.
	AdaptorVersion string `json:"adaptor_version,omitempty"`
	// ProtocolVersion is what core.ProtocolVersion evaluated to at run
	// time; recorded so CI can correlate a failing run to a core pin.
	ProtocolVersion string `json:"protocol_version"`
	// StartedAt / FinishedAt bound the whole run for CI timing metrics.
	StartedAt  time.Time `json:"started_at"`
	FinishedAt time.Time `json:"finished_at"`
	// Groups is the A–F grouping defined in docs/adaptors/*.md.
	Groups []GroupResult `json:"groups"`
}

Report is the machine-readable summary of a conformance run. It is the shape the `gemba adaptor test` CLI emits (gm-e3.5), and what third- party CI systems read when they call the programmatic runner directly rather than go test.

One Report covers one plane (work or orchestration). A CLI run that tests both planes produces one Report per plane.

func RunOrchestrationProbes ¶

func RunOrchestrationProbes(impl core.OrchestrationPlaneAdaptor, fixture *OrchestrationFixture) *Report

RunOrchestrationProbes runs the OrchestrationPlane conformance probes against impl and returns a structured Report. Programmatic counterpart to RunOrchestrationConformance (gm-e3.5).

fixture may be nil.

func RunWorkPlaneProbes ¶

func RunWorkPlaneProbes(impl core.WorkPlane, fixture *WorkPlaneFixture) *Report

RunWorkPlaneProbes runs the WorkPlane conformance probes against impl and returns a structured Report. It is the programmatic counterpart to RunWorkPlaneConformance: same probes, same group layout, but no *testing.T dependency so the `gemba adaptor test` CLI can drive it.

fixture may be nil. Probes that require fixture state record a ProbeResult{Skipped: true} rather than silently disappearing, so CI can tell a missing fixture from a passed probe.

func (*Report) Passed ¶

func (r *Report) Passed() bool

Passed reports whether every non-skipped probe across every group passed. Skipped probes do not count as failure (they document a fixture gap, not a contract violation).

func (*Report) Totals ¶

func (r *Report) Totals() (passed, failed, skipped int)

Totals returns aggregate pass/fail/skip counts across the whole report. Convenient for CLI summary lines and JUnit <testsuites> attributes.

type WorkPlaneFixture ¶

type WorkPlaneFixture struct {
	// KnownMissingID is a WorkItemID the adaptor is guaranteed NOT to
	// have. The Group F not-found probe calls GetWorkItem(id) and asserts
	// the error is a tagged *core.AdaptorError that also satisfies
	// errors.Is(err, core.ErrNotFound).
	KnownMissingID core.WorkItemID

	// SeedWorkItemWithEdges is an adaptor-supplied hook that stages one
	// WorkItem with a known-good Relationships slice inside the backend
	// and returns its id plus the expected relationships. The Group C
	// round-trip probe drives GetWorkItem(id) and checks the returned
	// relationships match the declaration. Leave nil when the adaptor
	// has no seed hook; the probe is skipped rather than failing.
	SeedWorkItemWithEdges func(impl core.WorkPlane) (core.WorkItemID, []core.Relationship, error)

	// ReadySetGraphEvolution — Group G (R3). Adaptor drives a
	// three-step scenario: stage two items A → (blocked by) B; observe
	// that B is in the ready-set and A is not; unblock A (close B or
	// remove the edge); observe A moves into the ready-set. Any
	// deviation returns a non-nil error. Gate:
	// DependencyGraphNative && ReadySetQuery.
	ReadySetGraphEvolution func(impl core.WorkPlane) error

	// DiscoveredFromMidExecution — Group G (R3, beads-specific). An
	// adaptor that declares an EdgeExtension named
	// "beads:discovered_from" stages a "discovered" item mid-execution
	// and asserts the new edge is visible in the parent's read view.
	// Gate: edge extension declared + hook non-nil.
	DiscoveredFromMidExecution func(impl core.WorkPlane) error

	// VersionedStateRoundTrip — Group H (R4). Adaptor exports state
	// via the declared transport, imports it into a fresh instance,
	// and asserts the round-tripped rows match. Runs once per
	// declared VersioningTransport entry that is not "none". Gate:
	// versioning_transport contains at least one real transport.
	VersionedStateRoundTrip func(impl core.WorkPlane, transport core.VersioningTransport) error

	// BranchMergeRoundTrip — Group H (R4). For transports that carry
	// branch semantics (git / dolt), the hook exercises a three-way
	// merge round-trip. Gate: versioning_transport contains git or
	// dolt.
	BranchMergeRoundTrip func(impl core.WorkPlane, transport core.VersioningTransport) error

	// ConcurrencyStressN — Group I (R5). Default N=16 concurrent
	// writers; the hook MAY use a smaller N and report it via the
	// returned int. Returns (N actually run, error). Gate:
	// concurrency_model is declared and not "optimistic".
	ConcurrencyStressN func(impl core.WorkPlane, n int) (int, error)

	// ReadAfterWriteCrossWriter — Group I (R5). Writes from goroutine
	// A, reads from goroutine B, asserts visibility within the
	// adaptor-declared event-latency budget.
	ReadAfterWriteCrossWriter func(impl core.WorkPlane) error

	// SessionDeathRecovery — Group J (R6). Simulates agent session
	// death mid-write; confirms state lands in the store. Gate:
	// agent_session_decoupling.
	SessionDeathRecovery func(impl core.WorkPlane) error

	// WorkPickupBySecondAgent — Group J (R6). First agent claims a
	// work item then disappears; second agent force-steals and
	// completes it. Gate: agent_session_decoupling.
	WorkPickupBySecondAgent func(impl core.WorkPlane) error

	// ReadySetSubscribeLatency — Group K (R8). Subscribe to ready-set
	// transitions; assert subscription receives the right events
	// within the adaptor-declared latency budget. Gate:
	// orchestrator_hooks ∋ ready-set-subscribe.
	ReadySetSubscribeLatency func(impl core.WorkPlane) error

	// ClaimAtomic — Group K (R8). Two concurrent claims on the same
	// item; exactly one succeeds, the other receives a typed conflict.
	// Gate: orchestrator_hooks ∋ claim-atomic.
	ClaimAtomic func(impl core.WorkPlane) error

	// EscalationIngestRoundTrip — Group K (R8). Adaptor accepts a
	// structured escalation and reads it back unchanged. Gate:
	// orchestrator_hooks ∋ escalation-ingest.
	EscalationIngestRoundTrip func(impl core.WorkPlane) error

	// WorkCompleteAck — Group K (R8). Adaptor round-trips a
	// work-complete write with an ack payload. Gate:
	// orchestrator_hooks ∋ work-complete-ack.
	WorkCompleteAck func(impl core.WorkPlane) error

	// PoolBulkDispatch — Group K (R8). N work items dispatched to a
	// pool of agents in one round-trip. Gate: orchestrator_hooks ∋
	// pool-bulk-dispatch.
	PoolBulkDispatch func(impl core.WorkPlane) error
}

WorkPlaneFixture carries the optional state some conformance probes need to run. Leave any field zero to skip the corresponding probe — fixture-independent probes (manifest validity, JSON round-trip, capability denial) still run regardless.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL