Documentation
¶
Overview ¶
Package concepts owns the controlled-vocabulary half of the gm-s47n two-axis planner (see docs/design/work-planning.md §6).
The package is self-contained: it does NOT depend on internal/core's WorkItem schema. Beads are surfaced through a small BeadConceptStore interface so the package compiles and tests run before gm-s47n.1.1 lands the WorkItem.concepts field.
Three interlocking surfaces:
- Bootstrap runs N pluggable [BootstrapSource]s in parallel, unions the candidates, normalizes, dedupes, and caps. Ships three sources: Go packages, route prefixes, fixture taxonomy.
- DetectDrift reads bead concept usage and emits suggestions for near-duplicate merges, drifter follow-ups, and singleton deletes. Idempotent, pure, never mutates state.
- [ReviewQueue] persists suggestions and operator decisions; an approval drives historical rewrites via ApplyMerge / ApplyRename / ApplyDelete over the BeadConceptStore.
Storage lives in <workspace>/.gemba/concepts/.
Index ¶
- Constants
- Variables
- func AppendDecision(workspace string, d Decision) error
- func ApplyDelete(ctx context.Context, store BeadConceptStore, term string) (int, error)
- func ApplyMerge(ctx context.Context, store BeadConceptStore, from, to string) (int, error)
- func ApplyRename(ctx context.Context, store BeadConceptStore, from, to string) (int, error)
- func Bootstrap(ctx context.Context, root string, sources []BootstrapSource, ...) (*Vocabulary, *BootstrapResult, error)
- func EnsureStoreDir(workspace string) (string, error)
- func NewSuggestionID() string
- func Normalize(name string) string
- func SaveSuggestions(workspace string, list *SuggestionList) error
- func SaveVocabulary(workspace string, v *Vocabulary) error
- func StoreDir(workspace string) string
- type BeadConceptStore
- type BeadConcepts
- type BootstrapBucket
- type BootstrapError
- type BootstrapOpts
- type BootstrapResult
- type BootstrapSource
- type Candidate
- type Decision
- type Drift
- type DriftOpts
- type FixtureTaxonomySource
- type GoPackagesSource
- type MemoryStore
- type NearDuplicate
- type RoutePrefixesSource
- type Singleton
- type Suggestion
- type SuggestionKind
- type SuggestionList
- func (l *SuggestionList) Add(s Suggestion) bool
- func (l *SuggestionList) Approved() []Suggestion
- func (l *SuggestionList) Find(id string) (*Suggestion, bool)
- func (l *SuggestionList) Mark(id string, status SuggestionStatus) error
- func (l *SuggestionList) Pending() []Suggestion
- func (l *SuggestionList) Rejected() []Suggestion
- type SuggestionStatus
- type Term
- type Vocabulary
- func (v *Vocabulary) Active() []Term
- func (v *Vocabulary) Add(t Term) (*Term, bool)
- func (v *Vocabulary) Find(name string) (*Term, bool)
- func (v *Vocabulary) Merge(from, to string) (*Term, error)
- func (v *Vocabulary) Rename(from, to string) (*Term, error)
- func (v *Vocabulary) Retire(name string) bool
- func (v *Vocabulary) Sort()
- type VocabularyError
Constants ¶
const ( StoreDirName = "concepts" VocabularyFile = "vocabulary.json" SuggestionsFile = "suggestions.json" DecisionsLogFile = "decisions.log" )
Store paths inside <workspace>/.gemba/concepts/. The names live as constants so the CLI, the SPA (when it lands), and any future importer all hit the same files.
Variables ¶
var ( ErrSuggestionNotFound = errors.New("concepts: suggestion not found") ErrSuggestionDecided = errors.New("concepts: suggestion already decided") )
ErrSuggestionNotFound / ErrSuggestionDecided are sentinel errors the CLI checks via errors.Is so it can map them to user-facing messages without string parsing.
Functions ¶
func AppendDecision ¶
AppendDecision adds one entry to the decisions JSONL log. The log is append-only: every approve / reject becomes a permanent line so the audit trail survives vocabulary edits.
func ApplyDelete ¶
ApplyDelete drops `term` from every bead that has it. Returns the count of beads changed.
func ApplyMerge ¶
ApplyMerge rewrites every bead whose concept set contains `from` so it now contains `to` instead. Beads already carrying both terms drop the `from` (no double entry). Returns the count of beads changed.
func ApplyRename ¶
ApplyRename changes every occurrence of `from` to `to` across every bead. Identical mechanics to ApplyMerge — the difference is in the vocabulary layer (Rename keeps the term as the surviving one, Merge collapses two pre-existing terms). Beads carrying both terms dedup to a single `to`.
func Bootstrap ¶
func Bootstrap(ctx context.Context, root string, sources []BootstrapSource, opts BootstrapOpts) (*Vocabulary, *BootstrapResult, error)
Bootstrap runs every source in parallel, unions the candidates, normalizes, dedupes, and caps. Returns a fresh Vocabulary with stable name order. Source selection is the caller's responsibility; pass DefaultSources for the ship-with set.
Errors from individual sources are collected and surfaced via [BootstrapResult.Errors]; the vocabulary is returned even when some sources failed because the surviving sources still produce useful starter terms.
func EnsureStoreDir ¶
EnsureStoreDir mkdir-p's the concepts directory. Idempotent.
func NewSuggestionID ¶
func NewSuggestionID() string
NewSuggestionID mints a short hex id stable enough for an operator to type. Collisions inside one workspace are vanishingly unlikely (8 hex chars = 4 bytes of entropy).
func Normalize ¶
Normalize collapses a candidate name into the canonical lower-kebab-case form. Whitespace, underscores, slashes, and dots all become hyphens; runs of separators collapse to one; trailing separators are trimmed.
func SaveSuggestions ¶
func SaveSuggestions(workspace string, list *SuggestionList) error
func SaveVocabulary ¶
func SaveVocabulary(workspace string, v *Vocabulary) error
SaveVocabulary writes vocabulary.json atomically (write to a sibling .tmp + rename) so a crashed process never leaves a half- written file.
Types ¶
type BeadConceptStore ¶
type BeadConceptStore interface {
// List returns every bead's id and current concept set.
List(ctx context.Context) ([]BeadConcepts, error)
// Set replaces the concept set on the named bead. The slice is
// owned by the caller after return — implementations that need to
// retain it should copy.
Set(ctx context.Context, beadID string, concepts []string) error
}
BeadConceptStore is the integration boundary between the concepts package and the WorkItem.concepts schema landing in gm-s47n.1.1. Production wiring (a thin adapter over WorkPlane) is scheduled for that bead; until then the in-memory implementation in this file powers tests and CLI dry-runs.
type BeadConcepts ¶
type BeadConcepts struct {
BeadID string
Concepts []string
// CreatedAt + ClosedAt feed the singleton-decay heuristic. Both
// optional; zero values disable the time-window filter.
CreatedAt time.Time
ClosedAt *time.Time
}
BeadConcepts is the slice projection a BeadConceptStore returns for each bead — just the id and the current concept set. Both the drift detector and the historical rewrite consume this shape.
type BootstrapBucket ¶
type BootstrapError ¶
func (BootstrapError) Error ¶
func (e BootstrapError) Error() string
type BootstrapOpts ¶
type BootstrapOpts struct {
// Max caps the number of terms in the resulting vocabulary. The
// bead description targets 30-60; default is 60. Sources are
// queried in order so an early source's candidates fill first;
// callers wanting different priority can reorder the slice.
Max int
}
BootstrapOpts controls collection limits.
func DefaultBootstrapOpts ¶
func DefaultBootstrapOpts() BootstrapOpts
DefaultBootstrapOpts is the ship-with policy: at most 60 terms.
type BootstrapResult ¶
type BootstrapResult struct {
Total int
Skipped int // candidates dropped because Max was hit
BySource []BootstrapBucket // count of terms attributed per source
Errors []BootstrapError // per-source failures (other sources still ran)
}
BootstrapResult is the operator-visible report of one bootstrap run. Mostly diagnostic; the vocabulary itself is the load-bearing output.
type BootstrapSource ¶
type BootstrapSource interface {
// Name is a stable identifier the [Term.Source] field carries
// so a future operator can tell which source proposed a term.
// Convention: lower-kebab-case noun phrase.
Name() string
// Extract returns the candidates this source observed under root.
// Implementations MUST return [] (not error) when there's nothing
// to extract — a workspace without an `internal/` directory is a
// legitimate state, not a failure.
Extract(ctx context.Context, root string) ([]Candidate, error)
}
BootstrapSource extracts candidate vocabulary terms from one observable feature of the workspace. The interface stays small so adding a source (e.g. an org-internal Linear label exporter) is a one-method change.
func DefaultSources ¶
func DefaultSources() []BootstrapSource
DefaultSources returns the bootstrap sources gemba ships. Order matters — earlier sources fill the cap first when [BootstrapOpts.Max] is small. Operators wanting a different priority compose their own slice.
type Candidate ¶
type Candidate struct {
Name string
Description string
// Source is overwritten by Bootstrap with the originating
// BootstrapSource.Name(); implementations can leave it empty.
Source string
}
Candidate is one proposed vocabulary entry. Bootstrap collects, normalizes, and dedupes by Name; the first source to propose a name wins for the Source label.
type Decision ¶
type Decision struct {
SuggestionID string `json:"suggestion_id"`
Kind SuggestionKind `json:"kind"`
From string `json:"from,omitempty"`
To string `json:"to,omitempty"`
Action string `json:"action"` // "approved" | "rejected"
Reason string `json:"reason,omitempty"`
By string `json:"by"`
BeadsChanged int `json:"beads_changed,omitempty"`
At time.Time `json:"at"`
}
Decision is one append-only entry in the decisions log.
func ApplyDecision ¶
func ApplyDecision( ctx context.Context, v *Vocabulary, list *SuggestionList, store BeadConceptStore, id string, by string, ) (Decision, error)
ApplyDecision is the CLI-facing entry point. It looks up the suggestion, marks it approved on the in-memory list, applies the vocabulary side and the bead-side rewrite, and returns the count of beads changed for the audit log. Caller is responsible for persisting the vocabulary + suggestion list afterwards.
func ReadDecisions ¶
ReadDecisions returns every decision in the log, in order. Used by the CLI's `concepts log` view; the file itself stays appendable for new entries.
func RejectDecision ¶
func RejectDecision(list *SuggestionList, id, by, reason string) (Decision, error)
RejectDecision marks a suggestion rejected and returns the audit-log entry.
type Drift ¶
type Drift struct {
NearDuplicates []NearDuplicate `json:"near_duplicates,omitempty"`
Singletons []Singleton `json:"singletons,omitempty"`
}
Drift is the detector's report.
func DetectDrift ¶
func DetectDrift(beads []BeadConcepts, opts DriftOpts) Drift
DetectDrift reads bead concepts and returns the current drift state. Pure: same input → same output, no mutation.
Drifters (semantic neighbor walking) live in gm-s47n.3 — the source analysis abstraction is the right place for embedding-based work, not this co-occurrence-only detector. This function ships the two signal types the bead description called out as concrete (.7.2).
type DriftOpts ¶
type DriftOpts struct {
// NearDuplicateJaccard is the minimum Jaccard similarity a pair
// of terms must share before the detector flags them as
// near-duplicates. Default 0.6.
NearDuplicateJaccard float64
// NearDuplicateUseRatio guards against flagging a pair where
// one term is heavily used and the other is a singleton — the
// usage profiles must be comparable. min(|a|,|b|)/max(|a|,|b|).
// Default 0.5.
NearDuplicateUseRatio float64
// SingletonDormantDays is how long after a bead's ClosedAt a
// singleton-on-that-bead must wait before the detector emits a
// delete suggestion. Default 90 (per spec §6.2). Set to 0 to
// disable the dormant gate (every singleton becomes a suggestion).
SingletonDormantDays int
// SingletonMaxUses is the inclusive upper bound on the bead-count
// for a term to qualify as a singleton candidate. Default 2 — the
// spec's "fewer than 3 beads". Set to 1 for the strict "exactly
// one bead" interpretation.
SingletonMaxUses int
// Now is the reference time for dormant calculations. Tests
// inject a fixed time so cases stay deterministic; production
// leaves it zero (defaults to time.Now().UTC()).
Now time.Time
}
DriftOpts tunes the detector's thresholds. Defaults match the values documented in docs/design/work-planning.md §6.4.
func DefaultDriftOpts ¶
func DefaultDriftOpts() DriftOpts
DefaultDriftOpts is the policy that ships. Threshold values target the intent of work-planning.md §6.2 (cosine ≥ 0.85 near-dups, singletons "< 3 beads after 90 days") translated to the Jaccard + dormant-only metrics this detector ships:
- Jaccard 0.7 lands at a similar precision to cosine 0.85 on the small-sparse-set distribution beads produce in practice.
- Singleton dormant 90d matches the spec's "after 90 days" gate. Use-count < 3 (rather than == 1) is enforced via [SingletonMaxUses].
type FixtureTaxonomySource ¶
type FixtureTaxonomySource struct{}
FixtureTaxonomySource emits the top-level subdirectory names of testing/e2e/specs/. The e2e library has already validated that each tier names a real surface (smoke / chrome / drawers / grid / realtime / etc.); reusing that taxonomy gives the concept set a language operators are already fluent in.
func (FixtureTaxonomySource) Name ¶
func (FixtureTaxonomySource) Name() string
type GoPackagesSource ¶
type GoPackagesSource struct{}
GoPackagesSource walks internal/ + cmd/ under root and emits a candidate per unique Go package name. Internal package names are the most stable signal of "what a contributor calls a thing" — a directory whose package is named `concepts` is observably about concepts whether or not the operator remembered to label it.
func (GoPackagesSource) Name ¶
func (GoPackagesSource) Name() string
type MemoryStore ¶
type MemoryStore struct {
// contains filtered or unexported fields
}
MemoryStore is the in-memory BeadConceptStore. Production-grade — the CLI uses it for dry-runs and the test suite uses it everywhere. The historical-rewrite math is the same regardless of which store sits behind the interface.
func NewMemoryStore ¶
func NewMemoryStore() *MemoryStore
NewMemoryStore returns an empty store. Callers seed via Set.
func (*MemoryStore) List ¶
func (s *MemoryStore) List(_ context.Context) ([]BeadConcepts, error)
List implements BeadConceptStore.
func (*MemoryStore) Set ¶
Set implements BeadConceptStore.
type NearDuplicate ¶
type NearDuplicate struct {
A string `json:"a"`
B string `json:"b"`
Jaccard float64 `json:"jaccard"`
UsesA int `json:"uses_a"`
UsesB int `json:"uses_b"`
}
NearDuplicate flags a pair of terms whose co-occurrence pattern suggests they're being used interchangeably.
type RoutePrefixesSource ¶
type RoutePrefixesSource struct{}
RoutePrefixesSource extracts the top-level UI route names the SPA already exposes — every `<Route path="/foo" ...>` literal in web/src/App.tsx becomes a candidate. Routes are user-facing surfaces the operator already named, which makes them excellent concept seeds.
func (RoutePrefixesSource) Name ¶
func (RoutePrefixesSource) Name() string
type Singleton ¶
type Singleton struct {
Term string `json:"term"`
BeadID string `json:"bead_id"`
ClosedAt *time.Time `json:"closed_at,omitempty"`
DormantFor int `json:"dormant_days,omitempty"`
}
Singleton flags a term used on exactly one bead. Carries that bead's id + close timestamp so the operator can decide whether the concept ever generalized.
type Suggestion ¶
type Suggestion struct {
ID string `json:"id"`
Kind SuggestionKind `json:"kind"`
From string `json:"from,omitempty"`
To string `json:"to,omitempty"`
Reason string `json:"reason"`
Source string `json:"source"` // "drift:near-duplicate" | "drift:singleton" | "operator"
CreatedAt time.Time `json:"created_at"`
Status SuggestionStatus `json:"status"`
}
Suggestion is a proposed vocabulary change. The drift detector emits these (status=pending); the operator approves or rejects; the apply path materializes approved changes through the vocabulary + the bead store.
func SuggestionsFromDrift ¶
func SuggestionsFromDrift(d Drift, existing []Suggestion) []Suggestion
SuggestionsFromDrift converts a drift report into pending suggestions. Idempotent against the existing list — a near- duplicate that's already in the queue (same Kind + From + To, regardless of order) doesn't get a second entry.
type SuggestionKind ¶
type SuggestionKind string
SuggestionKind enumerates the closed set of changes the queue can surface. Add a new kind here, in ApplyDecision, and add the corresponding Vocabulary / BeadConceptStore handler.
const ( KindMerge SuggestionKind = "merge" KindRename SuggestionKind = "rename" KindDelete SuggestionKind = "delete" )
type SuggestionList ¶
type SuggestionList struct {
Suggestions []Suggestion `json:"suggestions"`
}
LoadSuggestions / SaveSuggestions mirror the vocabulary helpers.
func LoadSuggestions ¶
func LoadSuggestions(workspace string) (*SuggestionList, error)
func (*SuggestionList) Add ¶
func (l *SuggestionList) Add(s Suggestion) bool
Add appends a suggestion. No-op when the (kind, from, to) tuple is already pending or approved — rejected suggestions don't block a re-proposal because the operator's earlier "no" was about that instance, not the entire idea.
func (*SuggestionList) Approved ¶
func (l *SuggestionList) Approved() []Suggestion
Approved / Rejected accessors mirror Pending.
func (*SuggestionList) Find ¶
func (l *SuggestionList) Find(id string) (*Suggestion, bool)
Find returns the suggestion with the given id (and a found bool).
func (*SuggestionList) Mark ¶
func (l *SuggestionList) Mark(id string, status SuggestionStatus) error
Mark updates a suggestion's status. Returns ErrSuggestionNotFound when the id doesn't match. Only pending suggestions can transition; re-marking a decided suggestion is an error so an operator can't silently flip a historical decision.
func (*SuggestionList) Pending ¶
func (l *SuggestionList) Pending() []Suggestion
Pending returns the slice of pending suggestions in stable order (by Kind, then From, then To). Callers wanting all statuses iterate the SuggestionList directly.
func (*SuggestionList) Rejected ¶
func (l *SuggestionList) Rejected() []Suggestion
type SuggestionStatus ¶
type SuggestionStatus string
SuggestionStatus tracks the operator's decision lifecycle.
const ( StatusPending SuggestionStatus = "pending" StatusApproved SuggestionStatus = "approved" StatusRejected SuggestionStatus = "rejected" )
type Term ¶
type Term struct {
Name string `json:"name"`
Source string `json:"source"`
Description string `json:"description,omitempty"`
// Aliases are names that merged into this term. Kept on the
// surviving term so lookups for the retired name still resolve
// without walking the suggestions log.
Aliases []string `json:"aliases,omitempty"`
CreatedAt time.Time `json:"created_at"`
UpdatedAt time.Time `json:"updated_at"`
// Retired terms stay in the vocabulary so historical rewrites
// can find them; lookups for active terms filter via [Vocabulary.Active].
Retired bool `json:"retired,omitempty"`
RetiredAt *time.Time `json:"retired_at,omitempty"`
}
Term is one entry in the controlled vocabulary. Names are normalized lower-kebab-case so a bead carrying "React-Query" matches a vocabulary term "react-query".
type Vocabulary ¶
type Vocabulary struct {
Terms []Term `json:"terms"`
}
Vocabulary is the closed set of terms, stably ordered by Name. The in-memory shape mirrors the on-disk vocabulary.json so the file stays diff-friendly for review in beads dolt commits.
func LoadVocabulary ¶
func LoadVocabulary(workspace string) (*Vocabulary, error)
LoadVocabulary reads vocabulary.json. Returns an empty Vocabulary (not an error) when the file doesn't exist — a fresh workspace's first read is "no terms yet", which the bootstrap path handles.
func (*Vocabulary) Active ¶
func (v *Vocabulary) Active() []Term
Active returns just the non-retired terms, copied so callers can't mutate vocabulary state.
func (*Vocabulary) Add ¶
func (v *Vocabulary) Add(t Term) (*Term, bool)
Add inserts a term, returning the inserted term and a bool that reports whether it was new. Re-adding an existing name is a no-op that returns the existing term — bootstrap sources can run multiple times without piling up duplicates.
func (*Vocabulary) Find ¶
func (v *Vocabulary) Find(name string) (*Term, bool)
Find returns the term with the given canonical name (or any alias), and a bool reporting whether it was found. Includes retired terms — historical rewrites need them.
func (*Vocabulary) Merge ¶
func (v *Vocabulary) Merge(from, to string) (*Term, error)
Merge folds the `from` term into `to`: from's name is added as an alias on to, from is retired. Both terms must already exist in the vocabulary. Returns the surviving term and an error when either name is missing.
Merge is the vocabulary-level half of the rewrite pipeline; the historical bead rewrite is ApplyMerge over a BeadConceptStore.
func (*Vocabulary) Rename ¶
func (v *Vocabulary) Rename(from, to string) (*Term, error)
Rename swaps a term's canonical name in place. The old name is preserved as an alias so beads carrying the old name still resolve.
func (*Vocabulary) Retire ¶
func (v *Vocabulary) Retire(name string) bool
Retire marks the named term as retired and stamps RetiredAt. No-op when the term is already retired; returns false when the name matches no term at all.
func (*Vocabulary) Sort ¶
func (v *Vocabulary) Sort()
Sort sorts the vocabulary's terms by name in place. Storage layer calls this before serializing so the on-disk order is stable.
type VocabularyError ¶
VocabularyError is the typed error returned by mutator methods so callers can branch on Reason without string parsing.
func (*VocabularyError) Error ¶
func (e *VocabularyError) Error() string