ckgalign

package
v0.0.0-...-b99cd60 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 21, 2026 License: AGPL-3.0 Imports: 5 Imported by: 0

Documentation

Overview

Package ckgalign builds an in-memory index from a CKG SQLite store (graph.db) and resolves each CKV chunk's CKGNodeID by matching (file_path, start_line) — exact start-line preferred, then smallest containing line range.

Used by `ckv build --ckg <dir>` to populate chunks.ckg_node_id, the 1:1 alignment that cks composer relies on to disambiguate same-named symbols across packages (e.g. eight different `Finalize` methods).

One-shot use: Load() reads every alignment-candidate node row into RAM (~25 MB for a 256k-node graph) and closes the DB handle before return. Lookup() is O(N_per_file) — fine for the chunk-emit loop because per-file slice size is small (avg ~tens of symbols).

Index

Constants

View Source
const MinOverlapLines = 2

MinOverlapLines is the minimum number of lines a chunk range and a ckg node range must share for step-3 (range overlap) to claim a match. A single shared line is almost always a boundary artifact — the chunk's closing `}` lying on the same line as the next function's opening brace. Requiring >= 2 shared lines avoids the Go method-body family mismatch (IsHomestead chunk binding to IsDAOFork node, etc.) while preserving the substantive-overlap case the step was designed for.

View Source
const NearestTolerance = 5

NearestTolerance is the maximum line gap between a chunk and a ckg node to be considered a candidate for the nearest-match step. 5 lines absorbs the common Go pattern where the ckg node covers only the function signature (e.g. `params.ChainConfig.IsConstantinople@:1017-1019`) while the ckv chunk covers the function body (`@:1020-1022`) — a 3-line gap no overlap/containment can catch. Larger tolerances start to introduce false positives for densely-packed const/var declarations.

Variables

This section is empty.

Functions

This section is empty.

Types

type Entry

type Entry struct {
	ID        string
	StartLine int
	EndLine   int
	// CanonicalID is ckg's globally-unique, import-path-qualified symbol id
	// (ADR-0001), copied verbatim so a CKV chunk inherits the exact key ckg's
	// FindByCanonicalID resolves on. Empty when the ckg graph predates
	// canonical_id (schema < 1.16) or for symbols ckg leaves unqualified.
	CanonicalID string
}

Entry is one ckg node row keyed by (start_line, end_line).

type Index

type Index struct {
	// contains filtered or unexported fields
}

Index holds every alignment-candidate ckg node grouped by file_path, each file's slice sorted ascending by start_line.

func Load

func Load(ckgPath string) (*Index, error)

Load opens <ckgPath>/graph.db read-only and indexes alignment-eligible node rows. Returns a populated *Index ready for Lookup. The DB handle is closed before return.

"Alignment-eligible" excludes pseudo nodes whose file_path does not describe a normal source span — `file:`/`hunk:`/`import:` prefixes in ckg's qualified_name space, and rows with empty file_path or non-positive start_line. Everything else (Function, Method, Type, Constant, Variable, Interface, Struct, Field, etc.) is a candidate.

func (*Index) EntryCount

func (ix *Index) EntryCount() int

EntryCount returns the total entry count across all files (diagnostic / footprint emission).

func (*Index) FileCount

func (ix *Index) FileCount() int

FileCount returns the number of unique files indexed (diagnostic / footprint emission).

func (*Index) Lookup

func (ix *Index) Lookup(filePath string, startLine, endLine int) string

Lookup returns the ckg node id that best matches (filePath, startLine, endLine) by trying four progressively looser strategies in order:

  1. exact start_line match (smallest range tiebreak),
  2. node whose [start_line, end_line] contains chunk startLine (smallest range tiebreak),
  3. range overlap: chunk [startLine, endLine] and node [s, e] share at least one line (smallest gap tiebreak — same as smallest |s − startLine| when both ranges are valid),
  4. nearest non-overlapping node within NearestTolerance lines, picking the smallest gap.

endLine == 0 (or < startLine) is treated as endLine = startLine so older callers and zero-span chunks still match exactly + range-contain.

filePath must be the same shape ckg stored — src-root-relative. The smallest-range tiebreak picks the inner Method/Field node over the enclosing Type node when both fire, so chunks emitted at a method body line map to the method, not its enclosing type. Lookup returns the matched ckg node ID (or "" when nothing matches). It is a thin wrapper over LookupEntry, kept for callers that only need the id.

func (*Index) LookupEntry

func (ix *Index) LookupEntry(filePath string, startLine, endLine int) *Entry

LookupEntry returns the full matched ckg node (id + canonical_id) or nil. The matching ladder is unchanged from the original Lookup; exposing the Entry lets callers copy both the node ID and the canonical_id in one pass.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL