analysis

package
v1.4.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 4, 2026 License: Apache-2.0 Imports: 21 Imported by: 0

Documentation

Overview

Package analysis implements the Phase 0 analysis engine for pcke.

It provides file tree scanning, git intelligence via go-git, secrets filtering, and path-based classification heuristics. The scanner orchestrates these components and persists results to the kdb store via kdb.DB.Update transactions.

Concurrency: a single Scanner instance should be used per scan invocation. The underlying kdb.DB manages its own locking.

See PRD §5.5 for the analysis pipeline design.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CheckBranchMismatch added in v0.2.0

func CheckBranchMismatch(ctx context.Context, db *kdb.DB, root string) string

CheckBranchMismatch reads the stored scan branch from the knowledge base and compares it against the current HEAD branch. Returns a non-empty warning message if they differ, or "" if they match (or if no prior scan exists).

func DetectModule

func DetectModule(relPath string) string

DetectModule infers a module name from a file path. It returns the first directory component under well-known roots (internal/, pkg/, lib/, src/, cmd/) or the first directory component if no known root is found. For files at the repository root, it returns "(root)".

func IsSecretPath

func IsSecretPath(relPath string) bool

IsSecretPath reports whether the given relative path matches a secret-file pattern and should be excluded from analysis.

func Language

func Language(ext string) string

Language returns the detected programming language from a file extension. Returns an empty string for unrecognised extensions.

func RedactSecrets

func RedactSecrets(content string) (string, bool)

RedactSecrets replaces secret values in content with "[REDACTED]". It returns the redacted content and true if any redaction was applied.

Types

type AuthorCommits

type AuthorCommits struct {
	Author  string
	Commits int
}

AuthorCommits pairs an author name with their commit count for a module.

type EvolutionLog

type EvolutionLog struct {
	ID         string    `json:"id"`
	NodeID     string    `json:"node_id"`
	CommitHash string    `json:"commit_hash"`
	ChangeType string    `json:"change_type"`
	Author     string    `json:"author"`
	Timestamp  time.Time `json:"timestamp"`
}

EvolutionLog records a change event for a knowledge node.

type FileClass

type FileClass int

FileClass categorises a file by its role in the project.

const (
	// ClassUnknown indicates the file role could not be determined.
	ClassUnknown FileClass = iota
	// ClassSource marks production source code.
	ClassSource
	// ClassTest marks test files.
	ClassTest
	// ClassEntryPoint marks CLI or application entry points.
	ClassEntryPoint
	// ClassAPI marks API/route definition files.
	ClassAPI
	// ClassDataLayer marks model/entity files.
	ClassDataLayer
	// ClassInfra marks infrastructure-as-code files.
	ClassInfra
	// ClassConfig marks configuration files.
	ClassConfig
	// ClassDoc marks documentation files.
	ClassDoc
	// ClassAsset marks non-code assets (images, fonts, etc.).
	ClassAsset
)

func Classify

func Classify(relPath string) FileClass

Classify determines the FileClass for a file based on its path. The relPath should be slash-separated and relative to the repository root.

func (FileClass) String

func (c FileClass) String() string

String returns a human-readable label for the classification.

type FileStats

type FileStats struct {
	TotalCommits   int
	RecentCommits  int // Commits in the last 90 days.
	Stability      float64
	LastAuthor     string
	LastChangeType string
	LastCommitTime time.Time
}

FileStats holds per-file git statistics.

type GitIntel

type GitIntel struct {
	// contains filtered or unexported fields
}

GitIntel extracts intelligence from a git repository.

func NewGitIntel

func NewGitIntel(dir string) (*GitIntel, error)

NewGitIntel opens the git repository at dir and returns a GitIntel. Returns an error if dir is not inside a git working tree.

func (*GitIntel) Authorship

func (g *GitIntel) Authorship() (map[string][]AuthorCommits, error)

Authorship returns authorship information for all files, grouped by module. The key is the module name (from DetectModule), the value lists authors sorted by commit count (descending).

func (*GitIntel) CurrentBranch added in v0.2.0

func (g *GitIntel) CurrentBranch() string

CurrentBranch returns the short name of the current branch (e.g., "main"). Returns an empty string if HEAD is detached.

func (*GitIntel) DetectRenames added in v0.2.0

func (g *GitIntel) DetectRenames(sinceHash string) ([]RenameEntry, error)

DetectRenames finds file renames in git history since the given commit hash. If sinceHash is empty, it scans the last 100 commits. Returns renames detected via tree-diff similarity matching (equivalent to git log --follow --diff-filter=R).

func (*GitIntel) FileHistory

func (g *GitIntel) FileHistory(relPath string) (FileStats, error)

FileHistory returns FileStats for the given file path. The path should be relative to the repository root, using forward slashes.

func (*GitIntel) GitIgnoredFiles

func (g *GitIntel) GitIgnoredFiles() (map[string]bool, error)

GitIgnoredFiles returns the set of files that are git-ignored. This uses the worktree status to detect ignored files.

func (*GitIntel) HeadHash

func (g *GitIntel) HeadHash() (string, error)

HeadHash returns the current HEAD commit hash as a hex string.

type KnowledgeNode

type KnowledgeNode struct {
	ID          string    `json:"id"`
	Type        string    `json:"type"`
	Name        string    `json:"name"`
	FilePath    string    `json:"file_path"`
	Language    string    `json:"language"`
	Module      string    `json:"module"`
	Class       string    `json:"class"`
	Source      string    `json:"source,omitempty"`
	Stability   float64   `json:"stability"`
	Status      string    `json:"status"`
	ContentHash string    `json:"content_hash"`
	CreatedAt   time.Time `json:"created_at"`
	UpdatedAt   time.Time `json:"updated_at"`

	// Deep analysis fields (populated when --deep is used).
	Entities []ast.Entity `json:"entities,omitempty"`
	Imports  []ast.Import `json:"imports,omitempty"`
}

KnowledgeNode represents a single file's analysis results persisted in the knowledge base. See PRD §5.2.

type Relation added in v0.2.0

type Relation struct {
	ID           string    `json:"id"`
	SourceNodeID string    `json:"source_node_id"`
	TargetNodeID string    `json:"target_node_id"`
	Type         string    `json:"type"`
	Source       string    `json:"source"`
	CreatedAt    time.Time `json:"created_at"`
}

Relation records a directed edge between two knowledge nodes. See PRD §5.2 — collection: relations.

type RenameEntry added in v0.2.0

type RenameEntry struct {
	OldPath    string
	NewPath    string
	CommitHash string
	Author     string
	Timestamp  time.Time
}

RenameEntry records a detected file rename in git history.

type ScanOption added in v0.2.0

type ScanOption func(*Scanner)

ScanOption configures optional Scanner behaviour.

func WithDeep added in v0.2.0

func WithDeep() ScanOption

WithDeep enables AST-based deep analysis (tree-sitter entity extraction).

type ScanResult

type ScanResult struct {
	NodesCreated      int
	NodesUpdated      int
	NodesDeleted      int
	FilesScanned      int
	FilesSkipped      int
	SecretsFound      int
	EntitiesExtracted int
	RelationsCreated  int
	CommitHash        string
	Duration          time.Duration
}

ScanResult summarises a completed scan.

type Scanner

type Scanner struct {
	// contains filtered or unexported fields
}

Scanner performs file tree analysis and persists results to a kdb database.

func NewScanner

func NewScanner(root string, db *kdb.DB, cfg config.ScanConfig, opts ...ScanOption) (*Scanner, error)

NewScanner creates a Scanner for the repository at root, persisting results into db. The cfg controls redaction and exclusion behaviour. Use WithDeep to enable AST-based entity extraction.

func (*Scanner) LastScanCommit

func (s *Scanner) LastScanCommit(ctx context.Context) string

LastScanCommit returns the commit hash of the last successful scan, or an empty string if no scan has been performed.

func (*Scanner) Scan

func (s *Scanner) Scan(ctx context.Context, full bool) (*ScanResult, error)

Scan performs a full or incremental scan based on the full flag. A full scan rebuilds all nodes. An incremental scan only processes files changed since the last scan commit.

type Watcher added in v0.9.0

type Watcher struct {
	// contains filtered or unexported fields
}

Watcher watches the repository for file changes and triggers scans.

func NewWatcher added in v0.9.0

func NewWatcher(root string, db *kdb.DB, cfg config.ScanConfig, opts WatcherOpts) (*Watcher, error)

NewWatcher creates a file watcher for the repository at root.

func (*Watcher) Run added in v0.9.0

func (w *Watcher) Run(ctx context.Context) error

Run starts the watcher loop. It blocks until ctx is cancelled or Stop is called.

func (*Watcher) ShouldIgnore added in v0.9.0

func (w *Watcher) ShouldIgnore(path string) bool

ShouldIgnore reports whether the watcher should ignore a change to path. Exported for testing.

func (*Watcher) Stop added in v0.9.0

func (w *Watcher) Stop()

Stop signals the watcher to stop.

type WatcherOpts added in v0.9.0

type WatcherOpts struct {
	Verbose  bool          // Print verbose output.
	Debounce time.Duration // Debounce interval (default 500ms).
	// OnScan is called after each successful scan with the result.
	OnScan func(result *ScanResult)
}

WatcherOpts configures the file watcher.

Directories

Path Synopsis
Package annotations extracts @pcke-rule and @pcke-lesson annotations from source code comments across supported languages (Go, Python, JavaScript, TypeScript, Java).
Package annotations extracts @pcke-rule and @pcke-lesson annotations from source code comments across supported languages (Go, Python, JavaScript, TypeScript, Java).
Package ast provides tree-sitter powered AST analysis for source code entity extraction.
Package ast provides tree-sitter powered AST analysis for source code entity extraction.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL