inspector

package
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 21, 2026 License: MIT Imports: 34 Imported by: 0

Documentation

Overview

Package inspector analyzes source trees and extracts code-quality metrics aimed at finding where to improve a codebase.

For each supported file it reports a code/comment/blank line breakdown, import and variable counts, and per-function cyclomatic complexity, cognitive complexity, nesting depth, parameter count and a Maintainability Index. It also computes file-level Halstead measures, git churn and a hotspot score (complexity x churn), detects duplicate code across files, and builds an import dependency graph with fan-in/fan-out and cycle detection.

Go is parsed with the standard library go/ast. A broad set of other languages (Python, JavaScript/JSX, TypeScript/TSX, Rust, Java, C, C++, C#, Ruby, PHP, Bash, Scala, CSS, HTML, JSON) is parsed with tree-sitter and bundled by default. Any additional tree-sitter grammar can be added at startup with RegisterLanguage; it is analyzed by a generic adapter that auto-derives its metric hints by introspecting the grammar's node-kind and field vocabulary (Python/JS/TS use higher-fidelity hand-tuned specs). Because the tree-sitter grammars are C, building any program that imports this package requires cgo: set CGO_ENABLED=1 and have a C compiler (gcc or clang) on PATH.

The simplest entry point is Inspect, which runs everything and returns a single Report:

report, err := inspector.Inspect("./path/to/project", inspector.Options{})
if err != nil {
	log.Fatal(err)
}
for _, h := range report.Summary.TopHotspots {
	fmt.Printf("%s\thot=%.0f cyc=%d churn=%d\n", h.Path, h.Hotspot, h.Cyclomatic, h.Churn)
}

The individual stages (BuildTree, ComputeChurn, BuildSummary, DetectDuplication, BuildDependencyGraph) are also exported for callers that need finer control.

Index

Constants

View Source
const DefaultDuplicationMinTokens = 50

DefaultDuplicationMinTokens is the default clone window size (in normalized tokens) for duplicate-code detection.

Variables

This section is empty.

Functions

func BuildExcludeSet

func BuildExcludeSet(includeDefaults bool, extras []string) map[string]struct{}

BuildExcludeSet constructs a normalized set of excluded directory names.

func ComputeChurn

func ComputeChurn(root *TreeNode, scanRoot string) bool

ComputeChurn annotates each file node with its git commit frequency and a hotspot score (complexity * churn). It returns false when the scan root is not inside a git work tree or git is unavailable, leaving the tree untouched.

func RegisterLanguage added in v0.2.0

func RegisterLanguage(cfg LanguageConfig)

RegisterLanguage adds (or overrides) support for a tree-sitter grammar mapped to one or more file extensions. The grammar is analyzed with the generic heuristic adapter; pass Hints to improve accuracy. It is safe to call at program startup before analysis begins.

func SupportedExtensions added in v0.2.0

func SupportedExtensions() []string

SupportedExtensions returns the registered file extensions, sorted.

func SupportedLanguages added in v0.2.0

func SupportedLanguages() []string

SupportedLanguages returns the registered language names, sorted.

Types

type Config

type Config struct {
	ExcludedDirs    map[string]struct{}
	ExcludePatterns []string
	SupportedOnly   bool
	// AnalyzerWorkers controls per-directory file analysis workers.
	// 0 uses automatic sizing, 1 forces sequential file analysis.
	AnalyzerWorkers int
	// GitChurn enables per-file commit-frequency and hotspot scoring.
	GitChurn bool
}

Config controls traversal and filtering behavior.

type DependencyReport

type DependencyReport struct {
	Nodes            int
	Edges            int
	ExternalImports  int
	MostDependedOn   []DependencyStat // highest fan-in: change here ripples widely
	MostDependencies []DependencyStat // highest fan-out: most fragile
	Cycles           [][]string
}

DependencyReport summarizes the internal import graph. Nodes are files for JavaScript/TypeScript/Python and packages (directories) for Go.

func BuildDependencyGraph

func BuildDependencyGraph(root *TreeNode, scanRoot string, topN int) DependencyReport

BuildDependencyGraph resolves intra-project imports into a graph and computes fan-in/fan-out plus dependency cycles.

type DependencyStat

type DependencyStat struct {
	Node   string
	FanIn  int
	FanOut int
}

DependencyStat holds the fan-in/fan-out of a single graph node.

type DuplicateBlock

type DuplicateBlock struct {
	Tokens     int
	Lines      int
	FirstPath  string
	FirstStart int
	FirstEnd   int
	OtherPath  string
	OtherStart int
	OtherEnd   int
}

DuplicateBlock is a pair of code regions that share an identical normalized token sequence (variable names and literal values are normalized, so renamed copies still match).

type DuplicationReport

type DuplicationReport struct {
	MinTokens       int
	Blocks          []DuplicateBlock
	TotalBlocks     int
	DuplicatedLines int
}

DuplicationReport summarizes clone detection across a scan.

func DetectDuplication

func DetectDuplication(root *TreeNode, minTokens, topN int) DuplicationReport

DetectDuplication finds duplicated token sequences of at least minTokens across all analyzed files in the tree. topN caps the reported block list.

type FileHotspot

type FileHotspot struct {
	Path            string
	Language        string
	Cyclomatic      int
	Churn           int
	Hotspot         float64
	LineCount       int
	Maintainability float64
}

FileHotspot is a ranked file entry in a Summary.

type FileMetrics

type FileMetrics struct {
	Language      string
	LineCount     int // physical line count
	CodeLines     int
	CommentLines  int
	BlankLines    int
	ImportCount   int
	VariableCount int
	// TodoCount counts TODO/FIXME/HACK/XXX markers found in comments.
	TodoCount int
	// Cyclomatic is the sum of every function's cyclomatic complexity.
	Cyclomatic int
	// MaxComplexity is the highest single-function cyclomatic complexity.
	MaxComplexity int
	// Halstead holds the file-level Halstead measures.
	Halstead Halstead
	// Maintainability is a 0-100 Maintainability Index for the file.
	Maintainability float64
	// Imports holds the raw import specifiers found in the file, used to build
	// the dependency graph.
	Imports   []string `json:",omitempty"`
	Functions []FunctionInfo
}

FileMetrics stores per-file metrics extracted by analyzers.

func AnalyzeFile

func AnalyzeFile(path string) (*FileMetrics, bool, error)

AnalyzeFile extracts metrics for supported source files. The returned bool reports whether the file extension is supported.

func (*FileMetrics) CommentRatio

func (m *FileMetrics) CommentRatio() float64

CommentRatio returns comment lines as a fraction of code+comment lines.

func (*FileMetrics) FunctionCount

func (m *FileMetrics) FunctionCount() int

FunctionCount returns the number of discovered functions in this file.

type FunctionHotspot

type FunctionHotspot struct {
	Path       string
	Name       string
	Line       int
	Cyclomatic int
	Cognitive  int
	LineCount  int
}

FunctionHotspot is a ranked function entry in a Summary.

type FunctionInfo

type FunctionInfo struct {
	Name      string
	Signature string
	Line      int
	LineCount int

	// Cyclomatic is the McCabe cyclomatic complexity: decision points + 1.
	Cyclomatic int
	// Cognitive is an approximation of SonarSource cognitive complexity; it
	// penalizes nesting and is a better "how hard to understand" proxy.
	Cognitive int
	// MaxNesting is the deepest control-flow nesting level inside the function.
	MaxNesting int
	// Params is the number of declared parameters.
	Params int
	// Maintainability is a 0-100 Maintainability Index for the function
	// (higher is better).
	Maintainability float64
}

FunctionInfo stores function metadata discovered in a source file.

type Halstead

type Halstead struct {
	Vocabulary int     // distinct operators + distinct operands
	Length     int     // total operators + total operands
	Volume     float64 // Length * log2(Vocabulary)
	Difficulty float64 // (distinctOperators/2) * (totalOperands/distinctOperands)
	Effort     float64 // Difficulty * Volume
}

Halstead holds the Halstead complexity measures derived from the operators and operands in a unit of code.

type LanguageAnalyzer

type LanguageAnalyzer interface {
	Analyze(source []byte) (*FileMetrics, error)
}

LanguageAnalyzer extracts metrics from the source of a single language.

type LanguageConfig added in v0.2.0

type LanguageConfig struct {
	Name       string           // canonical language name, e.g. "rust"
	Extensions []string         // file extensions including the dot, e.g. []string{".rs"}
	Grammar    *sitter.Language // required tree-sitter grammar
	Hints      *LanguageHints   // optional; augments the heuristic defaults
}

LanguageConfig registers a tree-sitter grammar for a set of file extensions. Construct Grammar with sitter.NewLanguage(grammarPackage.Language()).

type LanguageHints added in v0.2.0

type LanguageHints struct {
	FunctionKinds []string // node kinds that define a function/method
	DecisionKinds []string // node kinds counting +1 cyclomatic
	NestingKinds  []string // node kinds that increase nesting (cognitive/depth)
	ImportKinds   []string // node kinds that are imports (counted)
	NameField     string   // field name for a definition's name (default "name")
	ParamsField   string   // field name for the parameter list (default "parameters")
}

LanguageHints lets a caller refine the generic analyzer for a language by naming the relevant tree-sitter node kinds. Any unset field falls back to the curated cross-language defaults.

type Options

type Options struct {
	// Excludes are file or directory names/glob patterns to skip.
	Excludes []string
	// NoDefaultExcludes disables the built-in excludes (.git, node_modules, ...).
	NoDefaultExcludes bool
	// SupportedOnly prunes unsupported files from the tree.
	SupportedOnly bool
	// Workers controls per-directory analysis workers: 0 = auto, 1 = sequential.
	Workers int
	// TopN caps each ranked list (0 = 10).
	TopN int
	// DupMinTokens is the clone window size (0 = DefaultDuplicationMinTokens).
	DupMinTokens int
	// NoGit disables git churn and hotspot scoring.
	NoGit bool
	// NoDup disables duplicate-code detection.
	NoDup bool
	// NoDeps disables the import dependency graph.
	NoDeps bool
}

Options configures a full inspection run. The zero value is valid and produces sensible defaults: built-in directory excludes applied, automatic worker sizing, top-10 ranked lists, and git churn, duplicate detection and the dependency graph all enabled.

type Report

type Report struct {
	Root         *TreeNode
	Summary      Summary
	Duplication  *DuplicationReport `json:",omitempty"`
	Dependencies *DependencyReport  `json:",omitempty"`
}

Report is the aggregate result of a full inspection.

func Inspect

func Inspect(path string, opts Options) (*Report, error)

Inspect analyzes the tree rooted at path and returns the metrics tree plus the ranked summary, duplicate-code report and dependency graph. It is the primary entry point for embedding the inspector in another program.

Building any program that calls Inspect requires cgo (CGO_ENABLED=1 and a C compiler) because the tree-sitter grammars are C.

type Summary

type Summary struct {
	Files              int
	SupportedFiles     int
	TotalLines         int
	TotalCode          int
	TotalComment       int
	TotalBlank         int
	TotalFunctions     int
	TotalTodos         int
	GitChurn           bool
	TopHotspots        []FileHotspot
	MostComplex        []FunctionHotspot
	Longest            []FunctionHotspot
	LowestMaintainable []FileHotspot
}

Summary is an aggregate, ranked view of a scan, built to surface the highest value places to improve.

func BuildSummary

func BuildSummary(root *TreeNode, topN int, gitChurn bool) Summary

BuildSummary aggregates the tree into ranked lists. topN caps each list.

type TreeNode

type TreeNode struct {
	Name     string
	Path     string
	RelPath  string `json:",omitempty"` // path relative to the scan root
	IsDir    bool
	Metrics  *FileMetrics `json:",omitempty"`
	Children []*TreeNode  `json:",omitempty"`
	Warning  string       `json:",omitempty"`

	// Churn is the number of git commits that touched this file.
	Churn int `json:",omitempty"`
	// Hotspot is the refactoring-priority score: complexity * churn.
	Hotspot float64 `json:",omitempty"`
}

TreeNode is a directory or file entry in the output tree.

func BuildTree

func BuildTree(rootPath string, cfg Config) (*TreeNode, error)

BuildTree traverses the target directory and builds a deterministic tree.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL