Documentation
¶
Overview ¶
Package inspector analyzes source trees and extracts code-quality metrics aimed at finding where to improve a codebase.
For each supported file it reports a code/comment/blank line breakdown, import and variable counts, and per-function cyclomatic complexity, cognitive complexity, nesting depth, parameter count and a Maintainability Index. It also computes file-level Halstead measures, git churn and a hotspot score (complexity x churn), detects duplicate code across files, and builds an import dependency graph with fan-in/fan-out and cycle detection.
Go is parsed with the standard library go/ast. A broad set of other languages (Python, JavaScript/JSX, TypeScript/TSX, Rust, Java, C, C++, C#, Ruby, PHP, Bash, Scala, CSS, HTML, JSON) is parsed with tree-sitter and bundled by default. Any additional tree-sitter grammar can be added at startup with RegisterLanguage; it is analyzed by a generic adapter that auto-derives its metric hints by introspecting the grammar's node-kind and field vocabulary (Python/JS/TS use higher-fidelity hand-tuned specs). Because the tree-sitter grammars are C, building any program that imports this package requires cgo: set CGO_ENABLED=1 and have a C compiler (gcc or clang) on PATH.
The simplest entry point is Inspect, which runs everything and returns a single Report:
report, err := inspector.Inspect("./path/to/project", inspector.Options{})
if err != nil {
log.Fatal(err)
}
for _, h := range report.Summary.TopHotspots {
fmt.Printf("%s\thot=%.0f cyc=%d churn=%d\n", h.Path, h.Hotspot, h.Cyclomatic, h.Churn)
}
The individual stages (BuildTree, ComputeChurn, BuildSummary, DetectDuplication, BuildDependencyGraph) are also exported for callers that need finer control.
Index ¶
- Constants
- func BuildExcludeSet(includeDefaults bool, extras []string) map[string]struct{}
- func ComputeChurn(root *TreeNode, scanRoot string) bool
- func RegisterLanguage(cfg LanguageConfig)
- func SupportedExtensions() []string
- func SupportedLanguages() []string
- type Config
- type DependencyReport
- type DependencyStat
- type DuplicateBlock
- type DuplicationReport
- type FileHotspot
- type FileMetrics
- type FunctionHotspot
- type FunctionInfo
- type Halstead
- type LanguageAnalyzer
- type LanguageConfig
- type LanguageHints
- type Options
- type Report
- type Summary
- type TreeNode
Constants ¶
const DefaultDuplicationMinTokens = 50
DefaultDuplicationMinTokens is the default clone window size (in normalized tokens) for duplicate-code detection.
Variables ¶
This section is empty.
Functions ¶
func BuildExcludeSet ¶
BuildExcludeSet constructs a normalized set of excluded directory names.
func ComputeChurn ¶
ComputeChurn annotates each file node with its git commit frequency and a hotspot score (complexity * churn). It returns false when the scan root is not inside a git work tree or git is unavailable, leaving the tree untouched.
func RegisterLanguage ¶ added in v0.2.0
func RegisterLanguage(cfg LanguageConfig)
RegisterLanguage adds (or overrides) support for a tree-sitter grammar mapped to one or more file extensions. The grammar is analyzed with the generic heuristic adapter; pass Hints to improve accuracy. It is safe to call at program startup before analysis begins.
func SupportedExtensions ¶ added in v0.2.0
func SupportedExtensions() []string
SupportedExtensions returns the registered file extensions, sorted.
func SupportedLanguages ¶ added in v0.2.0
func SupportedLanguages() []string
SupportedLanguages returns the registered language names, sorted.
Types ¶
type Config ¶
type Config struct {
ExcludedDirs map[string]struct{}
ExcludePatterns []string
SupportedOnly bool
// AnalyzerWorkers controls per-directory file analysis workers.
// 0 uses automatic sizing, 1 forces sequential file analysis.
AnalyzerWorkers int
// GitChurn enables per-file commit-frequency and hotspot scoring.
GitChurn bool
}
Config controls traversal and filtering behavior.
type DependencyReport ¶
type DependencyReport struct {
Nodes int
Edges int
ExternalImports int
MostDependedOn []DependencyStat // highest fan-in: change here ripples widely
MostDependencies []DependencyStat // highest fan-out: most fragile
Cycles [][]string
}
DependencyReport summarizes the internal import graph. Nodes are files for JavaScript/TypeScript/Python and packages (directories) for Go.
func BuildDependencyGraph ¶
func BuildDependencyGraph(root *TreeNode, scanRoot string, topN int) DependencyReport
BuildDependencyGraph resolves intra-project imports into a graph and computes fan-in/fan-out plus dependency cycles.
type DependencyStat ¶
DependencyStat holds the fan-in/fan-out of a single graph node.
type DuplicateBlock ¶
type DuplicateBlock struct {
Tokens int
Lines int
FirstPath string
FirstStart int
FirstEnd int
OtherPath string
OtherStart int
OtherEnd int
}
DuplicateBlock is a pair of code regions that share an identical normalized token sequence (variable names and literal values are normalized, so renamed copies still match).
type DuplicationReport ¶
type DuplicationReport struct {
MinTokens int
Blocks []DuplicateBlock
TotalBlocks int
DuplicatedLines int
}
DuplicationReport summarizes clone detection across a scan.
func DetectDuplication ¶
func DetectDuplication(root *TreeNode, minTokens, topN int) DuplicationReport
DetectDuplication finds duplicated token sequences of at least minTokens across all analyzed files in the tree. topN caps the reported block list.
type FileHotspot ¶
type FileHotspot struct {
Path string
Language string
Cyclomatic int
Churn int
Hotspot float64
LineCount int
Maintainability float64
}
FileHotspot is a ranked file entry in a Summary.
type FileMetrics ¶
type FileMetrics struct {
Language string
LineCount int // physical line count
CodeLines int
CommentLines int
BlankLines int
ImportCount int
VariableCount int
// TodoCount counts TODO/FIXME/HACK/XXX markers found in comments.
TodoCount int
// Cyclomatic is the sum of every function's cyclomatic complexity.
Cyclomatic int
// MaxComplexity is the highest single-function cyclomatic complexity.
MaxComplexity int
// Halstead holds the file-level Halstead measures.
Halstead Halstead
// Maintainability is a 0-100 Maintainability Index for the file.
Maintainability float64
// Imports holds the raw import specifiers found in the file, used to build
// the dependency graph.
Imports []string `json:",omitempty"`
Functions []FunctionInfo
}
FileMetrics stores per-file metrics extracted by analyzers.
func AnalyzeFile ¶
func AnalyzeFile(path string) (*FileMetrics, bool, error)
AnalyzeFile extracts metrics for supported source files. The returned bool reports whether the file extension is supported.
func (*FileMetrics) CommentRatio ¶
func (m *FileMetrics) CommentRatio() float64
CommentRatio returns comment lines as a fraction of code+comment lines.
func (*FileMetrics) FunctionCount ¶
func (m *FileMetrics) FunctionCount() int
FunctionCount returns the number of discovered functions in this file.
type FunctionHotspot ¶
type FunctionHotspot struct {
Path string
Name string
Line int
Cyclomatic int
Cognitive int
LineCount int
}
FunctionHotspot is a ranked function entry in a Summary.
type FunctionInfo ¶
type FunctionInfo struct {
Name string
Signature string
Line int
LineCount int
// Cyclomatic is the McCabe cyclomatic complexity: decision points + 1.
Cyclomatic int
// Cognitive is an approximation of SonarSource cognitive complexity; it
// penalizes nesting and is a better "how hard to understand" proxy.
Cognitive int
// MaxNesting is the deepest control-flow nesting level inside the function.
MaxNesting int
// Params is the number of declared parameters.
Params int
// Maintainability is a 0-100 Maintainability Index for the function
// (higher is better).
Maintainability float64
}
FunctionInfo stores function metadata discovered in a source file.
type Halstead ¶
type Halstead struct {
Vocabulary int // distinct operators + distinct operands
Length int // total operators + total operands
Volume float64 // Length * log2(Vocabulary)
Difficulty float64 // (distinctOperators/2) * (totalOperands/distinctOperands)
Effort float64 // Difficulty * Volume
}
Halstead holds the Halstead complexity measures derived from the operators and operands in a unit of code.
type LanguageAnalyzer ¶
type LanguageAnalyzer interface {
Analyze(source []byte) (*FileMetrics, error)
}
LanguageAnalyzer extracts metrics from the source of a single language.
type LanguageConfig ¶ added in v0.2.0
type LanguageConfig struct {
Name string // canonical language name, e.g. "rust"
Extensions []string // file extensions including the dot, e.g. []string{".rs"}
Grammar *sitter.Language // required tree-sitter grammar
Hints *LanguageHints // optional; augments the heuristic defaults
}
LanguageConfig registers a tree-sitter grammar for a set of file extensions. Construct Grammar with sitter.NewLanguage(grammarPackage.Language()).
type LanguageHints ¶ added in v0.2.0
type LanguageHints struct {
FunctionKinds []string // node kinds that define a function/method
DecisionKinds []string // node kinds counting +1 cyclomatic
NestingKinds []string // node kinds that increase nesting (cognitive/depth)
ImportKinds []string // node kinds that are imports (counted)
NameField string // field name for a definition's name (default "name")
ParamsField string // field name for the parameter list (default "parameters")
}
LanguageHints lets a caller refine the generic analyzer for a language by naming the relevant tree-sitter node kinds. Any unset field falls back to the curated cross-language defaults.
type Options ¶
type Options struct {
// Excludes are file or directory names/glob patterns to skip.
Excludes []string
// NoDefaultExcludes disables the built-in excludes (.git, node_modules, ...).
NoDefaultExcludes bool
// SupportedOnly prunes unsupported files from the tree.
SupportedOnly bool
// Workers controls per-directory analysis workers: 0 = auto, 1 = sequential.
Workers int
// TopN caps each ranked list (0 = 10).
TopN int
// DupMinTokens is the clone window size (0 = DefaultDuplicationMinTokens).
DupMinTokens int
// NoGit disables git churn and hotspot scoring.
NoGit bool
// NoDup disables duplicate-code detection.
NoDup bool
// NoDeps disables the import dependency graph.
NoDeps bool
}
Options configures a full inspection run. The zero value is valid and produces sensible defaults: built-in directory excludes applied, automatic worker sizing, top-10 ranked lists, and git churn, duplicate detection and the dependency graph all enabled.
type Report ¶
type Report struct {
Root *TreeNode
Summary Summary
Duplication *DuplicationReport `json:",omitempty"`
Dependencies *DependencyReport `json:",omitempty"`
}
Report is the aggregate result of a full inspection.
func Inspect ¶
Inspect analyzes the tree rooted at path and returns the metrics tree plus the ranked summary, duplicate-code report and dependency graph. It is the primary entry point for embedding the inspector in another program.
Building any program that calls Inspect requires cgo (CGO_ENABLED=1 and a C compiler) because the tree-sitter grammars are C.
type Summary ¶
type Summary struct {
Files int
SupportedFiles int
TotalLines int
TotalCode int
TotalComment int
TotalBlank int
TotalFunctions int
TotalTodos int
GitChurn bool
TopHotspots []FileHotspot
MostComplex []FunctionHotspot
Longest []FunctionHotspot
LowestMaintainable []FileHotspot
}
Summary is an aggregate, ranked view of a scan, built to surface the highest value places to improve.
type TreeNode ¶
type TreeNode struct {
Name string
Path string
RelPath string `json:",omitempty"` // path relative to the scan root
IsDir bool
Metrics *FileMetrics `json:",omitempty"`
Children []*TreeNode `json:",omitempty"`
Warning string `json:",omitempty"`
// Churn is the number of git commits that touched this file.
Churn int `json:",omitempty"`
// Hotspot is the refactoring-priority score: complexity * churn.
Hotspot float64 `json:",omitempty"`
}
TreeNode is a directory or file entry in the output tree.