analyzer

package
v0.0.0-...-515337b Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 14, 2026 License: GPL-3.0 Imports: 11 Imported by: 0

Documentation

Overview

Package analyzer provides confidence scoring for git-history-based analysis

Package analyzer provides the analyzer framework for engineering intelligence

Package analyzer manages running analyzers for engineering intelligence

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CalculateMetricConfidence

func CalculateMetricConfidence(sampleSize, expectedMinimum, idealSample int) int

CalculateMetricConfidence calculates confidence for a specific metric based on the sample size and expected minimum

func Clear

func Clear()

Clear removes all registered analyzers (useful for testing)

func DeepenHistory

func DeepenHistory(repoPath string, commits int) error

DeepenHistory fetches additional commits (useful for incremental deepening)

func EstimateTime

func EstimateTime(analyzer string, fileCount int) int

EstimateTime returns estimated scan time in seconds based on file count

func GroupByDependencies

func GroupByDependencies(analyzers []Analyzer) ([][]Analyzer, error)

GroupByDependencies groups analyzers into levels based on dependencies Analyzers in the same level can run in parallel

func List

func List() []string

List returns all registered analyzer names

func Register

func Register(a Analyzer)

Register adds an analyzer to the registry This is typically called from analyzer init() functions

func TotalEstimate

func TotalEstimate(analyzers []string, fileCount int) int

TotalEstimate returns total estimated time for all analyzers

func UnmarshalConfig

func UnmarshalConfig[T any](opts *ScanOptions, defaultCfg T) T

UnmarshalConfig unmarshals the FeatureConfig from ScanOptions into the target type. Returns defaultCfg if FeatureConfig is nil or unmarshaling fails.

func Unshallow

func Unshallow(repoPath string) error

Unshallow fetches full history for a shallow clone

Types

type Analyzer

type Analyzer interface {
	// Name returns the analyzer identifier (e.g., "code-security")
	Name() string

	// Description returns a human-readable description
	Description() string

	// Run executes the analyzer and returns results
	Run(ctx context.Context, opts *ScanOptions) (*ScanResult, error)

	// Dependencies returns analyzers that must run first (e.g., "code-packages")
	Dependencies() []string

	// EstimateDuration returns estimated duration based on file count
	EstimateDuration(fileCount int) time.Duration

	// Requirements returns what this analyzer needs to run properly
	// Analyzers should return their requirements so the runner can validate
	// before execution and provide helpful error messages
	Requirements() AnalyzerRequirements
}

Analyzer defines the interface all analyzers must implement

func Get

func Get(name string) (Analyzer, bool)

Get returns an analyzer by name

func GetAll

func GetAll() []Analyzer

GetAll returns all registered analyzers

func GetByNames

func GetByNames(names []string) ([]Analyzer, error)

GetByNames returns analyzers for the given names Returns an error if any analyzer is not found

func TopologicalSort

func TopologicalSort(analyzers []Analyzer) ([]Analyzer, error)

TopologicalSort orders analyzers by dependencies Analyzers with no dependencies come first, then analyzers that depend on them

type AnalyzerRequirements

type AnalyzerRequirements struct {
	// RequiresFullHistory indicates the analyzer needs complete git history
	// (e.g., PR metrics, git blame analysis, rework rate detection)
	RequiresFullHistory bool `json:"requires_full_history,omitempty"`

	// RequiresGitHubAPI indicates the analyzer needs GitHub API access
	RequiresGitHubAPI bool `json:"requires_github_api,omitempty"`

	// RequiresSBOM indicates the analyzer depends on SBOM generation
	RequiresSBOM bool `json:"requires_sbom,omitempty"`

	// MinCommitCount is minimum commits needed for meaningful analysis (0 = no minimum)
	MinCommitCount int `json:"min_commit_count,omitempty"`

	// Description explains why these requirements exist
	Description string `json:"description,omitempty"`
}

AnalyzerRequirements defines what an analyzer needs to run properly

type CloneState

type CloneState struct {
	// IsShallow is true if the repository is a shallow clone
	IsShallow bool `json:"is_shallow"`

	// Depth is the clone depth (0 = full clone)
	Depth int `json:"depth"`

	// CommitCount is the number of commits available in the clone
	CommitCount int `json:"commit_count"`

	// CanUnshallow is true if we can fetch more history (has remote)
	CanUnshallow bool `json:"can_unshallow"`

	// RepoPath is the path to the repository
	RepoPath string `json:"repo_path,omitempty"`
}

CloneState provides information about the repository clone

func DetectCloneState

func DetectCloneState(repoPath string) *CloneState

DetectCloneState analyzes a repository to determine its clone state

type ConfidenceComponents

type ConfidenceComponents struct {
	// CommitCount score based on number of commits analyzed (0-100)
	CommitCount int `json:"commit_count"`

	// PeriodCoverage score based on coverage of analysis period (0-100)
	PeriodCoverage int `json:"period_coverage"`

	// Recency score based on freshness of data (0-100)
	Recency int `json:"recency"`

	// CloneDepth score based on shallow vs full clone (0-100)
	CloneDepth int `json:"clone_depth"`

	// DataCompleteness score based on available data sources (0-100)
	DataCompleteness int `json:"data_completeness"`
}

ConfidenceComponents contains the individual factor scores

type ConfidenceInput

type ConfidenceInput struct {
	// TotalCommits is the number of commits analyzed
	TotalCommits int

	// PeriodDays is the analysis period in days
	PeriodDays int

	// FirstCommitDate is when the first commit in period occurred
	FirstCommitDate time.Time

	// LastCommitDate is when the most recent commit occurred
	LastCommitDate time.Time

	// DaysWithCommits is how many unique days have commits
	DaysWithCommits int

	// IsShallowClone indicates if the repository is a shallow clone
	IsShallowClone bool

	// HasPRData indicates if PR/review data is available
	HasPRData bool

	// HasReviewData indicates if code review data is available
	HasReviewData bool

	// HasGitHubAPI indicates if GitHub API was successfully accessed
	HasGitHubAPI bool

	// ContributorCount is the number of unique contributors
	ContributorCount int
}

ConfidenceInput contains data needed to calculate confidence

type ConfidenceLevel

type ConfidenceLevel string

ConfidenceLevel represents confidence classification

const (
	ConfidenceHigh    ConfidenceLevel = "high"     // 80-100: Metrics are reliable
	ConfidenceMedium  ConfidenceLevel = "medium"   // 60-79: Reasonably reliable
	ConfidenceLow     ConfidenceLevel = "low"      // 40-59: Use with caution
	ConfidenceVeryLow ConfidenceLevel = "very_low" // 0-39: Insufficient data
)

type ConfidenceWeights

type ConfidenceWeights struct {
	CommitCount      float64
	PeriodCoverage   float64
	Recency          float64
	CloneDepth       float64
	DataCompleteness float64
}

ConfidenceWeights defines the weight of each factor in the overall score

func DefaultConfidenceWeights

func DefaultConfidenceWeights() ConfidenceWeights

DefaultConfidenceWeights returns the default weights for confidence calculation

type DORAConfidenceInput

type DORAConfidenceInput struct {
	// DeploymentCount is the number of deployments in the period
	DeploymentCount int

	// PeriodDays is the analysis period
	PeriodDays int

	// HasTags indicates if deployments were detected via git tags
	HasTags bool

	// HasGitHubReleases indicates if GitHub releases API was used
	HasGitHubReleases bool

	// CommitCount for the period
	CommitCount int

	// IsShallowClone indicates limited history
	IsShallowClone bool
}

DORAConfidenceInput contains data for DORA metric confidence calculation

type DataConfidence

type DataConfidence struct {
	// Overall confidence score (0-100)
	Score int `json:"score"`

	// Level is the confidence classification
	Level ConfidenceLevel `json:"level"`

	// Components shows the breakdown of confidence factors
	Components ConfidenceComponents `json:"components"`

	// Factors describes what affected the confidence score
	Factors []string `json:"factors,omitempty"`
}

DataConfidence represents confidence in analysis results based on data quality

func CalculateConfidence

func CalculateConfidence(input ConfidenceInput) DataConfidence

CalculateConfidence computes confidence based on data quality factors

func CalculateConfidenceWithWeights

func CalculateConfidenceWithWeights(input ConfidenceInput, weights ConfidenceWeights) DataConfidence

CalculateConfidenceWithWeights computes confidence with custom weights

func CalculateDORAConfidence

func CalculateDORAConfidence(input DORAConfidenceInput) DataConfidence

CalculateDORAConfidence computes confidence for DORA metrics

type Feature

type Feature struct {
	Name        string
	Description string
	Default     bool // Enabled by default?
}

Feature describes a feature within a super scanner

type FeatureConfidence

type FeatureConfidence struct {
	// Feature name
	Feature string `json:"feature"`

	// Confidence score and level
	DataConfidence

	// MetricConfidences contains per-metric confidence if applicable
	MetricConfidences map[string]int `json:"metric_confidences,omitempty"`
}

FeatureConfidence represents confidence for a specific scanner feature

type FinalizeResultOptions

type FinalizeResultOptions struct {
	Name      string
	Version   string
	Start     time.Time
	RepoPath  string
	OutputDir string
	Summary   interface{}
	Findings  interface{}
	Metadata  interface{}
}

FinalizeResultOptions contains options for finalizing a scan result

type Finding

type Finding struct {
	ID          string          `json:"id,omitempty"`
	Title       string          `json:"title,omitempty"`
	Description string          `json:"description,omitempty"`
	Severity    string          `json:"severity"`
	Category    string          `json:"category,omitempty"`
	Package     string          `json:"package,omitempty"`
	Version     string          `json:"version,omitempty"`
	File        string          `json:"file,omitempty"`
	Line        int             `json:"line,omitempty"`
	Confidence  string          `json:"confidence,omitempty"`
	References  []string        `json:"references,omitempty"`
	Metadata    json.RawMessage `json:"metadata,omitempty"`
}

Finding is a common finding structure

type IncrementalOptions

type IncrementalOptions struct {
	// Enabled indicates whether to use incremental scanning
	Enabled bool `json:"enabled"`

	// BaseCommit is the commit to diff against (typically the last scanned commit)
	BaseCommit string `json:"base_commit"`

	// CurrentCommit is the current commit being scanned
	CurrentCommit string `json:"current_commit"`

	// ChangedFiles is the list of files changed since BaseCommit
	// Scanners can use this to focus on only changed files
	ChangedFiles []string `json:"changed_files,omitempty"`

	// PreviousResultPath is the path to the previous scan results
	// Used for merging unchanged results with new results
	PreviousResultPath string `json:"previous_result_path,omitempty"`

	// ForceFullScan forces a full scan even in incremental mode
	// Useful when structural changes require re-analyzing everything
	ForceFullScan bool `json:"force_full_scan,omitempty"`
}

IncrementalOptions configures incremental scanning

type NativeRunner

type NativeRunner struct {
	ZeroHome   string
	Timeout    time.Duration
	Parallel   int
	OnProgress func(analyzer string, status Status, summary string)
}

NativeRunner executes Go-native analyzers

func NewNativeRunner

func NewNativeRunner(zeroHome string) *NativeRunner

NewNativeRunner creates a new native analyzer runner

func (*NativeRunner) RunAnalyzers

func (r *NativeRunner) RunAnalyzers(ctx context.Context, opts RunOptions) (*RunResult, error)

RunAnalyzers executes all configured analyzers for a repository

type Progress

type Progress struct {
	Current        string
	CompletedCount int
	TotalCount     int
	Results        map[string]*Result
	// contains filtered or unexported fields
}

Progress tracks analyzer progress for a repo

func NewProgress

func NewProgress(analyzers []string) *Progress

NewProgress creates a new progress tracker

func (*Progress) GetProgress

func (p *Progress) GetProgress() (completed, total int, current string)

GetProgress returns current progress info

func (*Progress) SetComplete

func (p *Progress) SetComplete(analyzer string, summary string, duration time.Duration)

SetComplete marks an analyzer as complete

func (*Progress) SetFailed

func (p *Progress) SetFailed(analyzer string, err error, duration time.Duration)

SetFailed marks an analyzer as failed

func (*Progress) SetRunning

func (p *Progress) SetRunning(analyzer string)

SetRunning marks an analyzer as running

func (*Progress) SetSkipped

func (p *Progress) SetSkipped(analyzer string)

SetSkipped marks an analyzer as skipped

type RepoMetadata

type RepoMetadata struct {
	// GitHubOrg is the organization or user that owns the repo (e.g., "expressjs")
	GitHubOrg string `json:"github_org"`

	// GitHubRepo is the repository name (e.g., "express")
	GitHubRepo string `json:"github_repo"`

	// RepoURL is the full GitHub URL (e.g., "https://github.com/expressjs/express")
	RepoURL string `json:"repo_url"`

	// CommitSHA is the exact commit being scanned
	CommitSHA string `json:"commit_sha"`

	// Branch is the branch name (e.g., "main", "master")
	Branch string `json:"branch"`

	// ScanProfile is the profile used for scanning (e.g., "all-quick")
	ScanProfile string `json:"scan_profile"`

	// ScannerVersion is the Zero version
	ScannerVersion string `json:"scanner_version"`
}

RepoMetadata contains GitHub repository information for evidence tracking

type RequirementError

type RequirementError struct {
	Analyzer    string `json:"analyzer"`
	Requirement string `json:"requirement"`
	Message     string `json:"message"`
	CanAutoFix  bool   `json:"can_auto_fix"`
}

RequirementError is returned when an analyzer's requirements are not met

func ValidateRequirements

func ValidateRequirements(analyzer Analyzer, cloneState *CloneState) *RequirementError

ValidateRequirements checks if an analyzer's requirements are met

func (*RequirementError) Error

func (e *RequirementError) Error() string

type Result

type Result struct {
	Analyzer string
	Status   Status
	Summary  string
	Duration time.Duration
	Error    error
	Output   json.RawMessage
}

Result holds the result of an analyzer run

type RunOptions

type RunOptions struct {
	RepoPath       string
	OutputDir      string
	Analyzers      []Analyzer
	SkipAnalyzers  []string
	Timeout        time.Duration
	Parallel       int
	FeatureConfigs map[string]map[string]interface{} // Analyzer name -> feature config
	RepoMetadata   *RepoMetadata                     // Repository metadata for evidence collection
	AutoUnshallow  bool                              // Automatically unshallow if analyzer requires full history
	Verbose        bool                              // Enable verbose logging
}

RunOptions configures an analyzer run

type RunResult

type RunResult struct {
	Success  bool
	Results  map[string]*Result
	Duration time.Duration
}

RunResult holds the result of running all analyzers on a repo

type Runner

type Runner struct {
	// contains filtered or unexported fields
}

Runner executes analyzers (wraps NativeRunner for backward compatibility)

func NewRunner

func NewRunner(zeroHome string) *Runner

NewRunner creates a new analyzer runner

func (*Runner) Run

func (r *Runner) Run(ctx context.Context, repo, profile string, progress *Progress, skipAnalyzers []string) (*RunResult, error)

Run executes all analyzers for a repository using native Go analyzers

type ScanOptions

type ScanOptions struct {
	// RepoPath is the path to the repository to scan
	RepoPath string

	// OutputDir is where to write results (e.g., .zero/repos/owner/repo/analysis)
	OutputDir string

	// SBOMPath is path to pre-generated SBOM (optional, for scanners that need it)
	SBOMPath string

	// Timeout is the maximum duration for this scanner
	Timeout time.Duration

	// Verbose enables verbose logging
	Verbose bool

	// OnStatus is called with progress messages during scanning
	// This allows scanners to report what they're doing in real-time
	OnStatus func(message string)

	// ExtraArgs contains scanner-specific options
	ExtraArgs map[string]string

	// FeatureConfig contains feature-specific configuration for super scanners
	FeatureConfig map[string]interface{}

	// Repository metadata for evidence collection
	RepoMetadata *RepoMetadata

	// CloneState provides info about the repository clone (shallow vs full)
	CloneState *CloneState

	// Incremental scanning support
	Incremental *IncrementalOptions
}

ScanOptions contains inputs for a scanner run

type ScanResult

type ScanResult struct {
	// Analyzer is the scanner name
	Analyzer string `json:"analyzer"`

	// Version is the scanner version
	Version string `json:"version"`

	// Timestamp is when the scan completed
	Timestamp string `json:"timestamp"`

	// DurationSeconds is how long the scan took
	DurationSeconds int `json:"duration_seconds"`

	// Repository is the repo that was scanned
	Repository string `json:"repository,omitempty"`

	// Summary contains aggregated findings info
	Summary json.RawMessage `json:"summary"`

	// Findings contains detailed findings (optional for some scanners)
	Findings json.RawMessage `json:"findings,omitempty"`

	// Metadata contains scanner-specific metadata
	Metadata json.RawMessage `json:"metadata,omitempty"`

	// Error contains error message if scan failed
	Error string `json:"error,omitempty"`
}

ScanResult represents scanner output

func FinalizeResult

func FinalizeResult(opts FinalizeResultOptions) (*ScanResult, error)

FinalizeResult creates and writes a ScanResult with common patterns. This consolidates the repeated finalization logic found in all analyzers.

func NewScanResult

func NewScanResult(analyzer, version string, start time.Time) *ScanResult

NewScanResult creates a new scan result with common fields populated

func (*ScanResult) SetFindings

func (r *ScanResult) SetFindings(findings interface{}) error

SetFindings marshals and sets the findings

func (*ScanResult) SetMetadata

func (r *ScanResult) SetMetadata(metadata interface{}) error

SetMetadata marshals and sets the metadata

func (*ScanResult) SetSummary

func (r *ScanResult) SetSummary(summary interface{}) error

SetSummary marshals and sets the summary

func (*ScanResult) WriteJSON

func (r *ScanResult) WriteJSON(path string) error

WriteJSON writes the result to a JSON file

type ScanSummary

type ScanSummary struct {
	TotalFindings int            `json:"total_findings,omitempty"`
	Critical      int            `json:"critical,omitempty"`
	High          int            `json:"high,omitempty"`
	Medium        int            `json:"medium,omitempty"`
	Low           int            `json:"low,omitempty"`
	Info          int            `json:"info,omitempty"`
	TotalPackages int            `json:"total_packages,omitempty"`
	ByType        map[string]int `json:"by_type,omitempty"`
	Status        string         `json:"status,omitempty"`
}

ScanSummary is a common summary structure used by many scanners

type Status

type Status string

Status represents analyzer execution status

const (
	StatusQueued   Status = "queued"
	StatusRunning  Status = "running"
	StatusComplete Status = "complete"
	StatusFailed   Status = "failed"
	StatusSkipped  Status = "skipped"
	StatusTimeout  Status = "timeout"
)

Directories

Path Synopsis
Package build provides the CI/CD optimization analyzer Features: cost analysis, caching optimization, parallelization, flaky test detection
Package build provides the CI/CD optimization analyzer Features: cost analysis, caching optimization, parallelization, flaky test detection
Package codeownership provides code ownership analysis with benchmark tiers
Package codeownership provides code ownership analysis with benchmark tiers
Package codepackages implements the consolidated code packages analyzer This analyzer generates SBOMs and performs comprehensive package analysis.
Package codepackages implements the consolidated code packages analyzer This analyzer generates SBOMs and performs comprehensive package analysis.
Package codequality provides the consolidated code quality analyzer
Package codequality provides the consolidated code quality analyzer
Package codesecurity provides the consolidated code security analyzer
Package codesecurity provides the consolidated code security analyzer
Package common provides shared utilities for analyzer implementations
Package common provides shared utilities for analyzer implementations
Package developerexperience provides the consolidated developer experience analyzer Features: onboarding, tooling, workflow
Package developerexperience provides the consolidated developer experience analyzer Features: onboarding, tooling, workflow
Package devops provides the consolidated DevOps and CI/CD security analyzer Renamed from infra - now includes all infrastructure, CI/CD, and GitHub Actions security
Package devops provides the consolidated DevOps and CI/CD security analyzer Renamed from infra - now includes all infrastructure, CI/CD, and GitHub Actions security
Package infraconfig provides the infrastructure configuration analyzer
Package infraconfig provides the infrastructure configuration analyzer
Package repogovernance provides the repository governance analyzer
Package repogovernance provides the repository governance analyzer
Package techid provides the consolidated technology identification analyzer Includes AI/ML security and ML-BOM generation
Package techid provides the consolidated technology identification analyzer Includes AI/ML security and ML-BOM generation
Package toolconfig provides the developer tool configuration analyzer
Package toolconfig provides the developer tool configuration analyzer

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL