analyzer

package

v0.0.0-...-515337b Latest Latest Go to latest Published: Jan 14, 2026 License: GPL-3.0 Imports: 11 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/crashappsec/zero

Links

Open Source Insights

Documentation ¶

Overview ¶

Package analyzer provides confidence scoring for git-history-based analysis

Package analyzer provides the analyzer framework for engineering intelligence ¶

Package analyzer manages running analyzers for engineering intelligence

Index ¶

func CalculateMetricConfidence(sampleSize, expectedMinimum, idealSample int) int
func Clear()
func DeepenHistory(repoPath string, commits int) error
func EstimateTime(analyzer string, fileCount int) int
func GroupByDependencies(analyzers []Analyzer) ([][]Analyzer, error)
func List() []string
func Register(a Analyzer)
func TotalEstimate(analyzers []string, fileCount int) int
func UnmarshalConfig[T any](opts *ScanOptions, defaultCfg T) T
func Unshallow(repoPath string) error
type Analyzer
- func Get(name string) (Analyzer, bool)
- func GetAll() []Analyzer
- func GetByNames(names []string) ([]Analyzer, error)
- func TopologicalSort(analyzers []Analyzer) ([]Analyzer, error)
type AnalyzerRequirements
type CloneState
- func DetectCloneState(repoPath string) *CloneState
type ConfidenceComponents
type ConfidenceInput
type ConfidenceLevel
type ConfidenceWeights
- func DefaultConfidenceWeights() ConfidenceWeights
type DORAConfidenceInput
type DataConfidence
- func CalculateConfidence(input ConfidenceInput) DataConfidence
- func CalculateConfidenceWithWeights(input ConfidenceInput, weights ConfidenceWeights) DataConfidence
- func CalculateDORAConfidence(input DORAConfidenceInput) DataConfidence
type Feature
type FeatureConfidence
type FinalizeResultOptions
type Finding
type IncrementalOptions
type NativeRunner
- func NewNativeRunner(zeroHome string) *NativeRunner
- func (r *NativeRunner) RunAnalyzers(ctx context.Context, opts RunOptions) (*RunResult, error)
type Progress
- func NewProgress(analyzers []string) *Progress
- func (p *Progress) GetProgress() (completed, total int, current string)
- func (p *Progress) SetComplete(analyzer string, summary string, duration time.Duration)
- func (p *Progress) SetFailed(analyzer string, err error, duration time.Duration)
- func (p *Progress) SetRunning(analyzer string)
- func (p *Progress) SetSkipped(analyzer string)
type RepoMetadata
type RequirementError
- func ValidateRequirements(analyzer Analyzer, cloneState *CloneState) *RequirementError
- func (e *RequirementError) Error() string
type Result
type RunOptions
type RunResult
type Runner
- func NewRunner(zeroHome string) *Runner
- func (r *Runner) Run(ctx context.Context, repo, profile string, progress *Progress, ...) (*RunResult, error)
type ScanOptions
type ScanResult
- func FinalizeResult(opts FinalizeResultOptions) (*ScanResult, error)
- func NewScanResult(analyzer, version string, start time.Time) *ScanResult
- func (r *ScanResult) SetFindings(findings interface{}) error
- func (r *ScanResult) SetMetadata(metadata interface{}) error
- func (r *ScanResult) SetSummary(summary interface{}) error
- func (r *ScanResult) WriteJSON(path string) error
type ScanSummary
type Status

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func CalculateMetricConfidence ¶

func CalculateMetricConfidence(sampleSize, expectedMinimum, idealSample int) int

CalculateMetricConfidence calculates confidence for a specific metric based on the sample size and expected minimum

func Clear ¶

func Clear()

Clear removes all registered analyzers (useful for testing)

func DeepenHistory ¶

func DeepenHistory(repoPath string, commits int) error

DeepenHistory fetches additional commits (useful for incremental deepening)

func EstimateTime ¶

func EstimateTime(analyzer string, fileCount int) int

EstimateTime returns estimated scan time in seconds based on file count

func GroupByDependencies ¶

func GroupByDependencies(analyzers []Analyzer) ([][]Analyzer, error)

GroupByDependencies groups analyzers into levels based on dependencies Analyzers in the same level can run in parallel

func List ¶

func List() []string

List returns all registered analyzer names

func Register ¶

func Register(a Analyzer)

Register adds an analyzer to the registry This is typically called from analyzer init() functions

func TotalEstimate ¶

func TotalEstimate(analyzers []string, fileCount int) int

TotalEstimate returns total estimated time for all analyzers

func UnmarshalConfig ¶

func UnmarshalConfig[T any](opts *ScanOptions, defaultCfg T) T

UnmarshalConfig unmarshals the FeatureConfig from ScanOptions into the target type. Returns defaultCfg if FeatureConfig is nil or unmarshaling fails.

func Unshallow ¶

func Unshallow(repoPath string) error

Unshallow fetches full history for a shallow clone

Types ¶

type Analyzer ¶

type Analyzer interface {
	// Name returns the analyzer identifier (e.g., "code-security")
	Name() string

	// Description returns a human-readable description
	Description() string

	// Run executes the analyzer and returns results
	Run(ctx context.Context, opts *ScanOptions) (*ScanResult, error)

	// Dependencies returns analyzers that must run first (e.g., "code-packages")
	Dependencies() []string

	// EstimateDuration returns estimated duration based on file count
	EstimateDuration(fileCount int) time.Duration

	// Requirements returns what this analyzer needs to run properly
	// Analyzers should return their requirements so the runner can validate
	// before execution and provide helpful error messages
	Requirements() AnalyzerRequirements
}

Analyzer defines the interface all analyzers must implement

func Get ¶

func Get(name string) (Analyzer, bool)

Get returns an analyzer by name

func GetAll ¶

func GetAll() []Analyzer

GetAll returns all registered analyzers

func GetByNames ¶

func GetByNames(names []string) ([]Analyzer, error)

GetByNames returns analyzers for the given names Returns an error if any analyzer is not found

func TopologicalSort ¶

func TopologicalSort(analyzers []Analyzer) ([]Analyzer, error)

TopologicalSort orders analyzers by dependencies Analyzers with no dependencies come first, then analyzers that depend on them

type AnalyzerRequirements ¶

type AnalyzerRequirements struct {
	// RequiresFullHistory indicates the analyzer needs complete git history
	// (e.g., PR metrics, git blame analysis, rework rate detection)
	RequiresFullHistory bool `json:"requires_full_history,omitempty"`

	// RequiresGitHubAPI indicates the analyzer needs GitHub API access
	RequiresGitHubAPI bool `json:"requires_github_api,omitempty"`

	// RequiresSBOM indicates the analyzer depends on SBOM generation
	RequiresSBOM bool `json:"requires_sbom,omitempty"`

	// MinCommitCount is minimum commits needed for meaningful analysis (0 = no minimum)
	MinCommitCount int `json:"min_commit_count,omitempty"`

	// Description explains why these requirements exist
	Description string `json:"description,omitempty"`
}

AnalyzerRequirements defines what an analyzer needs to run properly

type CloneState ¶

type CloneState struct {
	// IsShallow is true if the repository is a shallow clone
	IsShallow bool `json:"is_shallow"`

	// Depth is the clone depth (0 = full clone)
	Depth int `json:"depth"`

	// CommitCount is the number of commits available in the clone
	CommitCount int `json:"commit_count"`

	// CanUnshallow is true if we can fetch more history (has remote)
	CanUnshallow bool `json:"can_unshallow"`

	// RepoPath is the path to the repository
	RepoPath string `json:"repo_path,omitempty"`
}

CloneState provides information about the repository clone

func DetectCloneState ¶

func DetectCloneState(repoPath string) *CloneState

DetectCloneState analyzes a repository to determine its clone state

type ConfidenceComponents ¶

type ConfidenceComponents struct {
	// CommitCount score based on number of commits analyzed (0-100)
	CommitCount int `json:"commit_count"`

	// PeriodCoverage score based on coverage of analysis period (0-100)
	PeriodCoverage int `json:"period_coverage"`

	// Recency score based on freshness of data (0-100)
	Recency int `json:"recency"`

	// CloneDepth score based on shallow vs full clone (0-100)
	CloneDepth int `json:"clone_depth"`

	// DataCompleteness score based on available data sources (0-100)
	DataCompleteness int `json:"data_completeness"`
}

ConfidenceComponents contains the individual factor scores

type ConfidenceInput ¶

type ConfidenceInput struct {
	// TotalCommits is the number of commits analyzed
	TotalCommits int

	// PeriodDays is the analysis period in days
	PeriodDays int

	// FirstCommitDate is when the first commit in period occurred
	FirstCommitDate time.Time

	// LastCommitDate is when the most recent commit occurred
	LastCommitDate time.Time

	// DaysWithCommits is how many unique days have commits
	DaysWithCommits int

	// IsShallowClone indicates if the repository is a shallow clone
	IsShallowClone bool

	// HasPRData indicates if PR/review data is available
	HasPRData bool

	// HasReviewData indicates if code review data is available
	HasReviewData bool

	// HasGitHubAPI indicates if GitHub API was successfully accessed
	HasGitHubAPI bool

	// ContributorCount is the number of unique contributors
	ContributorCount int
}

ConfidenceInput contains data needed to calculate confidence

type ConfidenceLevel ¶

type ConfidenceLevel string

ConfidenceLevel represents confidence classification

const (
	ConfidenceHigh    ConfidenceLevel = "high"     // 80-100: Metrics are reliable
	ConfidenceMedium  ConfidenceLevel = "medium"   // 60-79: Reasonably reliable
	ConfidenceLow     ConfidenceLevel = "low"      // 40-59: Use with caution
	ConfidenceVeryLow ConfidenceLevel = "very_low" // 0-39: Insufficient data
)

type ConfidenceWeights ¶

type ConfidenceWeights struct {
	CommitCount      float64
	PeriodCoverage   float64
	Recency          float64
	CloneDepth       float64
	DataCompleteness float64
}

ConfidenceWeights defines the weight of each factor in the overall score

func DefaultConfidenceWeights ¶

func DefaultConfidenceWeights() ConfidenceWeights

DefaultConfidenceWeights returns the default weights for confidence calculation

type DORAConfidenceInput ¶

type DORAConfidenceInput struct {
	// DeploymentCount is the number of deployments in the period
	DeploymentCount int

	// PeriodDays is the analysis period
	PeriodDays int

	// HasTags indicates if deployments were detected via git tags
	HasTags bool

	// HasGitHubReleases indicates if GitHub releases API was used
	HasGitHubReleases bool

	// CommitCount for the period
	CommitCount int

	// IsShallowClone indicates limited history
	IsShallowClone bool
}

DORAConfidenceInput contains data for DORA metric confidence calculation

type DataConfidence ¶

type DataConfidence struct {
	// Overall confidence score (0-100)
	Score int `json:"score"`

	// Level is the confidence classification
	Level ConfidenceLevel `json:"level"`

	// Components shows the breakdown of confidence factors
	Components ConfidenceComponents `json:"components"`

	// Factors describes what affected the confidence score
	Factors []string `json:"factors,omitempty"`
}

DataConfidence represents confidence in analysis results based on data quality

func CalculateConfidence ¶

func CalculateConfidence(input ConfidenceInput) DataConfidence

CalculateConfidence computes confidence based on data quality factors

func CalculateConfidenceWithWeights ¶

func CalculateConfidenceWithWeights(input ConfidenceInput, weights ConfidenceWeights) DataConfidence

CalculateConfidenceWithWeights computes confidence with custom weights

func CalculateDORAConfidence ¶

func CalculateDORAConfidence(input DORAConfidenceInput) DataConfidence

CalculateDORAConfidence computes confidence for DORA metrics

type Feature ¶

type Feature struct {
	Name        string
	Description string
	Default     bool // Enabled by default?
}

Feature describes a feature within a super scanner

type FeatureConfidence ¶

type FeatureConfidence struct {
	// Feature name
	Feature string `json:"feature"`

	// Confidence score and level
	DataConfidence

	// MetricConfidences contains per-metric confidence if applicable
	MetricConfidences map[string]int `json:"metric_confidences,omitempty"`
}

FeatureConfidence represents confidence for a specific scanner feature

type FinalizeResultOptions ¶

type FinalizeResultOptions struct {
	Name      string
	Version   string
	Start     time.Time
	RepoPath  string
	OutputDir string
	Summary   interface{}
	Findings  interface{}
	Metadata  interface{}
}

FinalizeResultOptions contains options for finalizing a scan result

type Finding ¶

type Finding struct {
	ID          string          `json:"id,omitempty"`
	Title       string          `json:"title,omitempty"`
	Description string          `json:"description,omitempty"`
	Severity    string          `json:"severity"`
	Category    string          `json:"category,omitempty"`
	Package     string          `json:"package,omitempty"`
	Version     string          `json:"version,omitempty"`
	File        string          `json:"file,omitempty"`
	Line        int             `json:"line,omitempty"`
	Confidence  string          `json:"confidence,omitempty"`
	References  []string        `json:"references,omitempty"`
	Metadata    json.RawMessage `json:"metadata,omitempty"`
}

Finding is a common finding structure

type IncrementalOptions ¶

type IncrementalOptions struct {
	// Enabled indicates whether to use incremental scanning
	Enabled bool `json:"enabled"`

	// BaseCommit is the commit to diff against (typically the last scanned commit)
	BaseCommit string `json:"base_commit"`

	// CurrentCommit is the current commit being scanned
	CurrentCommit string `json:"current_commit"`

	// ChangedFiles is the list of files changed since BaseCommit
	// Scanners can use this to focus on only changed files
	ChangedFiles []string `json:"changed_files,omitempty"`

	// PreviousResultPath is the path to the previous scan results
	// Used for merging unchanged results with new results
	PreviousResultPath string `json:"previous_result_path,omitempty"`

	// ForceFullScan forces a full scan even in incremental mode
	// Useful when structural changes require re-analyzing everything
	ForceFullScan bool `json:"force_full_scan,omitempty"`
}

IncrementalOptions configures incremental scanning

type NativeRunner ¶

type NativeRunner struct {
	ZeroHome   string
	Timeout    time.Duration
	Parallel   int
	OnProgress func(analyzer string, status Status, summary string)
}

NativeRunner executes Go-native analyzers

func NewNativeRunner ¶

func NewNativeRunner(zeroHome string) *NativeRunner

NewNativeRunner creates a new native analyzer runner

func (*NativeRunner) RunAnalyzers ¶

func (r *NativeRunner) RunAnalyzers(ctx context.Context, opts RunOptions) (*RunResult, error)

RunAnalyzers executes all configured analyzers for a repository

type Progress ¶

type Progress struct {
	Current        string
	CompletedCount int
	TotalCount     int
	Results        map[string]*Result
	// contains filtered or unexported fields
}

Progress tracks analyzer progress for a repo

func NewProgress ¶

func NewProgress(analyzers []string) *Progress

NewProgress creates a new progress tracker

func (*Progress) GetProgress ¶

func (p *Progress) GetProgress() (completed, total int, current string)

GetProgress returns current progress info

func (*Progress) SetComplete ¶

func (p *Progress) SetComplete(analyzer string, summary string, duration time.Duration)

SetComplete marks an analyzer as complete

func (*Progress) SetFailed ¶

func (p *Progress) SetFailed(analyzer string, err error, duration time.Duration)

SetFailed marks an analyzer as failed

func (*Progress) SetRunning ¶

func (p *Progress) SetRunning(analyzer string)

SetRunning marks an analyzer as running

func (*Progress) SetSkipped ¶

func (p *Progress) SetSkipped(analyzer string)

SetSkipped marks an analyzer as skipped

type RepoMetadata ¶

type RepoMetadata struct {
	// GitHubOrg is the organization or user that owns the repo (e.g., "expressjs")
	GitHubOrg string `json:"github_org"`

	// GitHubRepo is the repository name (e.g., "express")
	GitHubRepo string `json:"github_repo"`

	// RepoURL is the full GitHub URL (e.g., "https://github.com/expressjs/express")
	RepoURL string `json:"repo_url"`

	// CommitSHA is the exact commit being scanned
	CommitSHA string `json:"commit_sha"`

	// Branch is the branch name (e.g., "main", "master")
	Branch string `json:"branch"`

	// ScanProfile is the profile used for scanning (e.g., "all-quick")
	ScanProfile string `json:"scan_profile"`

	// ScannerVersion is the Zero version
	ScannerVersion string `json:"scanner_version"`
}

RepoMetadata contains GitHub repository information for evidence tracking

type RequirementError ¶

type RequirementError struct {
	Analyzer    string `json:"analyzer"`
	Requirement string `json:"requirement"`
	Message     string `json:"message"`
	CanAutoFix  bool   `json:"can_auto_fix"`
}

RequirementError is returned when an analyzer's requirements are not met

func ValidateRequirements ¶

func ValidateRequirements(analyzer Analyzer, cloneState *CloneState) *RequirementError

ValidateRequirements checks if an analyzer's requirements are met

func (*RequirementError) Error ¶

func (e *RequirementError) Error() string

type Result ¶

type Result struct {
	Analyzer string
	Status   Status
	Summary  string
	Duration time.Duration
	Error    error
	Output   json.RawMessage
}

Result holds the result of an analyzer run

type RunOptions ¶

type RunOptions struct {
	RepoPath       string
	OutputDir      string
	Analyzers      []Analyzer
	SkipAnalyzers  []string
	Timeout        time.Duration
	Parallel       int
	FeatureConfigs map[string]map[string]interface{} // Analyzer name -> feature config
	RepoMetadata   *RepoMetadata                     // Repository metadata for evidence collection
	AutoUnshallow  bool                              // Automatically unshallow if analyzer requires full history
	Verbose        bool                              // Enable verbose logging
}

RunOptions configures an analyzer run

type RunResult ¶

type RunResult struct {
	Success  bool
	Results  map[string]*Result
	Duration time.Duration
}

RunResult holds the result of running all analyzers on a repo

type Runner ¶

type Runner struct {
	// contains filtered or unexported fields
}

Runner executes analyzers (wraps NativeRunner for backward compatibility)

func NewRunner ¶

func NewRunner(zeroHome string) *Runner

NewRunner creates a new analyzer runner

func (*Runner) Run ¶

func (r *Runner) Run(ctx context.Context, repo, profile string, progress *Progress, skipAnalyzers []string) (*RunResult, error)

Run executes all analyzers for a repository using native Go analyzers

type ScanOptions ¶

type ScanOptions struct {
	// RepoPath is the path to the repository to scan
	RepoPath string

	// OutputDir is where to write results (e.g., .zero/repos/owner/repo/analysis)
	OutputDir string

	// SBOMPath is path to pre-generated SBOM (optional, for scanners that need it)
	SBOMPath string

	// Timeout is the maximum duration for this scanner
	Timeout time.Duration

	// Verbose enables verbose logging
	Verbose bool

	// OnStatus is called with progress messages during scanning
	// This allows scanners to report what they're doing in real-time
	OnStatus func(message string)

	// ExtraArgs contains scanner-specific options
	ExtraArgs map[string]string

	// FeatureConfig contains feature-specific configuration for super scanners
	FeatureConfig map[string]interface{}

	// Repository metadata for evidence collection
	RepoMetadata *RepoMetadata

	// CloneState provides info about the repository clone (shallow vs full)
	CloneState *CloneState

	// Incremental scanning support
	Incremental *IncrementalOptions
}

ScanOptions contains inputs for a scanner run

type ScanResult ¶

type ScanResult struct {
	// Analyzer is the scanner name
	Analyzer string `json:"analyzer"`

	// Version is the scanner version
	Version string `json:"version"`

	// Timestamp is when the scan completed
	Timestamp string `json:"timestamp"`

	// DurationSeconds is how long the scan took
	DurationSeconds int `json:"duration_seconds"`

	// Repository is the repo that was scanned
	Repository string `json:"repository,omitempty"`

	// Summary contains aggregated findings info
	Summary json.RawMessage `json:"summary"`

	// Findings contains detailed findings (optional for some scanners)
	Findings json.RawMessage `json:"findings,omitempty"`

	// Metadata contains scanner-specific metadata
	Metadata json.RawMessage `json:"metadata,omitempty"`

	// Error contains error message if scan failed
	Error string `json:"error,omitempty"`
}

ScanResult represents scanner output

func FinalizeResult ¶

func FinalizeResult(opts FinalizeResultOptions) (*ScanResult, error)

FinalizeResult creates and writes a ScanResult with common patterns. This consolidates the repeated finalization logic found in all analyzers.

func NewScanResult ¶

func NewScanResult(analyzer, version string, start time.Time) *ScanResult

NewScanResult creates a new scan result with common fields populated

func (*ScanResult) SetFindings ¶

func (r *ScanResult) SetFindings(findings interface{}) error

SetFindings marshals and sets the findings

func (*ScanResult) SetMetadata ¶

func (r *ScanResult) SetMetadata(metadata interface{}) error

SetMetadata marshals and sets the metadata

func (*ScanResult) SetSummary ¶

func (r *ScanResult) SetSummary(summary interface{}) error

SetSummary marshals and sets the summary

func (*ScanResult) WriteJSON ¶

func (r *ScanResult) WriteJSON(path string) error

WriteJSON writes the result to a JSON file

type ScanSummary ¶

type ScanSummary struct {
	TotalFindings int            `json:"total_findings,omitempty"`
	Critical      int            `json:"critical,omitempty"`
	High          int            `json:"high,omitempty"`
	Medium        int            `json:"medium,omitempty"`
	Low           int            `json:"low,omitempty"`
	Info          int            `json:"info,omitempty"`
	TotalPackages int            `json:"total_packages,omitempty"`
	ByType        map[string]int `json:"by_type,omitempty"`
	Status        string         `json:"status,omitempty"`
}

ScanSummary is a common summary structure used by many scanners

type Status ¶

type Status string

Status represents analyzer execution status

const (
	StatusQueued   Status = "queued"
	StatusRunning  Status = "running"
	StatusComplete Status = "complete"
	StatusFailed   Status = "failed"
	StatusSkipped  Status = "skipped"
	StatusTimeout  Status = "timeout"
)

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
build Package build provides the CI/CD optimization analyzer Features: cost analysis, caching optimization, parallelization, flaky test detection	Package build provides the CI/CD optimization analyzer Features: cost analysis, caching optimization, parallelization, flaky test detection
code-ownership Package codeownership provides code ownership analysis with benchmark tiers	Package codeownership provides code ownership analysis with benchmark tiers
code-packages Package codepackages implements the consolidated code packages analyzer This analyzer generates SBOMs and performs comprehensive package analysis.	Package codepackages implements the consolidated code packages analyzer This analyzer generates SBOMs and performs comprehensive package analysis.
code-quality Package codequality provides the consolidated code quality analyzer	Package codequality provides the consolidated code quality analyzer
code-security Package codesecurity provides the consolidated code security analyzer	Package codesecurity provides the consolidated code security analyzer
common Package common provides shared utilities for analyzer implementations	Package common provides shared utilities for analyzer implementations
developer-experience Package developerexperience provides the consolidated developer experience analyzer Features: onboarding, tooling, workflow	Package developerexperience provides the consolidated developer experience analyzer Features: onboarding, tooling, workflow
devops Package devops provides the consolidated DevOps and CI/CD security analyzer Renamed from infra - now includes all infrastructure, CI/CD, and GitHub Actions security	Package devops provides the consolidated DevOps and CI/CD security analyzer Renamed from infra - now includes all infrastructure, CI/CD, and GitHub Actions security
infra-config Package infraconfig provides the infrastructure configuration analyzer	Package infraconfig provides the infrastructure configuration analyzer
repo-governance Package repogovernance provides the repository governance analyzer	Package repogovernance provides the repository governance analyzer
technology-identification Package techid provides the consolidated technology identification analyzer Includes AI/ML security and ML-BOM generation	Package techid provides the consolidated technology identification analyzer Includes AI/ML security and ML-BOM generation
tool-config Package toolconfig provides the developer tool configuration analyzer	Package toolconfig provides the developer tool configuration analyzer

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL