Documentation
¶
Index ¶
- Constants
- Variables
- func CollectOverriddenJobs(o *GitlabPipelineOriginDataFull, data *GitlabPipelineOriginData) []ir.OverriddenJob
- func FetchGitHubBranchProtection(host, owner, repo string, opts BranchFetchOptions) ([]ir.Branch, error)
- func FetchGitHubDefaultBranch(host, owner, repo string) (string, error)
- func ParseGitlabComponentPath(path string, instanceURL string) (string, string, string)
- func ScanGitHubWorkflows(projectPath, defaultBranch, rootDir, apiHost string, enrichActionMetadata bool) (pipeline *ir.NormalizedPipeline, partialErrors []error, err error)
- func ScanGitHubWorkflowsRemote(host, owner, repo, ref string, enrichActionMetadata bool, ...) (*ir.NormalizedPipeline, []error, error)
- func ScanGitHubWorkflowsWithProgress(projectPath, defaultBranch, rootDir, apiHost string, enrichActionMetadata bool, ...) (pipeline *ir.NormalizedPipeline, partialErrors []error, err error)
- func ToNormalizedPipeline(projectPath string, defaultBranch string, ciConfigPath string, ...) *ir.NormalizedPipeline
- func TotalProgressStepsForPipeline(pipeline *ir.NormalizedPipeline) int
- type BranchFetchOptions
- type GitHubMetadata
- type GitHubMetadataClient
- type GitlabPipelineImageData
- type GitlabPipelineImageDataCollection
- type GitlabPipelineImageInfo
- type GitlabPipelineImageMetrics
- type GitlabPipelineJobData
- type GitlabPipelineJobGitlabComponent
- type GitlabPipelineJobPlumberOrigin
- type GitlabPipelineOriginData
- type GitlabPipelineOriginDataCollection
- type GitlabPipelineOriginDataFull
- type GitlabPipelineOriginDataGeneric
- type GitlabPipelineOriginDataProjectSpecific
- type GitlabPipelineOriginMetrics
- type GitlabProtectionAnalysisData
- type GitlabProtectionData
- type GitlabProtectionDataBranch
- type GitlabProtectionDataCollection
- type GitlabProtectionMetrics
- type ProgressFunc
- type RuleParameters
Constants ¶
const ( BehaviorWhenCommitIsAddedKeepApprovalsId = iota + 1 BehaviorWhenCommitIsAddedRemoveCodeOwnerApprovalsId BehaviorWhenCommitIsAddedRemoveApprovalsId )
Behavior when commit is added constants
const ( BehaviorWhenCommitIsAddedKeepApprovalsText = "Keep approvals" BehaviorWhenCommitIsAddedRemoveCodeOwnerText = "Remove approvals by Code Owners if their files changed" BehaviorWhenCommitIsAddedRemoveApprovalsText = "Remove all approvals" )
Behavior when commit is added text values
const ( SquashOptionNever = "never" // Never squash SquashOptionAlways = "always" // Always squash SquashOptionDefaultOn = "default_on" // Squash by default (can be turned off) SquashOptionDefaultOff = "default_off" // Don't squash by default (can be turned on) )
GitLab squash option constants
const DataCollectionTypeGitlabPipelineImageVersion = "0.2.0"
const DataCollectionTypeGitlabPipelineOriginVersion = "0.2.0"
const (
DataCollectionTypeGitlabProtectionVersion = "0.2.0"
)
const EnvDisableGitHubAPI = "PLUMBER_DISABLE_GITHUB_API"
EnvDisableGitHubAPI, when set to a truthy value, forces the GitHub metadata client into degraded mode regardless of gh auth state. Set to "1" by the test suite to keep unit tests offline and fast; production code does not read this variable.
Variables ¶
var ErrAuthRequired = fmt.Errorf(
"GitHub authentication required for upstream-fetch mode (--github-url). " +
"Set up one of:\n" +
" export GH_TOKEN=<token> # personal token (see README §Step 3 for scope guidance)\n" +
" export GITHUB_TOKEN=<token> # auto-set in GitHub Actions runners\n" +
" gh auth login # recommended for local dev")
ErrAuthRequired is the actionable error surfaced when go-gh cannot resolve any auth credential. The message points the user at the three supported sources and the README section that documents scopes. Exported so cmd/ and control/ layers can detect the sentinel via errors.Is and short-circuit their normal wrap/log behaviour — a redundant logrus error log on top of cobra's "Error:" prefix on top of "analysis failed:" on top of "github api client:" produces a frame stack instead of the actionable message we want.
Functions ¶
func CollectOverriddenJobs ¶ added in v0.3.0
func CollectOverriddenJobs(o *GitlabPipelineOriginDataFull, data *GitlabPipelineOriginData) []ir.OverriddenJob
CollectOverriddenJobs returns the jobs inherited from origin that were locally redefined with forbidden CI/CD keys. The IR uses it to expose override metadata to Rego policies; the PBOM generator reuses it so both paths share the same rule for what counts as an override.
func FetchGitHubBranchProtection ¶ added in v0.3.0
func FetchGitHubBranchProtection(host, owner, repo string, opts BranchFetchOptions) ([]ir.Branch, error)
FetchGitHubBranchProtection resolves branch-protection state for the names the caller asks about, with pagination/cost optimised for the typical "I just want main protected" config. See BranchFetchOptions for how ExactNames and Listing combine.
host is the GitHub API host (empty → api.github.com; non-empty → GHES). Auth is consumed from the same go-gh chain used elsewhere (GH_TOKEN / GH_ENTERPRISE_TOKEN / gh auth login). Without auth, or when the token lacks `repo` / Administration:read scope, the API returns 403/404 — those degrade silently to whatever subset we have already collected (the rego rule then sees fewer branches and may emit fewer findings; quiet is preferable to crash for a partial- data control).
Mapping decisions GitHub → IR shape:
- branch.protected = true when the API marks the branch as such (the listing endpoint already merges classic Branch Protection and the newer Repository Rulesets, so this flag is correct regardless of which mechanism the repo uses).
- allowForcePush = api.AllowForcePushes.Enabled.
- codeOwnerApprovalRequired = api.RequiredPullRequestReviews .RequireCodeOwnerReviews.
- min*AccessLevel: deliberately left 0 on GitHub. GitLab uses a numeric 0..60 access ladder where 0 = "no one allowed" (strictest). The legacy ISSUE-505 rule treats config min=0 as "always violates", which would false-positive on every GitHub branch that simply requires PR reviews. GitHub has no equivalent ladder; encoding an approximation produced misleading findings. The other ISSUE-505 reasons (allowForcePush, codeOwnerApprovalRequired) still apply.
func FetchGitHubDefaultBranch ¶ added in v0.3.0
FetchGitHubDefaultBranch resolves the repo's default branch name via the REST API. Returns an empty string with no error when the repo can't be queried (degraded mode), which keeps the `defaultMustBeProtected` rule a silent no-op rather than a noisy crash.
func ParseGitlabComponentPath ¶
ParseGitlabComponentPath parses a GitLab component path to extract: 1. The instance (if any) 2. The clean path without instance prefix 3. The version (if any)
func ScanGitHubWorkflows ¶ added in v0.3.0
func ScanGitHubWorkflows(projectPath, defaultBranch, rootDir, apiHost string, enrichActionMetadata bool) (pipeline *ir.NormalizedPipeline, partialErrors []error, err error)
ScanGitHubWorkflows reads every .yml/.yaml file under <rootDir>/.github/workflows/ and aggregates them into a single NormalizedPipeline. Job names are namespaced by the workflow file base name ("ci/lint", "release/build", ...) so two workflows can expose identically-named jobs without clashing in the IR.
A missing workflows directory is not an error: the returned pipeline simply carries no jobs. Individual unreadable or unparseable files are returned in partialErrors so the caller can surface them without aborting the whole scan.
func ScanGitHubWorkflowsRemote ¶ added in v0.3.0
func ScanGitHubWorkflowsRemote(host, owner, repo, ref string, enrichActionMetadata bool, progressFn ProgressFunc) (*ir.NormalizedPipeline, []error, error)
ScanGitHubWorkflowsRemote fetches `.github/workflows/*.{yml,yaml}` from a GitHub project via the Contents API and runs them through the same parser as the local scanner. Used by `plumber analyze --project owner/repo` (with optional --github-url for GHES) when the user is not inside a local checkout.
host empty → api.github.com. ref empty → repo's default branch. Auth resolves via the same go-gh chain the metadata client uses (GH_TOKEN / GH_ENTERPRISE_TOKEN / GITHUB_TOKEN / gh auth login).
Repo-side artefacts that need a local checkout (Dockerfiles, dependabot.yml, SECURITY.md, Renovate config) are NOT collected in remote mode — controls that depend on them simply see absent inputs and produce no findings. Same degraded-mode contract as missing API auth elsewhere.
func ScanGitHubWorkflowsWithProgress ¶ added in v0.3.0
func ScanGitHubWorkflowsWithProgress(projectPath, defaultBranch, rootDir, apiHost string, enrichActionMetadata bool, progressFn ProgressFunc) (pipeline *ir.NormalizedPipeline, partialErrors []error, err error)
ScanGitHubWorkflowsWithProgress mirrors ScanGitHubWorkflows but notifies the caller through progressFn as it works. The progress total is sized so the bar advances monotonically end-to-end:
step 1 Scanning workflow files step 2..(1+N) Resolving action <n> (N unique refs) step 2+N Scan complete
The last step (policy evaluation) is reported by the caller (RunGitHubAnalysis) using the same total so the bar keeps climbing. progressFn may be nil; callers that don't care about progress should call the plain ScanGitHubWorkflows variant.
func ToNormalizedPipeline ¶ added in v0.3.0
func ToNormalizedPipeline( projectPath string, defaultBranch string, ciConfigPath string, origin *GitlabPipelineOriginData, images *GitlabPipelineImageData, protection *GitlabProtectionAnalysisData, ) *ir.NormalizedPipeline
ToNormalizedPipeline projects the GitLab collector outputs onto a provider-agnostic IR. Phase 1b: only the fields required by the first rule ported to Rego (image/mutable_tag) are mapped. Additional fields (services, includes, branch protection, etc.) will be filled in as each rule is migrated.
This function is pure: no I/O, no external state. It is safe to call from tests with hand-built fixtures.
func TotalProgressStepsForPipeline ¶ added in v0.3.0
func TotalProgressStepsForPipeline(pipeline *ir.NormalizedPipeline) int
TotalProgressStepsForPipeline returns the grand total the caller (RunGitHubAnalysis / RunGitHubAnalysisRemote) should use when emitting its own progress updates for the post-scan phases, so the bar stays in sync with what the collector already reported.
Layout in slots, both modes:
1 "Scanning" (local) or "Listing" (remote)
2..(1+N) per-file fetch ticks (remote only;
WorkflowFileCount is 0 in local mode)
(2+N)..(1+N+M) per-action enrichment ticks (M = unique refs)
(2+N+M) "Resolving branch protection"
(3+N+M) "Evaluating policies"
(4+N+M) "Analysis complete"
Total = N + M + 4. WorkflowFileCount is populated by ScanGitHubWorkflowsRemote; local scans leave it at zero so the formula collapses to M + 4 there.
Types ¶
type BranchFetchOptions ¶ added in v0.3.0
type BranchFetchOptions struct {
// ExactNames are branch names without glob characters. Each is
// fetched directly via /repos/{owner}/{repo}/branches/{name}; a
// 404 is treated as "branch doesn't exist on this repo" and
// silently skipped (the rego rule then has nothing to flag for
// that name). Duplicates are deduped.
ExactNames []string
// Listing, when true, additionally paginates the /branches
// endpoint (capped at maxBranchListingPages * 100 entries) so
// wildcard patterns can match. Off by default because the
// targeted path covers the typical config and avoids the
// pagination foot-gun.
Listing bool
// OnProgress, when non-nil, is invoked at user-meaningful
// checkpoints during the fetch: each listing page (with a
// running branches-seen count) and each per-branch protection-
// detail call. The caller is expected to forward these to its
// progress spinner as label updates at the same global slot;
// the messages are short single-line strings already shaped for
// terminal display. Used by the CLI to keep the bar's label
// alive during the otherwise-silent "Resolving branch
// protection" phase on large repos (grafana/grafana has 772
// branches across 8 listing pages, ~10s of API time).
OnProgress func(message string)
// InScope, when non-nil, gates the slow protection-detail calls
// during the listing pagination: branches for which InScope
// returns false are still added to the IR (Protected flag
// preserved from the listing so ISSUE-501 still has the data it
// needs) but the classic /protection + Rulesets endpoints are
// skipped for them. Saves hundreds of API calls on repos where
// the listing returns many protected branches that do not match
// any of the user's configured namePatterns (grafana's hundreds
// of `release-X.Y.Z` branches when the config asks for
// `release/*`, for example). The rego rule applies the same
// scope check at evaluation time; this just avoids paying for
// data the rule is going to discard.
InScope func(name string) bool
}
BranchFetchOptions controls which branches FetchGitHubBranchProtection reaches out for. The split between targeted and listing modes is deliberate: a single `?per_page=100` page on a busy repo (think grafana/grafana with thousands of `dependabot/*` and `release-*` branches) does not necessarily contain `main` — the alphabetical page-1 falls through long before we reach the default branch — so a naive listing silently produces "0 branches to protect" findings on every realistic config. Targeted /branches/{name} bypasses that problem entirely; listing is only used when the user has at least one wildcard pattern (e.g. `release/*`) we cannot enumerate ahead of time.
type GitHubMetadata ¶ added in v0.3.0
type GitHubMetadata struct {
RepoArchived bool
RefExists bool
RefKind string
TagSha string
LatestTag string
LatestReleaseSha string
RefIsAmbiguous bool
Advisories []string
}
GitHubMetadata is the facts the API-backed policies need to know about a single `owner/repo@ref` action reference.
- RepoArchived: the GitHub repo hosting the action is archived.
- RefExists: the ref (tag / branch / commit SHA) resolves.
- RefKind: "tag", "branch", "commit", "unknown".
- TagSha: when RefKind=="tag", the commit SHA the tag currently points at.
- LatestTag: the repo's newest release tag, "" when the API returns no releases.
- LatestReleaseSha: the SHA that tag resolves to upstream.
- RefIsAmbiguous: the ref resolves as BOTH a tag and a branch (ref-confusion).
- Advisories: security advisory identifiers from the GitHub Advisory Database whose affected version range covers this ref, if any.
Zero value (all fields empty / false) is explicitly "unknown" — it is also what the policies see when the API call failed. They should treat zero value as "I don't know" and stay silent.
type GitHubMetadataClient ¶ added in v0.3.0
type GitHubMetadataClient struct {
// contains filtered or unexported fields
}
GitHubMetadataClient resolves `owner/repo@ref` references against the real GitHub REST API (via github.com/cli/go-gh which reuses the installed `gh` CLI's stored credentials) and caches every answer so the collector never hits the API twice for the same key. Safe for concurrent use.
When `gh` is not authenticated — or go-gh cannot find a token — the client operates in degraded mode: every lookup returns an empty GitHubMetadata and Available() returns false. Policies are expected to key their deny rules on the positive evidence the client surfaces, so the degraded-mode output is a zero-finding run rather than a crash.
func NewGitHubMetadataClient ¶ added in v0.3.0
func NewGitHubMetadataClient() *GitHubMetadataClient
NewGitHubMetadataClient builds a client using the gh-CLI auth store. Returns a usable client even when authentication is missing — see Available() to check. Honors the PLUMBER_DISABLE_GITHUB_API env var which short-circuits the client into degraded mode regardless of auth state.
Targets api.github.com by default. For GitHub Enterprise Server instances, use NewGitHubMetadataClientForHost with the GHES API host (e.g. "ghes.example.com" or "ghes.example.com/api/v3").
func NewGitHubMetadataClientForHost ¶ added in v0.3.0
func NewGitHubMetadataClientForHost(host string) *GitHubMetadataClient
NewGitHubMetadataClientForHost is the GHES-aware constructor. When host is empty the client targets api.github.com via the default go-gh resolution chain (gh auth, GH_TOKEN, GITHUB_TOKEN). When host is non-empty the client is bound to that host — pair with a GH_TOKEN (or GH_ENTERPRISE_TOKEN) that has access to the GHES instance.
func (*GitHubMetadataClient) Available ¶ added in v0.3.0
func (c *GitHubMetadataClient) Available() bool
Available reports whether the client has a usable gh auth token.
func (*GitHubMetadataClient) Resolve ¶ added in v0.3.0
func (c *GitHubMetadataClient) Resolve(ownerRepoRef string) GitHubMetadata
Resolve looks up "owner/repo@ref" and returns what the API told us. Never returns an error — all failures degrade to "unknown" (zero-valued GitHubMetadata). Repeated calls for the same key return the cached value.
func (*GitHubMetadataClient) ResolveTagSha ¶ added in v0.3.0
func (c *GitHubMetadataClient) ResolveTagSha(ownerRepo, tag string) string
ResolveTagSha exposes the tag → SHA lookup publicly so the ref-version-mismatch enrichment can query the commented tag without going through the full Resolve() probe chain.
type GitlabPipelineImageData ¶
type GitlabPipelineImageData struct {
// Gitlab CI configuration
MergedConf *gitlab.GitlabCIConf
CiValid bool
CiMissing bool
// Default image and variables
DefaultImage string
InstanceVars map[string]string
GroupVars map[string]string
ProjectVars map[string]string
GlobalVars map[string]string
// Images found in the pipeline
Images []GitlabPipelineImageInfo `json:"images"`
}
type GitlabPipelineImageDataCollection ¶
type GitlabPipelineImageDataCollection struct{}
func (*GitlabPipelineImageDataCollection) Run ¶
func (dc *GitlabPipelineImageDataCollection) Run(project *gitlab.ProjectInfo, token string, conf *configuration.Configuration, pipelineOriginData *GitlabPipelineOriginData) (*GitlabPipelineImageData, *GitlabPipelineImageMetrics, error)
type GitlabPipelineImageInfo ¶
type GitlabPipelineJobData ¶
type GitlabPipelineJobGitlabComponent ¶
type GitlabPipelineJobGitlabComponent struct {
RepoFullPath string `json:"repoFullPath"`
RepoWebPath string `json:"repoWebPath"`
RepoName string `json:"repoName"`
ComponentName string `json:"componentName"`
ComponentLatestVersion string `json:"componentLatestVersion"`
ComponentIncludePath string `json:"componentIncludePath"`
}
GitlabPipelineJobGitlabComponent represents a GitLab component
type GitlabPipelineJobPlumberOrigin ¶ added in v0.1.31
type GitlabPipelineJobPlumberOrigin struct {
ID uint `json:"id"`
Path string `json:"path"`
LatestVersion string `json:"latestVersion"`
RepoDefaultBranch string `json:"repoDefaultBranch"`
}
GitlabPipelineJobPlumberOrigin represents a Plumber template origin
type GitlabPipelineOriginData ¶
type GitlabPipelineOriginData struct {
// Gitlab CI catalog data
GitlabCatalogResources []gitlab.CICatalogResource
GitlabCatalogComponentMap map[string]int // path -> index in catalogResources
VersionMap map[string][]string // path -> []versions (newest first)
// Gitlab CI configuration
Conf *gitlab.GitlabCIConf
ConfString string
MergedConf *gitlab.GitlabCIConf
MergedResponse *gitlab.MergedCIConfResponse
CiValid bool
CiMissing bool
CiErrors []string // Specific CI config errors for output
LimitedAnalysis bool
// Origins and jobs data
Origins []GitlabPipelineOriginDataFull
// CI conf content
JobMap map[string]*GitlabPipelineJobData
JobExtendsMap map[string][]string
JobHardcodedMap map[string]bool
JobHardcodedContent map[string]interface{}
}
type GitlabPipelineOriginDataCollection ¶
type GitlabPipelineOriginDataCollection struct{}
func (*GitlabPipelineOriginDataCollection) Run ¶
func (dc *GitlabPipelineOriginDataCollection) Run(project *gitlab.ProjectInfo, token string, conf *configuration.Configuration) (*GitlabPipelineOriginData, *GitlabPipelineOriginMetrics, error)
type GitlabPipelineOriginDataFull ¶
type GitlabPipelineOriginDataFull struct {
// Origin data generic and specific
GitlabPipelineOriginDataGeneric
GitlabPipelineOriginDataProjectSpecific
}
type GitlabPipelineOriginDataGeneric ¶
type GitlabPipelineOriginDataGeneric struct {
OriginType string `json:"originType"`
FromPlumber bool `json:"fromPlumber"`
FromGitlabCatalog bool `json:"fromGitlabCatalog"`
PlumberOrigin GitlabPipelineJobPlumberOrigin `json:"plumberOrigin"`
GitlabIncludeOrigin gitlab.IncludeOriginWithoutRef `json:"gitlabIncludeOrigin"`
GitlabComponent GitlabPipelineJobGitlabComponent `json:"gitlabComponent"`
OriginHash uint64 `json:"originHash"`
}
type GitlabPipelineOriginDataProjectSpecific ¶
type GitlabPipelineOriginDataProjectSpecific struct {
// Data specific to this project
Version string `json:"version"`
UpToDate bool `json:"upToDate"`
Nested bool `json:"nested"`
// Job related data
Jobs []GitlabPipelineJobData `json:"jobs"`
}
type GitlabPipelineOriginMetrics ¶
type GitlabPipelineOriginMetrics struct {
// Data metrics: jobs
JobTotal uint `json:"jobTotal"`
JobHardcoded uint `json:"jobHardcoded"`
// Data metrics: origin
OriginTotal uint `json:"originTotal"`
OriginComponent uint `json:"originComponent"`
OriginLocal uint `json:"originLocal"`
OriginProject uint `json:"originProject"`
OriginRemote uint `json:"originRemote"`
OriginTemplate uint `json:"originTemplate"`
OriginGitLabCatalog uint `json:"originGitLabCatalog"`
OriginOutdated uint `json:"originOutdated"`
}
type GitlabProtectionAnalysisData ¶
type GitlabProtectionAnalysisData struct {
Branches []string `json:"branches"`
BranchProtections []gitlab.BranchProtection `json:"branchProtections"`
MRApprovalRules []*glab.ProjectApprovalRule `json:"mrApprovalRules"`
MRApprovalSettings *glab.ProjectApprovals `json:"mrApprovalSettings"`
MRSettings *glab.Project `json:"mrSettings"`
ProjectMembers []gitlab.GitlabMemberInfo `json:"projectMembers"`
}
GitlabProtectionAnalysisData holds all the data needed by protection controls
type GitlabProtectionData ¶
type GitlabProtectionData struct {
Branches []*GitlabProtectionDataBranch `json:"branches"`
}
GitlabProtectionData holds the collected protection data
type GitlabProtectionDataBranch ¶
type GitlabProtectionDataBranch struct {
BranchName string `json:"branchName"`
Default bool `json:"default"`
}
GitlabProtectionDataBranch holds branch information
type GitlabProtectionDataCollection ¶
type GitlabProtectionDataCollection struct{}
GitlabProtectionDataCollection handles protection data collection
func (*GitlabProtectionDataCollection) Run ¶
func (dc *GitlabProtectionDataCollection) Run( project *gitlab.ProjectInfo, token string, conf *configuration.Configuration, ) (*GitlabProtectionAnalysisData, *GitlabProtectionMetrics, error)
Run fetches all GitLab protection data needed by the controls
type GitlabProtectionMetrics ¶
type GitlabProtectionMetrics struct {
Branches int `json:"branches"`
}
GitlabProtectionMetrics holds metrics about protection data
type ProgressFunc ¶ added in v0.3.0
ProgressFunc is the signature callers use to observe the progress of long-running collector operations — currently the GitHub API enrichment phase.
type RuleParameters ¶ added in v0.3.0
type RuleParameters struct {
// pull_request rule
RequireCodeOwnerReview bool `json:"require_code_owner_review,omitempty"`
}
RuleParameters is the union of parameter shapes across rule types we care about. JSON unmarshal populates only the fields present in the source — extras are silently ignored.