shell

package
v1.5.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 5, 2026 License: MIT Imports: 2 Imported by: 0

Documentation

Overview

Package shell parses Bash command strings into normalized segments for policy evaluation: subshell extraction, quote masking, heredoc handling, and pipeline splitting.

Index

Constants

This section is empty.

Variables

View Source
var (
	// EnvVarRE matches a run of NAME=value assignments at the start of a
	// segment, each followed by whitespace (so a *trailing* command exists).
	// The unquoted-value branch deliberately excludes `(` so that a bash array
	// literal `NAME=(...)` is not mistaken for `NAME=` followed by `(...)`,
	// and excludes `;` so we never grab a value across a statement boundary.
	EnvVarRE            = regexp.MustCompile(`^(\w+=(?:"[^"]*"|'[^']*'|[^\s(;]*)\s+)+`)
	CommentLineRE       = regexp.MustCompile(`(?m)^[ \t]*#[^\n]*(?:\n|$)`)
	RedirectRE          = regexp.MustCompile(`>\s*\S|>>`)
	SafeRedirectRE      = regexp.MustCompile(`(?:[12]\s*)?>\s*/dev/null\b|2\s*>\s*&\s*1|>\s*&\s*2`)
	InputRedirectRE     = regexp.MustCompile(`<\s*\S`)
	SafeInputRedirectRE = regexp.MustCompile(`<<<?|<\s*/dev/null\b`)
	HeredocRE           = regexp.MustCompile(`^[^\n]*<<`)
	ArithBodyRE         = regexp.MustCompile(`^\s*\(`)
)

Exported regexes used by callers to detect shell features in raw command strings (env-var prefixes, comments, redirections, heredocs, arithmetic).

Functions

func ExtractFindExecBody added in v1.3.0

func ExtractFindExecBody(seg string) []string

ExtractFindExecBody extracts all command portions from find -exec or -execdir clauses. It returns a slice of bodies found; an empty slice means no matches.

func ExtractHereStringBody added in v1.3.0

func ExtractHereStringBody(seg string) (string, bool)

ExtractHereStringBody extracts the inner command body from "sh <<< 'body'" or "bash <<< body" patterns where the opener is an interpreter (sh/bash/zsh/ dash/ash). It returns (body, true) when a match is found, or ("", false) otherwise. Only the first match is returned. The right-hand string is parsed respecting single quotes, double quotes, and unquoted forms.

func ExtractHeredocs added in v1.0.1

func ExtractHeredocs(cmd string) (string, bool)

ExtractHeredocs removes heredoc bodies from cmd, keeping only the opener lines. It returns the processed string and true when all heredocs terminated normally. If an unterminated heredoc is found, it returns false — the caller should treat the command as requiring manual review rather than silently dropping content.

func ExtractProcSubst added in v1.3.0

func ExtractProcSubst(cmd string) (bodies []string, outer string)

ExtractProcSubst returns all <(...) and >(...) bodies and the outer command with each occurrence replaced by __PROCSUBST__. Byte-level scan — safe for UTF-8 because all sentinels are ASCII and multibyte sequences never contain ASCII bytes.

func ExtractShellCBody added in v1.3.0

func ExtractShellCBody(seg string) []string

ExtractShellCBody extracts all inner command bodies from "sh -c '<body>'" style invocations. It returns a slice of bodies found; an empty slice means no matches.

func ExtractSubshells

func ExtractSubshells(cmd string) (bodies []string, outer string)

ExtractSubshells returns all $(...) bodies and the outer command with each occurrence replaced by __SUBSHELL__. Byte-level scan — safe for UTF-8 because all sentinels are ASCII and multibyte sequences never contain ASCII bytes.

func ExtractXargsShellCBody added in v1.3.0

func ExtractXargsShellCBody(seg string) []string

ExtractXargsShellCBody extracts all inner command bodies from xargs patterns that end with "sh -c '...'". It returns a slice of bodies found; an empty slice means no matches.

func FindSplitBoundaries

func FindSplitBoundaries(cmd string) [][2]int

FindSplitBoundaries scans cmd for pipeline delimiters (|, ||, &&, ;, \n), skipping content inside quotes. Returns [start, end) index pairs.

func HasShellCNonLiteralBody added in v1.3.0

func HasShellCNonLiteralBody(seg string) bool

HasShellCNonLiteralBody returns true when seg contains a shell -c invocation where the -c flag is not followed by a literal quoted string. This catches dangerous patterns like "bash -c $cmd" or "bash -c $(echo foo)" that bypass the normal shell-c recursion checks.

func IsSafePath added in v1.3.0

func IsSafePath(pathSpec string, exemptPaths []string) bool

IsSafePath returns true when the path spec's source component (left of the first ':') starts with an exempt prefix and contains no ".." component. Leading and trailing quote characters are stripped first so that both -v /tmp/x and -v "/tmp/x" are treated identically.

func JoinContinuations

func JoinContinuations(cmd string) string

JoinContinuations replaces line-continuation sequences (\<newline>) with a space.

func RemoveQuotedContent

func RemoveQuotedContent(cmd string) string

RemoveQuotedContent masks content inside single/double quotes with '_' so shell operators inside strings are not mistaken for command boundaries.

func SplitPipeline

func SplitPipeline(cmd string) []string

SplitPipeline splits cmd at shell pipeline boundaries and returns cleaned segments with env-var prefixes, comment-only entries, and blanks removed. A segment that starts with '-' is appended to the previous segment to handle flag-only pipeline components.

func SplitPipelineDetailed added in v1.5.0

func SplitPipelineDetailed(cmd string) ([]string, [][]string)

SplitPipelineDetailed is SplitPipeline plus a parallel slice of the env-var assignment names that prefixed each segment in the original input. The names slice is the same length as the segments slice; entries are nil when a segment had no leading assignments. Callers that need to enforce a policy over assignment *names* (e.g. blocking LD_PRELOAD=evil ls) consume the names here; callers that only care about the command itself can keep using SplitPipeline. A standalone NAME=value with no trailing command is left intact in the segment (and produces no name entry) — same as before.

func StripComments

func StripComments(cmd string) string

StripComments removes shell comment lines from cmd.

func StripExemptPaths added in v1.0.1

func StripExemptPaths(seg string, flagRE *regexp.Regexp, exemptPaths []string) string

StripExemptPaths rewrites flag+path occurrences in seg where the captured path is safe: starts with one of exemptPaths and contains no ".." traversal. Each matched occurrence is replaced with __SAFE_PATH__. flagRE must have exactly one capture group that captures the path or colon-separated mount spec (source:dest). The docker-volume scope (matching -v / --volume flags) is one example use case.

func StripLeadingPath added in v1.5.0

func StripLeadingPath(seg string) string

StripLeadingPath rewrites the first token of seg to its basename when the token is a path (contains a `/`). Bash treats any command word containing a slash as a direct path lookup rather than a PATH search, so the basename is unambiguously the program being run — `/bin/sed`, `/usr/local/bin/sed`, `./tools/sed`, and `tools/sed` all execute the same `sed` binary, and allow/deny rules written against the program name should match them all. Tokens without a slash are returned unchanged. The rest of seg (arguments, flags) is left untouched — only the leading command word is normalized.

func StripSafeRedirects added in v1.5.0

func StripSafeRedirects(masked string, safeTargets []string) string

StripSafeRedirects rewrites every `> target`, `>> target`, `N> target`, and `N>> target` occurrence in cmd to a sentinel when target's source path passes IsSafePath against safeTargets. The masked form (quotes already replaced with `_`) is required so we don't trip over operators inside strings. Unsafe or non-matching redirects are left untouched so the existing RedirectRE check still catches them. Note: because the target span is replaced, deny patterns written against literal redirect targets (e.g. `> /tmp/foo`) won't match — none exist today, but adding one would silently no-op.

func StripWrappers added in v1.0.1

func StripWrappers(seg string, extra []string) string

StripWrappers iteratively removes leading process-wrapper prefixes from seg. Builtins (timeout, time, nice, nohup, stdbuf, bare xargs) are always stripped. extra lists additional single-command wrapper names from config.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL