Documentation
¶
Overview ¶
Package shell parses Bash command strings into normalized segments for policy evaluation: subshell extraction, quote masking, heredoc handling, and pipeline splitting.
Index ¶
- Variables
- func ExtractFindExecBody(seg string) []string
- func ExtractHereStringBody(seg string) (string, bool)
- func ExtractHeredocs(cmd string) (string, bool)
- func ExtractProcSubst(cmd string) (bodies []string, outer string)
- func ExtractShellCBody(seg string) []string
- func ExtractSubshells(cmd string) (bodies []string, outer string)
- func ExtractXargsShellCBody(seg string) []string
- func FindSplitBoundaries(cmd string) [][2]int
- func HasShellCNonLiteralBody(seg string) bool
- func IsSafePath(pathSpec string, exemptPaths []string) bool
- func JoinContinuations(cmd string) string
- func RemoveQuotedContent(cmd string) string
- func SplitPipeline(cmd string) []string
- func SplitPipelineDetailed(cmd string) ([]string, [][]string)
- func StripComments(cmd string) string
- func StripExemptPaths(seg string, flagRE *regexp.Regexp, exemptPaths []string) string
- func StripLeadingPath(seg string) string
- func StripSafeRedirects(masked string, safeTargets []string) string
- func StripWrappers(seg string, extra []string) string
Constants ¶
This section is empty.
Variables ¶
var ( // EnvVarRE matches a run of NAME=value assignments at the start of a // segment, each followed by whitespace (so a *trailing* command exists). // The unquoted-value branch deliberately excludes `(` so that a bash array // literal `NAME=(...)` is not mistaken for `NAME=` followed by `(...)`, // and excludes `;` so we never grab a value across a statement boundary. EnvVarRE = regexp.MustCompile(`^(\w+=(?:"[^"]*"|'[^']*'|[^\s(;]*)\s+)+`) CommentLineRE = regexp.MustCompile(`(?m)^[ \t]*#[^\n]*(?:\n|$)`) RedirectRE = regexp.MustCompile(`>\s*\S|>>`) SafeRedirectRE = regexp.MustCompile(`(?:[12]\s*)?>\s*/dev/null\b|2\s*>\s*&\s*1|>\s*&\s*2`) InputRedirectRE = regexp.MustCompile(`<\s*\S`) SafeInputRedirectRE = regexp.MustCompile(`<<<?|<\s*/dev/null\b`) HeredocRE = regexp.MustCompile(`^[^\n]*<<`) ArithBodyRE = regexp.MustCompile(`^\s*\(`) )
Exported regexes used by callers to detect shell features in raw command strings (env-var prefixes, comments, redirections, heredocs, arithmetic).
Functions ¶
func ExtractFindExecBody ¶ added in v1.3.0
ExtractFindExecBody extracts all command portions from find -exec or -execdir clauses. It returns a slice of bodies found; an empty slice means no matches.
func ExtractHereStringBody ¶ added in v1.3.0
ExtractHereStringBody extracts the inner command body from "sh <<< 'body'" or "bash <<< body" patterns where the opener is an interpreter (sh/bash/zsh/ dash/ash). It returns (body, true) when a match is found, or ("", false) otherwise. Only the first match is returned. The right-hand string is parsed respecting single quotes, double quotes, and unquoted forms.
func ExtractHeredocs ¶ added in v1.0.1
ExtractHeredocs removes heredoc bodies from cmd, keeping only the opener lines. It returns the processed string and true when all heredocs terminated normally. If an unterminated heredoc is found, it returns false — the caller should treat the command as requiring manual review rather than silently dropping content.
func ExtractProcSubst ¶ added in v1.3.0
ExtractProcSubst returns all <(...) and >(...) bodies and the outer command with each occurrence replaced by __PROCSUBST__. Byte-level scan — safe for UTF-8 because all sentinels are ASCII and multibyte sequences never contain ASCII bytes.
func ExtractShellCBody ¶ added in v1.3.0
ExtractShellCBody extracts all inner command bodies from "sh -c '<body>'" style invocations. It returns a slice of bodies found; an empty slice means no matches.
func ExtractSubshells ¶
ExtractSubshells returns all $(...) bodies and the outer command with each occurrence replaced by __SUBSHELL__. Byte-level scan — safe for UTF-8 because all sentinels are ASCII and multibyte sequences never contain ASCII bytes.
func ExtractXargsShellCBody ¶ added in v1.3.0
ExtractXargsShellCBody extracts all inner command bodies from xargs patterns that end with "sh -c '...'". It returns a slice of bodies found; an empty slice means no matches.
func FindSplitBoundaries ¶
FindSplitBoundaries scans cmd for pipeline delimiters (|, ||, &&, ;, \n), skipping content inside quotes. Returns [start, end) index pairs.
func HasShellCNonLiteralBody ¶ added in v1.3.0
HasShellCNonLiteralBody returns true when seg contains a shell -c invocation where the -c flag is not followed by a literal quoted string. This catches dangerous patterns like "bash -c $cmd" or "bash -c $(echo foo)" that bypass the normal shell-c recursion checks.
func IsSafePath ¶ added in v1.3.0
IsSafePath returns true when the path spec's source component (left of the first ':') starts with an exempt prefix and contains no ".." component. Leading and trailing quote characters are stripped first so that both -v /tmp/x and -v "/tmp/x" are treated identically.
func JoinContinuations ¶
JoinContinuations replaces line-continuation sequences (\<newline>) with a space.
func RemoveQuotedContent ¶
RemoveQuotedContent masks content inside single/double quotes with '_' so shell operators inside strings are not mistaken for command boundaries.
func SplitPipeline ¶
SplitPipeline splits cmd at shell pipeline boundaries and returns cleaned segments with env-var prefixes, comment-only entries, and blanks removed. A segment that starts with '-' is appended to the previous segment to handle flag-only pipeline components.
func SplitPipelineDetailed ¶ added in v1.5.0
SplitPipelineDetailed is SplitPipeline plus a parallel slice of the env-var assignment names that prefixed each segment in the original input. The names slice is the same length as the segments slice; entries are nil when a segment had no leading assignments. Callers that need to enforce a policy over assignment *names* (e.g. blocking LD_PRELOAD=evil ls) consume the names here; callers that only care about the command itself can keep using SplitPipeline. A standalone NAME=value with no trailing command is left intact in the segment (and produces no name entry) — same as before.
func StripComments ¶
StripComments removes shell comment lines from cmd.
func StripExemptPaths ¶ added in v1.0.1
StripExemptPaths rewrites flag+path occurrences in seg where the captured path is safe: starts with one of exemptPaths and contains no ".." traversal. Each matched occurrence is replaced with __SAFE_PATH__. flagRE must have exactly one capture group that captures the path or colon-separated mount spec (source:dest). The docker-volume scope (matching -v / --volume flags) is one example use case.
func StripLeadingPath ¶ added in v1.5.0
StripLeadingPath rewrites the first token of seg to its basename when the token is a path (contains a `/`). Bash treats any command word containing a slash as a direct path lookup rather than a PATH search, so the basename is unambiguously the program being run — `/bin/sed`, `/usr/local/bin/sed`, `./tools/sed`, and `tools/sed` all execute the same `sed` binary, and allow/deny rules written against the program name should match them all. Tokens without a slash are returned unchanged. The rest of seg (arguments, flags) is left untouched — only the leading command word is normalized.
func StripSafeRedirects ¶ added in v1.5.0
StripSafeRedirects rewrites every `> target`, `>> target`, `N> target`, and `N>> target` occurrence in cmd to a sentinel when target's source path passes IsSafePath against safeTargets. The masked form (quotes already replaced with `_`) is required so we don't trip over operators inside strings. Unsafe or non-matching redirects are left untouched so the existing RedirectRE check still catches them. Note: because the target span is replaced, deny patterns written against literal redirect targets (e.g. `> /tmp/foo`) won't match — none exist today, but adding one would silently no-op.
func StripWrappers ¶ added in v1.0.1
StripWrappers iteratively removes leading process-wrapper prefixes from seg. Builtins (timeout, time, nice, nohup, stdbuf, bare xargs) are always stripped. extra lists additional single-command wrapper names from config.
Types ¶
This section is empty.