danger

package

v1.1.0 Latest Latest Go to latest Published: Jun 3, 2026 License: MIT Imports: 11 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/BackendStack21/odek

Links

Open Source Insights

Documentation ¶

Overview ¶

Package danger classifies shell commands by risk level and provides a configurable approval system for dangerous operations.

Classification is token-based (not regex) — it respects quotes, pipes, redirects, compound commands (&&, ||, ;), and multi-line input. Each command is classified into one of 9 risk classes, and the user can configure which actions (allow/prompt/deny) apply to each class.

The gate fails CLOSED. A command whose program name is recognised but used benignly classifies as Safe (allow); a command whose verb is NOT recognised classifies as Unknown and is denied by default. The set of recognised-safe commands (safeCommands) is therefore an explicit read-only allowlist — extend it, or the per-profile allowlist, to permit a tool rather than relying on it slipping through unclassified.

Threat model ¶

The classifier is an adversarial filter, not a parser for well-behaved input. It assumes a prompt-injected agent is actively trying to make a dangerous command read as harmless so it slips past the approval gate. The design therefore errs toward the worse class when in doubt, and is built in layers that each close a category of evasion:

Normalisation (see normalize) rewrites the command so token-level analysis can see through shell tricks before classification runs: - $'…' ANSI-C escapes decodeANSIC ($'\x72\x6d' → rm) - $IFS word-splitting expandIFS (rm$IFS-rf$IFS/ → rm -rf /) - {a,b,c} brace expansion expandBraces ({rm,-rf,/} → rm -rf /) - $(…)/`…`/<(…)/>(…) subst. extractSubstitutions (bodies classified too) - command/exec/builtin stripCommandWrappers - \-escapes (r\m, \rm) collapseUnquotedBackslashes - absolute paths (/bin/rm) basenameFirstToken + commandName The tokenizer additionally treats quote boundaries as NON word boundaries, so empty/adjacent quotes like r""m and "rm" still resolve to the single word `rm`.
Structural decomposition. A command is split into segments (on ;, &&, ||), each segment into pipe stages (on |), and EVERY stage is classified — not just the head — so `true | dd of=/dev/sda` and `echo x | sudo rm -rf /home` are seen for what their later stages do. The worst class across all parts wins (see rank).
Wrapper unwrapping (unwrapWrappers). Leading execution wrappers (env, xargs, nohup, nice, setsid, timeout, …) are stripped so the real command underneath is classified; privileged wrappers (sudo, doas, pkexec) additionally impose a system_write floor and then let the inner command escalate further (sudo rm -rf /var → destructive).
Verb-independent resource scanning (classifyResourceToken). Some resources are dangerous regardless of the command touching them: /dev/tcp and /dev/udp pseudo-devices (reverse-shell channels) and sensitive credential paths (~/.ssh, /etc/shadow, ~/.aws/credentials, /proc/self/environ, …). These are flagged wherever they appear.
Payload re-classification. Shell -c strings (bash -c '…') and the bodies of command/process substitutions are themselves classified by re-entering Classify, so nested commands cannot hide a level deeper.

Limitations ¶

This is a heuristic defence-in-depth layer, NOT a sandbox or a complete shell interpreter. It does not, and cannot, catch everything:

Variable indirection: `X=rm; $X -rf /` — the value of $X is not tracked. Note the fail-closed default turns this from a silent bypass into a denial: the unrecognised `$X` verb classifies as Unknown.
Fully dynamic construction from runtime data, command output, or environment the classifier cannot evaluate.
Arbitrary value transformations beyond the enumerated encodings (e.g. a secret piped through gzip/openssl before exfiltration).
Interpreter escape hatches we do not special-case (awk 'BEGIN{system()}', editor `!` shells, language-specific eval paths). These read as a known command (awk/vim/…) used benignly, so they classify Safe — the known verb is the gap, not an unknown one.

Because these gaps exist, the classifier is paired with other controls: non-interactive denial, output redaction (internal/redact), and — for strong isolation — the container sandbox. When tuning, remember that over-classification only costs an extra prompt, while under-classification can let a destructive or exfiltrating command through silently; prefer the former.

Index ¶

func IsSafe(content string) bool
func Rank(cls RiskClass) int
type Action
type Approver
type DangerousConfig
type InjectionPattern
type RiskClass
type ScanResult
- func ScanInjection(content string) []ScanResult
type TTYApprover
- func NewTTYApprover(cfg *DangerousConfig) *TTYApprover
type ToolOperation

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func IsSafe ¶

func IsSafe(content string) bool

IsSafe returns true if no injection threats are detected in content. This is the primary gate used before injecting untrusted content into the system prompt.

func Rank ¶ added in v1.1.0

func Rank(cls RiskClass) int

Rank returns the severity order for priority comparison. Exported so consumers that enforce risk caps (e.g. the sub-agent maxRisk clamp) share this single ordering instead of mirroring it — a mirror silently drifts when a class is added, as happened with Unknown.

Types ¶

type Action ¶

type Action string

Action represents what to do when a command of a given risk class is detected.

const (
	Allow  Action = "allow"
	Prompt Action = "prompt"
	Deny   Action = "deny"
)

type Approver ¶

type Approver interface {
	// PromptCommand asks the user to approve or deny a shell command.
	// cls is the risk class (system_write, network_egress, etc.).
	// Returns nil on approve, error on deny or timeout.
	PromptCommand(cls RiskClass, cmd, description string) error

	// PromptOperation asks the user to approve or deny a native tool operation
	// (read_file on /etc, browser to external URL, etc.).
	PromptOperation(op ToolOperation) error
}

Approver is the interface for user approval of dangerous operations. Two implementations exist:

TTYApprover — opens /dev/tty for interactive approval (CLI mode)
WSApprover — sends approval requests via WebSocket (serve mode)

When nil (no approver configured), calls fall back to non-interactive behavior (NonInteractiveAction). Tools MUST inject an approver to get interactive approval in any mode.

type DangerousConfig ¶

type DangerousConfig struct {
	// Classes maps risk classes to their configured action.
	// Only overrides for non-default values need to be set.
	Classes map[RiskClass]Action `json:"classes,omitempty"`

	// Allowlist is a list of command strings that are always allowed,
	// regardless of their risk classification. Exact match only.
	// Takes priority over Denylist.
	Allowlist []string `json:"allowlist,omitempty"`

	// Denylist is a list of command strings that are always denied,
	// regardless of their risk classification. Prefix match (after trimming).
	Denylist []string `json:"denylist,omitempty"`

	// DefaultAction is the global default action applied to ALL risk classes
	// when set. Per-class overrides in Classes still win.
	// "allow" → YOLO mode (everything runs without prompt)
	// "deny" → lockdown (everything denied unless explicitly allowed)
	// Not set → uses built-in defaults per class
	DefaultAction *string `json:"action,omitempty"`

	// NonInteractive specifies what to do when running without a TTY.
	// "allow" (default) — run everything, "deny" — block all prompted ops.
	NonInteractive *string `json:"non_interactive,omitempty"`

	// Approver handles interactive approval prompts for dangerous operations.
	// When set, all Prompt-class operations use this instead of /dev/tty.
	// Tools can inject their own approver (e.g., WebSocket-based for odek serve).
	// When nil, CheckOperation falls back to /dev/tty (CLI-compatible default).
	Approver Approver `json:"-"`
}

DangerousConfig defines how dangerous operations are handled. Configurable via the standard 4-layer odek config chain.

Default behavior per class (no sandbox):

safe → allow, local_write → allow, system_write → prompt,
destructive → deny, network_egress → prompt,
code_execution → prompt, install → prompt, blocked → deny,
unknown → deny

The classifier fails closed: a command whose program name is not recognised classifies as Unknown and is denied by default. Set "unknown": "prompt" (or add trusted tools to the allowlist) to soften this for a given profile.

func (*DangerousConfig) ActionFor ¶

func (c *DangerousConfig) ActionFor(cls RiskClass) Action

ActionFor returns the configured action for the given risk class. Per-class overrides in Classes win first, then the global default action (the "action" field), then built-in defaults, then Prompt.

func (*DangerousConfig) ActionForCommand ¶

func (c *DangerousConfig) ActionForCommand(cmd string) Action

ActionForCommand returns the action for a specific command string. Allowlist and denylist are checked first (exact match for allowlist, prefix match for denylist), then falls back to the risk-class-based action.

func (*DangerousConfig) CheckOperation ¶

func (c *DangerousConfig) CheckOperation(op ToolOperation, trustedClasses map[RiskClass]bool) error

CheckOperation checks whether a tool operation is allowed, denied, or needs approval. Returns nil on allow, error on deny, and prompts the user on prompt. Uses the configured Approver when set; falls back to /dev/tty (TTYApprover) when no approver is configured.

func (*DangerousConfig) NonInteractiveAction ¶

func (c *DangerousConfig) NonInteractiveAction() Action

NonInteractiveAction returns the action to use when no TTY is available.

type InjectionPattern ¶

type InjectionPattern struct {
	Re    *regexp.Regexp
	Label string
}

InjectionPattern groups a compiled regex with a human-readable label describing what threat it detects.

type RiskClass ¶

type RiskClass string

RiskClass represents the risk level of a shell command.

const (
	Safe          RiskClass = "safe"
	LocalWrite    RiskClass = "local_write"
	SystemWrite   RiskClass = "system_write"
	Destructive   RiskClass = "destructive"
	NetworkEgress RiskClass = "network_egress"
	CodeExecution RiskClass = "code_execution"
	Install       RiskClass = "install"
	Blocked       RiskClass = "blocked"

	// Unknown is the fall-through class for a command whose program name the
	// classifier does not recognise. It defaults to Deny (same as
	// Destructive): the gate fails CLOSED rather than open, so a novel or
	// obfuscated verb that dodged every known-dangerous check cannot run
	// unprompted. Recognised-but-benign usage classifies as Safe instead.
	Unknown RiskClass = "unknown"
)

func Classify ¶

func Classify(cmd string) RiskClass

Classify determines the risk class of a shell command using token-level heuristics. Returns the highest-severity class detected.

Priority (highest to lowest): blocked > destructive > system_write > code_execution > network_egress > install > local_write > safe

Pipeline (see the package doc for the full evasion model):

raw cmd ─▶ isRawBlocked ─▶ normalize ─┬─▶ classifyOne(main) ─┐
                                       └─▶ Classify(sub) ⟳ ───┴─▶ worst wins

normalize neutralises shell evasion tricks (ANSI-C/$IFS/brace expansion, $(…)/`…`/<(…) substitutions, command/exec wrappers, backslash escapes, absolute-path basenames) and returns the rewritten command plus any substitution bodies. classifyOne then splits into segments and pipe stages and classifies each (see classifyPipeline/classifyStage). Every extracted sub-expression is re-classified through Classify so nested commands cannot hide one level deeper; the worst class across the whole tree is returned.

func ClassifyPath ¶

func ClassifyPath(path string) RiskClass

ClassifyPath returns a RiskClass for a filesystem path. /tmp/*, working directory → local_write; /etc/*, /root/* → system_write; /boot/*, /dev/*, /sys/* → destructive; home sensitive dirs → system_write.

func ClassifyURL ¶

func ClassifyURL(rawURL string) RiskClass

ClassifyURL returns a RiskClass for a browser URL. Internal IPs → system_write; external → network_egress. Uses proper IP parsing (handles decimal, octal, hex, IPv6 compressed, short forms like 127.1, and all other representations that browsers accept via inet_aton-style parsing) instead of string prefix matching which was trivially bypassable.

type ScanResult ¶

type ScanResult struct {
	Label   string // human-readable threat label
	Pattern string // the regexp pattern that matched (for debugging)
}

ScanResult describes a single detected injection threat.

func ScanInjection ¶

func ScanInjection(content string) []ScanResult

ScanInjection checks content for prompt injection attempts. Returns nil if no threats detected, or a list of found threats. Each threat includes a label describing what was found.

type TTYApprover ¶

type TTYApprover struct {
	DangerousConfig *DangerousConfig
	TrustedClasses  map[RiskClass]bool

	TTYPath string // overridden in tests

	// Approval-fatigue mitigation. After FrictionThreshold approvals of
	// the same class within FrictionWindow, the next prompt requires
	// the user to type the literal word "approve" (no single-letter
	// shortcut) and prints a 1.5s pause before accepting input. This
	// breaks reflexive click-through and gives the user a moment to
	// notice they have approved an unusual number of dangerous calls.
	FrictionThreshold int
	FrictionWindow    time.Duration
	// contains filtered or unexported fields
}

TTYApprover implements Approver by reading from /dev/tty. This is the default approver used in CLI mode (odek run, odek repl). When /dev/tty is not available (piped stdin, CI), it falls back to the configured NonInteractiveAction.

func NewTTYApprover ¶

func NewTTYApprover(cfg *DangerousConfig) *TTYApprover

NewTTYApprover creates a TTYApprover with the given config.

func (*TTYApprover) PromptCommand ¶

func (a *TTYApprover) PromptCommand(cls RiskClass, cmd, description string) error

func (*TTYApprover) PromptOperation ¶

func (a *TTYApprover) PromptOperation(op ToolOperation) error

func (*TTYApprover) SetTrustAll ¶

func (a *TTYApprover) SetTrustAll(enabled bool)

SetTrustAll enables or disables blanket trust for all risk classes. When enabled, PromptCommand returns nil for every call (used by batch approval).

func (*TTYApprover) SetTrustedClasses ¶

func (a *TTYApprover) SetTrustedClasses(m map[RiskClass]bool)

SetTrustedClasses atomically sets the trusted classes map. Takes ownership of the provided map — caller must not write to it after calling.

type ToolOperation ¶

type ToolOperation struct {
	Name     string
	Resource string
	Risk     RiskClass
}

ToolOperation describes a native tool call for approval checking.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL