Documentation
¶
Overview ¶
Package flow is m-cli's control-flow / dataflow infrastructure for the path-sensitive lint rules (spec §3.1; the Python tool's "Phase 7"). It builds a per-label control-flow graph over the m-parse tree and provides dataflow passes (currently LOCK-held state, driving M-MOD-025). The CFG is built structurally — walking top-level `line` nodes and attaching each command's dot-block flag — so it needs no parent pointers from the parser.
Faithful to the reference model: per-label CFG with an entry block, one block per `command` node in source order, and an exit block; edges are fall/branch/skip/exit/if-skip, with QUIT inside a dot-block modeled as a dot-block exit (fall-through) rather than a label exit.
Index ¶
- func AnalyzeTaint(cfg CFG, src []byte, formals []string, config TaintConfig) map[int]map[string]bool
- func DefinitelyDefined(cfg CFG, src []byte, formals []string) map[int]map[string]bool
- func DepthAtExit(cfg CFG, src []byte) int
- func EtrapProtection(cfg CFG, src []byte) map[int]bool
- func FormalParams(root parse.Node, src []byte) map[int][]string
- func HeldAtExit(cfg CFG, src []byte) []string
- type Block
- type CFG
- type Effects
- type EtrapLeak
- type StaleTestRead
- type TaintConfig
- type TaintFlow
- type UndefinedRead
- type VarUse
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func AnalyzeTaint ¶
func AnalyzeTaint(cfg CFG, src []byte, formals []string, config TaintConfig) map[int]map[string]bool
AnalyzeTaint runs the forward union-meet taint dataflow over cfg, returning {blockID: tainted-at-entry}. The entry block's IN is the label's formals when config.FormalsTainted; every other block's IN is the union of its predecessors' out-sets.
func DefinitelyDefined ¶
DefinitelyDefined is a forward MUST-analysis (definite assignment) over cfg: in[B] is the set of local names guaranteed to have been DEF'd on EVERY path from the label entry to B. It drives M-MOD-024 (read of a local before it is definitely assigned).
Differs from classical reaching-definitions in two ways: the lattice element is a set of variable *names* (we only need "is X definitely defined", not which SET did it), and the meet is intersection (defined on every path).
formals are definitely defined at entry. Transfer for a block whose command ran is (in - kills) ∪ defs, or ∅ on kills_all; skip / if-skip edges (command did not run) carry the in-set through unchanged. Unreachable blocks → ∅.
func DepthAtExit ¶
DepthAtExit returns the worst-case (maximum) open-transaction nesting depth on any path reaching the label's exit. A non-zero result means at least one path leaves a transaction open — the TSTART-leak signal (M-MOD-026).
func EtrapProtection ¶
EtrapProtection is a forward MUST-analysis (AND meet): in_state[B] is true iff `NEW $ETRAP` (or `NEW $ET`) has executed on EVERY path from the label entry to B. The entry starts unprotected; other blocks start at the lattice top (true, the AND identity) and are refined downward as predecessors propagate.
func FormalParams ¶
FormalParams maps each label's start row (0-based, matching CFG.LabelRow) to the names of its declared formal parameters. LBL(A,B) → {row: ["A", "B"]}. Labels with no formals are absent. The structural walk mirrors the CFG build, since m-parse exposes no parent pointers.
func HeldAtExit ¶
HeldAtExit returns the set of lock names still held when control reaches the label's exit on any path — the LOCK-leak signal (M-MOD-025). The result is sorted for determinism.
Types ¶
type Block ¶
type Block struct {
ID int
Kind string // "entry" | "command" | "exit"
Command parse.Node
HasCmd bool
Succ []int
Edges []string
Line int // 1-based
}
Block is a node in a per-label CFG. Succ and Edges are parallel.
type CFG ¶
CFG is one label's control-flow graph. Block 0 is the entry; the last block is the exit.
type Effects ¶
Effects are the local-variable effects of a command (or a single argument). Defs/Kills are name sets; KillsAll captures the argumentless KILL/NEW semantics (every local in the current frame); Uses is ordered so a diagnostic can point at a specific read site.
type EtrapLeak ¶
EtrapLeak is a `SET $ETRAP=...` site not guarded by a `NEW $ETRAP` on every path from the label entry — the new error handler escapes the label into the caller's stack. Positions are 1-based; Col/EndCol span the offending command.
func EtrapLeaks ¶
EtrapLeaks returns every unguarded SET $ETRAP site in the label (M-MOD-027). It runs the protection MUST-analysis once, then reports each SET-$ETRAP block whose entry is not protected.
type StaleTestRead ¶
StaleTestRead is a read of $TEST ($T) at a point where no $TEST-setting command is guaranteed to have run on every path from the label entry — the value may be left over from a much earlier command (M-MOD-017). Positions are 1-based; Col/EndCol span the special variable.
func StaleTestReads ¶
func StaleTestReads(cfg CFG, src []byte) []StaleTestRead
StaleTestReads returns the stale $TEST reads in the label, one per source line (consecutive reads collapse). It runs the freshness MUST-analysis, then reports $TEST/$T reads in blocks whose entry is not fresh.
type TaintConfig ¶
type TaintConfig struct {
// FormalsTainted taints the label's formal parameters at entry. Default
// true — public-label formals are attack surface.
FormalsTainted bool
// Sanitizers are uppercased intrinsic-function keywords whose output is
// treated as clean regardless of input taint (e.g. $LENGTH returns a
// number).
Sanitizers map[string]bool
}
TaintConfig configures the taint analyzer.
func DefaultTaintConfig ¶
func DefaultTaintConfig() TaintConfig
DefaultTaintConfig taints formals and treats $L/$LENGTH/$A/$ASCII as sanitizers — matching the Python reference defaults.
type TaintFlow ¶
type TaintFlow struct {
Label string
Name string // the first tainted variable name in the sink subtree
SinkKind string // "indirection (@…)" | "XECUTE argument"
Line int
Col int
EndLine int
EndCol int
}
TaintFlow is one sink reached by a tainted local variable (the raw M-MOD-036 signal, before the lint layer dedups per (label, var)). Positions are 1-based.
func TaintFlows ¶
func TaintFlows(cfg CFG, src []byte, formals []string, config TaintConfig) []TaintFlow
TaintFlows returns, in source order, every indirection or XECUTE-argument sink in cfg whose subtree references a tainted variable, naming the first such var. No dedup is applied here — the lint layer collapses to one finding per (label, var).
type UndefinedRead ¶
UndefinedRead is one read of a local variable that may not be definitely assigned on every path from the label entry (the raw M-MOD-024 signal, before the lint layer applies the Kernel allowlist and per-label dedup). Positions are 1-based; Col/EndCol span the variable name.
func UndefinedReads ¶
func UndefinedReads(cfg CFG, src []byte, formals []string) []UndefinedRead
UndefinedReads returns, in source order, every local-variable read in cfg that is not guaranteed defined on every prior path. It runs the definite-assignment analysis, tracks running defs within each command (so S A=1,B=A sees A defined for B's RHS), and suppresses reads protected by the IF $G(X)="" SET X=default idiom. No dedup or allowlist is applied here.
Known limitations (faithful to the reference): GOTO targets within the routine are over-approximated as exits; FOR loops have no back-edge, so a first-iteration read may be under-reported; OPEN device-parameter syntax can parse as local variables and over-report on I/O code.