Documentation
¶
Overview ¶
Package cfg provides control flow graph representation for dataflow analysis.
A control flow graph (CFG) models the flow of control through a function body. Each node represents a program point (assignment, branch, call, etc.) and edges represent possible control flow between points.
Core Types ¶
Point: An index identifying a location in the CFG. Points are local to each function's CFG and serve as keys for type state maps.
Node: Contains metadata about a program point including its kind, target variable (for assignments), callee (for calls), and condition info (for branches).
Edge: A directed edge from one point to another, with a condition flag indicating whether it's the then-branch (true) or else-branch (false) of a conditional.
Graph: Interface for CFG access, implemented by CFG. Provides node lookup, edge traversal, predecessor/successor queries, and reverse post-order iteration.
SSA-Like Identity ¶
The cfg package provides SymbolID for SSA-style value identity. Unlike SSA which creates new versions at each assignment, SymbolID identifies the lexical variable binding. Combined with Point, this enables tracking which value a name refers to at each program point.
Analysis Support ¶
The CFG supports forward dataflow analysis via RPO (reverse post-order) traversal. Branch conditions are exposed for constraint extraction, and join points are identified for type merging.
Index ¶
- type BoundIdent
- type CFG
- func (c *CFG) AddBranch(condVar SymbolID, condCheck CondCheck) Point
- func (c *CFG) AddEdge(from, to Point, cond bool)
- func (c *CFG) AddNode(kind NodeKind, target SymbolID, callee string) Point
- func (c *CFG) EdgeCond(from, to Point) (bool, bool)
- func (c *CFG) Edges() []Edge
- func (c *CFG) Entry() Point
- func (c *CFG) Exit() Point
- func (c *CFG) ID() uint64
- func (c *CFG) IsBranch(p Point) bool
- func (c *CFG) IsJoin(p Point) bool
- func (c *CFG) Node(p Point) *Node
- func (c *CFG) Predecessor(p Point) Point
- func (c *CFG) Predecessors(p Point) []Point
- func (c *CFG) RPO() []Point
- func (c *CFG) Reachable() map[Point]bool
- func (c *CFG) RemoveOutgoing(from Point)
- func (c *CFG) Size() int
- func (c *CFG) Successor(p Point) Point
- func (c *CFG) Successors(p Point) []Point
- func (c *CFG) UnreachablePoints() []Point
- func (c *CFG) ValidateEdges() error
- type CondCheck
- type CondCheckKind
- type Edge
- type EdgeError
- type Graph
- type Node
- type NodeKind
- type ParamProvider
- type PhiNode
- type PhiOperand
- type Point
- type SSAVersioned
- type SymbolID
- type SymbolKind
- type SymbolScopeProvider
- type Version
- type VersionedGraph
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type BoundIdent ¶
BoundIdent represents an identifier that has been resolved to a symbol.
A BoundIdent is created when a name reference in the AST is resolved to its declaration. It captures both the SymbolID (for lookup) and the Point (for flow-sensitive analysis).
By using BoundIdent instead of raw strings, the type system enforces that variable identity comes from proper binding resolution, not string comparison. Two variables with the same name in different scopes have different BoundIdents.
func NewBoundIdent ¶
func NewBoundIdent(name string, symbol SymbolID, point Point) BoundIdent
NewBoundIdent creates a BoundIdent from a name, symbol, and point. Returns zero value if name is empty or symbol is zero.
func (BoundIdent) IsValid ¶
func (b BoundIdent) IsValid() bool
IsValid returns true if this represents a valid binding.
func (BoundIdent) Name ¶
func (b BoundIdent) Name() string
Name returns the identifier name for display purposes only.
This method is for error messages and debugging output. For identity comparisons, use the Symbol field, which distinguishes between different variables that happen to have the same name.
type CFG ¶
type CFG struct {
Nodes []Node
// contains filtered or unexported fields
}
CFG represents the control flow graph for a function.
A CFG is built during AST analysis and consumed by the type checker for flow-sensitive analysis. The graph is immutable after construction.
Structure:
- Nodes: flat array indexed by Point
- Edges: directed edges with predecessor/successor maps for efficient lookup
- Entry/Exit: special nodes for function boundaries
The CFG is built incrementally via AddNode/AddEdge, then traversed via RPO for forward dataflow analysis.
func (*CFG) EdgeCond ¶
EdgeCond returns the condition value for edge from->to. Returns (true, ok) for then-branch, (false, ok) for else-branch.
func (*CFG) Predecessor ¶
Predecessor returns single predecessor (for non-join nodes).
func (*CFG) Predecessors ¶
Predecessors returns all predecessors of p.
func (*CFG) RPO ¶
RPO returns nodes in reverse post-order for forward dataflow analysis.
Reverse post-order (RPO) is the optimal traversal order for forward dataflow analysis: predecessors are visited before successors (except for back-edges in loops). This ensures that type state propagates correctly through the control flow graph.
Only nodes reachable from the entry point are included. Unreachable code (after unconditional return, for example) is excluded.
func (*CFG) RemoveOutgoing ¶
RemoveOutgoing removes all outgoing edges from a node.
func (*CFG) Successors ¶
Successors returns all successors of p.
func (*CFG) UnreachablePoints ¶
UnreachablePoints returns all nodes not reachable from entry.
func (*CFG) ValidateEdges ¶
ValidateEdges checks that all edges reference valid node indices. Returns nil if valid, otherwise returns an error describing the issue.
type CondCheck ¶
type CondCheck struct {
Kind CondCheckKind
TypeName string // Only for CheckTypeEqual/CheckTypeNot
}
CondCheck represents a condition check in a branch node.
type CondCheckKind ¶
type CondCheckKind uint8
CondCheckKind identifies the type of condition check in a branch.
The type checker uses CondCheckKind to extract constraints from branch conditions without re-analyzing the expression. This enables efficient type narrowing in then/else branches.
const ( CheckNone CondCheckKind = iota // Complex expression, no simple constraint CheckTruthy // if x: narrows x to truthy values CheckFalsy // if not x: narrows x to falsy values CheckNil // x == nil: narrows x to nil CheckNotNil // x ~= nil: narrows x to non-nil CheckLimit // Numeric for loop limit (i <= n) CheckTypeEqual // type(x) == "typename": narrows to that type CheckTypeNot // type(x) ~= "typename": excludes that type )
Condition check kind constants represent recognizable branch patterns.
type Edge ¶
Edge represents a directed control flow edge between two points.
For branch nodes (NodeBranch), the Cond field distinguishes branches:
- Cond=true: the "then" branch, taken when condition is truthy
- Cond=false: the "else" branch, taken when condition is falsy
For non-branch nodes, Cond is typically true (single successor).
type Graph ¶
type Graph interface {
ID() uint64 // Unique identifier for this CFG instance
Entry() Point // Function entry point (always node 0)
Exit() Point // Function exit point
Node(p Point) *Node // Node metadata at point p
RPO() []Point // Reverse post-order for forward analysis
Predecessors(p Point) []Point // Incoming edges
Successor(p Point) Point // Single successor (non-branch nodes)
Successors(p Point) []Point // All successors (branch nodes have 2)
Edges() []Edge // All edges in the graph
Size() int // Number of nodes
EdgeCond(from, to Point) (bool, bool) // Branch condition for an edge
IsJoin(p Point) bool // True if p has multiple predecessors
IsBranch(p Point) bool // True if p has multiple successors
}
Graph is the interface for control flow graphs.
This interface abstracts over different CFG implementations, allowing the type checker to work with various CFG builders.
Key methods for dataflow analysis:
- RPO(): Returns nodes in reverse post-order for forward analysis
- Predecessors/Successors: Enable backward/forward traversal
- EdgeCond: Retrieves branch condition for type narrowing
- IsJoin/IsBranch: Identifies merge points and decision points
type Node ¶
type Node struct {
Point Point
Kind NodeKind
Target SymbolID // Variable for assignments (0 = none or unresolved)
Callee string // Function for calls (global/external name)
CondVar SymbolID // Variable being tested (0 = none or complex expression)
CondCheck CondCheck // Condition check type and optional type name
LoopVars []SymbolID
LoopLocals []SymbolID
// LoopPreheader is the unique predecessor that enters a loop from outside
// (i.e., not a back-edge). For loops with multiple entry points or complex
// goto patterns, this points to the primary loop entry.
LoopPreheader Point
LoopPreheaderSet bool
}
Node represents a CFG node with metadata about the program point.
Different node kinds use different fields:
- NodeAssign: Target holds the assigned variable's SymbolID
- NodeCall: Callee holds the function name for global/external calls
- NodeBranch: CondVar and CondCheck describe the condition for narrowing
- NodeJoin: LoopVars, LoopLocals, LoopPreheader describe loop structure
The Point field is the node's index in the CFG's Nodes slice.
type NodeKind ¶
type NodeKind uint8
NodeKind identifies the type of CFG node.
const ( NodeEntry NodeKind = iota // Function entry point, always first node NodeExit // Function exit point, always last node NodeAssign // Assignment statement (local x = expr) NodeCall // Function call expression NodeBranch // Conditional branch (if, while, for condition) NodeJoin // Join point where multiple paths merge NodeReturn // Return statement NodeScopeEnter // Lexical scope entry (function, block, loop body) NodeScopeExit // Lexical scope exit (end of block) NodeTypeDef // Type definition (type annotation) )
Node kind constants identify the type of each CFG node.
type ParamProvider ¶
type ParamProvider interface {
// ParamNames returns the function parameter names in order.
// Returns nil for block graphs or functions with no parameters.
ParamNames() []string
// ParamSymbols returns the function parameter symbol IDs in order.
// Returns nil for block graphs or functions with no parameters.
ParamSymbols() []SymbolID
// ParamDeclPoints returns the CFG points where parameters are declared.
// Returns nil for block graphs or functions with no parameters.
ParamDeclPoints() []Point
}
ParamProvider provides function parameter information.
This interface enables the type checker to access function parameter metadata for signature checking and type initialization.
type PhiNode ¶
type PhiNode struct {
Point Point // CFG point where the phi is located
Target Version // The new version created by this phi
Operands []PhiOperand // Incoming versions from each predecessor
}
PhiNode represents a phi function at a control flow join point.
Phi nodes are the mechanism for merging variable versions after control flow divergence. When an if/else or loop creates multiple definitions of a variable, a phi node at the join point unifies them into a single version.
Example:
if cond then
x = "hello" -- x@1
else
x = "world" -- x@2
end
-- phi node: x@3 = phi(x@1 from then-branch, x@2 from else-branch)
print(x) -- uses x@3
For type checking, the phi node's type is typically the union of all operand types.
type PhiOperand ¶
type PhiOperand struct {
From Point // Predecessor point where this version comes from
Version Version // The SSA version from that predecessor
}
PhiOperand represents one incoming value for a phi node.
At a join point where multiple control flow paths merge, a phi node needs to know which version of a variable comes from each predecessor. Each PhiOperand pairs a predecessor point with the version visible at that point.
type Point ¶
type Point uint32
Point represents a location in a control flow graph. Each Point is an index into the CFG's node array, local to that CFG.
type SSAVersioned ¶
type SSAVersioned interface {
// VisibleVersion returns the SSA version of a symbol visible at a point.
// Returns a zero Version if the symbol is not defined on all paths to this point.
VisibleVersion(p Point, sym SymbolID) Version
// AllVisibleVersions returns all symbol versions visible at a point.
// The returned map should not be modified by callers.
AllVisibleVersions(p Point) map[SymbolID]Version
// PhiNodes returns all phi nodes in the graph.
// Used to determine type at join points by unioning operand types.
PhiNodes() []PhiNode
}
SSAVersioned provides SSA versioning data for a CFG.
This interface abstracts over SSA construction, allowing the type checker to query which version of a variable is visible at each program point without knowing the construction details.
The versioning information enables flow-sensitive type checking: different assignments can have different types, and the checker uses the version to look up the correct type.
type SymbolID ¶
type SymbolID uint64
SymbolID uniquely identifies a variable declaration across the program.
SymbolID provides declaration-level identity, distinguishing between variables that happen to have the same name but are different bindings:
local x = 1 -- SymbolID 100
if cond then
local x = 2 -- SymbolID 101 (different binding)
print(x) -- refers to SymbolID 101
end
print(x) -- refers to SymbolID 100
The combination of SymbolID and Version forms a complete SSA identity, where SymbolID identifies which variable and Version identifies which definition of that variable.
SymbolID 0 is reserved for unresolved or unknown references.
func NextSymbolID ¶
func NextSymbolID() SymbolID
NextSymbolID generates a unique symbol ID. Thread-safe via atomic operations.
func ReserveSymbolIDs ¶
ReserveSymbolIDs reserves a contiguous block of symbol IDs and returns the first ID in the block. Returns 0 when n <= 0.
type SymbolKind ¶
type SymbolKind int
SymbolKind classifies how a symbol was declared.
The declaration kind affects type checking behavior:
- Param: Initialized by caller, type from function signature
- Local: Initialized at declaration point
- Global: Initialized from global environment
- Upvalue: Captured from enclosing scope, may have multiple definitions
const ( // SymbolUnknown indicates the symbol kind is not known. SymbolUnknown SymbolKind = iota // SymbolParam indicates a function parameter. SymbolParam // SymbolLocal indicates a local variable. SymbolLocal // SymbolGlobal indicates a global variable. SymbolGlobal // SymbolUpvalue indicates an upvalue (captured variable). SymbolUpvalue )
type SymbolScopeProvider ¶
type SymbolScopeProvider interface {
// SymbolAt returns the SymbolID for a variable name at a specific CFG point.
// Returns (0, false) if the name is not visible at that point.
SymbolAt(p Point, name string) (SymbolID, bool)
// AllSymbolsAt returns all visible symbols at a CFG point.
// Returns a map of variable names to their SymbolIDs.
AllSymbolsAt(p Point) map[string]SymbolID
// DeclarationPoint returns the CFG point where a symbol was declared.
// Returns (0, false) if the symbol is unknown.
DeclarationPoint(sym SymbolID) (Point, bool)
// NameOf returns the variable name for a symbol (for display purposes).
// Returns empty string if the symbol is unknown.
NameOf(sym SymbolID) string
// SymbolKind returns the kind of a symbol (Param, Local, or Global).
// Returns (SymbolUnknown, false) if the symbol is not known.
SymbolKind(sym SymbolID) (SymbolKind, bool)
}
SymbolScopeProvider provides symbol visibility information at CFG points.
This interface enables the type checker to resolve variable names to their SymbolIDs based on lexical scoping rules at each program point.
type Version ¶
type Version struct {
Root string // Variable name (for display and lookup)
Symbol SymbolID // Declaration identity (distinguishes same-named variables in different scopes)
ID int // Version number within the function (0 = undefined/uninitialized)
}
Version represents a stable SSA version of a variable.
In SSA (Static Single Assignment) form, each assignment creates a new version of a variable. The Version type captures both the variable identity and which assignment created this particular value:
local x = 1 -- x@1
x = x + 1 -- x@2 (new version)
if cond then
x = 10 -- x@3 in then-branch
else
x = 20 -- x@4 in else-branch
end
-- x@5 = phi(x@3, x@4) at join point
Version components:
- Root: The variable name for display purposes
- Symbol: The SymbolID of the declaration (distinguishes shadowed names)
- ID: The version number (0 = undefined, 1+ = specific assignment)
func (Version) IsZero ¶
IsZero returns true if this is an undefined/uninitialized version. Zero versions indicate the variable has not been assigned on this path.
type VersionedGraph ¶
type VersionedGraph interface {
Graph
SSAVersioned
SymbolScopeProvider
ParamProvider
}
VersionedGraph combines a CFG graph with SSA versioning and symbol scope info.
This is the full interface required by the type checker for flow-sensitive analysis. It combines:
- Graph: control flow structure for traversal
- SSAVersioned: version information for flow-sensitive types
- SymbolScopeProvider: name resolution at each point
- ParamProvider: function signature information