compile

package
v0.7.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 6, 2026 License: Apache-2.0 Imports: 7 Imported by: 0

Documentation

Overview

Package compile holds the AST-to-bytecode pipeline. Mirrors cpython/Python/instruction_sequence.c, codegen.c, flowgraph.c, assemble.c, and compile.c.

Index

Constants

View Source
const (
	FastLocal  uint8 = 0x20
	FastCell   uint8 = 0x40
	FastFree   uint8 = 0x80
	FastHidden uint8 = 0x10
)

FastLocalKind bits. Mirror cpython/Include/internal/pycore_code.h CO_FAST_LOCAL / CO_FAST_CELL / CO_FAST_FREE / CO_FAST_HIDDEN.

CPython: Include/internal/pycore_code.h:L42 CO_FAST_*

View Source
const (
	CoOptimized         uint32 = 0x0001
	CoNewLocals         uint32 = 0x0002
	CoVarargs           uint32 = 0x0004
	CoVarkeywords       uint32 = 0x0008
	CoNested            uint32 = 0x0010
	CoGenerator         uint32 = 0x0020
	CoCoroutine         uint32 = 0x0100
	CoIterableCoroutine uint32 = 0x0200
	CoAsyncGenerator    uint32 = 0x0400
	CoHasDocstring      uint32 = 0x4000000
	CoMethod            uint32 = 0x8000000
)

CO_* flags. CPython: Include/cpython/code.h.

View Source
const CoNoFree uint32 = 0x0040

CoNoFree is set when a code object captures no free variables and has no cell variables. CPython sets it in compute_code_flags after flowgraph + symtable agree no closure cells are needed.

CPython: Include/cpython/code.h CO_NOFREE

View Source
const MaxOparg = 1 << 30

MaxOparg is 1<<30. CPython asserts oparg < (1<<30).

CPython: Python/instruction_sequence.c:L122

View Source
const MaxOpcode = 511

MaxOpcode mirrors MAX_OPCODE in instruction_sequence.c.

CPython: Python/instruction_sequence.c:L113 MAX_OPCODE

Variables

This section is empty.

Functions

func AssembleExceptionTable added in v0.6.0

func AssembleExceptionTable(seq *Sequence) []byte

AssembleExceptionTable is the exported wrapper for assembleExceptionTable. External tests use it to build exception-table bytes from a curated Sequence and round-trip them through the vm reader without going through the full compile pipeline.

func AssembleLineTable added in v0.6.0

func AssembleLineTable(seq *Sequence, firstLineno int) []byte

AssembleLineTable is the exported wrapper for assembleLineTable. External tests use it to build location-table bytes from a curated Sequence and round-trip them through the vm reader without going through the full compile pipeline.

func Disassemble

func Disassemble(co *Code) string

Disassemble returns a human-readable string listing of co. Output shape matches `dis.dis(co)`: one line per instruction, columns are "lineno offset opname oparg (display)".

CPython: Lib/dis.py _disassemble_bytes

func HasTarget

func HasTarget(op Opcode) bool

HasTarget reports whether op carries a jump target oparg. CPython spells this OPCODE_HAS_JUMP and uses it from the instruction-sequence label resolver.

Types

type Assembler

type Assembler struct {
	Filename       string
	FirstLineno    int
	Code           []byte
	LineTable      []byte
	ExceptionTable []byte
	// contains filtered or unexported fields
}

Assembler is the per-call state. Public so tests can drive individual phases (emit, varint, location, exception).

CPython: Python/assemble.c struct assembler

type BasicBlock

type BasicBlock struct {
	Instrs       []Instr
	Next         *BasicBlock
	Label        int
	StartDepth   int
	Predecessors int
	Visited      bool
	Cold         bool
	Warm         bool
	Reachable    bool
}

BasicBlock is one node of the CFG. CPython builds a doubly-linked list rather than a slice so that inline / merge passes can splice cheaply. The Go port does the same with Next.

CPython: Python/flowgraph.c basicblock

type Builder

type Builder struct {
	Head *BasicBlock
	Tail *BasicBlock
	// contains filtered or unexported fields
}

Builder is the CFG builder state: head/tail of the block list plus scratch the optimiser passes share (label table, current block).

CPython: Python/flowgraph.c cfg_builder

func FromSequence

func FromSequence(seq *Sequence) (*Builder, error)

FromSequence builds a CFG from a flat instruction sequence. Each label target starts a new block; terminators (RETURN_VALUE, RAISE_VARARGS, RERAISE, unconditional JUMP) end the current block.

CPython: Python/flowgraph.c:L3923 _PyCfgBuilder_FromInstructionSequence

func (*Builder) ToSequence

func (b *Builder) ToSequence() (*Sequence, error)

ToSequence flattens a CFG back to a flat sequence. Walks blocks in list order, copying each instruction out. Jump targets are already instruction offsets (label resolution happened before the CFG build), so the caller does not need a fresh ApplyLabelMap.

CPython: Python/flowgraph.c:L3988 _PyCfg_ToInstructionSequence

type Code

type Code struct {
	Argcount        int
	PosOnlyArgCount int
	KwOnlyArgCount  int
	NLocals         int
	Stacksize       int
	Flags           uint32
	Code            []byte
	Consts          []any
	Names           []string
	VarNames        []string
	FreeVars        []string
	CellVars        []string
	LocalsPlusNames []string
	LocalsPlusKinds []uint8
	Filename        string
	Name            string
	Qualname        string
	Firstlineno     int
	Linetable       []byte
	ExceptionTable  []byte
	Nested          []*Code
}

Code is the v0.5 placeholder for objects.Code. It carries every field the assembler fills today (marshal parity is the v0.5 gate; fields the VM needs land alongside the v0.6 interpreter port).

CPython: Include/cpython/code.h PyCodeObject

func Assemble

func Assemble(seq *Sequence, info *Info, unit *Unit, filename string) (*Code, error)

Assemble builds a final Code object from the post-flowgraph Sequence plus per-unit metadata. Mirrors CPython's _PyAssemble_MakeCodeObject.

CPython: Python/assemble.c:L731 _PyAssemble_MakeCodeObject

func Compile

func Compile(mod ast.Mod, filename string, optimize int) (*Code, error)

Compile runs the full pipeline on a parsed module and returns the top-level Code object plus any nested code objects (one per nested scope).

Pipeline:

  1. symtable.Build resolves every name to its scope.
  2. codegen walks the AST top-down; each scope produces a Unit.
  3. flowgraph.Optimize resolves labels and runs the optimisation passes that have landed.
  4. assemble packs the Sequence and pools into a Code object.

CPython: Python/compile.c:L353 _PyAST_Compile

type Compiler

type Compiler struct {
	Filename string
	Optimize int
	Future   *future.Features
	Symtable *symtable.Table
	// contains filtered or unexported fields
}

Compiler is the long-lived driver state shared by every Codegen call within one Compile invocation.

CPython: Python/compile.c compiler

func NewCompiler

func NewCompiler(filename string, optimize int, ff *future.Features, st *symtable.Table) *Compiler

NewCompiler builds a fresh driver. Symtable must already be built over mod.

CPython: Python/compile.c new_compiler

func (*Compiler) Codegen

func (c *Compiler) Codegen(sc *symtable.Entry, mod ast.Mod) (*Unit, error)

Codegen emits instructions for one scope. The caller drives the walk; codegen does not recurse into nested scopes itself (the driver pushes a new unit and calls Codegen again).

CPython: Python/codegen.c _PyCodegen_Module / _PyCodegen_FunctionBody

type ConstTuple

type ConstTuple struct {
	Values []any
}

ConstTuple is the codegen-side placeholder for a Python tuple constant. The assembler converts it to a real PyTuple during marshal.

type ExceptHandler

type ExceptHandler struct {
	Start  int
	End    int
	Target int
	Depth  int
	Lasti  bool
}

ExceptHandler is one row in the PEP 657 exception table. Start, End, Target are byte offsets into the final co_code (filled by 1628 assemble); Depth is the stack depth at handler entry; Lasti is the PEP 657 push-lasti bit.

CPython: Python/flowgraph.c ExceptionHandler / Python/assemble.c emit_exception_table_entry

type ExceptHandlerInfo

type ExceptHandlerInfo struct {
	Label         int
	StartDepth    int
	PreserveLasti int
}

ExceptHandlerInfo is the per-instruction exception handler slot. h_label < 0 means "no handler". Mirrors _PyExceptHandlerInfo.

CPython: Include/internal/pycore_compile.h _PyExceptHandlerInfo

type Info

type Info struct {
	MaxStackDepth  int
	ExceptionTable []ExceptHandler
	Consts         []any
	LocalsPlus     int
	NLocals        int
	NCellvars      int
	NFreevars      int
}

Info is the per-pass metadata flowgraph hands to assemble. Mirrors the bookkeeping CPython attaches to each cfg_builder.

CPython: Python/flowgraph.c cfg_builder + _PyCfg_OptimizeCodeUnit returns

func Optimize

func Optimize(seq *Sequence, consts *[]any, nlocals, _ int) (*Info, error)

Optimize runs every flowgraph pass on a Sequence in the same order as CPython's _PyCfg_OptimizeCodeUnit. The current port lands the minimum-viable subset (label resolution, stackdepth, NOP cleanup); the rest of the panel arrives in follow-on commits per the 1627 spec.

CPython: Python/flowgraph.c:L3659 _PyCfg_OptimizeCodeUnit

type Instr

type Instr struct {
	Op      Opcode
	Oparg   int32
	Loc     ast.Pos
	Handler ExceptHandlerInfo
}

Instr is one entry in a Sequence. Mirrors _PyInstruction.

CPython: Include/internal/pycore_compile.h _PyInstruction

type JumpTargetLabel

type JumpTargetLabel struct {
	// contains filtered or unexported fields
}

JumpTargetLabel is an opaque label id created by NewLabel and bound to an instruction position by UseLabel. Mirrors _PyJumpTargetLabel.

CPython: Include/internal/pycore_compile.h _PyJumpTargetLabel

func (JumpTargetLabel) ID

func (l JumpTargetLabel) ID() int

ID returns the underlying label id. CPython exposes the same integer to Python via `InstructionSequence.new_label`.

CPython: Include/internal/pycore_compile.h _PyJumpTargetLabel.id

type Opcode

type Opcode int32

Opcode is a single bytecode opcode. The numeric values match cpython/Lib/opcode.py and are filled in by compile/opcodes_gen.go.

const (
	CACHE                             Opcode = 0
	BINARY_SLICE                      Opcode = 1
	BUILD_TEMPLATE                    Opcode = 2
	CALL_FUNCTION_EX                  Opcode = 4
	CHECK_EG_MATCH                    Opcode = 5
	CHECK_EXC_MATCH                   Opcode = 6
	CLEANUP_THROW                     Opcode = 7
	DELETE_SUBSCR                     Opcode = 8
	END_FOR                           Opcode = 9
	END_SEND                          Opcode = 10
	EXIT_INIT_CHECK                   Opcode = 11
	FORMAT_SIMPLE                     Opcode = 12
	FORMAT_WITH_SPEC                  Opcode = 13
	GET_AITER                         Opcode = 14
	GET_ANEXT                         Opcode = 15
	GET_ITER                          Opcode = 16
	RESERVED                          Opcode = 17
	GET_LEN                           Opcode = 18
	GET_YIELD_FROM_ITER               Opcode = 19
	INTERPRETER_EXIT                  Opcode = 20
	LOAD_BUILD_CLASS                  Opcode = 21
	LOAD_LOCALS                       Opcode = 22
	MAKE_FUNCTION                     Opcode = 23
	MATCH_KEYS                        Opcode = 24
	MATCH_MAPPING                     Opcode = 25
	MATCH_SEQUENCE                    Opcode = 26
	NOP                               Opcode = 27
	NOT_TAKEN                         Opcode = 28
	POP_EXCEPT                        Opcode = 29
	POP_ITER                          Opcode = 30
	POP_TOP                           Opcode = 31
	PUSH_EXC_INFO                     Opcode = 32
	PUSH_NULL                         Opcode = 33
	RETURN_GENERATOR                  Opcode = 34
	RETURN_VALUE                      Opcode = 35
	SETUP_ANNOTATIONS                 Opcode = 36
	STORE_SLICE                       Opcode = 37
	STORE_SUBSCR                      Opcode = 38
	TO_BOOL                           Opcode = 39
	UNARY_INVERT                      Opcode = 40
	UNARY_NEGATIVE                    Opcode = 41
	UNARY_NOT                         Opcode = 42
	WITH_EXCEPT_START                 Opcode = 43
	BINARY_OP                         Opcode = 44
	BUILD_INTERPOLATION               Opcode = 45
	BUILD_LIST                        Opcode = 46
	BUILD_MAP                         Opcode = 47
	BUILD_SET                         Opcode = 48
	BUILD_SLICE                       Opcode = 49
	BUILD_STRING                      Opcode = 50
	BUILD_TUPLE                       Opcode = 51
	CALL                              Opcode = 52
	CALL_INTRINSIC_1                  Opcode = 53
	CALL_INTRINSIC_2                  Opcode = 54
	CALL_KW                           Opcode = 55
	COMPARE_OP                        Opcode = 56
	CONTAINS_OP                       Opcode = 57
	CONVERT_VALUE                     Opcode = 58
	COPY                              Opcode = 59
	COPY_FREE_VARS                    Opcode = 60
	DELETE_ATTR                       Opcode = 61
	DELETE_DEREF                      Opcode = 62
	DELETE_FAST                       Opcode = 63
	DELETE_GLOBAL                     Opcode = 64
	DELETE_NAME                       Opcode = 65
	DICT_MERGE                        Opcode = 66
	DICT_UPDATE                       Opcode = 67
	END_ASYNC_FOR                     Opcode = 68
	EXTENDED_ARG                      Opcode = 69
	FOR_ITER                          Opcode = 70
	GET_AWAITABLE                     Opcode = 71
	IMPORT_FROM                       Opcode = 72
	IMPORT_NAME                       Opcode = 73
	IS_OP                             Opcode = 74
	JUMP_BACKWARD                     Opcode = 75
	JUMP_BACKWARD_NO_INTERRUPT        Opcode = 76
	JUMP_FORWARD                      Opcode = 77
	LIST_APPEND                       Opcode = 78
	LIST_EXTEND                       Opcode = 79
	LOAD_ATTR                         Opcode = 80
	LOAD_COMMON_CONSTANT              Opcode = 81
	LOAD_CONST                        Opcode = 82
	LOAD_DEREF                        Opcode = 83
	LOAD_FAST                         Opcode = 84
	LOAD_FAST_AND_CLEAR               Opcode = 85
	LOAD_FAST_BORROW                  Opcode = 86
	LOAD_FAST_BORROW_LOAD_FAST_BORROW Opcode = 87
	LOAD_FAST_CHECK                   Opcode = 88
	LOAD_FAST_LOAD_FAST               Opcode = 89
	LOAD_FROM_DICT_OR_DEREF           Opcode = 90
	LOAD_FROM_DICT_OR_GLOBALS         Opcode = 91
	LOAD_GLOBAL                       Opcode = 92
	LOAD_NAME                         Opcode = 93
	LOAD_SMALL_INT                    Opcode = 94
	LOAD_SPECIAL                      Opcode = 95
	LOAD_SUPER_ATTR                   Opcode = 96
	MAKE_CELL                         Opcode = 97
	MAP_ADD                           Opcode = 98
	MATCH_CLASS                       Opcode = 99
	POP_JUMP_IF_FALSE                 Opcode = 100
	POP_JUMP_IF_NONE                  Opcode = 101
	POP_JUMP_IF_NOT_NONE              Opcode = 102
	POP_JUMP_IF_TRUE                  Opcode = 103
	RAISE_VARARGS                     Opcode = 104
	RERAISE                           Opcode = 105
	SEND                              Opcode = 106
	SET_ADD                           Opcode = 107
	SET_FUNCTION_ATTRIBUTE            Opcode = 108
	SET_UPDATE                        Opcode = 109
	STORE_ATTR                        Opcode = 110
	STORE_DEREF                       Opcode = 111
	STORE_FAST                        Opcode = 112
	STORE_FAST_LOAD_FAST              Opcode = 113
	STORE_FAST_STORE_FAST             Opcode = 114
	STORE_GLOBAL                      Opcode = 115
	STORE_NAME                        Opcode = 116
	SWAP                              Opcode = 117
	UNPACK_EX                         Opcode = 118
	UNPACK_SEQUENCE                   Opcode = 119
	YIELD_VALUE                       Opcode = 120
	RESUME                            Opcode = 128
	INSTRUMENTED_END_FOR              Opcode = 234
	INSTRUMENTED_POP_ITER             Opcode = 235
	INSTRUMENTED_END_SEND             Opcode = 236
	INSTRUMENTED_FOR_ITER             Opcode = 237
	INSTRUMENTED_INSTRUCTION          Opcode = 238
	INSTRUMENTED_JUMP_FORWARD         Opcode = 239
	INSTRUMENTED_NOT_TAKEN            Opcode = 240
	INSTRUMENTED_POP_JUMP_IF_TRUE     Opcode = 241
	INSTRUMENTED_POP_JUMP_IF_FALSE    Opcode = 242
	INSTRUMENTED_POP_JUMP_IF_NONE     Opcode = 243
	INSTRUMENTED_POP_JUMP_IF_NOT_NONE Opcode = 244
	INSTRUMENTED_RESUME               Opcode = 245
	INSTRUMENTED_RETURN_VALUE         Opcode = 246
	INSTRUMENTED_YIELD_VALUE          Opcode = 247
	INSTRUMENTED_END_ASYNC_FOR        Opcode = 248
	INSTRUMENTED_LOAD_SUPER_ATTR      Opcode = 249
	INSTRUMENTED_CALL                 Opcode = 250
	INSTRUMENTED_CALL_KW              Opcode = 251
	INSTRUMENTED_CALL_FUNCTION_EX     Opcode = 252
	INSTRUMENTED_JUMP_BACKWARD        Opcode = 253
	INSTRUMENTED_LINE                 Opcode = 254
	ENTER_EXECUTOR                    Opcode = 255
	ANNOTATIONS_PLACEHOLDER           Opcode = 256
	JUMP                              Opcode = 257
	JUMP_IF_FALSE                     Opcode = 258
	JUMP_IF_TRUE                      Opcode = 259
	JUMP_NO_INTERRUPT                 Opcode = 260
	LOAD_CLOSURE                      Opcode = 261
	POP_BLOCK                         Opcode = 262
	SETUP_CLEANUP                     Opcode = 263
	SETUP_FINALLY                     Opcode = 264
	SETUP_WITH                        Opcode = 265
	STORE_FAST_MAYBE_NULL             Opcode = 266
)

Opcode constants. Numeric values match cpython 3.14 opmap.

func (Opcode) HasArg

func (op Opcode) HasArg() bool

HasArg reports whether op carries the arg metadata flag.

func (Opcode) HasConst

func (op Opcode) HasConst() bool

HasConst reports whether op carries the const metadata flag.

func (Opcode) HasDeopt

func (op Opcode) HasDeopt() bool

HasDeopt reports whether op carries the deopt metadata flag.

func (Opcode) HasError

func (op Opcode) HasError() bool

HasError reports whether op carries the error metadata flag.

func (Opcode) HasErrorNoPop

func (op Opcode) HasErrorNoPop() bool

HasErrorNoPop reports whether op carries the error no pop metadata flag.

func (Opcode) HasEscapes

func (op Opcode) HasEscapes() bool

HasEscapes reports whether op carries the escapes metadata flag.

func (Opcode) HasEvalBreak

func (op Opcode) HasEvalBreak() bool

HasEvalBreak reports whether op carries the eval break metadata flag.

func (Opcode) HasExit

func (op Opcode) HasExit() bool

HasExit reports whether op carries the exit metadata flag.

func (Opcode) HasFree

func (op Opcode) HasFree() bool

HasFree reports whether op carries the free metadata flag.

func (Opcode) HasJump

func (op Opcode) HasJump() bool

HasJump reports whether op carries the jump metadata flag.

func (Opcode) HasLocal

func (op Opcode) HasLocal() bool

HasLocal reports whether op carries the local metadata flag.

func (Opcode) HasName

func (op Opcode) HasName() bool

HasName reports whether op carries the name metadata flag.

func (Opcode) HasNoSaveIP

func (op Opcode) HasNoSaveIP() bool

HasNoSaveIP reports whether op carries the no save ip metadata flag.

func (Opcode) HasOpargAnd1

func (op Opcode) HasOpargAnd1() bool

HasOpargAnd1 reports whether op carries the oparg and 1 metadata flag.

func (Opcode) HasPassthrough

func (op Opcode) HasPassthrough() bool

HasPassthrough reports whether op carries the passthrough metadata flag.

func (Opcode) HasPure

func (op Opcode) HasPure() bool

HasPure reports whether op carries the pure metadata flag.

func (Opcode) Name

func (op Opcode) Name() string

Name returns the symbolic name for op, or "" if op is not a known opcode.

type Sequence

type Sequence struct {
	Instrs   []Instr
	Nested   []*Sequence
	AnnoCode *Sequence
	// contains filtered or unexported fields
}

Sequence is the pre-CFG instruction stream emitted by codegen and consumed by flowgraph. Mirrors _PyInstructionSequence.

CPython: Include/internal/pycore_compile.h _PyInstructionSequence

func (*Sequence) AddNested

func (s *Sequence) AddNested(child *Sequence)

AddNested appends a nested instruction sequence (one per nested scope: a function or class definition emits its own Sequence and hangs it off the parent here).

CPython: Python/instruction_sequence.c:L166 _PyInstructionSequence_AddNested

func (*Sequence) Addop

func (s *Sequence) Addop(op Opcode, oparg int32, loc ast.Pos)

Addop appends an instruction. Caller is responsible for ensuring op is in [0, MaxOpcode] and oparg in [0, MaxOparg). The C code asserts these; here a panic via slice growth would mask the bug, so we panic explicitly to match the assertion semantics.

CPython: Python/instruction_sequence.c:L115 _PyInstructionSequence_Addop

func (*Sequence) ApplyLabelMap

func (s *Sequence) ApplyLabelMap(hasTarget func(Opcode) bool)

ApplyLabelMap rewrites every jump opcode's oparg from a label id into the bound instruction offset, using hasTarget to recognize which opcodes carry a jump target. Callers pass the OPCODE_HAS_TARGET predicate (defined in the generated opcode metadata, which lands alongside the opcode generator).

Idempotent: a second call is a no-op (s.labelmap == nil). The per-instruction ExceptHandlerInfo.Label is also resolved.

CPython: Python/instruction_sequence.c:L86 _PyInstructionSequence_ApplyLabelMap

func (*Sequence) Insert

func (s *Sequence) Insert(pos int, op Opcode, oparg int32, loc ast.Pos)

Insert inserts at pos and shifts following entries right by one. Any previously-bound label that pointed at pos or later is bumped up by one to preserve its target.

CPython: Python/instruction_sequence.c:L133 _PyInstructionSequence_InsertInstruction

func (*Sequence) NewLabel

func (s *Sequence) NewLabel() JumpTargetLabel

NewLabel allocates a fresh label id (1-based, matching CPython's post-increment of s_next_free_label).

CPython: Python/instruction_sequence.c:L57 _PyInstructionSequence_NewLabel

func (*Sequence) SetAnnotationsCode

func (s *Sequence) SetAnnotationsCode(annot *Sequence)

SetAnnotationsCode mirrors _PyInstructionSequence_SetAnnotationsCode. CPython asserts s_annotations_code is unset; we do the same.

CPython: Python/instruction_sequence.c:L157

func (*Sequence) UseLabel

func (s *Sequence) UseLabel(lbl JumpTargetLabel)

UseLabel binds lbl to the position of the next instruction to be appended. Calling UseLabel with the same label twice rebinds it (mirrors CPython, which simply overwrites s_labelmap[lbl]).

CPython: Python/instruction_sequence.c:L64 _PyInstructionSequence_UseLabel

type Unit

type Unit struct {
	Name                string
	Qualname            string
	ScopeType           symtable.Block
	Argcount            int
	PosOnlyArgCount     int
	KwOnlyArgCount      int
	FirstLineno         int
	Flags               uint32
	Seq                 *Sequence
	Consts              []any
	Names               []string
	VarNames            []string
	FreeVars            []string
	CellVars            []string
	FastHidden          map[string]bool
	DeferredAnnotations []deferredAnnotation
}

Unit is the per-scope handoff codegen produces. The flowgraph optimizes Seq in place and the assembler packs the result into a Code object.

CPython: Python/compile.c compiler_unit

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL