optimizer

package
v0.12.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 15, 2026 License: Apache-2.0 Imports: 9 Imported by: 0

Documentation

Overview

Package optimizer hosts the Tier-2 trace optimizer and uop interpreter.

CPython 3.14 splits the runtime into two tiers. Tier-1 is the adaptive specializer that runs over conventional bytecode (lives in vm/ and specialize/). Tier-2 is a linear trace of micro-ops that the runtime projects out of warmed-up Tier-1 code, runs through an abstract interpreter, and dispatches via a separate uop loop. This package owns the trace + uop side. The matching jit.c path is explicitly out of scope; gopy stays interpreter-only.

CPython: Include/internal/pycore_optimizer.h

Index

Constants

View Source
const (
	ConfidenceRange  = 1000
	ConfidenceCutoff = 333
)

ConfidenceRange is the starting branch-confidence value. Each branch instruction divides confidence by how heavily one side has been favored; once the running value drops below ConfidenceCutoff the projection bails to keep cold paths out of traces.

CPython: Python/optimizer.c:466-467

View Source
const (
	MaxAllowedBuiltinsModifications = 3
	MaxAllowedGlobalsModifications  = 6
)

MaxAllowedBuiltinsModifications and MaxAllowedGlobalsModifications cap how many times the runtime tolerates builtins/globals mutating before it stops trying to specialize globals loads.

CPython: pycore_optimizer.h:101-102

View Source
const (
	UOPFormatTarget = 0
	UOPFormatJump   = 1
)

UOPFormatTarget and UOPFormatJump pick which interpretation of the 32-bit field between oparg and operand applies to a given uop.

CPython: pycore_optimizer.h:132-133

View Source
const (
	UopExitTrace                            uint16 = 300
	UopSetIp                                uint16 = 301
	UopBinaryOp                             uint16 = 302
	UopBinaryOpAddFloat                     uint16 = 303
	UopBinaryOpAddInt                       uint16 = 304
	UopBinaryOpAddUnicode                   uint16 = 305
	UopBinaryOpExtend                       uint16 = 306
	UopBinaryOpInplaceAddUnicode            uint16 = 307
	UopBinaryOpMultiplyFloat                uint16 = 308
	UopBinaryOpMultiplyInt                  uint16 = 309
	UopBinaryOpSubscrCheckFunc              uint16 = 310
	UopBinaryOpSubscrDict                   uint16 = 311
	UopBinaryOpSubscrInitCall               uint16 = 312
	UopBinaryOpSubscrListInt                uint16 = 313
	UopBinaryOpSubscrListSlice              uint16 = 314
	UopBinaryOpSubscrStrInt                 uint16 = 315
	UopBinaryOpSubscrTupleInt               uint16 = 316
	UopBinaryOpSubtractFloat                uint16 = 317
	UopBinaryOpSubtractInt                  uint16 = 318
	UopBinarySlice                          uint16 = 319
	UopBuildInterpolation                   uint16 = uint16(compile.BUILD_INTERPOLATION)
	UopBuildList                            uint16 = uint16(compile.BUILD_LIST)
	UopBuildMap                             uint16 = uint16(compile.BUILD_MAP)
	UopBuildSet                             uint16 = uint16(compile.BUILD_SET)
	UopBuildSlice                           uint16 = uint16(compile.BUILD_SLICE)
	UopBuildString                          uint16 = uint16(compile.BUILD_STRING)
	UopBuildTemplate                        uint16 = uint16(compile.BUILD_TEMPLATE)
	UopBuildTuple                           uint16 = uint16(compile.BUILD_TUPLE)
	UopCallBuiltinClass                     uint16 = 320
	UopCallBuiltinFast                      uint16 = 321
	UopCallBuiltinFastWithKeywords          uint16 = 322
	UopCallBuiltinO                         uint16 = 323
	UopCallIntrinsic1                       uint16 = uint16(compile.CALL_INTRINSIC_1)
	UopCallIntrinsic2                       uint16 = uint16(compile.CALL_INTRINSIC_2)
	UopCallIsinstance                       uint16 = uint16(compile.CALL_ISINSTANCE)
	UopCallKwNonPy                          uint16 = 324
	UopCallLen                              uint16 = 325
	UopCallListAppend                       uint16 = uint16(compile.CALL_LIST_APPEND)
	UopCallMethodDescriptorFast             uint16 = 326
	UopCallMethodDescriptorFastWithKeywords uint16 = 327
	UopCallMethodDescriptorNoargs           uint16 = 328
	UopCallMethodDescriptorO                uint16 = 329
	UopCallNonPyGeneral                     uint16 = 330
	UopCallStr1                             uint16 = 331
	UopCallTuple1                           uint16 = 332
	UopCallType1                            uint16 = 333
	UopCheckAndAllocateObject               uint16 = 334
	UopCheckAttrClass                       uint16 = 335
	UopCheckAttrMethodLazyDict              uint16 = 336
	UopCheckCallBoundMethodExactArgs        uint16 = 337
	UopCheckEgMatch                         uint16 = uint16(compile.CHECK_EG_MATCH)
	UopCheckExcMatch                        uint16 = uint16(compile.CHECK_EXC_MATCH)
	UopCheckFunction                        uint16 = 338
	UopCheckFunctionExactArgs               uint16 = 339
	UopCheckFunctionVersion                 uint16 = 340
	UopCheckFunctionVersionInline           uint16 = 341
	UopCheckFunctionVersionKw               uint16 = 342
	UopCheckIsNotPyCallable                 uint16 = 343
	UopCheckIsNotPyCallableKw               uint16 = 344
	UopCheckManagedObjectHasValues          uint16 = 345
	UopCheckMethodVersion                   uint16 = 346
	UopCheckMethodVersionKw                 uint16 = 347
	UopCheckPep523                          uint16 = 348
	UopCheckPeriodic                        uint16 = 349
	UopCheckPeriodicIfNotYieldFrom          uint16 = 350
	UopCheckRecursionRemaining              uint16 = 351
	UopCheckStackSpace                      uint16 = 352
	UopCheckStackSpaceOperand               uint16 = 353
	UopCheckValidity                        uint16 = 354
	UopCompareOp                            uint16 = 355
	UopCompareOpFloat                       uint16 = 356
	UopCompareOpInt                         uint16 = 357
	UopCompareOpStr                         uint16 = 358
	UopContainsOp                           uint16 = 359
	UopContainsOpDict                       uint16 = 360
	UopContainsOpSet                        uint16 = 361
	UopConvertValue                         uint16 = uint16(compile.CONVERT_VALUE)
	UopCopy                                 uint16 = uint16(compile.COPY)
	UopCopyFreeVars                         uint16 = uint16(compile.COPY_FREE_VARS)
	UopCreateInitFrame                      uint16 = 362
	UopDeleteAttr                           uint16 = uint16(compile.DELETE_ATTR)
	UopDeleteDeref                          uint16 = uint16(compile.DELETE_DEREF)
	UopDeleteFast                           uint16 = uint16(compile.DELETE_FAST)
	UopDeleteGlobal                         uint16 = uint16(compile.DELETE_GLOBAL)
	UopDeleteName                           uint16 = uint16(compile.DELETE_NAME)
	UopDeleteSubscr                         uint16 = uint16(compile.DELETE_SUBSCR)
	UopDeopt                                uint16 = 363
	UopDictMerge                            uint16 = uint16(compile.DICT_MERGE)
	UopDictUpdate                           uint16 = uint16(compile.DICT_UPDATE)
	UopDoCall                               uint16 = 364
	UopDoCallFunctionEx                     uint16 = 365
	UopDoCallKw                             uint16 = 366
	UopEndFor                               uint16 = uint16(compile.END_FOR)
	UopEndSend                              uint16 = uint16(compile.END_SEND)
	UopErrorPopN                            uint16 = 367
	UopExitInitCheck                        uint16 = uint16(compile.EXIT_INIT_CHECK)
	UopExpandMethod                         uint16 = 368
	UopExpandMethodKw                       uint16 = 369
	UopFatalError                           uint16 = 370
	UopFormatSimple                         uint16 = uint16(compile.FORMAT_SIMPLE)
	UopFormatWithSpec                       uint16 = uint16(compile.FORMAT_WITH_SPEC)
	UopForIter                              uint16 = 371
	UopForIterGenFrame                      uint16 = 372
	UopForIterTierTwo                       uint16 = 373
	UopGetAiter                             uint16 = uint16(compile.GET_AITER)
	UopGetAnext                             uint16 = uint16(compile.GET_ANEXT)
	UopGetAwaitable                         uint16 = uint16(compile.GET_AWAITABLE)
	UopGetIter                              uint16 = uint16(compile.GET_ITER)
	UopGetLen                               uint16 = uint16(compile.GET_LEN)
	UopGetYieldFromIter                     uint16 = uint16(compile.GET_YIELD_FROM_ITER)
	UopGuardBinaryOpExtend                  uint16 = 374
	UopGuardCallableLen                     uint16 = 375
	UopGuardCallableStr1                    uint16 = 376
	UopGuardCallableTuple1                  uint16 = 377
	UopGuardCallableType1                   uint16 = 378
	UopGuardDorvNoDict                      uint16 = 379
	UopGuardDorvValuesInstAttrFromDict      uint16 = 380
	UopGuardGlobalsVersion                  uint16 = 381
	UopGuardIsFalsePop                      uint16 = 382
	UopGuardIsNonePop                       uint16 = 383
	UopGuardIsNotNonePop                    uint16 = 384
	UopGuardIsTruePop                       uint16 = 385
	UopGuardKeysVersion                     uint16 = 386
	UopGuardNosDict                         uint16 = 387
	UopGuardNosFloat                        uint16 = 388
	UopGuardNosInt                          uint16 = 389
	UopGuardNosList                         uint16 = 390
	UopGuardNosNull                         uint16 = 391
	UopGuardNosTuple                        uint16 = 392
	UopGuardNosUnicode                      uint16 = 393
	UopGuardNotExhaustedList                uint16 = 394
	UopGuardNotExhaustedRange               uint16 = 395
	UopGuardNotExhaustedTuple               uint16 = 396
	UopGuardTosAnySet                       uint16 = 397
	UopGuardTosDict                         uint16 = 398
	UopGuardTosFloat                        uint16 = 399
	UopGuardTosInt                          uint16 = 400
	UopGuardTosList                         uint16 = 401
	UopGuardTosSlice                        uint16 = 402
	UopGuardTosTuple                        uint16 = 403
	UopGuardTosUnicode                      uint16 = 404
	UopGuardTypeVersion                     uint16 = 405
	UopGuardTypeVersionAndLock              uint16 = 406
	UopImportFrom                           uint16 = uint16(compile.IMPORT_FROM)
	UopImportName                           uint16 = uint16(compile.IMPORT_NAME)
	UopInitCallBoundMethodExactArgs         uint16 = 407
	UopInitCallPyExactArgs                  uint16 = 408
	UopInitCallPyExactArgs0                 uint16 = 409
	UopInitCallPyExactArgs1                 uint16 = 410
	UopInitCallPyExactArgs2                 uint16 = 411
	UopInitCallPyExactArgs3                 uint16 = 412
	UopInitCallPyExactArgs4                 uint16 = 413
	UopInsertNull                           uint16 = 414
	UopInstrumentedForIter                  uint16 = uint16(compile.INSTRUMENTED_FOR_ITER)
	UopInstrumentedInstruction              uint16 = uint16(compile.INSTRUMENTED_INSTRUCTION)
	UopInstrumentedJumpForward              uint16 = uint16(compile.INSTRUMENTED_JUMP_FORWARD)
	UopInstrumentedLine                     uint16 = uint16(compile.INSTRUMENTED_LINE)
	UopInstrumentedNotTaken                 uint16 = uint16(compile.INSTRUMENTED_NOT_TAKEN)
	UopInstrumentedPopJumpIfFalse           uint16 = uint16(compile.INSTRUMENTED_POP_JUMP_IF_FALSE)
	UopInstrumentedPopJumpIfNone            uint16 = uint16(compile.INSTRUMENTED_POP_JUMP_IF_NONE)
	UopInstrumentedPopJumpIfNotNone         uint16 = uint16(compile.INSTRUMENTED_POP_JUMP_IF_NOT_NONE)
	UopInstrumentedPopJumpIfTrue            uint16 = uint16(compile.INSTRUMENTED_POP_JUMP_IF_TRUE)
	UopIsNone                               uint16 = 415
	UopIsOp                                 uint16 = uint16(compile.IS_OP)
	UopIterCheckList                        uint16 = 416
	UopIterCheckRange                       uint16 = 417
	UopIterCheckTuple                       uint16 = 418
	UopIterJumpList                         uint16 = 419
	UopIterJumpRange                        uint16 = 420
	UopIterJumpTuple                        uint16 = 421
	UopIterNextList                         uint16 = 422
	UopIterNextListTierTwo                  uint16 = 423
	UopIterNextRange                        uint16 = 424
	UopIterNextTuple                        uint16 = 425
	UopJumpToTop                            uint16 = 426
	UopListAppend                           uint16 = uint16(compile.LIST_APPEND)
	UopListExtend                           uint16 = uint16(compile.LIST_EXTEND)
	UopLoadAttr                             uint16 = 427
	UopLoadAttrClass                        uint16 = 428
	UopLoadAttrGetattributeOverridden       uint16 = uint16(compile.LOAD_ATTR_GETATTRIBUTE_OVERRIDDEN)
	UopLoadAttrInstanceValue                uint16 = 429
	UopLoadAttrMethodLazyDict               uint16 = 430
	UopLoadAttrMethodNoDict                 uint16 = 431
	UopLoadAttrMethodWithValues             uint16 = 432
	UopLoadAttrModule                       uint16 = 433
	UopLoadAttrNondescriptorNoDict          uint16 = 434
	UopLoadAttrNondescriptorWithValues      uint16 = 435
	UopLoadAttrPropertyFrame                uint16 = 436
	UopLoadAttrSlot                         uint16 = 437
	UopLoadAttrWithHint                     uint16 = 438
	UopLoadBuildClass                       uint16 = uint16(compile.LOAD_BUILD_CLASS)
	UopLoadBytecode                         uint16 = 439
	UopLoadCommonConstant                   uint16 = uint16(compile.LOAD_COMMON_CONSTANT)
	UopLoadConst                            uint16 = uint16(compile.LOAD_CONST)
	UopLoadConstImmortal                    uint16 = uint16(compile.LOAD_CONST_IMMORTAL)
	UopLoadConstInline                      uint16 = 440
	UopLoadConstInlineBorrow                uint16 = 441
	UopLoadConstMortal                      uint16 = uint16(compile.LOAD_CONST_MORTAL)
	UopLoadDeref                            uint16 = uint16(compile.LOAD_DEREF)
	UopLoadFast                             uint16 = 442
	UopLoadFast0                            uint16 = 443
	UopLoadFast1                            uint16 = 444
	UopLoadFast2                            uint16 = 445
	UopLoadFast3                            uint16 = 446
	UopLoadFast4                            uint16 = 447
	UopLoadFast5                            uint16 = 448
	UopLoadFast6                            uint16 = 449
	UopLoadFast7                            uint16 = 450
	UopLoadFastAndClear                     uint16 = uint16(compile.LOAD_FAST_AND_CLEAR)
	UopLoadFastBorrow                       uint16 = 451
	UopLoadFastBorrow0                      uint16 = 452
	UopLoadFastBorrow1                      uint16 = 453
	UopLoadFastBorrow2                      uint16 = 454
	UopLoadFastBorrow3                      uint16 = 455
	UopLoadFastBorrow4                      uint16 = 456
	UopLoadFastBorrow5                      uint16 = 457
	UopLoadFastBorrow6                      uint16 = 458
	UopLoadFastBorrow7                      uint16 = 459
	UopLoadFastBorrowLoadFastBorrow         uint16 = uint16(compile.LOAD_FAST_BORROW_LOAD_FAST_BORROW)
	UopLoadFastCheck                        uint16 = uint16(compile.LOAD_FAST_CHECK)
	UopLoadFastLoadFast                     uint16 = uint16(compile.LOAD_FAST_LOAD_FAST)
	UopLoadFromDictOrDeref                  uint16 = uint16(compile.LOAD_FROM_DICT_OR_DEREF)
	UopLoadFromDictOrGlobals                uint16 = uint16(compile.LOAD_FROM_DICT_OR_GLOBALS)
	UopLoadGlobal                           uint16 = 460
	UopLoadGlobalBuiltins                   uint16 = 461
	UopLoadGlobalModule                     uint16 = 462
	UopLoadLocals                           uint16 = uint16(compile.LOAD_LOCALS)
	UopLoadName                             uint16 = uint16(compile.LOAD_NAME)
	UopLoadSmallInt                         uint16 = 463
	UopLoadSmallInt0                        uint16 = 464
	UopLoadSmallInt1                        uint16 = 465
	UopLoadSmallInt2                        uint16 = 466
	UopLoadSmallInt3                        uint16 = 467
	UopLoadSpecial                          uint16 = 468
	UopLoadSuperAttrAttr                    uint16 = uint16(compile.LOAD_SUPER_ATTR_ATTR)
	UopLoadSuperAttrMethod                  uint16 = uint16(compile.LOAD_SUPER_ATTR_METHOD)
	UopMakeCallargsATuple                   uint16 = 469
	UopMakeCell                             uint16 = uint16(compile.MAKE_CELL)
	UopMakeFunction                         uint16 = uint16(compile.MAKE_FUNCTION)
	UopMakeWarm                             uint16 = 470
	UopMapAdd                               uint16 = uint16(compile.MAP_ADD)
	UopMatchClass                           uint16 = uint16(compile.MATCH_CLASS)
	UopMatchKeys                            uint16 = uint16(compile.MATCH_KEYS)
	UopMatchMapping                         uint16 = uint16(compile.MATCH_MAPPING)
	UopMatchSequence                        uint16 = uint16(compile.MATCH_SEQUENCE)
	UopMaybeExpandMethod                    uint16 = 471
	UopMaybeExpandMethodKw                  uint16 = 472
	UopMonitorCall                          uint16 = 473
	UopMonitorCallKw                        uint16 = 474
	UopMonitorJumpBackward                  uint16 = 475
	UopMonitorResume                        uint16 = 476
	UopNop                                  uint16 = uint16(compile.NOP)
	UopPopExcept                            uint16 = uint16(compile.POP_EXCEPT)
	UopPopJumpIfFalse                       uint16 = 477
	UopPopJumpIfTrue                        uint16 = 478
	UopPopTop                               uint16 = uint16(compile.POP_TOP)
	UopPopTopLoadConstInline                uint16 = 479
	UopPopTopLoadConstInlineBorrow          uint16 = 480
	UopPopTwoLoadConstInlineBorrow          uint16 = 481
	UopPushExcInfo                          uint16 = uint16(compile.PUSH_EXC_INFO)
	UopPushFrame                            uint16 = 482
	UopPushNull                             uint16 = uint16(compile.PUSH_NULL)
	UopPushNullConditional                  uint16 = 483
	UopPyFrameGeneral                       uint16 = 484
	UopPyFrameKw                            uint16 = 485
	UopQuickenResume                        uint16 = 486
	UopReplaceWithTrue                      uint16 = 487
	UopResumeCheck                          uint16 = uint16(compile.RESUME_CHECK)
	UopReturnGenerator                      uint16 = uint16(compile.RETURN_GENERATOR)
	UopReturnValue                          uint16 = uint16(compile.RETURN_VALUE)
	UopSaveReturnOffset                     uint16 = 488
	UopSend                                 uint16 = 489
	UopSendGenFrame                         uint16 = 490
	UopSetupAnnotations                     uint16 = uint16(compile.SETUP_ANNOTATIONS)
	UopSetAdd                               uint16 = uint16(compile.SET_ADD)
	UopSetFunctionAttribute                 uint16 = uint16(compile.SET_FUNCTION_ATTRIBUTE)
	UopSetUpdate                            uint16 = uint16(compile.SET_UPDATE)
	UopStartExecutor                        uint16 = 491
	UopStoreAttr                            uint16 = 492
	UopStoreAttrInstanceValue               uint16 = 493
	UopStoreAttrSlot                        uint16 = 494
	UopStoreAttrWithHint                    uint16 = 495
	UopStoreDeref                           uint16 = uint16(compile.STORE_DEREF)
	UopStoreFast                            uint16 = 496
	UopStoreFast0                           uint16 = 497
	UopStoreFast1                           uint16 = 498
	UopStoreFast2                           uint16 = 499
	UopStoreFast3                           uint16 = 500
	UopStoreFast4                           uint16 = 501
	UopStoreFast5                           uint16 = 502
	UopStoreFast6                           uint16 = 503
	UopStoreFast7                           uint16 = 504
	UopStoreFastLoadFast                    uint16 = uint16(compile.STORE_FAST_LOAD_FAST)
	UopStoreFastStoreFast                   uint16 = uint16(compile.STORE_FAST_STORE_FAST)
	UopStoreGlobal                          uint16 = uint16(compile.STORE_GLOBAL)
	UopStoreName                            uint16 = uint16(compile.STORE_NAME)
	UopStoreSlice                           uint16 = 505
	UopStoreSubscr                          uint16 = 506
	UopStoreSubscrDict                      uint16 = 507
	UopStoreSubscrListInt                   uint16 = 508
	UopSwap                                 uint16 = uint16(compile.SWAP)
	UopTier2ResumeCheck                     uint16 = 509
	UopToBool                               uint16 = 510
	UopToBoolBool                           uint16 = uint16(compile.TO_BOOL_BOOL)
	UopToBoolInt                            uint16 = uint16(compile.TO_BOOL_INT)
	UopToBoolList                           uint16 = 511
	UopToBoolNone                           uint16 = uint16(compile.TO_BOOL_NONE)
	UopToBoolStr                            uint16 = 512
	UopUnaryInvert                          uint16 = uint16(compile.UNARY_INVERT)
	UopUnaryNegative                        uint16 = uint16(compile.UNARY_NEGATIVE)
	UopUnaryNot                             uint16 = uint16(compile.UNARY_NOT)
	UopUnpackEx                             uint16 = uint16(compile.UNPACK_EX)
	UopUnpackSequence                       uint16 = 513
	UopUnpackSequenceList                   uint16 = 514
	UopUnpackSequenceTuple                  uint16 = 515
	UopUnpackSequenceTwoTuple               uint16 = 516
	UopWithExceptStart                      uint16 = uint16(compile.WITH_EXCEPT_START)
	UopYieldValue                           uint16 = uint16(compile.YIELD_VALUE)
	MaxUopID                                uint16 = 516
)

Tier-2 micro-op IDs. Numeric IDs (>= 300) name uops the Tier-2 dispatch loop owns; aliased IDs reuse the Tier-1 opcode value so the trace projection can copy a Tier-1 instruction into the trace without remapping.

CPython: Include/internal/pycore_uop_ids.h

View Source
const (
	BuiltinsWatcherID = 0
	GlobalsWatcherID  = 1
	TypeWatcherID     = 0
)

Watcher slot identifiers. CPython reserves the first two dict watcher IDs and the first type watcher ID for the Tier-2 optimizer.

CPython: Python/optimizer_analysis.c:70-72 BUILTINS_WATCHER_ID / GLOBALS_WATCHER_ID / TYPE_WATCHER_ID

View Source
const (
	MaxDictWatchers = 8
	MaxTypeWatchers = 8
)

MaxDictWatchers / MaxTypeWatchers cap the per-interpreter watcher registry. CPython hard-codes both to 8.

CPython: Include/internal/pycore_dict_state.h:11 DICT_MAX_WATCHERS / pycore_interp_structs.h:22 TYPE_MAX_WATCHERS

View Source
const BloomFilterWords = 8

BloomFilterWords is the bloom filter's width in 32-bit words. A width of 8 gives m = 256 bits; trace projection adds every guarded type / dict / code pointer, the invalidation walk hashes a mutated pointer once and asks every executor whether it might be in scope.

CPython: pycore_optimizer.h:24 _Py_BLOOM_FILTER_WORDS

View Source
const ExecutorDeleteListMax = 100

ExecutorDeleteListMax caps the pending-deletion list before a sweep runs. A separate list keeps us from running tp_dealloc while another thread is executing the trace.

CPython: pycore_optimizer.h:90 EXECUTOR_DELETE_LIST_MAX

View Source
const JITCleanupThreshold = 100000

JITCleanupThreshold is the trace-run-counter threshold past which the runtime invalidates cold executors.

CPython: pycore_optimizer.h:118 JIT_CLEANUP_THRESHOLD

View Source
const MaxAbstractFrameDepth = TraceStackSize + 2

MaxAbstractFrameDepth bounds the abstract-interpreter frame stack. One slot for the root frame, one for the overflow frame, the rest for inlined PUSH_FRAME calls.

CPython: pycore_optimizer.h:159 MAX_ABSTRACT_FRAME_DEPTH

View Source
const MaxAbstractInterpSize = 4096

MaxAbstractInterpSize bounds the locals + stack + consts buffer the abstract interpreter walks during analysis.

CPython: pycore_optimizer.h:154 MAX_ABSTRACT_INTERP_SIZE

View Source
const MaxChainDepth = 4

MaxChainDepth bounds the number of side exits a trace tree may take before requiring forward progress through a fresh ENTER_EXECUTOR.

CPython: pycore_optimizer.h:165 MAX_CHAIN_DEPTH

View Source
const MaxExecutorsSize = 256

MaxExecutorsSize caps the number of executors the side table holds for a single Code. CPython pins this so ENTER_EXECUTOR's oparg fits in a single byte.

CPython: Python/optimizer.c:29 MAX_EXECUTORS_SIZE

View Source
const MaxSymbolicTupleSize = 7

MaxSymbolicTupleSize bounds the length of a tuple the symbolic interpreter will track elementwise. Past this length the tuple degrades to type-only.

CPython: pycore_optimizer.h:199 MAX_SYMBOLIC_TUPLE_SIZE

View Source
const MaxUopPerExpansion = 10

MaxUopPerExpansion bounds the per-opcode uop fan-out. Matches CPython's macro-expansion limit so handcrafted entries stay sized the same as generated ones.

CPython: Include/internal/pycore_opcode_metadata.h:1322 MAX_UOP_PER_EXPANSION

View Source
const TraceStackSize = 5

TraceStackSize bounds the depth of inlined frames the projection will follow through PUSH_FRAME.

CPython: pycore_optimizer.h:123 TRACE_STACK_SIZE

View Source
const TyArenaSize = UOPMaxTraceLength * 5

TyArenaSize caps the per-trace symbolic-type arena.

CPython: pycore_optimizer.h:156 TY_ARENA_SIZE

View Source
const UOPMaxTraceLength = 800

UOPMaxTraceLength caps the number of uops in a single projected trace. Trace projection stops when the buffer fills up.

CPython: pycore_optimizer.h:121 UOP_MAX_TRACE_LENGTH

Variables

View Source
var ExecutorType = objects.NewType("uop_executor", []*objects.Type{objects.ObjectType()})

ExecutorType is the type singleton for Tier-2 executors. dis.dis recognizes this type so disassembly walks the trace through the sequence protocol.

CPython: Python/optimizer.c:421-432 _PyUOpExecutor_Type

View Source
var UopMeta = map[string]UopMetaEntry{}/* 289 elements not displayed */

UopMeta maps a uop name (with leading underscore stripped) to its static metadata. Lookup is by name rather than ID so the table stays robust against ID renumbering across CPython releases.

View Source
var UopNames = map[uint16]string{}/* 318 elements not displayed */

UopNames maps a uop ID to its CPython spelling. The dispatch and disassembly paths use this to render trace rows with the same uop names dis.dis emits upstream.

Functions

func AbstractContextFini

func AbstractContextFini(ctx *JitOptContext)

AbstractContextFini releases the per-trace arena. Const-tagged symbols hold object references; clearing them lets the GC reclaim values that were live only inside the analyzer.

CPython: Python/optimizer_symbols.c:660-674 _Py_uop_abstractcontext_fini

func AbstractContextInit

func AbstractContextInit(ctx *JitOptContext)

AbstractContextInit zeroes ctx and seeds the arena and locals pool. Must run before any frame is pushed.

CPython: Python/optimizer_symbols.c:677-701 _Py_uop_abstractcontext_init

func AddToPendingDeletionList

func AddToPendingDeletionList(interp *state.Interpreter, self *Executor)

AddToPendingDeletionList enqueues self on the per-interpreter deletion list. The list is a singly-linked stack reusing the VMData.Links.Next pointer for cheap intrusive linkage. Each push decrements RemainingCapacity; once it hits zero the sweep runs to reclaim everything not currently executing.

CPython: Python/optimizer.c:260-272 add_to_pending_deletion_list

func Analyze

func Analyze(interp *state.Interpreter, frame objects.InterpreterFrame, co *objects.Code, buffer []UOPInstruction, length, currStacklen int, dependencies *BloomFilter) int

Analyze runs the three-pass forward analysis over a freshly projected trace. Returns >0 (the new length) on success, 0 on a benign bail (analyzer not ready, contradiction hit, etc), -1 on a hard failure. Mirrors CPython's three-step pipeline: globals folding, abstract interpretation, then the cleanup that drops the now-unneeded _SET_IP / _CHECK_VALIDITY rows and collapses the load-then-pop idiom.

CPython: Python/optimizer_analysis.c:625-654 _Py_uop_analyze_and_optimize

func ClearExecutorDeletionList

func ClearExecutorDeletionList(interp *state.Interpreter)

ClearExecutorDeletionList walks the deletion list and frees every executor that is not currently executing on a thread. The CPython path consults each thread's current_executor; gopy is single- threaded for now, so nothing is "currently executing" outside the caller and the entire list is freed.

CPython: Python/optimizer.c:225-258 _Py_ClearExecutorDeletionList

func CodeClearExecutors

func CodeClearExecutors(code *objects.Code)

CodeClearExecutors detaches every executor anchored at code's side table and drops the table. Called from `_Py_Executors_InvalidateAll` when a code object's executors must all go.

CPython: Objects/codeobject.c:2272-2289 clear_executors / _PyCode_Clear_Executors

func DictAddWatcher

func DictAddWatcher(interp *state.Interpreter, cb DictWatchCallback) int

DictAddWatcher registers cb in the first free dict watcher slot and returns its ID. Returns -1 when every slot is occupied.

CPython: Objects/dictobject.c PyDict_AddWatcher

func DictClearWatcher

func DictClearWatcher(interp *state.Interpreter, id int)

DictClearWatcher releases watcher slot id. The subscription set for that slot is dropped along with the callback.

CPython: Objects/dictobject.c PyDict_ClearWatcher

func DictUnwatch

func DictUnwatch(interp *state.Interpreter, id int, dict unsafe.Pointer)

DictUnwatch drops dict from watcher slot id. Idempotent.

CPython: Objects/dictobject.c PyDict_Unwatch

func DictWatch

func DictWatch(interp *state.Interpreter, id int, dict unsafe.Pointer)

DictWatch subscribes dict to watcher slot id. Idempotent.

CPython: Objects/dictobject.c PyDict_Watch

func DisassembleRuntime

func DisassembleRuntime(co *objects.Code) string

DisassembleRuntime renders the live runtime bytecode of co as a dis.dis-style listing. ENTER_EXECUTOR install sites are rewritten back to the Tier-1 (op, arg) the executor stashed, so the listing reads as if no trace projection had happened.

CPython: Lib/dis.py:218-237 _get_code_array

func DispatchDictMutation

func DispatchDictMutation(interp *state.Interpreter, event DictWatchEvent, dict, key, newValue unsafe.Pointer)

DispatchDictMutation fires every registered dict watcher subscribed to dict. The dict mutation paths in objects/dict_mutate.go will call this once per mutation when the dict's WATCHED tag is set; gate tests drive it directly until that wiring lands.

CPython: Objects/dictobject.c _PyDict_NotifyEvent

func DispatchTypeMutation

func DispatchTypeMutation(interp *state.Interpreter, typ unsafe.Pointer)

DispatchTypeMutation fires every registered type watcher subscribed to typ. Driven by the type mutation paths once they land per-event hooks; gate tests drive it directly today.

CPython: Objects/typeobject.c PyType_Modified

func ExecutorClear

func ExecutorClear(interp *state.Interpreter, executor *Executor)

ExecutorClear runs tp_clear on the executor: unlinks from the interpreter list, marks it invalid, drops every chained side-exit reference, and detaches from its anchor Code. Idempotent once Valid is false.

CPython: Python/optimizer.c:1496-1518 executor_clear

func ExecutorDependsOn

func ExecutorDependsOn(executor *Executor, obj unsafe.Pointer)

ExecutorDependsOn stamps obj into executor's dependency bloom. The trace projector calls this for every guarded type / dict / code pointer; later mutations matching the bloom invalidate the trace.

CPython: Python/optimizer.c:1520-1525 _Py_Executor_DependsOn

func ExecutorInit

func ExecutorInit(interp *state.Interpreter, executor *Executor, dependencies *BloomFilter)

ExecutorInit stamps the dependency bloom into the executor and links it onto the per-interpreter list. Optimizers must call this before exposing the executor to the dispatch loop.

CPython: Python/optimizer.c:1466-1474 _Py_ExecutorInit

func ExecutorsInvalidateAll

func ExecutorsInvalidateAll(interp *state.Interpreter, isInvalidation bool)

ExecutorsInvalidateAll clears every executor in interp. Executors anchored at a code object route through CodeClearExecutors so the side table is reclaimed too; standalone executors clear directly.

CPython: Python/optimizer.c:1572-1588 _Py_Executors_InvalidateAll

func ExecutorsInvalidateCold

func ExecutorsInvalidateCold(interp *state.Interpreter)

ExecutorsInvalidateCold clears every executor whose Warm flag is unset and downgrades the rest to "not warm". Two passes against the same flag implement a CLOCK-style aging policy: a trace must keep warming up between sweeps to survive.

CPython: Python/optimizer.c:1590-1626 _Py_Executors_InvalidateCold

func ExecutorsInvalidateDependency

func ExecutorsInvalidateDependency(interp *state.Interpreter, obj unsafe.Pointer, isInvalidation bool)

ExecutorsInvalidateDependency clears every executor whose dependency bloom may contain obj. The CPython path stages the matches into a list before clearing because executor_clear can deallocate peers; gopy mirrors the staging pattern even though Go's GC removes the deallocation hazard, so the walk semantics are byte-for-byte identical.

CPython: Python/optimizer.c:1530-1568 _Py_Executors_InvalidateDependency

func FramePop

func FramePop(ctx *JitOptContext)

FramePop drops the current abstract frame. The locals/stack window is recycled for the parent.

CPython: Python/optimizer_symbols.c:703-713 _Py_uop_frame_pop

func FreeExecutor

func FreeExecutor(executor *Executor)

FreeExecutor releases the executor's storage. CPython hands the memory back to the GC arena; gopy lets Go's runtime reclaim it once the last reference drops.

CPython: Python/optimizer.c:217-223 free_executor

func PrintUOp

func PrintUOp(u *UOPInstruction) string

PrintUOp formats one uop instruction as a single-line string. Out of range opcodes render as "<uop N>"; named opcodes use their CPython spelling (with leading underscore).

CPython: Python/optimizer.c:295-328 _PyUOpPrint

func SymGetConst

func SymGetConst(ctx *JitOptContext, sym *JitOptSymbol) objects.Object

SymGetConst returns the constant value sym proves, or nil. Resolves Truthiness symbols to True/False on the fly when their underlying value's truthiness is known.

CPython: Python/optimizer_symbols.c:131-148 _Py_uop_sym_get_const

func SymGetType

func SymGetType(sym *JitOptSymbol) *objects.Type

SymGetType returns the type sym is known to be, or nil if the type is unconstrained.

CPython: Python/optimizer_symbols.c:395-416 _Py_uop_sym_get_type

func SymGetTypeVersion

func SymGetTypeVersion(sym *JitOptSymbol) uint32

SymGetTypeVersion returns the tp_version_tag sym proves, or 0 when unconstrained.

CPython: Python/optimizer_symbols.c:418-440 _Py_uop_sym_get_type_version

func SymHasType

func SymHasType(sym *JitOptSymbol) bool

SymHasType reports whether sym carries a definite type.

CPython: Python/optimizer_symbols.c:442-460 _Py_uop_sym_has_type

func SymIsBottom

func SymIsBottom(sym *JitOptSymbol) bool

SymIsBottom reports whether sym is the bottom of the lattice.

CPython: Python/optimizer_symbols.c:95-99 _Py_uop_sym_is_bottom

func SymIsConst

func SymIsConst(ctx *JitOptContext, sym *JitOptSymbol) bool

SymIsConst reports whether sym is a known concrete value. A Truthiness symbol upgrades to a known bool when its underlying value's truthiness is known.

CPython: Python/optimizer_symbols.c:106-122 _Py_uop_sym_is_const

func SymIsImmortal

func SymIsImmortal(sym *JitOptSymbol) bool

SymIsImmortal reports whether sym is known to point at an immortal object. CPython mints int(-5..256), the empty string, True/False, and None as immortal. gopy treats bool / None / interned singletons the same way.

CPython: Python/optimizer_symbols.c:576-589 _Py_uop_sym_is_immortal

func SymIsNotNull

func SymIsNotNull(sym *JitOptSymbol) bool

SymIsNotNull reports whether sym is known to not be NULL.

CPython: Python/optimizer_symbols.c:101-104 _Py_uop_sym_is_not_null

func SymIsNull

func SymIsNull(sym *JitOptSymbol) bool

SymIsNull reports whether sym is known to be NULL.

CPython: Python/optimizer_symbols.c:124-128 _Py_uop_sym_is_null

func SymMatchesType

func SymMatchesType(sym *JitOptSymbol, typ *objects.Type) bool

SymMatchesType reports whether sym's known type equals typ.

CPython: Python/optimizer_symbols.c:462-467 _Py_uop_sym_matches_type

func SymMatchesTypeVersion

func SymMatchesTypeVersion(sym *JitOptSymbol, version uint32) bool

SymMatchesTypeVersion reports whether sym's known type version equals version.

CPython: Python/optimizer_symbols.c:469-473 _Py_uop_sym_matches_type_version

func SymSetConst

func SymSetConst(ctx *JitOptContext, sym *JitOptSymbol, constVal objects.Object)

SymSetConst narrows sym to a single concrete value. Contradictions on the type or existing const drop sym to Bottom.

CPython: Python/optimizer_symbols.c:247-314 _Py_uop_sym_set_const

func SymSetNonNull

func SymSetNonNull(ctx *JitOptContext, sym *JitOptSymbol)

SymSetNonNull narrows sym to non-NULL. Unknown promotes; Null drops to Bottom.

CPython: Python/optimizer_symbols.c:327-336 _Py_uop_sym_set_non_null

func SymSetNull

func SymSetNull(ctx *JitOptContext, sym *JitOptSymbol)

SymSetNull narrows sym to NULL. Promotes Unknown directly; any non-NULL state drops to Bottom.

CPython: Python/optimizer_symbols.c:316-325 _Py_uop_sym_set_null

func SymSetType

func SymSetType(ctx *JitOptContext, sym *JitOptSymbol, typ *objects.Type)

SymSetType narrows sym to instances of typ. Contradictions on the existing tag drop sym to Bottom.

CPython: Python/optimizer_symbols.c:150-198 _Py_uop_sym_set_type

func SymSetTypeVersion

func SymSetTypeVersion(ctx *JitOptContext, sym *JitOptSymbol, version uint32) bool

SymSetTypeVersion narrows sym to a specific tp_version_tag. Returns false on contradiction.

CPython: Python/optimizer_symbols.c:200-245 _Py_uop_sym_set_type_version

func SymTruthiness

func SymTruthiness(ctx *JitOptContext, sym *JitOptSymbol) int

SymTruthiness returns 1 for truthy, 0 for falsey, -1 for unknown. Resolves Truthiness symbols recursively and folds to KnownValue when the underlying truth is known.

CPython: Python/optimizer_symbols.c:475-521 _Py_uop_sym_truthiness

func SymTupleLength

func SymTupleLength(sym *JitOptSymbol) int

SymTupleLength returns the proven length of sym, or -1 if unknown.

CPython: Python/optimizer_symbols.c:560-573 _Py_uop_sym_tuple_length

func TranslateBytecodeToTrace

func TranslateBytecodeToTrace(
	code *objects.Code,
	instr int,
	trace []UOPInstruction,
	bufferSize int,
	dependencies *BloomFilter,
	progressNeeded bool,
) int

TranslateBytecodeToTrace projects the live Tier-1 bytecode of code starting at codeunit offset instr into the uop buffer trace. On success it returns the trace length (>0); on a failure that should retire the warmup attempt without an error it returns 0; on a hard error it returns -1. The dependencies bloom is stamped with every Code object the trace inlines through PUSH_FRAME (gopy currently only stamps the entry code; PUSH_FRAME inlining bails).

CPython: Python/optimizer.c:553-987 translate_bytecode_to_trace

func TypeAddWatcher

func TypeAddWatcher(interp *state.Interpreter, cb TypeWatchCallback) int

TypeAddWatcher registers cb in the first free type watcher slot.

CPython: Objects/typeobject.c PyType_AddWatcher

func TypeClearWatcher

func TypeClearWatcher(interp *state.Interpreter, id int)

TypeClearWatcher releases watcher slot id.

CPython: Objects/typeobject.c PyType_ClearWatcher

func TypeUnwatch

func TypeUnwatch(interp *state.Interpreter, id int, typ unsafe.Pointer)

TypeUnwatch drops typ from watcher slot id.

CPython: Objects/typeobject.c PyType_Unwatch

func TypeWatch

func TypeWatch(interp *state.Interpreter, id int, typ unsafe.Pointer)

TypeWatch subscribes typ to watcher slot id.

CPython: Objects/typeobject.c PyType_Watch

func UOpName

func UOpName(id uint16) string

UOpName looks up the human-readable uop name. Returns "" when the id is out of range, matching CPython's NULL return; callers that want a placeholder substitute "<nil>" themselves. The id->name map is generated; see optimizer/uop_ids_gen.go.

CPython: Python/optimizer.c:285-292 _PyUOpName

func WatcherInit

func WatcherInit(interp *state.Interpreter)

WatcherInit registers the dict and type watcher callbacks at the canonical Tier-2 IDs. Idempotent: re-running leaves the existing callbacks in place.

CPython: Python/optimizer_analysis.c:175-180 (callback registration inside remove_globals)

Types

type AbstractFrame

type AbstractFrame struct {
	StackLen     int
	LocalsLen    int
	StackPointer int // index into Ctx.LocalsAndStack
	Stack        int // index into Ctx.LocalsAndStack
	Locals       int // index into Ctx.LocalsAndStack
}

AbstractFrame mirrors one stack frame inside the abstract interpreter. Locals and stack point into the JitOptContext's per-trace pool.

CPython: pycore_optimizer.h:224-232 _Py_UOpsAbstractFrame

func FrameNew

func FrameNew(ctx *JitOptContext, co *objects.Code, currStackEntries int, args []*JitOptSymbol, argLen int) *AbstractFrame

FrameNew pushes a new abstract frame onto the context. arg-len symbols seed the locals; the rest are Unknown. Returns nil and sets OutOfSpace when the locals+stack pool is exhausted.

CPython: Python/optimizer_symbols.c:617-658 _Py_uop_frame_new

type BackoffCounter

type BackoffCounter struct {
	ValueAndBackoff uint16
}

BackoffCounter is the same value-and-backoff scheme the adaptive specializer uses for its per-instruction warmup counters. Exposed here so trace exits can share the shape.

CPython: Include/internal/pycore_backoff.h _Py_BackoffCounter

type BloomFilter

type BloomFilter struct {
	Bits [BloomFilterWords]uint32
}

BloomFilter packs eight 32-bit words. False positives are tolerable; false negatives would skip a needed invalidation.

CPython: pycore_optimizer.h:26-28 _PyBloomFilter

func (*BloomFilter) Add

func (b *BloomFilter) Add(ptr unsafe.Pointer)

Add hashes ptr and stamps bloomK bits across the filter. Each hash pulls 8 bits off the running 64-bit hash, picks the word from the top 3 bits and the bit-index from the bottom 5.

CPython: Python/optimizer.c:1394-1404 _Py_BloomFilter_Add

func (*BloomFilter) Init

func (b *BloomFilter) Init()

Init zeroes every bit. Matches the manual loop CPython runs for shape-parity with the gate fixtures.

CPython: Python/optimizer.c:1381-1387 _Py_BloomFilter_Init

func (*BloomFilter) MayContain

func (b *BloomFilter) MayContain(hashes *BloomFilter) bool

MayContain reports whether every bit set in hashes is also set in b. False positives are intentional; the filter exists to keep the invalidation walk O(1) per executor in the common case.

CPython: Python/optimizer.c:1406-1415 bloom_filter_may_contain

type DictWatchCallback

type DictWatchCallback func(event DictWatchEvent, dict unsafe.Pointer, key unsafe.Pointer, newValue unsafe.Pointer) int

DictWatchCallback is the dict-mutation callback signature. dict, key, newValue match the CPython PyObject* parameters; gopy passes raw pointers because the optimizer only needs identity for the bloom.

CPython: Include/cpython/dictobject.h:97 PyDict_WatchCallback

type DictWatchEvent

type DictWatchEvent int

DictWatchEvent mirrors CPython's PyDict_WatchEvent. The Tier-2 optimizer only reacts to mutations; the other events flow through for parity with future consumers.

CPython: Include/cpython/dictobject.h:88 PyDict_WatchEvent

const (
	DictEventAdded DictWatchEvent = iota
	DictEventModified
	DictEventDeleted
	DictEventCloned
	DictEventCleared
	DictEventDeallocated
)

type Executor

type Executor struct {
	objects.VarHeader
	// Trace is the post-analysis uop buffer. Length is in VarHeader.
	Trace []UOPInstruction
	// VMData is the slice the dispatch loop mutates.
	VMData VMData
	// ExitCount is len(Exits); the JIT path uses this for chain
	// bookkeeping.
	ExitCount uint32
	// CodeSize is len(Trace) at allocation time.
	CodeSize uint32
	// Exits holds one ExitData per side-exit guard. Variable-length;
	// the slot count is fixed at allocation.
	Exits []ExitData
}

Executor is the v0.12 unit of work. One executor owns one linear trace of uops projected out of a warm Tier-1 region. The Python object surface (lives in pyobject.go) makes Executor round-trip through dis.dis and the debugger.

CPython: pycore_optimizer.h:75-85 _PyExecutorObject

func AllocateExecutor

func AllocateExecutor(exitCount, length int) *Executor

AllocateExecutor returns a fresh executor with room for length uops and exitCount side exits. Trace and Exits slices share the same underlying memory in CPython through a flexible array member; gopy allocates them as separate slices since Go has no equivalent.

CPython: Python/optimizer.c:1103-1115 allocate_executor

func GetExecutor

func GetExecutor(code *objects.Code, offset int) (*Executor, error)

GetExecutor returns the executor anchored at offset bytes into code's bytecode, or an error if no ENTER_EXECUTOR exists there. This is the public entry point sys.monitoring and dis.dis use to resolve ENTER_EXECUTOR's oparg back to the trace it routes to.

CPython: Python/optimizer.c:184-193 _Py_GetExecutor

func Optimize

func Optimize(interp *state.Interpreter, frame objects.InterpreterFrame, code *objects.Code, start, chainDepth int, currStackEntries int) (*Executor, int)

Optimize is the JUMP_BACKWARD warmup-callback. It projects a trace starting at codeunit offset start in code, hands it through the analysis pass, and on success installs the resulting executor in code's side table at the start offset. chainDepth is the number of side-exit hops between the root trace and this attempt; the entry trace and every MaxChainDepth'th side-exit must make forward progress (i.e. own an ENTER_EXECUTOR slot in the host code).

CPython: Python/optimizer.c:113-163 _PyOptimizer_Optimize

func (*Executor) GetOparg

func (e *Executor) GetOparg() uint8

GetOparg returns the original Tier-1 oparg.

CPython: Python/optimizer.c:204-208 get_oparg

func (*Executor) GetOpcode

func (e *Executor) GetOpcode() uint8

GetOpcode returns the original Tier-1 opcode at the install site. dis.dis uses this to render the deopt arrow alongside the trace.

CPython: Python/optimizer.c:198-202 get_opcode

func (*Executor) IsValid

func (e *Executor) IsValid() bool

IsValid reports whether the executor is still wired into the runtime. CPython exposes this as the is_valid() Python method; gopy callers (mainly dis.dis and the gate fixtures) call this directly.

CPython: Python/optimizer.c:192-196 is_valid

type ExecutorArray

type ExecutorArray struct {
	Capacity int
	Size     int
	Entries  []*Executor
}

ExecutorArray is the side table that hangs off Code. Each entry carries one ENTER_EXECUTOR routing for a given bytecode offset; the dispatch loop reads code.Executors[oparg] when it hits ENTER_EXECUTOR.

CPython: optimizer.c:34-193 (the side-table helpers)

type ExecutorLinkListNode

type ExecutorLinkListNode struct {
	Next *Executor
	Prev *Executor
}

ExecutorLinkListNode holds the doubly-linked thread-list pointers the per-interpreter executor list uses.

CPython: pycore_optimizer.h:16-19 _PyExecutorLinkListNode

type ExitData

type ExitData struct {
	// Target is the Tier-1 bytecode offset to resume at.
	Target uint32
	// Temperature is the side-exit warmup counter.
	Temperature BackoffCounter
	// Executor is the side-exit's own trace, populated lazily when
	// the exit warms up.
	Executor *Executor
}

ExitData holds one trace-exit slot. Each guard that might bail to Tier-1 owns one of these. Temperature drives whether the exit's own trace gets compiled when the exit fires often enough.

CPython: pycore_optimizer.h:69-73 _PyExitData

type InterpState

type InterpState struct {
	ExecutorListHead         *Executor
	ExecutorDeletionListHead *Executor
	RemainingCapacity        int
}

InterpState is the per-interpreter Tier-2 bookkeeping. The executor list head links every live executor; the deletion list head holds executors awaiting a sweep. RemainingCapacity is decremented on each pending-deletion enqueue and refilled when the sweep runs.

CPython: Include/internal/pycore_interp_structs.h executor_list_head / executor_deletion_list_head / executor_deletion_list_remaining_capacity

type JitOptContext

type JitOptContext struct {
	Done           bool
	OutOfSpace     bool
	Contradiction  bool
	Frame          *AbstractFrame
	Frames         [MaxAbstractFrameDepth]AbstractFrame
	CurrFrameDepth int
	TArena         TyArena
	NConsumed      int // index into LocalsAndStack
	Limit          int // index into LocalsAndStack
	LocalsAndStack [MaxAbstractInterpSize]*JitOptSymbol
}

JitOptContext threads through the analysis pass. Holds the symbol arena, the abstract-frame stack, and the live locals+stack window.

CPython: pycore_optimizer.h:242-257 _JitOptContext

type JitOptKnownClass

type JitOptKnownClass struct {
	Tag     uint8
	Version uint32
	Type    *objects.Type
}

JitOptKnownClass is the "definitely an instance of T, possibly with version v pinned" symbol payload.

CPython: pycore_optimizer.h:183-187 _jit_opt_known_class

type JitOptKnownValue

type JitOptKnownValue struct {
	Tag   uint8
	Value objects.Object
}

JitOptKnownValue is the "constant" symbol payload, holding the concrete object the analyzer proved is on the stack.

CPython: pycore_optimizer.h:194-197 _jit_opt_known_value

type JitOptKnownVersion

type JitOptKnownVersion struct {
	Tag     uint8
	Version uint32
}

JitOptKnownVersion is the "type version pinned but type identity unknown" symbol payload. Appears after _GUARD_TYPE_VERSION fires.

CPython: pycore_optimizer.h:189-192 _jit_opt_known_version

type JitOptSymbol

type JitOptSymbol struct {
	Tag        uint8
	Class      JitOptKnownClass
	Value      JitOptKnownValue
	Version    JitOptKnownVersion
	Tuple      JitOptTuple
	Truthiness JitOptTruthiness
}

JitOptSymbol is the tagged-union one-of for the symbolic lattice. CPython packs this as a C union; Go's lack of unions forces a struct that holds every payload variant. The Tag field picks which payload is live.

CPython: pycore_optimizer.h:213-220 _jit_opt_symbol

func SymNewConst

func SymNewConst(ctx *JitOptContext, constVal objects.Object) *JitOptSymbol

SymNewConst allocates a fresh symbol pinned to constVal.

CPython: Python/optimizer_symbols.c:372-382 _Py_uop_sym_new_const

func SymNewNotNull

func SymNewNotNull(ctx *JitOptContext) *JitOptSymbol

SymNewNotNull allocates a fresh NonNull symbol.

CPython: Python/optimizer_symbols.c:349-358 _Py_uop_sym_new_not_null

func SymNewNull

func SymNewNull(ctx *JitOptContext) *JitOptSymbol

SymNewNull allocates a fresh NULL-state symbol.

CPython: Python/optimizer_symbols.c:384-393 _Py_uop_sym_new_null

func SymNewTruthiness

func SymNewTruthiness(ctx *JitOptContext, value *JitOptSymbol, truthy bool) *JitOptSymbol

SymNewTruthiness allocates a Truthiness-tagged symbol that mirrors the truthiness of value. truthy=true makes "is truthy", false inverts.

CPython: Python/optimizer_symbols.c:591-613 _Py_uop_sym_new_truthiness

func SymNewTuple

func SymNewTuple(ctx *JitOptContext, size int, args []*JitOptSymbol) *JitOptSymbol

SymNewTuple allocates a Tuple symbol with element-wise tracking. Tuples longer than MaxSymbolicTupleSize degrade to KnownClass.

CPython: Python/optimizer_symbols.c:523-542 _Py_uop_sym_new_tuple

func SymNewType

func SymNewType(ctx *JitOptContext, typ *objects.Type) *JitOptSymbol

SymNewType allocates a fresh symbol narrowed to instances of typ.

CPython: Python/optimizer_symbols.c:360-369 _Py_uop_sym_new_type

func SymNewUnknown

func SymNewUnknown(ctx *JitOptContext) *JitOptSymbol

SymNewUnknown allocates a fresh Unknown symbol.

CPython: Python/optimizer_symbols.c:339-347 _Py_uop_sym_new_unknown

func SymTupleGetitem

func SymTupleGetitem(ctx *JitOptContext, sym *JitOptSymbol, item int) *JitOptSymbol

SymTupleGetitem returns the symbol at index item of a Tuple-tagged sym, or a fresh Unknown otherwise.

CPython: Python/optimizer_symbols.c:544-558 _Py_uop_sym_tuple_getitem

type JitOptTruthiness

type JitOptTruthiness struct {
	Tag    uint8
	Invert bool
	Value  uint16
}

JitOptTruthiness encodes "this symbol's truthiness equals (maybe inverted) some other symbol's truthiness".

CPython: pycore_optimizer.h:207-211 (anonymous struct)

type JitOptTuple

type JitOptTuple struct {
	Tag    uint8
	Length uint8
	Items  [MaxSymbolicTupleSize]uint16
}

JitOptTuple holds element-wise symbol indices for a small tuple.

CPython: pycore_optimizer.h:201-205 _jit_opt_tuple

type JitSymType

type JitSymType uint8

JitSymType enumerates the symbolic-state tags the abstract interpreter operates on. The values match CPython byte for byte so gate fixtures that assert on tag values stay portable.

CPython: pycore_optimizer.h:171-181 _JitSymType

const (
	JitSymUnknown     JitSymType = 1
	JitSymNull        JitSymType = 2
	JitSymNonNull     JitSymType = 3
	JitSymBottom      JitSymType = 4
	JitSymTypeVersion JitSymType = 5
	JitSymKnownClass  JitSymType = 6
	JitSymKnownValue  JitSymType = 7
	JitSymTuple       JitSymType = 8
	JitSymTruthiness  JitSymType = 9
)

type MacroOparg

type MacroOparg uint8

MacroOparg encodes how the projector pulls oparg/operand for one uop slot. The values match CPython's OPARG_* constants byte for byte; #430 will fold them into the generated table.

CPython: Tools/cases_generator/analyzer.py CacheEffect / OpargSize

const (
	OpargSimple           MacroOparg = 0
	OpargCache1           MacroOparg = 1
	OpargCache2           MacroOparg = 2
	OpargCache4           MacroOparg = 4
	OpargTop              MacroOparg = 5
	OpargBottom           MacroOparg = 6
	OpargSaveReturnOffset MacroOparg = 7
	OpargReplaced         MacroOparg = 8
	Operand1_1            MacroOparg = 9
	Operand1_2            MacroOparg = 10
	Operand1_4            MacroOparg = 11
)

type MacroUop

type MacroUop struct {
	Uop    uint16
	Size   MacroOparg
	Offset uint8
}

MacroUop is one slot in a Tier-1 opcode's uop expansion. Size picks where the operand comes from; Offset is the cache slot the operand reads from (in codeunits past the opcode pair).

CPython: Include/internal/pycore_opcode_metadata.h:1325

type Tier2State

type Tier2State struct {
	Interp   *state.Interpreter
	Thread   *state.Thread
	Frame    *frame.Frame
	Executor *Executor
	// NextUop is the index into Executor.Trace of the next instruction
	// to dispatch. Mutated by JUMP_TO_TOP / JUMP_TO_JUMP_TARGET helpers
	// before the method returns StatusContinue.
	NextUop int
	// Oparg carries the current uop's oparg (low 16 bits of
	// UOPInstruction.Oparg). Refreshed before each dispatch.
	Oparg uint32
}

Tier2State carries the per-call locals the dispatch switch reads and mutates. Field set is intentionally small: the C loop keeps frame, stack pointer, oparg, next_uop and current_executor in registers, and gopy mirrors that with a stack-allocated value.

CPython: Python/ceval.c:1240-1288 enter_tier_two prologue

func (*Tier2State) Run

func (s *Tier2State) Run() Tier2Status

Run is the dispatch driver. Mirrors enter_tier_two's for(;;) over the switch: load opcode, advance, dispatch by uop ID, branch on the returned status. JUMP_TO_TOP / JUMP_TO_JUMP_TARGET are encoded as in-method mutations of NextUop, not separate status values, so the switch stays small.

CPython: Python/ceval.c:1291-1333 tier2_dispatch loop

type Tier2Status

type Tier2Status int

Tier2Status is the return contract for per-uop methods. The driver loop branches on this value to decide what to do next.

CPython: Python/ceval.c:1335-1356 jump_to_error_target / jump_to_jump_target labels

const (
	// StatusContinue advances to NextUop. Methods that take a
	// JUMP_TO_JUMP_TARGET / JUMP_TO_TOP path mutate s.NextUop in
	// place and still return StatusContinue, since the loop's "next"
	// is whatever index was just written.
	StatusContinue Tier2Status = iota
	// StatusError signals the per-uop body set an exception on
	// s.Thread; the driver hands control back to the Tier-1 unwind
	// path.
	StatusError
	// StatusDeopt signals the trace cannot run further: the install
	// site Tier-1 instruction should run instead. Stubs return this
	// to keep partial-port traces runtime-safe.
	StatusDeopt
	// StatusExit corresponds to _EXIT_TRACE: the trace finished
	// normally; the caller resumes Tier-1 at the side-exit target.
	StatusExit
)

func RunExecutor

func RunExecutor(thread *state.Thread, fr *frame.Frame, exec *Executor) Tier2Status

RunExecutor is the entry point from vm.enterExecutor: build a Tier2State around the trace and run the dispatch loop. Returns the terminal status so the caller can deopt, propagate an error, or resume Tier-1 at the exit target.

CPython: Python/ceval.c:1290 jump to enter_tier_two

type TyArena

type TyArena struct {
	CurrNumber int
	MaxNumber  int
	Arena      [TyArenaSize]JitOptSymbol
}

TyArena is the per-trace bump arena that backs every JitOptSymbol.

CPython: pycore_optimizer.h:236-240 ty_arena

type TypeWatchCallback

type TypeWatchCallback func(typ unsafe.Pointer) int

TypeWatchCallback is the type-mutation callback signature. CPython passes the PyTypeObject* of the mutated class; gopy passes a raw pointer for the same reason.

CPython: Include/cpython/object.h:449 PyType_WatchCallback

type UOPInstruction

type UOPInstruction struct {
	// Opcode holds the uop ID in the low 15 bits and the format bit
	// in the high bit. Use Opcode() and Format() to read them.
	OpcodeAndFormat uint16
	Oparg           uint16
	// Target is the Tier-1 bytecode offset of the source opcode when
	// Format == UOPFormatTarget, or jump_target | (error_target<<16)
	// when Format == UOPFormatJump.
	Target   uint32
	Operand0 uint64
	Operand1 uint64
}

UOPInstruction is one row of an executor's trace. The opcode is a uint16 from the uop ID table (see uop_ids_gen.go). The format bit distinguishes plain Tier-1-offset targets from split jump/error targets used by exit-arm uops. operand0 carries cache entries and constant pointers; operand1 is reserved for the second-cache-entry uops the upstream JIT uses (gopy stays interpreter-only but mirrors the field for trace-shape parity).

CPython: pycore_optimizer.h:51-67 _PyUOpInstruction

func (*UOPInstruction) Format

func (u *UOPInstruction) Format() uint16

Format returns the format bit (UOPFormatTarget or UOPFormatJump).

CPython: pycore_optimizer.h:53 format:1

func (*UOPInstruction) GetErrorTarget

func (u *UOPInstruction) GetErrorTarget() uint16

GetErrorTarget returns the in-trace error-handler target for a JUMP-format uop.

CPython: pycore_optimizer.h:147-151 uop_get_error_target

func (*UOPInstruction) GetJumpTarget

func (u *UOPInstruction) GetJumpTarget() uint16

GetJumpTarget returns the in-trace jump target for a JUMP-format uop.

CPython: pycore_optimizer.h:141-145 uop_get_jump_target

func (*UOPInstruction) GetTarget

func (u *UOPInstruction) GetTarget() uint32

GetTarget returns the Tier-1 bytecode offset for a plain target uop.

CPython: pycore_optimizer.h:135-139 uop_get_target

func (*UOPInstruction) IsTerminator

func (u *UOPInstruction) IsTerminator() bool

IsTerminator reports whether u closes a projected trace. The trace driver stops dispatch on these.

CPython: pycore_optimizer.h:301-308 is_terminator

func (*UOPInstruction) Opcode

func (u *UOPInstruction) Opcode() uint16

Opcode returns the uop ID stored in OpcodeAndFormat's low 15 bits.

CPython: pycore_optimizer.h:52 opcode:15

func (*UOPInstruction) SetFormat

func (u *UOPInstruction) SetFormat(fmt uint16)

SetFormat writes the format bit, preserving the opcode.

func (*UOPInstruction) SetOpcode

func (u *UOPInstruction) SetOpcode(op uint16)

SetOpcode writes the uop ID, preserving the format bit.

type UopFlag

type UopFlag uint16

UopFlag is the flag bitmask the analysis pass reads to learn how a uop affects program state. Bit positions match the order in pycore_uop_metadata.h's HAS_*_FLAG enum.

CPython: Include/internal/pycore_uop_metadata.h HAS_*_FLAG

const (
	FlagArg         UopFlag = 1 << 0
	FlagConst       UopFlag = 1 << 1
	FlagName        UopFlag = 1 << 2
	FlagJump        UopFlag = 1 << 3
	FlagFree        UopFlag = 1 << 4
	FlagLocal       UopFlag = 1 << 5
	FlagEvalBreak   UopFlag = 1 << 6
	FlagDeopt       UopFlag = 1 << 7
	FlagError       UopFlag = 1 << 8
	FlagEscapes     UopFlag = 1 << 9
	FlagExit        UopFlag = 1 << 10
	FlagPure        UopFlag = 1 << 11
	FlagPassthrough UopFlag = 1 << 12
	FlagOpargAnd1   UopFlag = 1 << 13
	FlagErrorNoPop  UopFlag = 1 << 14
	FlagNoSaveIp    UopFlag = 1 << 15
)

type UopMetaEntry

type UopMetaEntry struct {
	Flags       UopFlag
	Replication uint8
	Popped      int8 // -1 when oparg-dependent or absent
}

UopMetaEntry is one row of the uop metadata table.

type VMData

type VMData struct {
	// Opcode and Oparg hold the original Tier-1 bytecode at the
	// install site so deopt can restore them.
	Opcode uint8
	Oparg  uint8
	// Valid is cleared when the executor is invalidated.
	Valid bool
	// Linked is set while the executor is on the per-interpreter
	// list.
	Linked bool
	// ChainDepth tracks how far this trace is from the root through
	// chained side exits (max MaxChainDepth - 1).
	ChainDepth uint8
	// Warm is set once the trace has run enough times to be worth
	// keeping cold-pruning at bay.
	Warm bool
	// Index is the slot in Code.Executors that holds this executor.
	Index int
	// Bloom is the filter over types/dicts/code objects this trace
	// guards against.
	Bloom BloomFilter
	// Links connects the executor to the per-interpreter list.
	Links ExecutorLinkListNode
	// Code is a weak back-pointer to the install site; nil if no
	// ENTER_EXECUTOR currently routes here.
	Code *objects.Code
}

VMData is the slice of an executor the VM mutates: pointers back to the bytecode site that installed the executor, the bloom filter the invalidation path reads, and the link-list node.

CPython: pycore_optimizer.h:30-41 _PyVMData

type WatcherTable

type WatcherTable struct {
	// contains filtered or unexported fields
}

WatcherTable holds the per-interpreter dict / type callback slots plus the subscription sets the optimizer registers per dict and per type. Subscriptions are stored as pointer sets keyed by raw address so the dispatch path needs nothing more than identity comparison.

CPython: Include/internal/pycore_interp_structs.h dict_state.watchers / type_watchers

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL