tokenize

package

v0.12.3 Latest Latest Go to latest Published: May 15, 2026 License: Apache-2.0 Imports: 3 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/tamnd/gopy

Links

Open Source Insights

Documentation ¶

Overview ¶

Package tokenize is the Go port of cpython/Python/Python-tokenize.c. The C file is the Python-visible wrapper around the parser's lexer; it exposes the TokenizerIter class that `tokenize.tokenize()` in the stdlib delegates to.

The token kind constants live in the sibling token package, mirroring CPython's split between Include/internal/pycore_token.h (consumed by the C tokenizer) and Lib/token.py (re-exported by Lib/tokenize.py).

CPython: Python/Python-tokenize.c

Index ¶

type Iter
- func New(src string, extraTokens bool) *Iter
- func NewReadline(rl func() (string, error), extraTokens bool) *Iter
- func (it *Iter) Next() (Token, error)
type Pos
type Token

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Iter ¶

type Iter struct {
	// contains filtered or unexported fields
}

Iter is the Go-side TokenizerIter equivalent. Next advances the underlying lexer state by one token; EOF is reported as io.EOF.

CPython: Python/Python-tokenize.c tokenizeriterobject

func New ¶

func New(src string, extraTokens bool) *Iter

New constructs an Iter over a source string. extraTokens enables the COMMENT / NL / ENCODING / NEWLINE-at-EOF tokens that the stdlib filters out by default.

CPython: Python/Python-tokenize.c tokenizeriter_new (source path)

func NewReadline ¶

func NewReadline(rl func() (string, error), extraTokens bool) *Iter

NewReadline constructs an Iter that pulls source lines from a readline-shaped callable, the same shape io.TextIO.readline has on the Python side. The callback returns one line of source (including any trailing newline) or io.EOF at end of stream.

CPython: Python/Python-tokenize.c tokenizeriter_new (readline path)

func (*Iter) Next ¶

func (it *Iter) Next() (Token, error)

Next returns the next token. Returns io.EOF after the lexer's ENDMARKER has been delivered, matching the Python iterator protocol's StopIteration translation.

CPython: Python/Python-tokenize.c tokenizeriter_next

type Pos ¶

type Pos struct {
	Line int
	Col  int
}

Pos is the (line, column) source position of a token boundary. Line is 1-based and Col is 0-based, matching CPython's tokenize.TokenInfo.

CPython: Python/Python-tokenize.c tokenizeriter_next

type Token ¶

type Token struct {
	Type  token.Type
	Value string
	Start Pos
	End   Pos
	Line  string
}

Token is one record emitted by the iterator. Mirrors the 5-tuple (type, string, start, end, line) the C wrapper returns.

CPython: Python/Python-tokenize.c tokenizeriter_next

Source Files ¶

View all Source files

tokenize.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL