tokenizer

package
v1.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 4, 2025 License: MIT Imports: 10 Imported by: 0

Documentation

Overview

Package tokenizer provides a high-performance SQL tokenizer with zero-copy operations

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func PutTokenizer

func PutTokenizer(t *Tokenizer)

PutTokenizer returns a Tokenizer to the pool

Types

type BufferPool

type BufferPool struct {
	// contains filtered or unexported fields
}

BufferPool manages a pool of reusable byte buffers for token content

func NewBufferPool

func NewBufferPool() *BufferPool

NewBufferPool creates a new buffer pool with optimized initial capacity

func (*BufferPool) Get

func (p *BufferPool) Get() []byte

Get retrieves a buffer from the pool

func (*BufferPool) Grow

func (p *BufferPool) Grow(buf []byte, n int) []byte

Grow ensures the buffer has enough capacity

func (*BufferPool) Put

func (p *BufferPool) Put(buf []byte)

Put returns a buffer to the pool

type DebugLogger

type DebugLogger interface {
	Debug(format string, args ...interface{})
}

DebugLogger is an interface for debug logging

type Error

type Error struct {
	Message  string
	Location models.Location
}

Error represents a tokenization error with location information

func ErrorInvalidIdentifier

func ErrorInvalidIdentifier(value string, location models.Location) *Error

ErrorInvalidIdentifier creates an error for an invalid identifier

func ErrorInvalidNumber

func ErrorInvalidNumber(value string, location models.Location) *Error

ErrorInvalidNumber creates an error for an invalid number format

func ErrorInvalidOperator

func ErrorInvalidOperator(value string, location models.Location) *Error

ErrorInvalidOperator creates an error for an invalid operator

func ErrorUnexpectedChar

func ErrorUnexpectedChar(ch byte, location models.Location) *Error

ErrorUnexpectedChar creates an error for an unexpected character

func ErrorUnterminatedString

func ErrorUnterminatedString(location models.Location) *Error

ErrorUnterminatedString creates an error for an unterminated string

func NewError

func NewError(message string, location models.Location) *Error

NewError creates a new tokenization error

func (*Error) Error

func (e *Error) Error() string

type Position

type Position struct {
	Line   int
	Index  int
	Column int
	LastNL int // byte offset of last newline
}

Position tracks our scanning cursor with optimized tracking - Line is 1-based - Index is 0-based - Column is 1-based - LastNL tracks the last newline for efficient column calculation

func NewPosition

func NewPosition(line, index int) Position

NewPosition builds a Position from raw info

func (*Position) AdvanceN

func (p *Position) AdvanceN(n int, lineStarts []int)

AdvanceN moves forward by n bytes

func (*Position) AdvanceRune

func (p *Position) AdvanceRune(r rune, size int)

Advance moves us forward by the given rune, updating line/col efficiently

func (Position) Clone

func (p Position) Clone() Position

Clone makes a copy of Position

func (Position) Location

func (p Position) Location(t *Tokenizer) models.Location

Location gives the models.Location for this position

type StringLiteralReader

type StringLiteralReader struct {
	// contains filtered or unexported fields
}

StringLiteralReader handles reading of string literals with proper escape sequence handling

func NewStringLiteralReader

func NewStringLiteralReader(input []byte, pos *Position, quote rune) *StringLiteralReader

NewStringLiteralReader creates a new StringLiteralReader

func (*StringLiteralReader) ReadStringLiteral

func (r *StringLiteralReader) ReadStringLiteral() (models.Token, error)

ReadStringLiteral reads a string literal with proper escape sequence handling

type Tokenizer

type Tokenizer struct {
	// contains filtered or unexported fields
}

Tokenizer provides high-performance SQL tokenization with zero-copy operations

func GetTokenizer

func GetTokenizer() *Tokenizer

GetTokenizer gets a Tokenizer from the pool

func New

func New() (*Tokenizer, error)

New creates a new Tokenizer with default configuration

func NewWithKeywords

func NewWithKeywords(kw *keywords.Keywords) (*Tokenizer, error)

NewWithKeywords initializes a Tokenizer with custom keywords

func (*Tokenizer) Reset

func (t *Tokenizer) Reset()

Reset resets a Tokenizer's state for reuse

func (*Tokenizer) SetDebugLogger

func (t *Tokenizer) SetDebugLogger(logger DebugLogger)

SetDebugLogger sets a debug logger for verbose tracing

func (*Tokenizer) Tokenize

func (t *Tokenizer) Tokenize(input []byte) ([]models.TokenWithSpan, error)

Tokenize processes the input and returns tokens

type TokenizerError

type TokenizerError struct {
	Message  string
	Location models.Location
}

TokenizerError is a simple error wrapper

func (TokenizerError) Error

func (e TokenizerError) Error() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL