models

package
v1.11.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 14, 2026 License: Apache-2.0 Imports: 0 Imported by: 0

Documentation

Overview

Package models provides the core data structures for SQL tokenization and parsing in GoSQLX.

The fundamental types are Token (a single lexical unit with Type and Value), TokenWithSpan (a Token paired with Start/End Location for precise source positions), Location (1-based line/column coordinates), Span (a source range from one Location to another), and TokenizerError (structured error with position information). TokenType is an integer enumeration that covers all SQL keywords, operators, literals, and punctuation, enabling O(1) switch-based dispatch throughout the tokenizer and parser.

This package contains the fundamental types used throughout the GoSQLX library for representing SQL tokens, their locations in source code, and tokenization errors. All types are designed with zero-copy operations and object pooling in mind for optimal performance.

Core Components

The package is organized into several key areas:

  • Token Types: Token, TokenType, Word, Keyword for representing lexical units
  • Location Tracking: Location, Span for precise error reporting with line/column information
  • Token Wrappers: TokenWithSpan for tokens with position information
  • Error Types: TokenizerError for tokenization failures
  • Helper Functions: Factory functions for creating tokens efficiently

Performance Characteristics

GoSQLX v1.6.0 achieves exceptional performance metrics:

  • Tokenization: 1.38M+ operations/second sustained, 1.5M peak throughput
  • Memory Efficiency: 60-80% reduction via object pooling
  • Zero-Copy: Direct byte slice operations without string allocation
  • Thread-Safe: All operations are race-free and goroutine-safe
  • Test Coverage: 100% code coverage with comprehensive test suite

Token Type System

The TokenType system supports v1.6.0 features including:

  • PostgreSQL Extensions: JSON/JSONB operators (->/->>/#>/#>>/@>/<@/?/?|/?&/#-), LATERAL, RETURNING
  • SQL-99 Standards: Window functions, CTEs, GROUPING SETS, ROLLUP, CUBE
  • SQL:2003 Features: MERGE statements, FILTER clause, FETCH FIRST/NEXT
  • Multi-Dialect: PostgreSQL, MySQL, SQL Server, Oracle, SQLite keywords

Token types are organized into ranges for efficient categorization:

  • Basic tokens (10-29): WORD, NUMBER, IDENTIFIER, PLACEHOLDER
  • String literals (30-49): Single/double quoted, dollar quoted, hex strings
  • Operators (50-149): Arithmetic, comparison, JSON/JSONB operators
  • Keywords (200-499): SQL keywords organized by category

Location Tracking

Location and Span provide precise position information for error reporting:

  • 1-based indexing for line and column numbers (SQL standard)
  • Line numbers start at 1, column numbers start at 1
  • Spans represent ranges from start to end locations
  • Used extensively in error messages and IDE integration

Usage Examples

Creating tokens with location information:

loc := models.Location{Line: 1, Column: 5}
token := models.NewTokenWithSpan(
    models.TokenTypeSelect,
    "SELECT",
    loc,
    models.Location{Line: 1, Column: 11},
)

Working with token types:

if tokenType.IsKeyword() {
    // Handle SQL keyword
}
if tokenType.IsOperator() {
    // Handle operator
}
if tokenType.IsDMLKeyword() {
    // Handle SELECT, INSERT, UPDATE, DELETE
}

Checking for specific token categories:

// Check for window function keywords
if tokenType.IsWindowKeyword() {
    // Handle OVER, PARTITION BY, ROWS, RANGE, etc.
}

// Check for PostgreSQL JSON operators
switch tokenType {
case models.TokenTypeArrow:         // ->
case models.TokenTypeLongArrow:     // ->>
case models.TokenTypeHashArrow:     // #>
case models.TokenTypeHashLongArrow: // #>>
    // Handle JSON field access
}

Creating error locations:

err := models.TokenizerError{
    Message:  "unexpected character '@'",
    Location: models.Location{Line: 2, Column: 15},
}

PostgreSQL v1.6.0 Features

New token types for PostgreSQL extensions:

  • TokenTypeLateral: LATERAL JOIN support for correlated subqueries
  • TokenTypeReturning: RETURNING clause for INSERT/UPDATE/DELETE
  • TokenTypeArrow, TokenTypeLongArrow: -> and ->> JSON operators
  • TokenTypeHashArrow, TokenTypeHashLongArrow: #> and #>> path operators
  • TokenTypeAtArrow, TokenTypeArrowAt: @> contains and <@ is-contained-by
  • TokenTypeHashMinus: #- delete at path operator
  • TokenTypeAtQuestion: @? JSON path query
  • TokenTypeQuestionAnd, TokenTypeQuestionPipe: ?& and ?| key existence

SQL Standards Support

SQL-99 (Core + Extensions):

  • Window Functions: OVER, PARTITION BY, ROWS, RANGE, frame clauses
  • CTEs: WITH, RECURSIVE for common table expressions
  • Set Operations: UNION, INTERSECT, EXCEPT with ALL modifier
  • GROUPING SETS: ROLLUP, CUBE for multi-dimensional aggregation
  • Analytic Functions: ROW_NUMBER, RANK, DENSE_RANK, LAG, LEAD

SQL:2003 Features:

  • MERGE Statements: MERGE INTO with MATCHED/NOT MATCHED
  • FILTER Clause: Conditional aggregation in window functions
  • FETCH FIRST/NEXT: Standard limit syntax with TIES support
  • Materialized Views: CREATE MATERIALIZED VIEW, REFRESH

Thread Safety

All types in this package are immutable value types and safe for concurrent use:

  • Token, TokenType, Location, Span are all value types
  • No shared mutable state
  • Safe to pass between goroutines
  • Used extensively with object pooling (sync.Pool)

Integration with Parser

The models package integrates seamlessly with the parser:

// Tokenize SQL
tkz := tokenizer.GetTokenizer()
defer tokenizer.PutTokenizer(tkz)
tokens, err := tkz.Tokenize([]byte(sql))
if err != nil {
    if tokErr, ok := err.(models.TokenizerError); ok {
        // Access error location: tokErr.Location.Line, tokErr.Location.Column
    }
}

// Parse tokens
ast, parseErr := parser.Parse(tokens)
if parseErr != nil {
    // Parser errors include location information
}

Design Philosophy

The models package follows GoSQLX design principles:

  • Zero Dependencies: Only depends on Go standard library
  • Value Types: Immutable structs for safety and performance
  • Explicit Ranges: Token type ranges for O(1) categorization
  • 1-Based Indexing: Matches SQL and editor conventions
  • Clear Semantics: Descriptive names and comprehensive documentation

Testing and Quality

The package maintains exceptional quality standards:

  • 100% Test Coverage: All code paths tested
  • Race Detection: No race conditions (go test -race)
  • Benchmarks: Performance validation for all operations
  • Property Testing: Extensive edge case validation
  • Real-World SQL: Validated against 115+ production queries

For complete examples and advanced usage, see:

  • docs/GETTING_STARTED.md - Quick start guide
  • docs/USAGE_GUIDE.md - Comprehensive usage documentation
  • examples/ directory - Production-ready examples

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Comment added in v1.9.3

type Comment struct {
	// Text is the full comment text including its delimiters.
	// For line comments: includes the leading "--" (e.g., "-- my comment").
	// For block comments: includes "/*" and "*/" delimiters (e.g., "/* my comment */").
	Text string
	// Style indicates whether this is a LineComment (--) or BlockComment (/* */).
	Style CommentStyle
	// Start is the 1-based source location where the comment begins (inclusive).
	Start Location
	// End is the 1-based source location where the comment ends (exclusive).
	End Location
	// Inline is true when the comment appears on the same source line as SQL code,
	// i.e., it is a trailing comment following a statement or clause.
	Inline bool
}

Comment represents a SQL comment captured during tokenization.

Comments are preserved by the tokenizer for use by formatters, LSP servers, and other tools that need to maintain the original SQL structure. Both single-line (--) and multi-line (/* */) comment styles are supported.

Fields:

  • Text: Complete comment text including delimiters (e.g., "-- foo" or "/* bar */")
  • Style: Whether this is a line or block comment
  • Start: Source position where the comment begins (inclusive, 1-based)
  • End: Source position where the comment ends (exclusive, 1-based)
  • Inline: True when the comment appears on the same line as SQL code (trailing comment)

Example:

// Trailing line comment
comment := models.Comment{
    Text:   "-- filter active users",
    Style:  models.LineComment,
    Start:  models.Location{Line: 3, Column: 30},
    End:    models.Location{Line: 3, Column: 52},
    Inline: true,
}

// Stand-alone block comment
comment := models.Comment{
    Text:   "/* Returns all active users */",
    Style:  models.BlockComment,
    Start:  models.Location{Line: 1, Column: 1},
    End:    models.Location{Line: 1, Column: 30},
    Inline: false,
}

type CommentStyle added in v1.9.3

type CommentStyle int

CommentStyle indicates the type of SQL comment syntax used.

There are two styles of SQL comments: single-line comments introduced with -- and multi-line block comments delimited by /* and */.

Example:

// Single-line comment
-- This is a line comment

// Multi-line block comment
/* This is a
   block comment */
const (
	// LineComment represents a -- single-line comment that extends to the end of the line.
	LineComment CommentStyle = iota
	// BlockComment represents a /* multi-line */ comment that can span multiple lines.
	BlockComment
)

type Keyword

type Keyword struct {
	// Word is the keyword text in its canonical uppercase form (e.g., "SELECT", "LATERAL").
	Word string
	// Reserved is true for keywords that cannot be used as unquoted identifiers
	// (e.g., SELECT, FROM, WHERE) and false for non-reserved keywords
	// (e.g., RETURNING, LATERAL, FILTER) that are valid as identifiers in some dialects.
	Reserved bool
}

Keyword represents a lexical keyword with its properties.

Keywords are SQL reserved words or dialect-specific keywords that have special meaning in SQL syntax. GoSQLX supports keywords from multiple SQL dialects: PostgreSQL, MySQL, SQL Server, Oracle, and SQLite.

Fields:

  • Word: The keyword text in uppercase (canonical form)
  • Reserved: True if this is a reserved keyword that cannot be used as an identifier

Example:

// Reserved keyword
kw := &models.Keyword{Word: "SELECT", Reserved: true}

// Non-reserved keyword
kw := &models.Keyword{Word: "RETURNING", Reserved: false}

v1.6.0 adds support for PostgreSQL-specific keywords:

  • LATERAL: Correlated subqueries in FROM clause
  • RETURNING: Return modified rows from INSERT/UPDATE/DELETE
  • FILTER: Conditional aggregation in window functions

type Location

type Location struct {
	Line   int // Line number (1-based)
	Column int // Column number (1-based)
}

Location represents a position in the source code using 1-based indexing.

Location is used throughout GoSQLX for precise error reporting and IDE integration. Both Line and Column use 1-based indexing to match SQL standards and editor conventions.

Fields:

  • Line: Line number in source code (starts at 1)
  • Column: Column number within the line (starts at 1)

Example:

loc := models.Location{Line: 5, Column: 20}
// Represents position: line 5, column 20 (5th line, 20th character)

Usage in error reporting:

err := errors.NewError(
    errors.ErrCodeUnexpectedToken,
    "unexpected token",
    models.Location{Line: 1, Column: 15},
)

Integration with LSP (Language Server Protocol):

// Convert to LSP Position (0-based)
lspPos := lsp.Position{
    Line:      location.Line - 1,      // Convert to 0-based
    Character: location.Column - 1,    // Convert to 0-based
}

Performance: Location is a lightweight value type (2 ints) that is stack-allocated and has no memory overhead.

func (Location) IsZero added in v1.9.3

func (l Location) IsZero() bool

IsZero reports whether the location is the zero value (i.e., no position information). A zero Location has both Line and Column equal to 0.

Example:

loc := models.Location{}
if loc.IsZero() {
    // no position info available
}

type Span

type Span struct {
	Start Location // Start of the span (inclusive)
	End   Location // End of the span (exclusive)
}

Span represents a range in the source code.

Span defines a contiguous region of source code from a Start location to an End location. Used for highlighting ranges in error messages, LSP diagnostics, and code formatting.

Fields:

  • Start: Beginning location of the span (inclusive)
  • End: Ending location of the span (exclusive)

Example:

span := models.Span{
    Start: models.Location{Line: 1, Column: 1},
    End:   models.Location{Line: 1, Column: 7},
}
// Represents "SELECT" token spanning columns 1-6 on line 1

Usage with TokenWithSpan:

token := models.TokenWithSpan{
    Token: models.Token{Type: models.TokenTypeSelect, Value: "SELECT"},
    Start: models.Location{Line: 1, Column: 1},
    End:   models.Location{Line: 1, Column: 7},
}

Helper functions:

span := models.NewSpan(startLoc, endLoc)  // Create new span
emptySpan := models.EmptySpan()            // Create empty span

func EmptySpan

func EmptySpan() Span

EmptySpan returns an empty span with zero values.

Used as a default/placeholder when span information is not available.

Example:

span := models.EmptySpan()
// Equivalent to: Span{Start: Location{}, End: Location{}}

func NewSpan

func NewSpan(start, end Location) Span

NewSpan creates a new span from start to end locations.

Parameters:

  • start: Beginning location (inclusive)
  • end: Ending location (exclusive)

Returns a Span covering the range [start, end).

Example:

start := models.Location{Line: 1, Column: 1}
end := models.Location{Line: 1, Column: 7}
span := models.NewSpan(start, end)

type Token

type Token struct {
	// Type is the TokenType classification of this token (e.g., TokenTypeSelect,
	// TokenTypeNumber, TokenTypeArrow). Use Type for all category checks.
	Type TokenType
	// Value is the raw string representation of the token as it appeared in the
	// SQL source (e.g., "SELECT", "42", "'hello'", "->").
	Value string
	// Word holds keyword or identifier metadata for TokenTypeWord tokens.
	// It is nil for all other token types.
	Word *Word
	// Long is true for TokenTypeNumber tokens whose value exceeds the range of
	// a 32-bit integer and must be interpreted as int64.
	Long bool
	// Quote is the quote character used to delimit the token, for quoted string
	// literals (') and quoted identifiers (", `, [). Zero for unquoted tokens.
	Quote rune
}

Token represents a SQL token with its value and metadata.

Token is the fundamental unit of lexical analysis in GoSQLX. Each token represents a meaningful element in SQL source code: keywords, identifiers, operators, literals, or punctuation.

Tokens are lightweight value types designed for use with object pooling and zero-copy operations. They are immutable and safe for concurrent use.

Fields:

  • Type: The token category (keyword, operator, literal, etc.)
  • Value: The string representation of the token
  • Word: Optional Word struct for keyword/identifier tokens
  • Long: Flag for numeric tokens indicating long integer (int64)
  • Quote: Quote character used for quoted strings/identifiers (' or ")

Example usage:

token := models.Token{
    Type:  models.TokenTypeSelect,
    Value: "SELECT",
}

// Check token category
if token.Type.IsKeyword() {
    fmt.Println("Found SQL keyword:", token.Value)
}

Performance: Tokens are stack-allocated value types with minimal memory overhead. Used extensively with sync.Pool for zero-allocation parsing in hot paths.

func NewToken

func NewToken(tokenType TokenType, value string) Token

NewToken creates a new Token with the given type and value.

Factory function for creating tokens without location information. Useful for testing, manual token construction, or scenarios where position tracking is not needed.

Parameters:

  • tokenType: The TokenType classification
  • value: The string representation of the token

Returns a Token with the specified type and value.

Example:

token := models.NewToken(models.TokenTypeSelect, "SELECT")
// token.Type = TokenTypeSelect, token.Value = "SELECT"

numToken := models.NewToken(models.TokenTypeNumber, "42")
// numToken.Type = TokenTypeNumber, numToken.Value = "42"

type TokenType

type TokenType int

TokenType represents the type of a SQL token.

TokenType is the core classification system for all lexical units in SQL. GoSQLX v1.6.0 supports 500+ distinct token types organized into logical ranges for efficient categorization and type checking.

Token Type Organization:

  • Special (0-9): EOF, UNKNOWN
  • Basic (10-29): WORD, NUMBER, IDENTIFIER, PLACEHOLDER
  • Strings (30-49): Various string literal formats
  • Operators (50-149): Arithmetic, comparison, JSON/JSONB operators
  • Keywords (200-499): SQL keywords by category
  • Data Types (430-449): SQL data type keywords

v1.6.0 PostgreSQL Extensions:

  • JSON/JSONB Operators: ->, ->>, #>, #>>, @>, <@, #-, @?, @@, ?&, ?|
  • LATERAL: Correlated subqueries in FROM clause
  • RETURNING: Return modified rows from DML statements
  • FILTER: Conditional aggregation in window functions
  • DISTINCT ON: PostgreSQL-specific row selection

Performance: TokenType is an int with O(1) lookup via range checking. All Is* methods use constant-time comparisons.

Example usage:

// Check token category
if tokenType.IsKeyword() {
    // Handle SQL keyword
}
if tokenType.IsOperator() {
    // Handle operator (+, -, *, /, ->, etc.)
}

// Check specific categories
if tokenType.IsWindowKeyword() {
    // Handle OVER, PARTITION BY, ROWS, RANGE
}
if tokenType.IsDMLKeyword() {
    // Handle SELECT, INSERT, UPDATE, DELETE
}

// PostgreSQL JSON operators
switch tokenType {
case TokenTypeArrow:      // -> (JSON field access)
case TokenTypeLongArrow:  // ->> (JSON field as text)
    // Handle JSON operations
}
const (
	// TokenRangeBasicStart marks the beginning of basic token types
	TokenRangeBasicStart TokenType = 10
	// TokenRangeBasicEnd marks the end of basic token types (exclusive)
	TokenRangeBasicEnd TokenType = 30

	// TokenRangeStringStart marks the beginning of string literal types
	TokenRangeStringStart TokenType = 30
	// TokenRangeStringEnd marks the end of string literal types (exclusive)
	TokenRangeStringEnd TokenType = 50

	// TokenRangeOperatorStart marks the beginning of operator types
	TokenRangeOperatorStart TokenType = 50
	// TokenRangeOperatorEnd marks the end of operator types (exclusive)
	TokenRangeOperatorEnd TokenType = 150

	// TokenRangeKeywordStart marks the beginning of SQL keyword types
	TokenRangeKeywordStart TokenType = 200
	// TokenRangeKeywordEnd marks the end of SQL keyword types (exclusive)
	TokenRangeKeywordEnd TokenType = 500

	// TokenRangeDataTypeStart marks the beginning of data type keywords
	TokenRangeDataTypeStart TokenType = 430
	// TokenRangeDataTypeEnd marks the end of data type keywords (exclusive)
	TokenRangeDataTypeEnd TokenType = 450
)

Token range constants for maintainability and clarity. These define the boundaries for each category of tokens.

const (
	// TokenTypeEOF marks the end of the SQL input stream. Every token slice
	// produced by the tokenizer is terminated with an EOF token.
	TokenTypeEOF TokenType = 0
	// TokenTypeUnknown is assigned to tokens that cannot be classified. This
	// typically signals a tokenizer bug or an unsupported input character.
	TokenTypeUnknown TokenType = 1

	// TokenTypeWord represents an unquoted keyword or identifier word token.
	// The Token.Word field holds keyword metadata when the word is a SQL keyword.
	TokenTypeWord TokenType = 10
	// TokenTypeNumber represents a numeric literal (integer or floating-point).
	// Token.Long is true when the value requires int64 representation.
	TokenTypeNumber TokenType = 11
	// TokenTypeChar represents a single character token that does not fit other categories.
	TokenTypeChar TokenType = 12
	// TokenTypeWhitespace represents whitespace (spaces, newlines, tabs) or comments.
	TokenTypeWhitespace TokenType = 13
	// TokenTypeIdentifier represents a quoted identifier such as "column name" or `table`.
	TokenTypeIdentifier TokenType = 14
	// TokenTypePlaceholder represents a query parameter placeholder such as ? or $1.
	TokenTypePlaceholder TokenType = 15

	// TokenTypeString is the generic string literal type used when the specific
	// quoting style is not important (e.g., for dialect-agnostic processing).
	TokenTypeString TokenType = 30
	// TokenTypeSingleQuotedString represents a SQL string literal enclosed in single quotes: 'value'.
	TokenTypeSingleQuotedString TokenType = 31
	// TokenTypeDoubleQuotedString represents a string enclosed in double quotes: "value".
	// In standard SQL this is a quoted identifier; in MySQL it can be a string literal.
	TokenTypeDoubleQuotedString TokenType = 32
	// TokenTypeTripleSingleQuotedString represents a string enclosed in triple single quotes: ”'value”'.
	TokenTypeTripleSingleQuotedString TokenType = 33
	// TokenTypeTripleDoubleQuotedString represents a string enclosed in triple double quotes: """value""".
	TokenTypeTripleDoubleQuotedString TokenType = 34
	// TokenTypeDollarQuotedString represents a PostgreSQL dollar-quoted string: $$value$$ or $tag$value$tag$.
	TokenTypeDollarQuotedString TokenType = 35
	// TokenTypeByteStringLiteral represents a byte string literal such as b'bytes' (BigQuery).
	TokenTypeByteStringLiteral TokenType = 36
	// TokenTypeNationalStringLiteral represents an ANSI national character set string: N'value'.
	TokenTypeNationalStringLiteral TokenType = 37
	// TokenTypeEscapedStringLiteral represents a PostgreSQL escaped string: E'value\n'.
	TokenTypeEscapedStringLiteral TokenType = 38
	// TokenTypeUnicodeStringLiteral represents an ANSI Unicode string: U&'value'.
	TokenTypeUnicodeStringLiteral TokenType = 39
	// TokenTypeHexStringLiteral represents a hexadecimal string literal: X'DEADBEEF'.
	TokenTypeHexStringLiteral TokenType = 40

	// TokenTypeOperator is the generic operator type for operators not covered by a more specific constant.
	TokenTypeOperator TokenType = 50
	// TokenTypeComma represents the , separator used in lists and clauses.
	TokenTypeComma TokenType = 51
	// TokenTypeEq represents the = equality or assignment operator.
	TokenTypeEq TokenType = 52
	// TokenTypeDoubleEq represents the == equality operator (MySQL, SQLite).
	TokenTypeDoubleEq TokenType = 53
	// TokenTypeNeq represents the <> or != inequality operator.
	TokenTypeNeq TokenType = 54
	// TokenTypeLt represents the < less-than comparison operator.
	TokenTypeLt TokenType = 55
	// TokenTypeGt represents the > greater-than comparison operator.
	TokenTypeGt TokenType = 56
	// TokenTypeLtEq represents the <= less-than-or-equal comparison operator.
	TokenTypeLtEq TokenType = 57
	// TokenTypeGtEq represents the >= greater-than-or-equal comparison operator.
	TokenTypeGtEq TokenType = 58
	// TokenTypeSpaceship represents the <=> NULL-safe equality operator (MySQL).
	TokenTypeSpaceship TokenType = 59
	// TokenTypePlus represents the + addition operator.
	TokenTypePlus TokenType = 60
	// TokenTypeMinus represents the - subtraction or negation operator.
	TokenTypeMinus TokenType = 61
	// TokenTypeMul represents the * multiplication operator or SELECT * wildcard.
	TokenTypeMul TokenType = 62
	// TokenTypeDiv represents the / division operator.
	TokenTypeDiv TokenType = 63
	// TokenTypeDuckIntDiv represents the // integer division operator (DuckDB).
	TokenTypeDuckIntDiv TokenType = 64
	// TokenTypeMod represents the % modulo operator.
	TokenTypeMod TokenType = 65
	// TokenTypeStringConcat represents the || string concatenation operator (SQL standard).
	TokenTypeStringConcat TokenType = 66
	// TokenTypeLParen represents the ( left parenthesis.
	TokenTypeLParen    TokenType = 67
	TokenTypeLeftParen TokenType = 67 // TokenTypeLeftParen is an alias for TokenTypeLParen for backward compatibility.
	// TokenTypeRParen represents the ) right parenthesis.
	TokenTypeRParen     TokenType = 68
	TokenTypeRightParen TokenType = 68 // TokenTypeRightParen is an alias for TokenTypeRParen for backward compatibility.
	// TokenTypePeriod represents the . dot/period used for qualified names (schema.table.column).
	TokenTypePeriod TokenType = 69
	TokenTypeDot    TokenType = 69 // TokenTypeDot is an alias for TokenTypePeriod for backward compatibility.
	// TokenTypeColon represents the : colon used in named parameters (:param) and slices.
	TokenTypeColon TokenType = 70
	// TokenTypeDoubleColon represents the :: PostgreSQL type cast operator (expr::type).
	TokenTypeDoubleColon TokenType = 71
	// TokenTypeAssignment represents the := assignment operator used in PL/SQL and named arguments.
	TokenTypeAssignment TokenType = 72
	// TokenTypeSemicolon represents the ; statement terminator.
	TokenTypeSemicolon TokenType = 73
	// TokenTypeBackslash represents the \ backslash character.
	TokenTypeBackslash TokenType = 74
	// TokenTypeLBracket represents the [ left square bracket used for array subscripts and array literals.
	TokenTypeLBracket TokenType = 75
	// TokenTypeRBracket represents the ] right square bracket.
	TokenTypeRBracket TokenType = 76
	// TokenTypeAmpersand represents the & bitwise AND operator.
	TokenTypeAmpersand TokenType = 77
	// TokenTypePipe represents the | bitwise OR operator.
	TokenTypePipe TokenType = 78
	// TokenTypeCaret represents the ^ bitwise XOR or exponentiation operator.
	TokenTypeCaret TokenType = 79
	// TokenTypeLBrace represents the { left curly brace used in JSON literals and format strings.
	TokenTypeLBrace TokenType = 80
	// TokenTypeRBrace represents the } right curly brace.
	TokenTypeRBrace TokenType = 81
	// TokenTypeRArrow represents the => fat arrow used in named argument syntax.
	TokenTypeRArrow TokenType = 82
	// TokenTypeSharp represents the # hash character used in PostgreSQL path operators.
	TokenTypeSharp TokenType = 83
	// TokenTypeTilde represents the ~ regular expression match operator (PostgreSQL).
	TokenTypeTilde TokenType = 84
	// TokenTypeExclamationMark represents the ! logical NOT or factorial operator.
	TokenTypeExclamationMark TokenType = 85
	// TokenTypeAtSign represents the @ at-sign used in PostgreSQL full-text search and JSON operators.
	TokenTypeAtSign TokenType = 86
	// TokenTypeQuestion represents the ? parameter placeholder (JDBC) and JSON key existence operator (PostgreSQL).
	TokenTypeQuestion TokenType = 87

	// TokenTypeTildeAsterisk represents the ~* case-insensitive regex match operator (PostgreSQL).
	TokenTypeTildeAsterisk TokenType = 100
	// TokenTypeExclamationMarkTilde represents the !~ regex non-match operator (PostgreSQL).
	TokenTypeExclamationMarkTilde TokenType = 101
	// TokenTypeExclamationMarkTildeAsterisk represents the !~* case-insensitive regex non-match operator (PostgreSQL).
	TokenTypeExclamationMarkTildeAsterisk TokenType = 102
	// TokenTypeDoubleTilde represents the ~~ LIKE operator alias (PostgreSQL).
	TokenTypeDoubleTilde TokenType = 103
	// TokenTypeDoubleTildeAsterisk represents the ~~* ILIKE operator alias (PostgreSQL).
	TokenTypeDoubleTildeAsterisk TokenType = 104
	// TokenTypeExclamationMarkDoubleTilde represents the !~~ NOT LIKE operator alias (PostgreSQL).
	TokenTypeExclamationMarkDoubleTilde TokenType = 105
	// TokenTypeExclamationMarkDoubleTildeAsterisk represents the !~~* NOT ILIKE operator alias (PostgreSQL).
	TokenTypeExclamationMarkDoubleTildeAsterisk TokenType = 106
	// TokenTypeShiftLeft represents the << bitwise left-shift operator.
	TokenTypeShiftLeft TokenType = 107
	// TokenTypeShiftRight represents the >> bitwise right-shift operator.
	TokenTypeShiftRight TokenType = 108
	// TokenTypeOverlap represents the && range overlap operator (PostgreSQL).
	TokenTypeOverlap TokenType = 109
	// TokenTypeDoubleExclamationMark represents the !! prefix factorial operator (PostgreSQL).
	TokenTypeDoubleExclamationMark TokenType = 110
	// TokenTypeCaretAt represents the ^@ starts-with string operator (PostgreSQL 11+).
	TokenTypeCaretAt TokenType = 111
	// TokenTypePGSquareRoot represents the |/ square root prefix operator (PostgreSQL).
	TokenTypePGSquareRoot TokenType = 112
	// TokenTypePGCubeRoot represents the ||/ cube root prefix operator (PostgreSQL).
	TokenTypePGCubeRoot TokenType = 113

	// TokenTypeArrow represents the -> operator that returns a JSON field value as a JSON object.
	// Example: data->'name' returns the "name" field as JSON.
	TokenTypeArrow TokenType = 114
	// TokenTypeLongArrow represents the ->> operator that returns a JSON field value as text.
	// Example: data->>'name' returns the "name" field as a text string.
	TokenTypeLongArrow TokenType = 115
	// TokenTypeHashArrow represents the #> operator that returns a JSON value at a path as JSON.
	// Example: data#>'{address,city}' returns the nested value as JSON.
	TokenTypeHashArrow TokenType = 116
	// TokenTypeHashLongArrow represents the #>> operator that returns a JSON value at a path as text.
	// Example: data#>>'{address,city}' returns the nested value as text.
	TokenTypeHashLongArrow TokenType = 117
	// TokenTypeAtArrow represents the @> containment operator: left JSON value contains right.
	// Example: data @> '{"status":"active"}' checks if data contains the given JSON.
	TokenTypeAtArrow TokenType = 118
	// TokenTypeArrowAt represents the <@ containment operator: left JSON value is contained by right.
	// Example: '{"a":1}' <@ data checks if the left-hand JSON is a subset of data.
	TokenTypeArrowAt TokenType = 119
	// TokenTypeHashMinus represents the #- operator that deletes a key or index at the given path.
	// Example: data #- '{address,zip}' removes the "zip" key from the nested "address" object.
	TokenTypeHashMinus TokenType = 120
	// TokenTypeAtQuestion represents the @? operator that tests whether a JSON path returns any values.
	// Example: data @? '$.address.city' checks whether the path produces a result.
	TokenTypeAtQuestion TokenType = 121
	// TokenTypeAtAt represents the @@ operator used for full-text search matching.
	// Example: to_tsvector(text) @@ to_tsquery('query').
	TokenTypeAtAt TokenType = 122
	// TokenTypeQuestionAnd represents the ?& operator that checks whether all given keys exist.
	// Example: data ?& array['name','email'] returns true if both keys exist in the JSON object.
	TokenTypeQuestionAnd TokenType = 123
	// TokenTypeQuestionPipe represents the ?| operator that checks whether any of the given keys exist.
	// Example: data ?| array['name','email'] returns true if at least one key exists.
	TokenTypeQuestionPipe TokenType = 124
	// TokenTypeCustomBinaryOperator represents a user-defined or dialect-specific binary operator
	// not covered by any other constant (e.g., custom PostgreSQL operators).
	TokenTypeCustomBinaryOperator TokenType = 125

	// TokenTypeKeyword is the generic keyword token type for words that are recognised
	// as SQL keywords but do not have a more specific constant assigned.
	TokenTypeKeyword TokenType = 200
	// TokenTypeSelect represents the SELECT keyword that begins a query.
	TokenTypeSelect TokenType = 201
	// TokenTypeFrom represents the FROM keyword that introduces the table source.
	TokenTypeFrom TokenType = 202
	// TokenTypeWhere represents the WHERE keyword that begins the filter condition.
	TokenTypeWhere TokenType = 203
	// TokenTypeJoin represents the JOIN keyword (typically preceded by INNER, LEFT, etc.).
	TokenTypeJoin TokenType = 204
	// TokenTypeInner represents the INNER keyword used in INNER JOIN.
	TokenTypeInner TokenType = 205
	// TokenTypeLeft represents the LEFT keyword used in LEFT JOIN and LEFT OUTER JOIN.
	TokenTypeLeft TokenType = 206
	// TokenTypeRight represents the RIGHT keyword used in RIGHT JOIN and RIGHT OUTER JOIN.
	TokenTypeRight TokenType = 207
	// TokenTypeOuter represents the OUTER keyword used in LEFT/RIGHT/FULL OUTER JOIN.
	TokenTypeOuter TokenType = 208
	// TokenTypeOn represents the ON keyword that introduces a join condition.
	TokenTypeOn TokenType = 209
	// TokenTypeAs represents the AS keyword used in aliases (table AS alias, column AS alias).
	TokenTypeAs TokenType = 210
	// TokenTypeAnd represents the AND logical operator combining conditions.
	TokenTypeAnd TokenType = 211
	// TokenTypeOr represents the OR logical operator combining conditions.
	TokenTypeOr TokenType = 212
	// TokenTypeNot represents the NOT logical negation operator.
	TokenTypeNot TokenType = 213
	// TokenTypeIn represents the IN operator for membership tests (expr IN (list)).
	TokenTypeIn TokenType = 214
	// TokenTypeLike represents the LIKE pattern-matching operator.
	TokenTypeLike TokenType = 215
	// TokenTypeBetween represents the BETWEEN range operator (expr BETWEEN low AND high).
	TokenTypeBetween TokenType = 216
	// TokenTypeIs represents the IS operator used with NULL, TRUE, FALSE.
	TokenTypeIs TokenType = 217
	// TokenTypeNull represents the NULL literal value.
	TokenTypeNull TokenType = 218
	// TokenTypeTrue represents the TRUE boolean literal.
	TokenTypeTrue TokenType = 219
	// TokenTypeFalse represents the FALSE boolean literal.
	TokenTypeFalse TokenType = 220
	// TokenTypeCase represents the CASE keyword beginning a conditional expression.
	TokenTypeCase TokenType = 221
	// TokenTypeWhen represents the WHEN keyword inside a CASE expression.
	TokenTypeWhen TokenType = 222
	// TokenTypeThen represents the THEN keyword inside a CASE WHEN clause.
	TokenTypeThen TokenType = 223
	// TokenTypeElse represents the ELSE keyword for the default branch in a CASE expression.
	TokenTypeElse TokenType = 224
	// TokenTypeEnd represents the END keyword closing a CASE expression or block.
	TokenTypeEnd TokenType = 225
	// TokenTypeGroup represents the GROUP keyword as part of GROUP BY.
	TokenTypeGroup TokenType = 226
	// TokenTypeBy represents the BY keyword used with GROUP BY and ORDER BY.
	TokenTypeBy TokenType = 227
	// TokenTypeHaving represents the HAVING keyword for filtering grouped results.
	TokenTypeHaving TokenType = 228
	// TokenTypeOrder represents the ORDER keyword as part of ORDER BY.
	TokenTypeOrder TokenType = 229
	// TokenTypeAsc represents the ASC sort direction keyword (ascending order).
	TokenTypeAsc TokenType = 230
	// TokenTypeDesc represents the DESC sort direction keyword (descending order).
	TokenTypeDesc TokenType = 231
	// TokenTypeLimit represents the LIMIT keyword for restricting result count (MySQL, PostgreSQL, SQLite).
	TokenTypeLimit TokenType = 232
	// TokenTypeOffset represents the OFFSET keyword for skipping rows in a result set.
	TokenTypeOffset TokenType = 233

	// TokenTypeInsert represents the INSERT keyword beginning an INSERT statement.
	TokenTypeInsert TokenType = 234
	// TokenTypeUpdate represents the UPDATE keyword beginning an UPDATE statement.
	TokenTypeUpdate TokenType = 235
	// TokenTypeDelete represents the DELETE keyword beginning a DELETE statement.
	TokenTypeDelete TokenType = 236
	// TokenTypeInto represents the INTO keyword used in INSERT INTO and other clauses.
	TokenTypeInto TokenType = 237
	// TokenTypeValues represents the VALUES keyword introducing a list of row values.
	TokenTypeValues TokenType = 238
	// TokenTypeSet represents the SET keyword introducing column assignments in UPDATE.
	TokenTypeSet TokenType = 239

	// TokenTypeCreate represents the CREATE keyword beginning a DDL creation statement.
	TokenTypeCreate TokenType = 240
	// TokenTypeAlter represents the ALTER keyword beginning a DDL modification statement.
	TokenTypeAlter TokenType = 241
	// TokenTypeDrop represents the DROP keyword beginning a DDL deletion statement.
	TokenTypeDrop TokenType = 242
	// TokenTypeTable represents the TABLE keyword used in DDL statements (CREATE TABLE, etc.).
	TokenTypeTable TokenType = 243
	// TokenTypeIndex represents the INDEX keyword used in CREATE/DROP INDEX statements.
	TokenTypeIndex TokenType = 244
	// TokenTypeView represents the VIEW keyword used in CREATE/DROP VIEW statements.
	TokenTypeView TokenType = 245
	// TokenTypeColumn represents the COLUMN keyword used in ALTER TABLE ADD/DROP COLUMN.
	TokenTypeColumn TokenType = 246
	// TokenTypeDatabase represents the DATABASE keyword used in CREATE/DROP DATABASE.
	TokenTypeDatabase TokenType = 247
	// TokenTypeSchema represents the SCHEMA keyword used in CREATE/DROP SCHEMA.
	TokenTypeSchema TokenType = 248
	// TokenTypeTrigger represents the TRIGGER keyword used in CREATE/DROP TRIGGER.
	TokenTypeTrigger TokenType = 249

	// TokenTypeCount represents the COUNT aggregate function.
	TokenTypeCount TokenType = 250
	// TokenTypeSum represents the SUM aggregate function.
	TokenTypeSum TokenType = 251
	// TokenTypeAvg represents the AVG (average) aggregate function.
	TokenTypeAvg TokenType = 252
	// TokenTypeMin represents the MIN aggregate function returning the smallest value.
	TokenTypeMin TokenType = 253
	// TokenTypeMax represents the MAX aggregate function returning the largest value.
	TokenTypeMax TokenType = 254

	// TokenTypeGroupBy represents the compound GROUP BY keyword pair.
	TokenTypeGroupBy TokenType = 270
	// TokenTypeOrderBy represents the compound ORDER BY keyword pair.
	TokenTypeOrderBy TokenType = 271
	// TokenTypeLeftJoin represents the compound LEFT JOIN keyword pair.
	TokenTypeLeftJoin TokenType = 272
	// TokenTypeRightJoin represents the compound RIGHT JOIN keyword pair.
	TokenTypeRightJoin TokenType = 273
	// TokenTypeInnerJoin represents the compound INNER JOIN keyword pair.
	TokenTypeInnerJoin TokenType = 274
	// TokenTypeOuterJoin represents the compound OUTER JOIN keyword pair.
	TokenTypeOuterJoin TokenType = 275
	// TokenTypeFullJoin represents the compound FULL JOIN keyword pair.
	TokenTypeFullJoin TokenType = 276
	// TokenTypeCrossJoin represents the compound CROSS JOIN keyword pair.
	TokenTypeCrossJoin TokenType = 277

	// TokenTypeWith represents the WITH keyword beginning a Common Table Expression (CTE).
	TokenTypeWith TokenType = 280
	// TokenTypeRecursive represents the RECURSIVE modifier in WITH RECURSIVE CTEs.
	TokenTypeRecursive TokenType = 281
	// TokenTypeUnion represents the UNION set operation combining two result sets.
	TokenTypeUnion TokenType = 282
	// TokenTypeExcept represents the EXCEPT set operation returning rows in the left set not in the right.
	TokenTypeExcept TokenType = 283
	// TokenTypeIntersect represents the INTERSECT set operation returning rows present in both sets.
	TokenTypeIntersect TokenType = 284
	// TokenTypeAll represents the ALL modifier used with UNION/EXCEPT/INTERSECT and quantified predicates.
	TokenTypeAll TokenType = 285

	// TokenTypeOver represents the OVER keyword introducing a window specification.
	TokenTypeOver TokenType = 300
	// TokenTypePartition represents the PARTITION keyword in PARTITION BY window clause.
	TokenTypePartition TokenType = 301
	// TokenTypeRows represents the ROWS mode in a window frame (physical row offsets).
	TokenTypeRows TokenType = 302
	// TokenTypeRange represents the RANGE mode in a window frame (logical value offsets).
	TokenTypeRange TokenType = 303
	// TokenTypeUnbounded represents UNBOUNDED in window frames (UNBOUNDED PRECEDING/FOLLOWING).
	TokenTypeUnbounded TokenType = 304
	// TokenTypePreceding represents PRECEDING in window frame bounds.
	TokenTypePreceding TokenType = 305
	// TokenTypeFollowing represents FOLLOWING in window frame bounds.
	TokenTypeFollowing TokenType = 306
	// TokenTypeCurrent represents CURRENT in CURRENT ROW frame bound.
	TokenTypeCurrent TokenType = 307
	// TokenTypeRow represents ROW in the CURRENT ROW window frame bound.
	TokenTypeRow TokenType = 308
	// TokenTypeGroups represents the GROUPS mode in a window frame (peer group offsets, SQL:2011).
	TokenTypeGroups TokenType = 309
	// TokenTypeFilter represents the FILTER keyword for conditional aggregation (e.g., COUNT(*) FILTER (WHERE ...)).
	TokenTypeFilter TokenType = 310
	// TokenTypeExclude represents the EXCLUDE keyword in window frame EXCLUDE clauses.
	TokenTypeExclude TokenType = 311

	// TokenTypeCross represents the CROSS keyword used in CROSS JOIN.
	TokenTypeCross TokenType = 320
	// TokenTypeNatural represents the NATURAL keyword used in NATURAL JOIN (joins on all matching column names).
	TokenTypeNatural TokenType = 321
	// TokenTypeFull represents the FULL keyword used in FULL OUTER JOIN.
	TokenTypeFull TokenType = 322
	// TokenTypeUsing represents the USING keyword that specifies shared column names in a JOIN.
	TokenTypeUsing TokenType = 323
	// TokenTypeLateral represents the LATERAL keyword allowing correlated subqueries in the FROM clause.
	// Example: FROM users u, LATERAL (SELECT * FROM orders WHERE user_id = u.id) o
	TokenTypeLateral TokenType = 324

	// TokenTypePrimary represents the PRIMARY keyword in PRIMARY KEY constraints.
	TokenTypePrimary TokenType = 330
	// TokenTypeKey represents the KEY keyword in PRIMARY KEY and FOREIGN KEY constraints.
	TokenTypeKey TokenType = 331
	// TokenTypeForeign represents the FOREIGN keyword in FOREIGN KEY constraints.
	TokenTypeForeign TokenType = 332
	// TokenTypeReferences represents the REFERENCES keyword in FOREIGN KEY constraints.
	TokenTypeReferences TokenType = 333
	// TokenTypeUnique represents the UNIQUE constraint keyword.
	TokenTypeUnique TokenType = 334
	// TokenTypeCheck represents the CHECK constraint keyword.
	TokenTypeCheck TokenType = 335
	// TokenTypeDefault represents the DEFAULT constraint keyword specifying a default column value.
	TokenTypeDefault TokenType = 336
	// TokenTypeAutoIncrement represents the AUTO_INCREMENT column attribute (MySQL).
	// In PostgreSQL, the equivalent is SERIAL or GENERATED ALWAYS AS IDENTITY.
	TokenTypeAutoIncrement TokenType = 337
	// TokenTypeConstraint represents the CONSTRAINT keyword that names a table constraint.
	TokenTypeConstraint TokenType = 338
	// TokenTypeNotNull represents the NOT NULL constraint keyword pair.
	TokenTypeNotNull TokenType = 339
	// TokenTypeNullable represents the NULLABLE keyword (some dialects allow explicit nullable columns).
	TokenTypeNullable TokenType = 340

	// TokenTypeDistinct represents the DISTINCT keyword for removing duplicate rows.
	TokenTypeDistinct TokenType = 350
	// TokenTypeExists represents the EXISTS keyword for subquery existence tests.
	TokenTypeExists TokenType = 351
	// TokenTypeAny represents the ANY quantifier used with comparison operators and subqueries.
	TokenTypeAny TokenType = 352
	// TokenTypeSome represents the SOME quantifier (synonym for ANY in most dialects).
	TokenTypeSome TokenType = 353
	// TokenTypeCast represents the CAST keyword for explicit type conversion (CAST(expr AS type)).
	TokenTypeCast TokenType = 354
	// TokenTypeConvert represents the CONVERT keyword for type or charset conversion (MySQL, SQL Server).
	TokenTypeConvert TokenType = 355
	// TokenTypeCollate represents the COLLATE keyword specifying a collation for comparisons.
	TokenTypeCollate TokenType = 356
	// TokenTypeCascade represents the CASCADE option in DROP and constraint definitions.
	TokenTypeCascade TokenType = 357
	// TokenTypeRestrict represents the RESTRICT option preventing drops when dependent objects exist.
	TokenTypeRestrict TokenType = 358
	// TokenTypeReplace represents the REPLACE keyword used in INSERT OR REPLACE and REPLACE INTO (MySQL).
	TokenTypeReplace TokenType = 359
	// TokenTypeRename represents the RENAME keyword used in ALTER TABLE RENAME.
	TokenTypeRename TokenType = 360
	// TokenTypeTo represents the TO keyword used in RENAME ... TO and GRANT ... TO.
	TokenTypeTo TokenType = 361
	// TokenTypeIf represents the IF keyword used in IF EXISTS and IF NOT EXISTS clauses.
	TokenTypeIf TokenType = 362
	// TokenTypeOnly represents the ONLY keyword used in inheritance-aware queries (PostgreSQL).
	TokenTypeOnly TokenType = 363
	// TokenTypeFor represents the FOR keyword used in FOR UPDATE, FOR SHARE, and FETCH FOR.
	TokenTypeFor TokenType = 364
	// TokenTypeNulls represents the NULLS keyword used in NULLS FIRST / NULLS LAST ordering.
	TokenTypeNulls TokenType = 365
	// TokenTypeFirst represents the FIRST keyword used in NULLS FIRST and FETCH FIRST.
	TokenTypeFirst TokenType = 366
	// TokenTypeLast represents the LAST keyword used in NULLS LAST.
	TokenTypeLast TokenType = 367
	// TokenTypeFetch represents the FETCH keyword beginning a FETCH FIRST/NEXT clause (SQL standard LIMIT).
	// Example: FETCH FIRST 10 ROWS ONLY
	TokenTypeFetch TokenType = 368
	// TokenTypeNext represents the NEXT keyword used in FETCH NEXT ... ROWS ONLY.
	TokenTypeNext TokenType = 369

	// TokenTypeMerge represents the MERGE keyword beginning a MERGE statement.
	TokenTypeMerge TokenType = 370
	// TokenTypeMatched represents the MATCHED keyword in WHEN MATCHED and WHEN NOT MATCHED clauses.
	TokenTypeMatched TokenType = 371
	// TokenTypeTarget represents the TARGET keyword (used in some dialect MERGE syntax).
	TokenTypeTarget TokenType = 372
	// TokenTypeSource represents the SOURCE keyword (used in some dialect MERGE syntax).
	TokenTypeSource TokenType = 373

	// TokenTypeMaterialized represents the MATERIALIZED keyword in CREATE/DROP/REFRESH MATERIALIZED VIEW.
	TokenTypeMaterialized TokenType = 374
	// TokenTypeRefresh represents the REFRESH keyword in REFRESH MATERIALIZED VIEW.
	TokenTypeRefresh TokenType = 375
	// TokenTypeTies represents the TIES keyword in FETCH FIRST n ROWS WITH TIES.
	// WITH TIES causes the last group of rows with equal ordering values to all be returned.
	TokenTypeTies TokenType = 376
	// TokenTypePercent represents the PERCENT keyword in FETCH FIRST n PERCENT ROWS ONLY.
	TokenTypePercent TokenType = 377
	// TokenTypeTruncate represents the TRUNCATE keyword beginning a TRUNCATE TABLE statement.
	TokenTypeTruncate TokenType = 378
	// TokenTypeReturning represents the RETURNING keyword in PostgreSQL INSERT/UPDATE/DELETE statements.
	// RETURNING causes the modified rows to be returned as a result set.
	TokenTypeReturning TokenType = 379

	// TokenTypeShare represents the SHARE keyword in FOR SHARE row locking.
	TokenTypeShare TokenType = 380
	// TokenTypeNoWait represents the NOWAIT keyword causing an immediate error instead of waiting
	// for locked rows (FOR UPDATE NOWAIT, FOR SHARE NOWAIT).
	TokenTypeNoWait TokenType = 381
	// TokenTypeSkip represents the SKIP keyword in FOR UPDATE SKIP LOCKED,
	// causing locked rows to be silently skipped.
	TokenTypeSkip TokenType = 382
	// TokenTypeLocked represents the LOCKED keyword in SKIP LOCKED.
	TokenTypeLocked TokenType = 383
	// TokenTypeOf represents the OF keyword in FOR UPDATE OF table_name,
	// restricting locking to specific tables in a JOIN.
	TokenTypeOf TokenType = 384

	// TokenTypeGroupingSets represents the GROUPING SETS keyword pair for
	// specifying explicit grouping combinations in GROUP BY.
	TokenTypeGroupingSets TokenType = 390
	// TokenTypeRollup represents the ROLLUP keyword for hierarchical grouping subtotals.
	// Example: GROUP BY ROLLUP (year, quarter, month)
	TokenTypeRollup TokenType = 391
	// TokenTypeCube represents the CUBE keyword for all possible grouping combinations.
	// Example: GROUP BY CUBE (region, product)
	TokenTypeCube TokenType = 392
	// TokenTypeGrouping represents the GROUPING function keyword that indicates whether
	// a column is aggregated in a GROUPING SETS/ROLLUP/CUBE expression.
	TokenTypeGrouping TokenType = 393
	// TokenTypeSets represents the SETS keyword used in GROUPING SETS (...).
	TokenTypeSets TokenType = 394
	// TokenTypeArray represents the ARRAY keyword for PostgreSQL array constructors.
	// Example: ARRAY[1, 2, 3] or ARRAY(SELECT id FROM users)
	TokenTypeArray TokenType = 395
	// TokenTypeWithin represents the WITHIN keyword in ordered-set aggregate functions.
	// Example: PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY salary)
	TokenTypeWithin TokenType = 396

	// TokenTypeRole represents the ROLE keyword in CREATE/ALTER/DROP ROLE statements.
	TokenTypeRole TokenType = 400
	// TokenTypeUser represents the USER keyword in CREATE/ALTER/DROP USER statements.
	TokenTypeUser TokenType = 401
	// TokenTypeGrant represents the GRANT keyword for granting privileges to roles/users.
	TokenTypeGrant TokenType = 402
	// TokenTypeRevoke represents the REVOKE keyword for revoking previously granted privileges.
	TokenTypeRevoke TokenType = 403
	// TokenTypePrivilege represents the PRIVILEGE keyword in GRANT/REVOKE statements.
	TokenTypePrivilege TokenType = 404
	// TokenTypePassword represents the PASSWORD keyword in ALTER USER ... PASSWORD statements.
	TokenTypePassword TokenType = 405
	// TokenTypeLogin represents the LOGIN option keyword in CREATE/ALTER ROLE ... LOGIN.
	TokenTypeLogin TokenType = 406
	// TokenTypeSuperuser represents the SUPERUSER option keyword in CREATE/ALTER ROLE.
	TokenTypeSuperuser TokenType = 407
	// TokenTypeCreateDB represents the CREATEDB option keyword allowing a role to create databases.
	TokenTypeCreateDB TokenType = 408
	// TokenTypeCreateRole represents the CREATEROLE option keyword allowing a role to create other roles.
	TokenTypeCreateRole TokenType = 409

	// TokenTypeBegin represents the BEGIN keyword starting an explicit transaction block.
	TokenTypeBegin TokenType = 420
	// TokenTypeCommit represents the COMMIT keyword permanently saving a transaction.
	TokenTypeCommit TokenType = 421
	// TokenTypeRollback represents the ROLLBACK keyword undoing a transaction.
	TokenTypeRollback TokenType = 422
	// TokenTypeSavepoint represents the SAVEPOINT keyword creating a named transaction savepoint.
	TokenTypeSavepoint TokenType = 423

	// TokenTypeInt represents the INT data type keyword (32-bit signed integer).
	TokenTypeInt TokenType = 430
	// TokenTypeInteger represents the INTEGER data type keyword (synonym for INT in most dialects).
	TokenTypeInteger TokenType = 431
	// TokenTypeBigInt represents the BIGINT data type keyword (64-bit signed integer).
	TokenTypeBigInt TokenType = 432
	// TokenTypeSmallInt represents the SMALLINT data type keyword (16-bit signed integer).
	TokenTypeSmallInt TokenType = 433
	// TokenTypeFloat represents the FLOAT data type keyword (single or double precision floating-point).
	TokenTypeFloat TokenType = 434
	// TokenTypeDouble represents the DOUBLE or DOUBLE PRECISION data type keyword.
	TokenTypeDouble TokenType = 435
	// TokenTypeDecimal represents the DECIMAL(p,s) fixed-precision data type keyword.
	TokenTypeDecimal TokenType = 436
	// TokenTypeNumeric represents the NUMERIC(p,s) fixed-precision data type keyword (synonym for DECIMAL).
	TokenTypeNumeric TokenType = 437
	// TokenTypeVarchar represents the VARCHAR(n) variable-length character data type keyword.
	TokenTypeVarchar TokenType = 438
	// TokenTypeCharDataType represents the CHAR(n) fixed-length character data type keyword.
	// Note: this is distinct from TokenTypeChar (value 12) which represents a single character token.
	TokenTypeCharDataType TokenType = 439
	// TokenTypeText represents the TEXT data type keyword for variable-length text.
	TokenTypeText TokenType = 440
	// TokenTypeBoolean represents the BOOLEAN data type keyword.
	TokenTypeBoolean TokenType = 441
	// TokenTypeDate represents the DATE data type keyword (calendar date without time).
	TokenTypeDate TokenType = 442
	// TokenTypeTime represents the TIME data type keyword (time of day without date).
	TokenTypeTime TokenType = 443
	// TokenTypeTimestamp represents the TIMESTAMP data type keyword (date and time).
	TokenTypeTimestamp TokenType = 444
	// TokenTypeInterval represents the INTERVAL data type keyword for time durations.
	TokenTypeInterval TokenType = 445
	// TokenTypeBlob represents the BLOB data type keyword for binary large objects.
	TokenTypeBlob TokenType = 446
	// TokenTypeClob represents the CLOB data type keyword for character large objects.
	TokenTypeClob TokenType = 447
	// TokenTypeJson represents the JSON data type keyword (PostgreSQL, MySQL 5.7+).
	TokenTypeJson TokenType = 448
	// TokenTypeUuid represents the UUID data type keyword (PostgreSQL, SQL Server).
	TokenTypeUuid TokenType = 449

	// TokenTypeIllegal is used for parser compatibility with internal ILLEGAL token values.
	TokenTypeIllegal TokenType = 500
	// TokenTypeAsterisk represents an explicit * token used as a wildcard or multiply operator.
	// Distinct from TokenTypeMul (62) to allow unambiguous identification of the asterisk character.
	TokenTypeAsterisk TokenType = 501
	// TokenTypeDoublePipe represents the || string concatenation operator (SQL standard).
	// Distinct from TokenTypeStringConcat (66) for cases where dialect disambiguation is needed.
	TokenTypeDoublePipe TokenType = 502
	// TokenTypeILike represents the ILIKE case-insensitive pattern-matching operator (PostgreSQL).
	TokenTypeILike TokenType = 503
	// TokenTypeAdd represents the ADD keyword used in ALTER TABLE ADD COLUMN.
	TokenTypeAdd TokenType = 504
	// TokenTypeNosuperuser represents the NOSUPERUSER option in ALTER ROLE, removing superuser privilege.
	TokenTypeNosuperuser TokenType = 505
	// TokenTypeNocreatedb represents the NOCREATEDB option in ALTER ROLE, removing database creation privilege.
	TokenTypeNocreatedb TokenType = 506
	// TokenTypeNocreaterole represents the NOCREATEROLE option in ALTER ROLE, removing role creation privilege.
	TokenTypeNocreaterole TokenType = 507
	// TokenTypeNologin represents the NOLOGIN option in ALTER ROLE, preventing login.
	TokenTypeNologin TokenType = 508
	// TokenTypeValid represents the VALID keyword used in VALID UNTIL role attribute.
	TokenTypeValid TokenType = 509
	// TokenTypeDcproperties represents the DCPROPERTIES keyword used in ALTER CONNECTOR.
	TokenTypeDcproperties TokenType = 510
	// TokenTypeUrl represents the URL keyword used in CREATE/ALTER CONNECTOR statements.
	TokenTypeUrl TokenType = 511
	// TokenTypeOwner represents the OWNER keyword used in ALTER CONNECTOR ... OWNER TO.
	TokenTypeOwner TokenType = 512
	// TokenTypeMember represents the MEMBER keyword used in ALTER ROLE ... MEMBER.
	TokenTypeMember TokenType = 513
	// TokenTypeConnector represents the CONNECTOR keyword used in CREATE/ALTER CONNECTOR statements.
	TokenTypeConnector TokenType = 514
	// TokenTypePolicy represents the POLICY keyword used in CREATE/ALTER POLICY statements.
	TokenTypePolicy TokenType = 515
	// TokenTypeUntil represents the UNTIL keyword used in VALID UNTIL date expressions.
	TokenTypeUntil TokenType = 516
	// TokenTypeReset represents the RESET keyword used in ALTER ROLE ... RESET parameter.
	TokenTypeReset TokenType = 517
	// TokenTypeShow represents the SHOW keyword used in MySQL SHOW TABLES, SHOW COLUMNS, etc.
	TokenTypeShow TokenType = 518
	// TokenTypeDescribe represents the DESCRIBE keyword used in MySQL DESCRIBE table_name.
	TokenTypeDescribe TokenType = 519
	// TokenTypeExplain represents the EXPLAIN keyword used to display query execution plans.
	TokenTypeExplain TokenType = 520
)

Token type constants with explicit values to avoid collisions.

Constants are assigned explicit numeric values to guarantee stability across versions. Adding new token types must not change existing values.

func (TokenType) IsAggregateFunction added in v1.6.0

func (t TokenType) IsAggregateFunction() bool

IsAggregateFunction returns true if the token type is a standard SQL aggregate function.

Covered aggregate functions: COUNT, SUM, AVG, MIN, MAX.

Note: This method covers only the five standard SQL aggregate functions. Other aggregate functions (e.g., ARRAY_AGG, STRING_AGG, JSON_AGG) are represented as TokenTypeWord or TokenTypeIdentifier tokens.

Example:

if token.Type.IsAggregateFunction() {
    // Handle aggregate function (COUNT, SUM, AVG, MIN, MAX)
}

func (TokenType) IsConstraint added in v1.6.0

func (t TokenType) IsConstraint() bool

IsConstraint returns true if the token type is a table or column constraint keyword.

Covered constraint keywords: PRIMARY, KEY, FOREIGN, REFERENCES, UNIQUE, CHECK, DEFAULT, AUTO_INCREMENT, CONSTRAINT, NOT NULL, NULLABLE.

Example:

if token.Type.IsConstraint() {
    // Handle constraint keyword (PRIMARY KEY, FOREIGN KEY, UNIQUE, CHECK, etc.)
}

func (TokenType) IsDDLKeyword added in v1.6.0

func (t TokenType) IsDDLKeyword() bool

IsDDLKeyword returns true if the token type is a Data Definition Language keyword.

Covered DDL keywords: CREATE, ALTER, DROP, TRUNCATE, TABLE, INDEX, VIEW, COLUMN, DATABASE, SCHEMA, TRIGGER.

Example:

if token.Type.IsDDLKeyword() {
    // Handle DDL keyword (CREATE, ALTER, DROP, TABLE, etc.)
}

func (TokenType) IsDMLKeyword added in v1.6.0

func (t TokenType) IsDMLKeyword() bool

IsDMLKeyword returns true if the token type is a Data Manipulation Language keyword.

Covered DML keywords: SELECT, INSERT, UPDATE, DELETE, INTO, VALUES, SET, FROM, WHERE.

Example:

if token.Type.IsDMLKeyword() {
    // Handle DML keyword (SELECT, INSERT, UPDATE, DELETE, etc.)
}

func (TokenType) IsDataType added in v1.6.0

func (t TokenType) IsDataType() bool

IsDataType returns true if the token type is a SQL data type. Uses range-based checking for O(1) performance.

Example:

if token.Type.IsDataType() {
    // Handle data type token (INT, VARCHAR, BOOLEAN, etc.)
}

func (TokenType) IsJoinKeyword added in v1.6.0

func (t TokenType) IsJoinKeyword() bool

IsJoinKeyword returns true if the token type is a JOIN-related keyword.

Covered JOIN keywords: JOIN, INNER, LEFT, RIGHT, OUTER, CROSS, NATURAL, FULL, INNER JOIN, LEFT JOIN, RIGHT JOIN, OUTER JOIN, FULL JOIN, CROSS JOIN, ON, USING.

Example:

if token.Type.IsJoinKeyword() {
    // Handle JOIN keyword (JOIN, INNER, LEFT, RIGHT, ON, USING, etc.)
}

func (TokenType) IsKeyword added in v1.6.0

func (t TokenType) IsKeyword() bool

IsKeyword returns true if the token type is a SQL keyword. Uses range-based checking for O(1) performance (~0.24ns/op).

Example:

if token.Type.IsKeyword() {
    // Handle SQL keyword token
}

func (TokenType) IsLiteral added in v1.6.0

func (t TokenType) IsLiteral() bool

IsLiteral returns true if the token type is a literal value. Includes identifiers, numbers, strings, and boolean/null literals.

Example:

if token.Type.IsLiteral() {
    // Handle literal value (identifier, number, string, true/false/null)
}

func (TokenType) IsOperator added in v1.6.0

func (t TokenType) IsOperator() bool

IsOperator returns true if the token type is an operator. Uses range-based checking for O(1) performance.

Example:

if token.Type.IsOperator() {
    // Handle operator token (e.g., +, -, *, /, etc.)
}

func (TokenType) IsSetOperation added in v1.6.0

func (t TokenType) IsSetOperation() bool

IsSetOperation returns true if the token type is a set operation keyword.

Covered set operations: UNION, EXCEPT, INTERSECT, ALL.

These keywords combine multiple query result sets:

SELECT id FROM users UNION ALL SELECT id FROM admins
SELECT id FROM a EXCEPT SELECT id FROM b
SELECT id FROM a INTERSECT SELECT id FROM b

Example:

if token.Type.IsSetOperation() {
    // Handle set operation keyword (UNION, EXCEPT, INTERSECT, ALL)
}

func (TokenType) IsWindowKeyword added in v1.6.0

func (t TokenType) IsWindowKeyword() bool

IsWindowKeyword returns true if the token type is a window function keyword.

Covered window keywords: OVER, PARTITION, ROWS, RANGE, UNBOUNDED, PRECEDING, FOLLOWING, CURRENT, ROW, GROUPS, FILTER, EXCLUDE.

These keywords appear in window function specifications:

RANK() OVER (PARTITION BY dept ORDER BY salary ROWS UNBOUNDED PRECEDING)

Example:

if token.Type.IsWindowKeyword() {
    // Handle window keyword (OVER, PARTITION BY, ROWS, RANGE, etc.)
}

func (TokenType) String added in v1.0.1

func (t TokenType) String() string

String returns a string representation of the token type.

Provides names for debugging, error messages, and logging. Uses a switch statement for O(1) compiled jump-table lookup. Covers ALL defined TokenType constants for completeness.

Example:

tokenType := models.TokenTypeSelect
fmt.Println(tokenType.String()) // Output: "SELECT"

tokenType = models.TokenTypeLongArrow
fmt.Println(tokenType.String()) // Output: "LONG_ARROW"

type TokenWithSpan

type TokenWithSpan struct {
	Token Token    // The token with type and value
	Start Location // Start position (inclusive)
	End   Location // End position (exclusive)
}

TokenWithSpan represents a token with its location in the source code.

TokenWithSpan combines a Token with precise position information (Start and End locations). This is the primary representation used by the tokenizer output and consumed by the parser.

Fields:

  • Token: The token itself (type, value, metadata)
  • Start: Beginning location of the token in source (inclusive)
  • End: Ending location of the token in source (exclusive)

Example:

// Token for "SELECT" at line 1, columns 1-7
tokenWithSpan := models.TokenWithSpan{
    Token: models.Token{Type: models.TokenTypeSelect, Value: "SELECT"},
    Start: models.Location{Line: 1, Column: 1},
    End:   models.Location{Line: 1, Column: 7},
}

Usage with tokenizer:

tkz := tokenizer.GetTokenizer()
defer tokenizer.PutTokenizer(tkz)
tokens, err := tkz.Tokenize([]byte(sql))
// tokens is []TokenWithSpan with location information
for _, t := range tokens {
    fmt.Printf("Token %s at line %d, column %d\n",
        t.Token.Value, t.Start.Line, t.Start.Column)
}

Used for error reporting:

// Create error at token location
err := errors.NewError(
    errors.ErrCodeUnexpectedToken,
    "unexpected token",
    tokenWithSpan.Start,
)

Performance: TokenWithSpan is a value type designed for zero-copy operations. The tokenizer returns slices of TokenWithSpan without heap allocations.

func NewEOFToken

func NewEOFToken(pos Location) TokenWithSpan

NewEOFToken creates a new EOF token with span.

Factory function for creating End-Of-File tokens. EOF tokens mark the end of the input stream and are essential for parser termination.

Parameters:

  • pos: The location where EOF was encountered

Returns a TokenWithSpan with type TokenTypeEOF and empty value. Both Start and End are set to the same position.

Example:

eofToken := models.NewEOFToken(models.Location{Line: 10, Column: 1})
// eofToken.Token.Type = TokenTypeEOF
// eofToken.Token.Value = ""
// eofToken.Start = eofToken.End = {Line: 10, Column: 1}

Used by tokenizer at end of input:

tokens = append(tokens, models.NewEOFToken(currentLocation))

func NewTokenWithSpan

func NewTokenWithSpan(tokenType TokenType, value string, start, end Location) TokenWithSpan

NewTokenWithSpan creates a new TokenWithSpan with the given type, value, and location.

Factory function for creating tokens with precise position information. This is the primary way to create tokens during tokenization.

Parameters:

  • tokenType: The TokenType classification
  • value: The string representation of the token
  • start: Beginning location in source (inclusive)
  • end: Ending location in source (exclusive)

Returns a TokenWithSpan with all fields populated.

Example:

token := models.NewTokenWithSpan(
    models.TokenTypeSelect,
    "SELECT",
    models.Location{Line: 1, Column: 1},
    models.Location{Line: 1, Column: 7},
)
// Represents "SELECT" spanning columns 1-6 on line 1

Used by tokenizer:

tokens = append(tokens, models.NewTokenWithSpan(
    tokenType, value, startLoc, endLoc,
))

func TokenAtLocation

func TokenAtLocation(token Token, start, end Location) TokenWithSpan

TokenAtLocation creates a new TokenWithSpan from a Token and location.

Convenience function for adding location information to an existing Token. Useful when token is created first and location is determined later.

Parameters:

  • token: The Token to wrap with location
  • start: Beginning location in source (inclusive)
  • end: Ending location in source (exclusive)

Returns a TokenWithSpan combining the token and location.

Example:

token := models.NewToken(models.TokenTypeSelect, "SELECT")
start := models.Location{Line: 1, Column: 1}
end := models.Location{Line: 1, Column: 7}
tokenWithSpan := models.TokenAtLocation(token, start, end)

func WrapToken

func WrapToken(token Token) TokenWithSpan

WrapToken wraps a token with an empty location.

Creates a TokenWithSpan from a Token when location information is not available or not needed. The Start and End locations are set to zero values.

Example:

token := models.Token{Type: models.TokenTypeSelect, Value: "SELECT"}
wrapped := models.WrapToken(token)
// wrapped.Start and wrapped.End are both Location{Line: 0, Column: 0}

Use case: Testing or scenarios where location tracking is not required.

type TokenizerError

type TokenizerError struct {
	Message  string   // Error description
	Location Location // Where the error occurred
}

TokenizerError represents an error during tokenization.

TokenizerError is a simple error type for lexical analysis failures. It includes the error message and the precise location where the error occurred.

For more sophisticated error handling with hints, suggestions, and context, use the errors package (pkg/errors) which provides structured errors with:

  • Error codes (E1xxx for tokenizer errors)
  • SQL context extraction and highlighting
  • Intelligent suggestions and typo detection
  • Help URLs for documentation

Fields:

  • Message: Human-readable error description
  • Location: Precise position in source where error occurred (line/column)

Example:

err := models.TokenizerError{
    Message:  "unexpected character '@' at position",
    Location: models.Location{Line: 2, Column: 15},
}
fmt.Println(err.Error()) // "unexpected character '@' at position"

Upgrading to structured errors:

// Instead of TokenizerError, use errors package:
err := errors.UnexpectedCharError('@', location, sqlSource)
// Provides: error code, context, hints, help URL

Common tokenizer errors:

  • Unexpected characters in input
  • Unterminated string literals
  • Invalid numeric formats
  • Invalid identifier syntax
  • Input size limits exceeded (DoS protection)

Performance: TokenizerError is a lightweight value type with minimal overhead.

func (TokenizerError) Error

func (e TokenizerError) Error() string

Error implements the error interface.

Returns the error message. For full context and location information, use the errors package which provides FormatErrorWithContext.

Example:

err := models.TokenizerError{Message: "invalid token", Location: loc}
fmt.Println(err.Error()) // Output: "invalid token"

type Whitespace

type Whitespace struct {
	// Type identifies whether this is a space, newline, tab, or comment.
	Type WhitespaceType
	// Content holds the text of a comment, including its delimiters.
	// Empty for non-comment whitespace (spaces, newlines, tabs).
	Content string
	// Prefix holds the comment introducer for single-line comments ("--" or "#").
	// Empty for block comments and non-comment whitespace.
	Prefix string
}

Whitespace represents different types of whitespace tokens.

Whitespace tokens are typically ignored during parsing but can be preserved for formatting tools, SQL formatters, or LSP servers that need to maintain original source formatting and comments.

Fields:

  • Type: The specific type of whitespace (space, newline, tab, comment)
  • Content: The actual content (used for comments to preserve text)
  • Prefix: Comment prefix for single-line comments (-- or # in MySQL)

Example:

// Single-line comment
ws := models.Whitespace{
    Type:    models.WhitespaceTypeSingleLineComment,
    Content: "This is a comment",
    Prefix:  "--",
}

// Multi-line comment
ws := models.Whitespace{
    Type:    models.WhitespaceTypeMultiLineComment,
    Content: "/* Block comment */",
}

type WhitespaceType

type WhitespaceType int

WhitespaceType represents the type of whitespace.

Used to distinguish between different whitespace and comment types in SQL source code for accurate formatting and comment preservation.

const (
	WhitespaceTypeSpace             WhitespaceType = iota // Regular space character
	WhitespaceTypeNewline                                 // Line break (\n or \r\n)
	WhitespaceTypeTab                                     // Tab character (\t)
	WhitespaceTypeSingleLineComment                       // Single-line comment (-- or #)
	WhitespaceTypeMultiLineComment                        // Multi-line comment (/* ... */)
)

type Word

type Word struct {
	// Value is the actual text of the word in its original case (e.g., "SELECT", "users").
	Value string
	// QuoteStyle is the quote character used to delimit a quoted identifier (", `, [).
	// Zero for unquoted words.
	QuoteStyle rune
	// Keyword holds SQL keyword metadata when this word is a recognized SQL keyword.
	// It is nil for plain identifiers (table names, column names, aliases).
	Keyword *Keyword
}

Word represents a keyword or identifier with its properties.

Word is used to distinguish between different types of word tokens: SQL keywords (SELECT, FROM, WHERE), identifiers (table/column names), and quoted identifiers ("column name" or [column name]).

Fields:

  • Value: The actual text of the word (case-preserved)
  • QuoteStyle: The quote character if this is a quoted identifier (", `, [, etc.)
  • Keyword: Pointer to Keyword struct if this word is a SQL keyword (nil for identifiers)

Example:

// SQL keyword
word := &models.Word{
    Value:   "SELECT",
    Keyword: &models.Keyword{Word: "SELECT", Reserved: true},
}

// Quoted identifier
word := &models.Word{
    Value:      "column name",
    QuoteStyle: '"',
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL