mssql

package

v2.0.2 Latest Latest Go to latest Published: Jan 9, 2026 License: MIT Imports: 9 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/simukka/sqlcode

Links

Open Source Insights

README ¶

Package mssql provides a T-SQL (Microsoft SQL Server) parser for the sqlcode library.

Overview

This package implements a lexical scanner and document parser specifically designed for T-SQL syntax. It is part of the sqlcode toolchain that manages SQL database objects (procedures, functions, types) with dependency tracking and code generation.

Architecture

The parser follows a two-layer architecture:

Scanner (scanner.go): A lexical tokenizer that breaks T-SQL source into tokens. It handles T-SQL-specific constructs like N'unicode strings', [bracketed identifiers], and the GO batch separator.
Document (document.go): A higher-level parser that processes token streams to extract CREATE statements, DECLARE constants, and dependency information.

Token System

T-SQL tokens are divided into two categories:

Common tokens (defined in sqldocument): Shared across SQL dialects (e.g., parentheses, whitespace, identifiers). These use token type values 0-999.
T-SQL-specific tokens (defined in tokens.go): Dialect-specific tokens like VarcharLiteralToken ('...') and NVarcharLiteralToken (N'...'). These use values 1000-1999.

Batch Separator Handling

T-SQL uses GO as a batch separator with special rules:

GO must appear at the start of a line (only whitespace/comments before it)
Nothing except whitespace may follow GO on the same line
GO is not a reserved word; it's a client tool command The scanner tracks line position state to correctly identify GO as a BatchSeparatorToken rather than an identifier. Malformed separators (GO followed by non-whitespace) are reported as MalformedBatchSeparatorToken.

Document Structure

The parser recognizes:

CREATE PROCEDURE/FUNCTION/TYPE statements in the [code] schema
DECLARE statements for constants (variables starting with @Enum, @Global, or @Const)
Dependencies between objects via [code].ObjectName references
Pragma comments (--sqlcode:...) for build-time directives

Dependency Tracking

When parsing CREATE statements, the parser scans for [code].ObjectName patterns to build a dependency graph. This enables topological sorting of objects so they are created in the correct order during deployment.

Error Recovery

The parser uses a recovery strategy that skips to the next statement-starting keyword (CREATE, DECLARE, GO) when encountering syntax errors. This allows partial parsing of files with errors while collecting all error messages.

Documentation ¶

Index ¶

Constants
Variables
func ToCommonToken(tt sqldocument.TokenType) sqldocument.TokenType
type Scanner
- func NewScanner(path sqldocument.FileRef, input string) *Scanner
type TSqlDocument

Constants ¶

View Source

const (
	// T-SQL specific string literals
	//
	// T-SQL distinguishes between varchar ('...') and nvarchar (N'...')
	// string literals. Both use single quotes with ” as the escape sequence.
	VarcharLiteralToken sqldocument.TokenType = iota + sqldocument.TSQLTokenStart
	NVarcharLiteralToken

	// T-SQL specific identifier styles
	//
	// T-SQL uses square brackets for quoted identifiers: [My Table]
	// Brackets are escaped by doubling: [My]]Table] represents "My]Table"
	BracketQuotedIdentifierToken // [identifier]

	// T-SQL specific errors
	//
	// Unlike standard SQL, T-SQL does not support double-quoted strings.
	// Double quotes are reserved for QUOTED_IDENTIFIER mode identifiers,
	// but sqlcode requires bracket notation for consistency.
	DoubleQuoteErrorToken // T-SQL doesn't support double-quoted strings
	UnterminatedVarcharLiteralErrorToken
	UnterminatedQuotedIdentifierErrorToken
)

T-SQL specific tokens (range 1000-1999)

Token values are partitioned by dialect to avoid collisions:

0-999: Common tokens shared across dialects (sqldocument package)
1000-1999: T-SQL specific tokens (this package)
2000-2999: Reserved for other dialects (e.g., PostgreSQL)

This design allows dialect-specific code to use concrete token types while common code can use ToCommonToken() for abstraction.

Variables ¶

View Source

var TSQLStatementTokens = []string{"create", "declare", "go"}

TSQLStatementTokens defines the keywords that start new statements. Used by error recovery to find a safe point to resume parsing.

Functions ¶

func ToCommonToken ¶

func ToCommonToken(tt sqldocument.TokenType) sqldocument.TokenType

ToCommonToken maps T-SQL specific tokens to their common equivalents for dialect-agnostic processing.

This abstraction layer allows higher-level code to work with logical token categories (e.g., "string literal") without knowing the specific dialect syntax (varchar vs nvarchar, brackets vs double quotes).

Tokens that are already common tokens pass through unchanged.

Types ¶

type Scanner ¶

type Scanner struct {
	// contains filtered or unexported fields
}

Scanner is a lexical scanner for T-SQL source code.

Unlike traditional lexer/parser architectures with a token stream, Scanner is used directly by the recursive descent parser as a cursor into the input buffer. It provides utility methods for tokenization and position tracking.

The scanner handles T-SQL specific constructs including:

String literals ('...' and N'...')
Quoted identifiers ([...])
Single-line (--) and multi-line (/* */) comments
Batch separators (GO)
Reserved words
Variables (@identifier)

func NewScanner ¶

func NewScanner(path sqldocument.FileRef, input string) *Scanner

NewScanner creates a new Scanner for the given T-SQL source file and input string. The scanner is positioned before the first token; call NextToken() to advance.

func (Scanner) Clone ¶

func (s Scanner) Clone() *Scanner

Clone returns a copy of the scanner at its current position. This is used for look-ahead parsing where we need to tentatively scan tokens without committing to consuming them.

func (*Scanner) File ¶

func (s *Scanner) File() sqldocument.FileRef

func (*Scanner) NextNonWhitespaceCommentToken ¶

func (s *Scanner) NextNonWhitespaceCommentToken() sqldocument.TokenType

NextNonWhitespaceCommentToken advances to the next token and then skips any whitespace and comments, returning the type of the first significant token.

func (*Scanner) NextNonWhitespaceToken ¶

func (s *Scanner) NextNonWhitespaceToken() sqldocument.TokenType

NextNonWhitespaceToken advances to the next token and then skips any whitespace, returning the type of the first non-whitespace token.

func (*Scanner) NextToken ¶

func (s *Scanner) NextToken() sqldocument.TokenType

NextToken scans the next token and advances the scanner's position.

This method wraps the raw tokenization with batch separator handling. The GO batch separator has special rules in T-SQL:

It must appear at the start of a line (only whitespace/comments before it)
Nothing except whitespace may follow it on the same line
It is not processed inside [names], 'strings', or /*comments*/

If GO is followed by non-whitespace on the same line, subsequent tokens are returned as MalformedBatchSeparatorToken until end of line.

Returns the TokenType of the scanned token.

func (*Scanner) ReservedWord ¶

func (s *Scanner) ReservedWord() string

ReservedWord returns the lowercase reserved word if the current token is a ReservedWordToken, or an empty string otherwise.

func (*Scanner) SetFile ¶

func (s *Scanner) SetFile(file sqldocument.FileRef)

func (*Scanner) SetInput ¶

func (s *Scanner) SetInput(input []byte)

func (*Scanner) SkipWhitespace ¶

func (s *Scanner) SkipWhitespace()

SkipWhitespace advances past any whitespace tokens. Stops when a non-whitespace token is encountered. Unlike SkipWhitespaceComments, this preserves comments.

func (*Scanner) SkipWhitespaceComments ¶

func (s *Scanner) SkipWhitespaceComments()

SkipWhitespaceComments advances past any whitespace and comment tokens. Stops when a non-whitespace, non-comment token is encountered.

func (*Scanner) Start ¶

func (s *Scanner) Start() sqldocument.Pos

Start returns the position where the current token begins. Line and column are 1-indexed.

func (*Scanner) Stop ¶

func (s *Scanner) Stop() sqldocument.Pos

Stop returns the position where the current token ends. Line and column are 1-indexed.

func (*Scanner) Token ¶

func (s *Scanner) Token() string

Token returns the text of the current token as a substring of Input.

func (*Scanner) TokenLower ¶

func (s *Scanner) TokenLower() string

TokenLower returns the current token text converted to lowercase. Useful for case-insensitive keyword matching.

func (*Scanner) TokenType ¶

func (s *Scanner) TokenType() sqldocument.TokenType

TokenType returns the type of the current token.

type TSqlDocument ¶

type TSqlDocument struct {
	sqldocument.Pragma
	// contains filtered or unexported fields
}

TSqlDocument represents a T-SQL source file.

The document contains:

creates: CREATE PROCEDURE/FUNCTION/TYPE statements with dependency info
declares: DECLARE statements for sqlcode constants (@Enum*, @Global*, @Const*)
errors: Syntax and semantic errors encountered during parsing
pragmaIncludeIf: Conditional compilation directives from --sqlcode:include-if

Parsing follows T-SQL batch semantics where batches are separated by GO. The first batch may contain DECLARE statements for constants. Subsequent batches contain CREATE statements for database objects.

func (TSqlDocument) Creates ¶

func (d TSqlDocument) Creates() []sqldocument.Create

func (TSqlDocument) Declares ¶

func (d TSqlDocument) Declares() []sqldocument.Declare

func (TSqlDocument) Empty ¶

func (d TSqlDocument) Empty() bool

func (TSqlDocument) Errors ¶

func (d TSqlDocument) Errors() []sqldocument.Error

func (TSqlDocument) HasErrors ¶

func (d TSqlDocument) HasErrors() bool

func (*TSqlDocument) Include ¶

func (d *TSqlDocument) Include(other sqldocument.Document)

func (*TSqlDocument) Parse ¶

func (d *TSqlDocument) Parse(input []byte, file sqldocument.FileRef) error

Parse processes a T-SQL source file from the given input.

Parsing proceeds in phases:

Parse pragma comments at the file start (--sqlcode:...)
Parse batches sequentially, separated by GO

The first batch has special rules: it may contain DECLARE statements for sqlcode constants. CREATE statements may appear in any batch, but procedures/functions must be alone in their batch (T-SQL requirement).

Errors are accumulated in the document rather than stopping parsing, allowing partial results even with syntax errors.

func (*TSqlDocument) Sort ¶

func (d *TSqlDocument) Sort()

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL