mssql

package
v2.0.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 9, 2026 License: MIT Imports: 9 Imported by: 0

README

Package mssql provides a T-SQL (Microsoft SQL Server) parser for the sqlcode library.

Overview

This package implements a lexical scanner and document parser specifically designed for T-SQL syntax. It is part of the sqlcode toolchain that manages SQL database objects (procedures, functions, types) with dependency tracking and code generation.

Architecture

The parser follows a two-layer architecture:

  1. Scanner (scanner.go): A lexical tokenizer that breaks T-SQL source into tokens. It handles T-SQL-specific constructs like N'unicode strings', [bracketed identifiers], and the GO batch separator.
  2. Document (document.go): A higher-level parser that processes token streams to extract CREATE statements, DECLARE constants, and dependency information.

Token System

T-SQL tokens are divided into two categories:

  • Common tokens (defined in sqldocument): Shared across SQL dialects (e.g., parentheses, whitespace, identifiers). These use token type values 0-999.
  • T-SQL-specific tokens (defined in tokens.go): Dialect-specific tokens like VarcharLiteralToken ('...') and NVarcharLiteralToken (N'...'). These use values 1000-1999.

Batch Separator Handling

T-SQL uses GO as a batch separator with special rules:

  • GO must appear at the start of a line (only whitespace/comments before it)
  • Nothing except whitespace may follow GO on the same line
  • GO is not a reserved word; it's a client tool command The scanner tracks line position state to correctly identify GO as a BatchSeparatorToken rather than an identifier. Malformed separators (GO followed by non-whitespace) are reported as MalformedBatchSeparatorToken.

Document Structure

The parser recognizes:

  • CREATE PROCEDURE/FUNCTION/TYPE statements in the [code] schema
  • DECLARE statements for constants (variables starting with @Enum, @Global, or @Const)
  • Dependencies between objects via [code].ObjectName references
  • Pragma comments (--sqlcode:...) for build-time directives

Dependency Tracking

When parsing CREATE statements, the parser scans for [code].ObjectName patterns to build a dependency graph. This enables topological sorting of objects so they are created in the correct order during deployment.

Error Recovery

The parser uses a recovery strategy that skips to the next statement-starting keyword (CREATE, DECLARE, GO) when encountering syntax errors. This allows partial parsing of files with errors while collecting all error messages.

Documentation

Index

Constants

View Source
const (
	// T-SQL specific string literals
	//
	// T-SQL distinguishes between varchar ('...') and nvarchar (N'...')
	// string literals. Both use single quotes with ” as the escape sequence.
	VarcharLiteralToken sqldocument.TokenType = iota + sqldocument.TSQLTokenStart
	NVarcharLiteralToken

	// T-SQL specific identifier styles
	//
	// T-SQL uses square brackets for quoted identifiers: [My Table]
	// Brackets are escaped by doubling: [My]]Table] represents "My]Table"
	BracketQuotedIdentifierToken // [identifier]

	// T-SQL specific errors
	//
	// Unlike standard SQL, T-SQL does not support double-quoted strings.
	// Double quotes are reserved for QUOTED_IDENTIFIER mode identifiers,
	// but sqlcode requires bracket notation for consistency.
	DoubleQuoteErrorToken // T-SQL doesn't support double-quoted strings
	UnterminatedVarcharLiteralErrorToken
	UnterminatedQuotedIdentifierErrorToken
)

T-SQL specific tokens (range 1000-1999)

Token values are partitioned by dialect to avoid collisions:

  • 0-999: Common tokens shared across dialects (sqldocument package)
  • 1000-1999: T-SQL specific tokens (this package)
  • 2000-2999: Reserved for other dialects (e.g., PostgreSQL)

This design allows dialect-specific code to use concrete token types while common code can use ToCommonToken() for abstraction.

Variables

View Source
var TSQLStatementTokens = []string{"create", "declare", "go"}

TSQLStatementTokens defines the keywords that start new statements. Used by error recovery to find a safe point to resume parsing.

Functions

func ToCommonToken

func ToCommonToken(tt sqldocument.TokenType) sqldocument.TokenType

ToCommonToken maps T-SQL specific tokens to their common equivalents for dialect-agnostic processing.

This abstraction layer allows higher-level code to work with logical token categories (e.g., "string literal") without knowing the specific dialect syntax (varchar vs nvarchar, brackets vs double quotes).

Tokens that are already common tokens pass through unchanged.

Types

type Scanner

type Scanner struct {
	// contains filtered or unexported fields
}

Scanner is a lexical scanner for T-SQL source code.

Unlike traditional lexer/parser architectures with a token stream, Scanner is used directly by the recursive descent parser as a cursor into the input buffer. It provides utility methods for tokenization and position tracking.

The scanner handles T-SQL specific constructs including:

  • String literals ('...' and N'...')
  • Quoted identifiers ([...])
  • Single-line (--) and multi-line (/* */) comments
  • Batch separators (GO)
  • Reserved words
  • Variables (@identifier)

func NewScanner

func NewScanner(path sqldocument.FileRef, input string) *Scanner

NewScanner creates a new Scanner for the given T-SQL source file and input string. The scanner is positioned before the first token; call NextToken() to advance.

func (Scanner) Clone

func (s Scanner) Clone() *Scanner

Clone returns a copy of the scanner at its current position. This is used for look-ahead parsing where we need to tentatively scan tokens without committing to consuming them.

func (*Scanner) File

func (s *Scanner) File() sqldocument.FileRef

func (*Scanner) NextNonWhitespaceCommentToken

func (s *Scanner) NextNonWhitespaceCommentToken() sqldocument.TokenType

NextNonWhitespaceCommentToken advances to the next token and then skips any whitespace and comments, returning the type of the first significant token.

func (*Scanner) NextNonWhitespaceToken

func (s *Scanner) NextNonWhitespaceToken() sqldocument.TokenType

NextNonWhitespaceToken advances to the next token and then skips any whitespace, returning the type of the first non-whitespace token.

func (*Scanner) NextToken

func (s *Scanner) NextToken() sqldocument.TokenType

NextToken scans the next token and advances the scanner's position.

This method wraps the raw tokenization with batch separator handling. The GO batch separator has special rules in T-SQL:

  • It must appear at the start of a line (only whitespace/comments before it)
  • Nothing except whitespace may follow it on the same line
  • It is not processed inside [names], 'strings', or /*comments*/

If GO is followed by non-whitespace on the same line, subsequent tokens are returned as MalformedBatchSeparatorToken until end of line.

Returns the TokenType of the scanned token.

func (*Scanner) ReservedWord

func (s *Scanner) ReservedWord() string

ReservedWord returns the lowercase reserved word if the current token is a ReservedWordToken, or an empty string otherwise.

func (*Scanner) SetFile

func (s *Scanner) SetFile(file sqldocument.FileRef)

func (*Scanner) SetInput

func (s *Scanner) SetInput(input []byte)

func (*Scanner) SkipWhitespace

func (s *Scanner) SkipWhitespace()

SkipWhitespace advances past any whitespace tokens. Stops when a non-whitespace token is encountered. Unlike SkipWhitespaceComments, this preserves comments.

func (*Scanner) SkipWhitespaceComments

func (s *Scanner) SkipWhitespaceComments()

SkipWhitespaceComments advances past any whitespace and comment tokens. Stops when a non-whitespace, non-comment token is encountered.

func (*Scanner) Start

func (s *Scanner) Start() sqldocument.Pos

Start returns the position where the current token begins. Line and column are 1-indexed.

func (*Scanner) Stop

func (s *Scanner) Stop() sqldocument.Pos

Stop returns the position where the current token ends. Line and column are 1-indexed.

func (*Scanner) Token

func (s *Scanner) Token() string

Token returns the text of the current token as a substring of Input.

func (*Scanner) TokenLower

func (s *Scanner) TokenLower() string

TokenLower returns the current token text converted to lowercase. Useful for case-insensitive keyword matching.

func (*Scanner) TokenType

func (s *Scanner) TokenType() sqldocument.TokenType

TokenType returns the type of the current token.

type TSqlDocument

type TSqlDocument struct {
	sqldocument.Pragma
	// contains filtered or unexported fields
}

TSqlDocument represents a T-SQL source file.

The document contains:

  • creates: CREATE PROCEDURE/FUNCTION/TYPE statements with dependency info
  • declares: DECLARE statements for sqlcode constants (@Enum*, @Global*, @Const*)
  • errors: Syntax and semantic errors encountered during parsing
  • pragmaIncludeIf: Conditional compilation directives from --sqlcode:include-if

Parsing follows T-SQL batch semantics where batches are separated by GO. The first batch may contain DECLARE statements for constants. Subsequent batches contain CREATE statements for database objects.

func (TSqlDocument) Creates

func (d TSqlDocument) Creates() []sqldocument.Create

func (TSqlDocument) Declares

func (d TSqlDocument) Declares() []sqldocument.Declare

func (TSqlDocument) Empty

func (d TSqlDocument) Empty() bool

func (TSqlDocument) Errors

func (d TSqlDocument) Errors() []sqldocument.Error

func (TSqlDocument) HasErrors

func (d TSqlDocument) HasErrors() bool

func (*TSqlDocument) Include

func (d *TSqlDocument) Include(other sqldocument.Document)

func (*TSqlDocument) Parse

func (d *TSqlDocument) Parse(input []byte, file sqldocument.FileRef) error

Parse processes a T-SQL source file from the given input.

Parsing proceeds in phases:

  1. Parse pragma comments at the file start (--sqlcode:...)
  2. Parse batches sequentially, separated by GO

The first batch has special rules: it may contain DECLARE statements for sqlcode constants. CREATE statements may appear in any batch, but procedures/functions must be alone in their batch (T-SQL requirement).

Errors are accumulated in the document rather than stopping parsing, allowing partial results even with syntax errors.

func (*TSqlDocument) Sort

func (d *TSqlDocument) Sort()

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL