parsers

package
v0.0.0-...-62694dd Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 17, 2018 License: Apache-2.0 Imports: 6 Imported by: 2

Documentation

Overview

Package parsers is the main entry point for lexing and parsing source code for IDE and presentation purposes.

Index

Constants

View Source
const Version = 1

Version is the global version of the parsers toolkit, which advances every time we make an incompatible change in one of the parsers.

Variables

View Source
var (
	// ErrUnsupportedLanguage indicates that the given language is not supported yet.
	ErrUnsupportedLanguage = errors.New("unsupported language")
)

Functions

func HasLexer

func HasLexer(lang lpb.Language) bool

HasLexer returns whether the language has a registered lexer.

func HasParser

func HasParser(lang lpb.Language) bool

HasParser returns whether the language has a registered parser.

func Lex

func Lex(ctx context.Context, lang lpb.Language, source string, listener LexerListener) error

Lex tokenizes "source" in the given language. It returns tokens from a unified set of token classes shared by all languages and can be used for syntax highlighting, detecting non-whitespace changes in a file, or enumerating all tokens of a certain class in a language-independent way.

We expect all lexers to be cheap and process the input at least at 100MB/s. The function returns an error only if no lexer is registered for the language.

func Parse

func Parse(ctx context.Context, lang lpb.Language, source string, pl ParserListener, opts Options) error

Parse uses a parser for the given language to report all syntactic elements from "source". It is the responsibility of the caller to decide which nodes need to be preserved and build an AST. The function returns true if the parsing succeeds, i.e. there were no syntax errors, or the parser was able to recover from all of them. Broken code is reported as nodes of the SyntaxProblem category.

The provided error handler is called for all positions where the parser stumbled upon a syntax error or skipped an unrecognized token.

Note: parsing is slower than lexing but one can expect at least 10MB/s of throughput (and usually no memory allocations) from this function.

func RegisterLexer

func RegisterLexer(l lpb.Language, lexer Lexer)

RegisterLexer adds the lexer implementation to the registry.

func RegisterParser

func RegisterParser(l lpb.Language, parser Parser)

RegisterParser adds the parser implementation to the registry.

Types

type ErrorHandler

type ErrorHandler func(err SyntaxError) bool

ErrorHandler is a function which receives all non-fatal parser errors and decides whether we should continue parsing. If it returns false, the last error gets returned as the main outcome from the parser.

type Lexer

type Lexer func(ctx context.Context, source string, l LexerListener)

Lexer is an actual implementation of the lexer for some language.

type LexerListener

type LexerListener func(t tpb.TokenType, offset, endoffset int)

LexerListener receives all non-whitespace tokens of the language (including comments) in the order of their appearance in the input string.

The given ranges are non-empty and never overlap (may touch though).

type Options

type Options struct {
	// Adds node.Punctuation and node.Keyword nodes to the parse tree.
	IncludeAllTokens bool

	// A callback function which decides whether the parser should try to recover and continue
	// parsing. Successful error recovery leads to one or more SyntaxProblem or InvalidToken nodes
	// in the tree.
	//
	// Never called on syntactically valid input.
	// Leave unset to disable error recovery.
	ShouldTryToRecover ErrorHandler
}

Options contains parameters that control parsing behavior.

type Parser

type Parser func(ctx context.Context, source string, l ParserListener, opts Options) error

Parser is an actual implementation of the parser for some language.

type ParserListener

type ParserListener func(t node.Type, offset, endoffset int)

ParserListener gets all parsed source ranges in the left-to-right and parent-after-children order. Any two of the reported ranges either don't overlap, or contain one another.

For some types, the range can be empty, indicating the position for a node rather than the node itself (such as InsertedSemicolon). Empty nodes do not become parents of other empty nodes.

Note: errors reported via ShouldTryToRecover might be reported out of order with ParserListener, but SyntaxProblem nodes produced by error recovery will be delivered here as usual syntactic constructs.

type SyntaxError

type SyntaxError struct {
	Description string
	Line        int
	Offset      int
	Length      int
}

SyntaxError wraps low-level parsing errors and points to the first token that was not consumed by the parser.

func (SyntaxError) Error

func (se SyntaxError) Error() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL