lexer

package
v0.0.0-...-c3a0e23 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 30, 2019 License: Unlicense Imports: 16 Imported by: 0

Documentation

Overview

Package lexer provides the tokenization functionality of the tootchain.

Index

Constants

View Source
const (
	// DefinitionPrecedenceHigh is a default value for a high precedence
	// definition
	DefinitionPrecedenceHigh = 10000

	// DefinitionPrecedenceMedium is the default value for a medium precedence
	// definition
	DefinitionPrecedenceMedium = 5000

	// DefinitionPrecedenceLow is the default value for a low precedence
	// definition
	DefinitionPrecedenceLow = 1000

	// DefinitionPrecedenceDefault is the default value for a definition
	// precedence
	//
	// Note that it is also the lowest possible precedence
	DefinitionPrecedenceDefault = 500
)
View Source
const (
	// LocationValueInvalid represents an invalid location value for either line
	// or column
	LocationValueInvalid = math.MaxUint32
)

Variables

View Source
var DefinitionRunePrecedenceSorter = DefinitionSortHandler(func(lhs, rhs Definition) bool {
	if lhs.Precedence() > rhs.Precedence() {
		return true
	} else if len(lhs.Runes()) > len(rhs.Runes()) {
		return true
	}
	return false
})

DefinitionRunePrecedenceSorter is the default sorting function for definitions

Functions

func WalkTokensWithVisitor

func WalkTokensWithVisitor(ctx context.Context, tokens []Token, visitor Visitor) error

WalkTokensWithVisitor performs a visit for every token in the slice

Types

type Buffer

type Buffer interface {
	// Append a rune to the buffer
	Write(Location, rune)

	// Returns all of the buffer contents
	Runes() []rune

	// Returns the location when the buffer was first written to
	StartLocation() Location

	// Should reset the buffer's internal state to empty
	Reset()

	// Should delete the last x number of items from the buffer
	//
	// It should guard against an over-delete
	Delete(int)
}

Buffer is responsible for collecting the runes as they are read from the source input.

func NewBuffer

func NewBuffer() Buffer

NewBuffer is the default implementation of a BufferProvider and is the default BufferProvider for all new lexers.

type BufferProvider

type BufferProvider func() Buffer

BufferProvider is called every time the lexer needs a new buffer.

type Definition

type Definition interface {
	// Returns the collection of runes that should be matched against
	Runes() []rune

	// Returns the importance of this definition
	//
	// Given a definition for `->` and `-` Giving a precedence of 2 to `->` and 1
	// to `-` will allow the tokenizer to "look ahead" to see if the next
	// character would allow the definition with a higher precedence to be matched
	Precedence() int

	// Check if the def is a "collecting" type where we need to wait until the
	// terminator
	//
	// In the case of a comment, there is no terminator. Only a newline will
	// terminate the collection
	IsCollection() bool

	// Remove the matched runes from the beginning and end of the collection
	StripRunes() bool

	// Create and return a token for the given runes
	Token([]rune, Position) Token

	// is breaking allows a definition to break from a value. The default
	// Definition implementation returns true for any `TokenTypeSyntax`
	IsBreaking() bool
}

Definition allows you to define your own kind of token definition

type DefinitionSortHandler

type DefinitionSortHandler func(lhs, rhs Definition) bool

DefinitionSortHandler is the sort function interface for sorting definitions.

It should return false if lhs is "less" than rhs.

func (DefinitionSortHandler) Sort

func (fn DefinitionSortHandler) Sort(definitions []Definition)

Sort performs the sorting

type Error

type Error struct {
	Position Position
	Message  string
}

Error represents an error encountered during lexing

func (*Error) Error

func (e *Error) Error() string

type Lexer

type Lexer struct {
	// contains filtered or unexported fields
}

Lexer represents a lexer

func New

func New() *Lexer

New creates a new Lexer

func (*Lexer) Register

func (l *Lexer) Register(kind TokenType, runes []rune, precedence ...int)

Register adds a set of runes to be matched against for the specific types

func (*Lexer) RegisterDefinitions

func (l *Lexer) RegisterDefinitions(def ...Definition)

RegisterDefinitions registers the set of definitions with the lexer

func (*Lexer) RegisterSingleRunes

func (l *Lexer) RegisterSingleRunes(kind TokenType, runes ...rune)

RegisterSingleRunes allows multiple single characters tokens to be registered for a single type. In one call

lex.RegisterSingleRunes(lexer.TokenTypeSyntax, '(', ')', '{', '}')

func (*Lexer) Run

func (l *Lexer) Run(o Operation) ([]Token, error)

Run generates tokens from the given operation

func (*Lexer) RunWithContext

func (l *Lexer) RunWithContext(ctx context.Context, o Operation) utilities.Result

RunWithContext performs an asynchronous operation. The channel returned will receive either an OperationResult or an error

func (*Lexer) SetBufferProvider

func (l *Lexer) SetBufferProvider(p BufferProvider)

SetBufferProvider sets the lexer's buffer provider function.

func (*Lexer) SetEscape

func (l *Lexer) SetEscape(r rune)

SetEscape sets the escape character. The default is `\`

func (*Lexer) SetLogger

func (l *Lexer) SetLogger(lg *logger.Logger)

SetLogger assigns the new logger for the lexer

func (*Lexer) SetTerminator

func (l *Lexer) SetTerminator(r ...rune)

SetTerminator sets the list of terminator characters other than whitespace.

func (*Lexer) Wait

func (l *Lexer) Wait() error

Wait waits for all the operations in flight to finish

type Location

type Location struct {
	Line   uint32
	Column uint32
}

Location is a specific location of the source

func (Location) String

func (l Location) String() string

type OpenDefinition

type OpenDefinition interface {
	Definition

	// OpenEnded should be a no-op.
	//
	// It's only here to allow the interface to have a definition
	OpenEnded()
}

OpenDefinition defines an interface for a Definition that only ends when the line ends

func CommentDefinition

func CommentDefinition(runes []rune) OpenDefinition

CommentDefinition returns a new OpenDefinition that will collect runes up to the end of the line

type Operation

type Operation interface {
	// Returns the source code's reader.
	SourceReader() io.Reader

	// Returns an ID for the operation that will be appended to tokens.
	ID() string

	// Called before the operation will begin. This is a time to open a file or
	// ensure that required resources are available.
	Prepare() error

	// Called when the operation is completed. This can be used to release any
	// resources retained by the operation.
	Finish() error
}

Operation represents a single lexer operation. This is usually a single file.

type OperationResult

type OperationResult struct {
	// The ID of the operation passed in
	ID string

	// The tokens generated from the operation
	Tokens []Token
}

OperationResult contains the results of an operation

type Position

type Position struct {
	ID    string
	Range Range
}

Position represents where the token was found

type Range

type Range struct {
	Start Location
	End   Location
}

Range is a character range

type TerminatedDefinition

type TerminatedDefinition interface {
	Definition

	EndRunes() []rune
}

TerminatedDefinition defines an interface that a Definition can implement that can be used to support multiline comments.

It must return true from the IsCollection method. If it doesn't the end runes will not be matched against.

func MultilineCommentDefinition

func MultilineCommentDefinition(open, close []rune) TerminatedDefinition

MultilineCommentDefinition returns a new TerminatedDefinition that will generate a multiline comment token.

type Token

type Token struct {
	Type     TokenType
	Value    string
	Position Position
}

Token represents a single code point

func AwaitCompletion

func AwaitCompletion(c utilities.Result) ([]Token, error)

AwaitCompletion takes the channel returned by Lexer's RunWithContext function and waits for a signal. It will return the tokens or an error.

func (Token) Accept

func (t Token) Accept(v Visitor) error

Accept processes the visitor and calls the appropriate visitor function

type TokenType

type TokenType uint32

TokenType represents the type of the token

const (

	// TokenTypeValue represents a generic token value. This can be real numbers or other types.
	TokenTypeValue TokenType

	// TokenTypeComment represents a comment token
	TokenTypeComment

	// TokenTypeKeyword represents a keyword like `let` or `const`
	TokenTypeKeyword

	// TokenTypeOperator represents an operator e.g. `=`
	TokenTypeOperator

	// TokenTypeString represents a string type. The registered run is the the open & close values.
	TokenTypeString

	// TokenTypeTerminator represents
	TokenTypeTerminator

	// TokenTypeNewLine represents a newline
	TokenTypeNewLine

	// TokenTypeSyntax represents a syntax type value, like open or close parenthesis
	TokenTypeSyntax
)

func (TokenType) String

func (t TokenType) String() string

type Visitor

type Visitor interface {
	VisitValue(Token) error

	VisitComment(Token) error

	VisitKeyword(Token) error

	VisitOperator(Token) error

	VisitString(Token) error

	VisitTerminator(Token) error

	VisitNewLine(Token) error

	VisitSyntax(Token) error
}

Visitor is a type that implements visit methods for tokens

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL