lex

package module
v0.0.0-...-7459a62 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 21, 2018 License: MIT Imports: 8 Imported by: 10

README

go-lex

Build Status

This is a simple lexer, based loosely on text/template's lexer.

WARNING

This repository has been moved to github.com/lestrrat-go/lex. This repository exists so that libraries pointing to this URL will keep functioning, but this repository will NOT be updated in the future. Please use the new import path.

HOW TO USE

The lexing is done by chaining lex.LexFn functions. Create a StringLexer or a ReaderLexer, and pass it an entry point to start lexing. The result will be passed through a channel as a series of lex.Items:

l := NewStringLexer(buf, lexStart)
go l.Run()

for item := range l.Items() {
   // Do whatever
}

In your lexing functions, you should do whatever processing necessary, and return the next lexing function. If you are done and want the lexing to stop, return a nil for lex.LexFn

func lexStart(l lex.Lexer) lex.LexFn {
  if !l.AcceptString("Hello") {
    l.EmitErrorf("expected 'Hello'")
    return nil
  }
  
  if !l.AcceptRun(" ") {
    l.EmitErrorf("expected space")
    return nil
  }
    
  return lexWorld
}

func lexWorld(l lex.Lexer) lex.LexFn {
  if !l.AcceptString("World") {
    l.EmitErrorf("expected 'World'")
    return nil
  }
  // In reality we should check for EOF, but for now, we just end processing
  return nil
}

Documentation

Overview

Package lex contains a lexer based on text/template from the main golang distribution. I'ma big fan of how that parser works, have found that it suits my brain better than other forms of tokenization.

I found myself cutting and pasting this code a lot, so I decided to cut it out as a generic piece of library so I don't have to keep doing it over and over.

Index

Examples

Constants

View Source
const EOF = -1

EOF is used to signal that we have reached EOF

Variables

View Source
var TypeNames = make(map[ItemType]string)

TypeNames contains the name of reach ItemType. This is used for printing the values out in a human readable format

Functions

func AcceptAny

func AcceptAny(l Lexer, valid string) bool

This method moves the cursor 1 rune if the rune is contained in the given string. This is a utility function to be called from concrete Lexer types

func AcceptRun

func AcceptRun(l Lexer, valid string) bool

AcceptRun takes a string, and moves the cursor forward as long as the input matches one of the given runes in the string This is a utility function to be called from concrete Lexer types

func AcceptRunExcept

func AcceptRunExcept(l Lexer, valid string) bool

AcceptRunExcept takes a string, and moves the cursor forward as long as the input DOES NOT match one of the given runes in the string This is a utility function to be called from concrete Lexer types

func AcceptRunFunc

func AcceptRunFunc(l Lexer, fn func(rune) bool) bool

func AcceptString

func AcceptString(l Lexer, word string, rewind bool) (ok bool)

AcceptString returns true if the given string can be matched exactly. This is a utility function to be called from concrete Lexer types

func LexRun

func LexRun(l Lexer)

LexRun starts lexing using Lexer l, and a context Lexer ctx. "Context" in this case can be thought as the concret lexer, and l is the parent class. This is a utility function to be called from concrete Lexer types

func Mark

func Mark(format string, args ...interface{}) func()

Mark marks the begin/end of a function, and also indents the log output accordingly

func Trace

func Trace(format string, args ...interface{})

Trace outputs log if debug is enabled

Types

type Consumer

type Consumer interface {
	Peek() LexItem
	Consume() LexItem
	Backup()
	Backup2(LexItem)
}

Consumer is a base implementation for things that consume the Lexer interface

type Item

type Item struct {
	// contains filtered or unexported fields
}

Item is the struct that gets generated upon finding *something*

func NewItem

func NewItem(t ItemType, pos int, line int, v string) Item

NewItem creates a new Item

func (Item) Line

func (l Item) Line() int

Line returns the line number in which this occurred

func (Item) Pos

func (l Item) Pos() int

Pos returns the associated position

func (Item) String

func (l Item) String() string

String returns the string representation of the Item

func (Item) Type

func (l Item) Type() ItemType

Type returns the associated ItemType

func (Item) Value

func (l Item) Value() string

Value returns the associated text value

type ItemConsume

type ItemConsume struct {
	// contains filtered or unexported fields
}

ItemConsume is a simple Consumer impementation.

func NewItemConsume

func NewItemConsume(l Lexer) *ItemConsume

NewItemConsume creates a new ItemConsume instance

func (*ItemConsume) Backup

func (c *ItemConsume) Backup()

Backup moves 1 item back

func (*ItemConsume) Backup2

func (c *ItemConsume) Backup2(t1 LexItem)

Backup2 pushes `t1` into the buffer, and moves 2 items back

func (*ItemConsume) Consume

func (c *ItemConsume) Consume() LexItem

Consume returns the next item, and consumes it.

func (*ItemConsume) Peek

func (c *ItemConsume) Peek() LexItem

Peek returns the next item, but does not consume it

type ItemType

type ItemType int

ItemType describes the type of a LexItem

const (
	// ItemEOF is emiteed upon EOF
	ItemEOF ItemType = iota
	// ItemError is emitted upon Error
	ItemError
	// ItemDefaultMax is used as marker for your own ItemType.
	// Start your types from this + 1
	ItemDefaultMax
)

func (ItemType) String

func (t ItemType) String() string

type LexFn

type LexFn func(Lexer) LexFn

LexFn defines the lexing function. It takes the lexer (i.e. StringLexer or ReaderLexer) as its argument. If you have no state, you can just use a regular functions. Otherwise, use an object and a method bound to that object:

type Foo strcut { ... }
func (f *Foo) lexFoo(l lex.Lexer) lex.LexFn {
  ...
}

src := "...."
f := &Foo{}
l := lex.NewStringReader(src, f.lexFoo)
l.Run()

type LexItem

type LexItem interface {
	Type() ItemType
	Pos() int
	Line() int
	Value() string
}

LexItem defines the interface for items emitted by the Lexer

type Lexer

type Lexer interface {
	Run()
	GetEntryPoint() LexFn
	Current() rune
	Next() rune
	Peek() rune
	Backup()
	PeekString(string) bool
	AcceptAny(string) bool
	AcceptString(string) bool
	AcceptRun(string) bool
	AcceptRunFunc(func(r rune) bool) bool
	AcceptRunExcept(string) bool
	EmitErrorf(string, ...interface{}) LexFn
	Emit(ItemType)
	Items() chan LexItem
	BufferString() string
	NextItem() LexItem
}

Lexer defines the interface for Lexers

Example
c := &testLexCtx{}
l := NewStringLexer("1 + 1", c.lexStart)
go l.Run()

for item := range l.Items() {
	// Do your processing here
	_ = item
}

type ReaderLexer

type ReaderLexer struct {
	// contains filtered or unexported fields
}

ReaderLexer lexes input from an io.Reader instance

func NewReaderLexer

func NewReaderLexer(in io.Reader, fn LexFn) *ReaderLexer

NewReaderLexer creats a ReaderLexer

func (*ReaderLexer) AcceptAny

func (l *ReaderLexer) AcceptAny(valid string) bool

AcceptAny takes a string which contains a set of runes that can be accepted. This method moves the cursor 1 rune if the rune is contained in the given string.

func (*ReaderLexer) AcceptRun

func (l *ReaderLexer) AcceptRun(valid string) bool

AcceptRun takes a string, and moves the cursor forward as long as the input matches one of the given runes in the string

func (*ReaderLexer) AcceptRunExcept

func (l *ReaderLexer) AcceptRunExcept(valid string) bool

AcceptRunExcept takes a string, and moves the cursor forward as long as the input DOES NOT match one of the given runes in the string

func (*ReaderLexer) AcceptRunFunc

func (l *ReaderLexer) AcceptRunFunc(fn func(r rune) bool) bool

AcceptRunFunc takes a function, and moves the cursor forward as long as the function returns true

func (*ReaderLexer) AcceptString

func (l *ReaderLexer) AcceptString(word string) bool

AcceptString returns true if the given string can be matched exactly. This is a utility function to be called from concrete Lexer types

func (*ReaderLexer) Backup

func (l *ReaderLexer) Backup()

Backup moves the cursor 1 position

func (*ReaderLexer) BufferString

func (l *ReaderLexer) BufferString() (str string)

BufferString returns the current buffer

func (*ReaderLexer) Current

func (l *ReaderLexer) Current() (r rune)

Current returns current rune being considered

func (*ReaderLexer) Emit

func (l *ReaderLexer) Emit(t ItemType)

Emit creates and sends a new Item of type `t` through the output channel. The Item is generated using `Grab`

func (*ReaderLexer) EmitErrorf

func (l *ReaderLexer) EmitErrorf(format string, args ...interface{}) LexFn

EmitErrorf emits an Error Item

func (*ReaderLexer) GetEntryPoint

func (l *ReaderLexer) GetEntryPoint() LexFn

GetEntryPoint returns the function that lexing is started with

func (*ReaderLexer) Grab

func (l *ReaderLexer) Grab(t ItemType) Item

Grab creates a new Item of type `t`. The value in the item is created from the position of the last read item to current cursor position

func (*ReaderLexer) Items

func (l *ReaderLexer) Items() chan LexItem

Items returns the channel where lex'ed Item structs are sent to

func (*ReaderLexer) Next

func (l *ReaderLexer) Next() (r rune)

Next returns the next rune

func (*ReaderLexer) NextItem

func (l *ReaderLexer) NextItem() LexItem

NextItem returns the next Item in the processing pipeline. This is just a convenience function over reading l.Items()

func (*ReaderLexer) Peek

func (l *ReaderLexer) Peek() (r rune)

Peek returns the next rune, but does not move the position

func (*ReaderLexer) PeekString

func (l *ReaderLexer) PeekString(word string) bool

PeekString returns true if the given string can be matched exactly, but does not move the position

func (*ReaderLexer) Run

func (l *ReaderLexer) Run()

Run starts the lexing. You should be calling this method as a goroutine:

lexer := lex.NewStringLexer(...)
go lexer.Run()
for item := range lexer.Items() {
  ...
}

type StringLexer

type StringLexer struct {
	// contains filtered or unexported fields
}

StringLexer is an implementation of Lexer interface, which lexes contents in a string

func NewStringLexer

func NewStringLexer(input string, fn LexFn) *StringLexer

NewStringLexer creates a new StringLexer instance. This lexer can be used only once per input string. Do not try to reuse it

func (*StringLexer) AcceptAny

func (l *StringLexer) AcceptAny(valid string) bool

AcceptAny takes a string, and moves the cursor 1 rune if the rune is contained in the given string

func (*StringLexer) AcceptRun

func (l *StringLexer) AcceptRun(valid string) bool

AcceptRun takes a string, and moves the cursor forward as long as the input matches one of the given runes in the string

func (*StringLexer) AcceptRunExcept

func (l *StringLexer) AcceptRunExcept(valid string) bool

AcceptRunExcept takes a string, and moves the cursor forward as long as the input DOES NOT match one of the given runes in the string

func (*StringLexer) AcceptRunFunc

func (l *StringLexer) AcceptRunFunc(fn func(r rune) bool) bool

AcceptRunFunc takes a function, and moves the cursor forward as long as the function returns true

func (*StringLexer) AcceptString

func (l *StringLexer) AcceptString(word string) bool

AcceptString returns true if the given string can be matched exactly. This is a utility function to be called from concrete Lexer types

func (*StringLexer) AdvanceCursor

func (l *StringLexer) AdvanceCursor(n int)

AdvanceCursor advances the cursor position by `n`

func (*StringLexer) Backup

func (l *StringLexer) Backup()

Backup moves the cursor position (as many bytes as the last read rune)

func (*StringLexer) BufferString

func (l *StringLexer) BufferString() string

BufferString reutrns the string beween LastCursor and Cursor

func (*StringLexer) Current

func (l *StringLexer) Current() (r rune)

Current returns the current rune being considered

func (*StringLexer) Cursor

func (l *StringLexer) Cursor() int

Cursor returns the current cursor position

func (*StringLexer) Emit

func (l *StringLexer) Emit(t ItemType)

Emit creates and sends a new Item of type `t` through the output channel. The Item is generated using `Grab`

func (*StringLexer) EmitErrorf

func (l *StringLexer) EmitErrorf(format string, args ...interface{}) LexFn

EmitErrorf emits an Error Item

func (*StringLexer) GetEntryPoint

func (l *StringLexer) GetEntryPoint() LexFn

GetEntryPoint returns the function that lexing is started with

func (*StringLexer) Grab

func (l *StringLexer) Grab(t ItemType) Item

Grab creates a new Item of type `t`. The value in the item is created from the position of the last read item to current cursor position

func (*StringLexer) Items

func (l *StringLexer) Items() chan LexItem

Items returns the channel where lex'ed Item structs are sent to

func (*StringLexer) LastCursor

func (l *StringLexer) LastCursor() int

LastCursor returns the end position of the last Grab

func (*StringLexer) Next

func (l *StringLexer) Next() (r rune)

Next returns the next rune

func (*StringLexer) NextItem

func (l *StringLexer) NextItem() LexItem

NextItem returns the next Item in the processing pipeline. This is just a convenience function over reading l.Items()

func (*StringLexer) Peek

func (l *StringLexer) Peek() (r rune)

Peek returns the next rune, but does not move the position

func (*StringLexer) PeekString

func (l *StringLexer) PeekString(word string) bool

PeekString returns true if the given string can be matched exactly, but does not move the position

func (*StringLexer) PrevByte

func (l *StringLexer) PrevByte() byte

PrevByte returns the previous byte (l.Cursor - 1)

func (*StringLexer) RemainingString

func (l *StringLexer) RemainingString() string

RemainingString returns the string starting at the current cursor

func (*StringLexer) Run

func (l *StringLexer) Run()

Run starts the lexing. You should be calling this method as a goroutine:

lexer := lex.NewStringLexer(...)
go lexer.Run()
for item := range lexer.Items() {
  ...
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL