comb

package module
v0.0.0-...-018e9ba Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 28, 2017 License: MIT Imports: 4 Imported by: 0

README

comb parser combinator framework

comb is an efficient parser combinator framework.

comb offers various useful parsers, including nonterminals such as characters, character ranges, tokens, and regular expressions. Also included are sequences, repetitions, and optional parsers.

For example, here is a parser that recognizes the decimal, octal, and hex integers accepted in Go.

var integerParser = comb.SequenceRunes(
    comb.Maybe(
        comb.Char('-'),
    ),
    comb.Or(
        comb.SequenceRunes(
            comb.Token("0x", "0X"),
            comb.OnePlusRunes(
                comb.Or(
                    comb.CharRange('a', 'z'),
                    comb.CharRange('a', 'z'),
                    combext.Digit(),
                ),
            ),
        ),
        combext.Digits(),
    ),
)

(Though, this is more succinctly expressed with a regular expression, giving a moderate performance gain.)

comb uses a scanner which traverses a rune slice. All builtin parsers return results that are simply slices of the original data, keeping copying to a minimum.

The combext package offers other general-use parsers (such as alpha-numeric characters, whitespace, etc) that may be frequently needed, though not always used.

Examples

In the _examples directory, you can find examples of comb in use, including a recursive expression calculator.

Other libraries

comb takes inspiration from the following Go parser combinator libraries:

I tried both of these while writing a parser for another project. They both offer some amount of good usability and performance, but not enough to my liking. I thought the changes I would make would be too breaking to turn into a reasonable PR, so here we are.

Documentation

Overview

Package comb is the comb parser combinator framework.

comb is an efficient parser combinator framework.

comb offers various useful parsers, including nonterminals such as characters, character ranges, tokens, and regular expressions. Also included are sequences, repetitions, and optional parsers.

For example, here is a parser that recognizes the decimal, octal, and hex integers accepted in Go.

var integerParser = comb.SequenceRunes(
    comb.Maybe(
        comb.Char('-'),
    ),
    comb.Or(
        comb.SequenceRunes(
            comb.Token("0x", "0X"),
            comb.OnePlusRunes(
                comb.Or(
                    comb.CharRange('a', 'z'),
                    comb.CharRange('a', 'z'),
                    combext.Digit(),
                ),
            ),
        ),
        combext.Digits(),
    ),
)

(Though, this is more succinctly expressed with a regular expression, giving a moderate performance gain.)

comb uses a scanner which traverses a rune slice. All builtin parsers return results that are simply slices of the original data, keeping copying to a minimum.

The combext package offers other general-use parsers (such as alpha-numeric characters, whitespace, etc) that may be frequently needed, though not always used.

Examples

In the _examples directory, you can find examples of comb in use, including a recursive expression calculator.

Other libraries

comb takes inspiration from the following Go parser combinator libraries:

• jmikkola/parsego (https://github.com/jmikkola/parsego)

• prataprc/goparsec (https://github.com/prataprc/goparsec)

I tried both of these while writing a parser for another project. They both offer some amount of good usability and performance, but not enough to my liking. I thought the changes I would make would be too breaking to turn into a reasonable PR, so here we are.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Parser

type Parser interface {
	Parse(s Scanner) (r Result, next Scanner)
}

Parser describes comb parsers, which take a scanner, scan some amount of text, then return a result and the next scanner.

func AnyChar

func AnyChar() Parser

AnyChar accepts any single character.

func Char

func Char(chars ...rune) Parser

Char accepts a single given character.

func CharRange

func CharRange(from, to rune) Parser

CharRange accepts chars in an inclusive range.

func CharsIn

func CharsIn(s string) Parser

CharsIn accepts any of the chars in a given string.

func EOF

func EOF() Parser

EOF matches only at EOF.

func Ignore

func Ignore(parser Parser) Parser

Ignore sets the result of a Parser to be Ignored.

func Many

func Many(combiner ResultCombiner, parser Parser) Parser

Many looks for a series of 0+ matches of a parser, then combines the results with a combiner. If combiner is nil, SliceCombiner is used.

If you only need the runes captured by Many, use TextMany instead.

func ManyRunes

func ManyRunes(parser Parser) Parser

ManyRunes looks for a series of 0+ matches of a parser, then returns the runes captured.

func Maybe

func Maybe(parser Parser) Parser

Maybe tries a parser and returns its result if it matches, otherwise, it returns an empty result and the original scanner.

func NotChar

func NotChar(chars ...rune) Parser

NotChar only accepts a char not given.

func OnePlus

func OnePlus(combiner ResultCombiner, parser Parser) Parser

OnePlus looks for a series of 1+ matches of a parser.

If you only need the runes captured by OnePlus, use TextOnePlus instead.

func OnePlusRunes

func OnePlusRunes(parser Parser) Parser

OnePlusRunes looks for a series of 1+ matches of a parser, then returns the runes captured.

func Or

func Or(parsers ...Parser) Parser

Or checks parsers in order, returning the first match.

func OrLongest

func OrLongest(parsers ...Parser) Parser

OrLongest is like Or, but returns the result of the parser that captures the most text. Ties are broken by taking the first result. In order to do this, *every* parser will be run, so keep that in mind.

func ParserFunc

func ParserFunc(fn func(Scanner) (Result, Scanner)) Parser

ParserFunc turns a parser function into a Parser.

func Reference

func Reference(p *Parser) Parser

Reference takes a pointer to a Parser, and only dereferences it when Parse is called.

func Regexp

func Regexp(pattern string) Parser

Regexp compiles a Go regexp into a parser. If the pattern does not begin with ^, one will be added, as the parser must begin with the next rune.

func Sequence

func Sequence(combiner ResultCombiner, parsers ...Parser) Parser

Sequence runs multiple parsers in a sequence, combining results with a combiner function. If combiner is nil, then SliceCombiner is used. Sequence must allocate a slice of results the same length as the number of parsers required.

If you only need the runes captured by Sequence, use SequenceRunes instead.

func SequenceRunes

func SequenceRunes(parsers ...Parser) Parser

SequenceRunes is like Sequence, but does not capture all results, instead returning the runes between the start and end of the matching region. Unlike Sequence, this does not allocate anything. SequenceRunes does not read any results, just the returned scanners, so cannot respect the Ignored option.

func Surround

func Surround(left, parser, right Parser) Parser

Surround surrounds a parser with two parsers, and returns the surrounded value. This is equivalent to Sequence with a combiner which returns the middle result.

func Tag

func Tag(tag string, parser Parser) Parser

Tag sets the tag of a parser's result.

func Take

func Take(n int) Parser

Take accepts n characters and returns the runes captured.

func Token

func Token(tokens ...string) Parser

Token accepts the shortest given token. At least one token must be provided. If more than one token is given, then a trie is used to check for membership.

func TokenRunes

func TokenRunes(tokens ...[]rune) Parser

TokenRunes is like Token, but takes multiple rune slices.

type Result

type Result struct {
	Err       error
	Runes     []rune
	Int64     int64
	Float64   float64
	Interface interface{}
	Tag       string
	Ignore    bool
}

Result represents the result of a parser. It supports a range of common values, including a rune slice, integer and float types used in strconv, as well as an interface{} for anything not included. Err will be set if a Result is failed. If your result contains an error that is not a failure, then it should be placed into Interface.

func Failed

func Failed(err error) Result

Failed returns a failed result with a given error.

func Failedf

func Failedf(format string, a ...interface{}) Result

Failedf returns a failed result in fmt.Errorf form. fmt.Errorf will not be called until the error is read to prevent unnecessary computation. This is important, as failed results can be checked without ever generating an error.

func SliceCombiner

func SliceCombiner(results []Result, begin, end Scanner) Result

SliceCombiner combines results by returning a Result with the slice in Interface. If a result is set to be ignored, the result will not be in the new result slice.

func (Result) Matched

func (r Result) Matched() bool

Matched returns true if Err is not nil.

type ResultCombiner

type ResultCombiner func(results []Result, begin, end Scanner) Result

ResultCombiner is a function that takes a slice of results and surrounding scanners and combines them into a single result.

type Scanner

type Scanner struct {
	// contains filtered or unexported fields
}

Scanner is an immutable struct which scans over a rune slice.

func NewScanner

func NewScanner(s []rune) Scanner

NewScanner creates a new Scanner from a rune slice.

func NewStringScanner

func NewStringScanner(s string) Scanner

NewStringScanner creates a new Scanner from a string.

func (Scanner) Between

func (s Scanner) Between(other Scanner) []rune

Between returns the slice between two scanners. s1.Between(s2) returns a slice in the range [s1, s2).

func (Scanner) Col

func (s Scanner) Col() int

Col returns the current column number, 1 indexed.

func (Scanner) EOF

func (s Scanner) EOF() bool

EOF returns true if the scanner is at EOF, i.e. a call to Next would return EOF.

func (Scanner) Line

func (s Scanner) Line() int

Line returns the current line number, 1 indexed.

func (Scanner) Next

func (s Scanner) Next() (rune, Scanner, error)

Next scans for the next rune, returning the rune and the next Scanner. If there are no more runes to scan, io.EOF is returned.

Directories

Path Synopsis
_examples
Package combext holds various comb helpers, including things like digit, alpha, whitespace, and integer parsers.
Package combext holds various comb helpers, including things like digit, alpha, whitespace, and integer parsers.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL