combinator

package
v0.11.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 22, 2025 License: ISC Imports: 0 Imported by: 0

Documentation

Overview

Package combinator provides data types and primitive constructs for building parser combinators. A parser combinator is a higher-order function that takes one or more parsers as input and produces a new composite parser as output.

A parser itself is a function that processes an input stream of characters and returns an output structure, such as an abstract syntax tree (AST), a finite automaton, or another representation of the parsed data.

Parser combinators enable a modular, top-down recursive descent parsing strategy. They allow complex parsers to be constructed from smaller, reusable components, making the parsing process easier to build, maintain, and test.

Top-down parsing involves constructing a parse tree for the input string, starting from the root node (representing the start symbol) and expanding the nodes in preorder. Equivalently, top-down parsing can be viewed as finding a leftmost derivation for an input string.

A recursive descent parser is a top-down parser constructed from mutually recursive procedures (or their non-recursive equivalents), where each procedure corresponds to a nonterminal in the grammar. This structure closely mirrors the grammar, making it intuitive and directly aligned with the rules it recognizes.

For more details on parsing theory, refer to "Compilers: Principles, Techniques, and Tools (2nd Edition)".

Example
package main

import (
	"fmt"

	"github.com/moorara/algo/parser/combinator"
)

// stringInput implements the combinator.Input interface for strings.
type stringInput struct {
	pos   int
	runes []rune
}

func newStringInput(s string) combinator.Input {
	return &stringInput{
		pos:   0,
		runes: []rune(s),
	}
}

func (s *stringInput) Current() (rune, int) {
	return s.runes[0], s.pos
}

func (s *stringInput) Remaining() combinator.Input {
	if len(s.runes) == 1 {
		return nil
	}

	return &stringInput{
		pos:   s.pos + 1,
		runes: s.runes[1:],
	}
}

func main() {
	add := combinator.ExpectRune('+').Map(toAdd)                 // add → "+"
	digit := combinator.ExpectRuneInRange('0', '9').Map(toDigit) // digit → "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
	num := digit.REP1().Map(toNum)                               // num → digit+
	expr := num.CONCAT(add, num).Map(evalExpr)                   // expr → num "+" num

	in := newStringInput("27+69")
	if out, ok := expr(in); ok {
		n := out.Result.Val.(int)
		fmt.Println(n)
	}
}

func toAdd(r combinator.Result) (combinator.Result, bool) {
	return combinator.Result{
		Val: nil,
		Pos: r.Pos,
	}, true
}

func toDigit(r combinator.Result) (combinator.Result, bool) {
	v := r.Val.(rune)
	digit := int(v - '0')

	return combinator.Result{
		Val: digit,
		Pos: r.Pos,
	}, true
}

func toNum(r combinator.Result) (combinator.Result, bool) {
	l := r.Val.(combinator.List)

	var num int
	for _, r := range l {
		num = num*10 + r.Val.(int)
	}

	return combinator.Result{
		Val: num,
		Pos: l[0].Pos,
	}, true
}

func evalExpr(r combinator.Result) (combinator.Result, bool) {
	r0, _ := r.Get(0)
	r2, _ := r.Get(2)

	n0 := r0.Val.(int)
	n2 := r2.Val.(int)

	return combinator.Result{
		Val: int(n0 + n2),
		Pos: r0.Pos,
	}, true
}

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Bag

type Bag map[BagKey]BagVal

Bag is the type for a collection of key-value pairs.

type BagKey

type BagKey string

BagKey is the type for the keys in Bag type.

type BagVal

type BagVal any

BagKey is the type for the values in Bag type.

type Binder

type Binder func(Result) Parser

Binder is a function that receives a parsing result and returns a new parser. It is often used for modifying the behavior of previous parsers in the chain.

func ExcludeRunes

func ExcludeRunes(r ...rune) Binder

ExcludeRunes can be bound on a rune parser to exclude certain runes.

type Empty

type Empty struct{}

Empty is the empty string ε.

type Input

type Input interface {
	// Current returns the current rune from input along with its position in the input.
	Current() (rune, int)
	// Remaining returns the remaining of input. If no input left, it returns nil.
	Remaining() Input
}

Input is the input to a parser function.

type List

type List []Result

List is the type for the result of concatenation or repetition.

type Mapper

type Mapper func(Result) (Result, bool)

Mapper is a function that receives a parsing result and returns a new result. The second return value determines whether or not the mapping was successful and the first value is valid.

type Output

type Output struct {
	Result    Result
	Remaining Input
}

Output is the output of a parser function.

type Parser

type Parser func(Input) (Output, bool)

Parser is the type for a function that receives a parsing input and returns a parsing output. The second return value determines whether or not the parsing was successful and the output is valid.

func ExpectRune

func ExpectRune(r rune) Parser

ExpectRune creates a parser that returns a successful result only if the input starts with the given rune.

func ExpectRuneIn

func ExpectRuneIn(runes ...rune) Parser

ExpectRuneIn creates a parser that returns a successful result only if the input starts with one of the given runes.

func ExpectRuneInRange

func ExpectRuneInRange(low, up rune) Parser

ExpectRuneInRange creates a parser that returns a successful result only if the input starts with a rune in the given range.

func ExpectRunes

func ExpectRunes(runes ...rune) Parser

ExpectRunes creates a parser that returns a successful result only if the input starts with the given runes in the given order.

func ExpectString

func ExpectString(s string) Parser

ExpectString creates a parser that returns a successful result only if the input starts with the given string.

func (Parser) ALT

func (p Parser) ALT(q ...Parser) Parser

ALT composes a parser that alternate parser p with a sequence of parsers. It applies parser p to the input and if it does not succeed, it applies the next parser to the same input, and continues parsing to the last parser. It stops at the first successful parsing and returns its result.

  • EBNF Operator: Alternation
  • EBNF Notation: p | q

func (Parser) Bind

func (p Parser) Bind(f Binder) Parser

Bind composes a parser that uses parser p to parse the input and builds a second parser from the result of parsing. It then applies the new parser to the remaining input from the first parser. You can use this to implement syntax annotations.

func (Parser) CONCAT

func (p Parser) CONCAT(q ...Parser) Parser

CONCAT composes a parser that concats parser p to a sequence of parsers. It applies parser p to the input, then applies the next parser to the remaining of the input, and continues parsing to the last parser.

  • EBNF Operator: Concatenation
  • EBNF Notation: p q

func (Parser) Flatten

func (p Parser) Flatten() Parser

Flatten composes a parser that applies parser p to the input and flattens all results into a single list. This can be used for accessing the values of symbols in the right-side of a production rule more intuitively.

func (Parser) Get

func (p Parser) Get(i int) Parser

Get composes a parser that applies parser p to the input and returns the value of a symbol from the right-side of the production rule. This can be used after CONCAT, REP, REP1, Flatten, and/or Select. It will not have any effect if used after other operators and the result of parsing is not a list. If index is invalid, you will get the empty string ε.

func (Parser) Map

func (p Parser) Map(f Mapper) Parser

Map composes a parser that uses parser p to parse the input and applies a mapper function to the result of parsing. If the parser does not succeed, the mapper function will not be applied.

func (Parser) OPT

func (p Parser) OPT() Parser

OPT composes a parser that applies parser p zero or one time to the input. If the parser does not succeed, it will return an empty result.

  • EBNF Operator: Optional
  • EBNF Notation: [ p ] or p?

func (Parser) REP

func (p Parser) REP() Parser

REP composes a parser that applies parser p zero or more times to the input and accumulates the results. If the parser does not succeed, it will return an empty result.

  • EBNF Operator: Repetition (Kleene Star)
  • EBNF Notation: { p } or p*

func (Parser) REP1

func (p Parser) REP1() Parser

REP1 composes a parser that applies parser p one or more times to the input and accumulates the results. This does not allow parsing zero times (empty result).

  • EBNF Operator: Kleene Plus
  • EBNF Notation: p+

func (Parser) Select

func (p Parser) Select(i ...int) Parser

Select composes a parser that applies parser p to the input and returns a list of symbols from the right-side of the production rule. This will not have any effect if the result of parsing is not a list. If indices are invalid, you will get the empty string ε.

type Result

type Result struct {
	// Val is the actual result of a parser function.
	// It can be an abstract syntax tree, a finite automata, or any other data structure.
	Val any
	// Pos is the first position in the source corresponding to the parsing result.
	Pos int
	// Bag is an optional collection of key-value pairs holding extra information and metadata about the parsing result.
	// You should always check this field to be not nil before using it.
	Bag Bag
}

Result is the result of parsing a production rule. It represents a production rule result.

func (*Result) Get

func (r *Result) Get(i int) (Result, bool)

Get returns the parsing result of a symbol from the right-side of a production rule.

Example:

Production Rule: range → "{" num ( "," num? )? "}"
r = {2,4}
r.Get(1) = 2
r.Get(3) = 4

If the position i is out of bounds, the second return value is false.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL