parsec

package module
Version: v0.0.0-...-2600a2a Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 6, 2018 License: Apache-2.0 Imports: 14 Imported by: 0

README

Parser combinator library in Golang

Build Status Coverage Status GoDoc Sourcegraph Go Report Card

A library to construct top-down recursive backtracking parsers using parser-combinators. Before proceeding you might want to take at peep at theory of parser combinators. As for this package, it provides:

  • A standard set of combinators.
  • Regular expression based simple-scanner.
  • Standard set of tokenizers based on the simple-scanner.

To construct syntax-trees based on detailed grammar try with AST struct

  • Standard set of combinators are exported as methods to AST.
  • Generate dot-graph EG: dotfile for html.
  • Pretty print on the console.
  • Make debugging easier.

NOTE that AST object is a recent development and expect user to adapt to newer versions

Combinators

Every combinator should confirm to the following signature,

    // ParsecNode type defines a node in the AST
    type ParsecNode interface{}

    // Parser function parses input text, higher order parsers are
    // constructed using combinators.
    type Parser func(Scanner) (ParsecNode, Scanner)

    // Nodify callback function to construct custom ParsecNode.
    type Nodify func([]ParsecNode) ParsecNode

Combinators take a variable number of parser functions and return a new parser function.

Using the builtin scanner

Builtin scanner library manages the input buffer and implements a cursor into the buffer. Create a new scanner instance,

    s := parsec.NewScanner(text)

The scanner library supplies method like Match(pattern), SkipAny(pattern) and Endof(), refer to for more information on each of these methods.

Panics and Recovery

Panics are to be expected when APIs are misused. Programmers might choose to ignore errors, but not panics. For example:

  • Kleene and Many combinators take one or two parsers as arguments. Less than one or more than two will throw a panic.
  • ManyUntil combinator take two or three parsers as arguments. Less than two or more than three will throw a panic.
  • Combinators accept Parser function or pointer to Parser function. Anything else will panic.
  • When using invalid regular expression to match a token.

Examples

  • expr/expr.go, implements a parsec grammar to parse arithmetic expressions.
  • json/json.go, implements a parsec grammar to parse JSON document.

Clone the repository run the benchmark suite

    $ cd expr/
    $ go test -test.bench=. -test.benchmem=true
    $ cd json/
    $ go test -test.bench=. -test.benchmem=true

To run the example program,

    # to parse expression
    $ go run tools/parsec/parsec.go -expr "10 + 29"

    # to parse JSON string
    $ go run tools/parsec/parsec.go -json '{ "key1" : [10, "hello", true, null, false] }'

Projects using goparsec

  • Monster, production system in golang.
  • GoLedger, ledger re-write in golang.

If your project is using goparsec you can raise an issue to list them under this section.

Articles

  • [Parsing by composing functions][article1-link]
  • [Parser composition for recursive grammar][article2-link]
  • [How to use the Maybe combinator][article3-link]

How to contribute

Issue Stats Issue Stats

  • Pick an issue, or create an new issue. Provide adequate documentation for the issue.
  • Assign the issue or get it assigned.
  • Work on the code, once finished, raise a pull request.
  • Goparsec is written in golang, hence expected to follow the global guidelines for writing go programs.
  • If the changeset is more than few lines, please generate a [report card][report-link].
  • As of now, branch master is the development branch.

article1-link article2-link article3-link [report-link]: https://goreportcard.com/report/github.com/prataprc/goparsec

Documentation

Overview

Package parsec provides a library of parser-combinators. The basic idea behind parsec module is that, it allows programmers to compose basic set of terminal parsers, a.k.a tokenizers and compose them together as a tree of parsers, using combinators like: And, OrdChoice, Kleene, Many, Maybe.

To begin with there are four basic Types that needs to be kept in mind while creating and composing parsers,

Types

Scanner, an interface type that encapsulates the input text. A built in scanner called SimpleScanner is supplied along with this package. Developers can also implement their own scanner types. Following example create a new instance of SimpleScanner, using an input text:

var exprText = []byte(`4 + 123 + 23 + 67 +89 + 87 *78`)
s := parsec.NewScanner(exprText)

Nodify, callback function is supplied while combining parser functions. If the underlying parsing logic matches with i/p text, then callback will be dispatched with list of matching ParsecNode. Value returned by callback function will further be used as ParsecNode item in higher-level list of ParsecNodes.

Parser, simple parsers are functions that matches i/p text for specific patterns. Simple parsers can be combined using one of the supplied combinators to construct a higher level parser. A parser function takes a Scanner object and applies the underlying parsing logic, if underlying logic succeeds Nodify callback is dispatched and a ParsecNode and a new Scanner object (with its cursor moved forward) is returned. If parser fails to match, it shall return the input scanner object as it is, along with nil ParsecNode.

ParsecNode, an interface type encapsulates one or more tokens from i/p text, as terminal node or non-terminal node.

Combinators

If input text is going to be a single token like `10` or `true` or `"some string"`, then all we need is a single Parser function that can tokenize the i/p text into a terminal node. But our applications are seldom that simple. Almost all the time we need to parse the i/p text for more than one tokens and most of the time we need to compose them into a tree of terminal and non-terminal nodes.

This is where combinators are useful. Package provides a set of combinators to help combine terminal parsers into higher level parsers. They are,

* And, to combine a sequence of terminals and non-terminal parsers.
* OrdChoice, to choose between specified list of parsers.
* Kleene, to repeat the parser zero or more times.
* Many, to repeat the parser one or more times.
* ManyUntil, to repeat the parser until a specified end matcher.
* Maybe, to apply the parser once or none.

All the above mentioned combinators accept one or more parser function as arguments, either by value or by reference. The reason for allowing parser argument by reference is to be able to define recursive parsing logic, like parsing nested arrays:

var Y Parser
var value Parser // circular rats

var opensqrt = Atom("[", "OPENSQRT")
var closesqrt = Atom("]", "CLOSESQRT")
var values = Kleene(nil, &value, Atom(",", "COMMA"))
var array = And(nil, opensqrt, values, closeSqrt)
func init() {
	value = parsec.OrdChoice(nil, Int(), Bool(), String(), array)
	Y = parsec.OrdChoice(nil, value)
}

Terminal parsers

Parsers for standard set of tokens are supplied along with this package. Most of these parsers return Terminal type as ParseNode.

* Char, match a single character skipping leading whitespace.
* Float, match a float literal skipping leading whitespace.
* Hex, match a hexadecimal literal skipping leading whitespace.
* Int, match a decimal number literal skipping leading whitespace.
* Oct, match a octal number literal skipping leading whitespace.
* String, match a string literal skipping leading whitespace.
* Ident, match a identifier token skipping leading whitespace.
* Atom, match a single atom skipping leading whitespace.
* AtomExact, match a single atom without skipping leading whitespace.
* Token, match a single token skipping leading whitespace.
* TokenExact, match a single token without skipping leading whitespace.
* OrdToken, match a single token with specified list of alternatives.
* End, match end of text.
* NoEnd, match not an end of text.

All of the terminal parsers, except End and NoEnd return Terminal type as ParsecNode. While End and NoEnd return a boolean type as ParsecNode.

AST and Queryable

This is an experimental feature to use CSS like selectors for quering an Abstract Syntax Tree (AST). Types, APIs and methods associated with AST and Queryable are unstable, and are expected to change in future.

While Scanner, Parser, ParsecNode types are re-used in AST and Queryable, combinator functions are re-implemented as AST methods. Similarly type ASTNodify is to be used instead of Nodify type. Otherwise all the parsec techniques mentioned above are equally applicable on AST.

Additionally, following points are worth noting while using AST,

* Combinator methods supplied via AST can be named.
* All combinators from AST object will create and return NonTerminal
  as the Queryable type.
* ASTNodify function can interpret its Queryable argument and return
  a different type implementing Queryable interface.

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type AST

type AST struct {
	// contains filtered or unexported fields
}

AST to parse and construct Abstract Syntax Tree whose nodes confirm to `Queryable` interface, facilitating tree processing algorithms.

func NewAST

func NewAST(name string, maxnodes int) *AST

NewAST return a new instance of AST, maxnodes is size of internal buffer pool of nodes, it is directly proportional to number of nodes that you expect in the syntax-tree.

func (*AST) And

func (ast *AST) And(name string, callb ASTNodify, parsers ...interface{}) Parser

And combinator, same as package level And combinator function. `name` identifies the NonTerminal nodes constructed by this combinator.

Example
// parse a configuration line from ini file.
text := []byte(`loglevel = info`)
ast := NewAST("example", 100)
y := ast.And("configline", nil, Ident(), Atom("=", "EQUAL"), Ident())
root, _ := ast.Parsewith(y, NewScanner(text))
nodes := root.GetChildren()
fmt.Println(nodes[0].GetName(), nodes[0].GetValue())
fmt.Println(nodes[1].GetName(), nodes[1].GetValue())
fmt.Println(nodes[2].GetName(), nodes[2].GetValue())
Output:

IDENT loglevel
EQUAL =
IDENT info

func (*AST) Dotstring

func (ast *AST) Dotstring(name string) string

Dotstring return AST in graphviz dot format. Save this string to a dot file and use graphviz tool generate a nice looking graph.

func (*AST) End

func (ast *AST) End(name string) Parser

End is a parser function to detect end of scanner output.

func (*AST) GetValue

func (ast *AST) GetValue() string

GetValue return the full text, called as value here, that was parsed to contruct this syntax-tree.

func (*AST) Kleene

func (ast *AST) Kleene(nm string, callb ASTNodify, ps ...interface{}) Parser

Kleene combinator, same as package level Kleene combinator function. `nm` identifies the NonTerminal nodes constructed by this combinator.

func (*AST) Many

func (ast *AST) Many(nm string, callb ASTNodify, parsers ...interface{}) Parser

Many combinator, same as package level Many combinator function. `nm` identifies the NonTerminal nodes constructed by this combinator.

Example
// parse comma separated values
text := []byte(`10,30,50 wont parse this`)
ast := NewAST("example", 100)
y := ast.Many("many", nil, Int(), Atom(",", "COMMA"))
root, _ := ast.Parsewith(y, NewScanner(text))
nodes := root.GetChildren()
fmt.Println(nodes[0].GetName(), nodes[0].GetValue())
fmt.Println(nodes[1].GetName(), nodes[1].GetValue())
fmt.Println(nodes[2].GetName(), nodes[2].GetValue())
Output:

INT 10
INT 30
INT 50

func (*AST) ManyUntil

func (ast *AST) ManyUntil(nm string, callb ASTNodify, ps ...interface{}) Parser

ManyUntil combinator, same as package level Many combinator function. `nm` identifies the NonTerminal nodes constructed by this combinator.

Example
// make sure to parse the entire text
text := []byte("10,30,50")
ast := NewAST("example", 100)
y := ast.ManyUntil("values", nil, Int(), Atom(",", "COMMA"), ast.End("eof"))
root, _ := ast.Parsewith(y, NewScanner(text))
nodes := root.GetChildren()
fmt.Println(nodes[0].GetName(), nodes[0].GetValue())
fmt.Println(nodes[1].GetName(), nodes[1].GetValue())
fmt.Println(nodes[2].GetName(), nodes[2].GetValue())
Output:

INT 10
INT 30
INT 50

func (*AST) Maybe

func (ast *AST) Maybe(name string, callb ASTNodify, parser interface{}) Parser

Maybe combinator, same as package level Maybe combinator function. `nm` identifies the NonTerminal nodes constructed by this combinator.

Example
// parse an optional token
ast := NewAST("example", 100)
equal := Atom("=", "EQUAL")
maybeand := ast.Maybe("maybeand", nil, Atom("&", "AND"))
y := ast.And("assignment", nil, Ident(), equal, maybeand, Ident())

text := []byte("a = &b")
root, _ := ast.Parsewith(y, NewScanner(text))
nodes := root.GetChildren()
fmt.Println(nodes[0].GetName(), nodes[0].GetValue())
fmt.Println(nodes[1].GetName(), nodes[1].GetValue())
fmt.Println(nodes[2].GetName(), nodes[2].GetValue())
fmt.Println(nodes[3].GetName(), nodes[3].GetValue())

text = []byte("a = b")
ast = ast.Reset()
root, _ = ast.Parsewith(y, NewScanner(text))
nodes = root.GetChildren()
fmt.Println(nodes[0].GetName(), nodes[0].GetValue())
fmt.Println(nodes[1].GetName(), nodes[1].GetValue())
fmt.Println(nodes[2].GetName())
fmt.Println(nodes[3].GetName(), nodes[3].GetValue())
Output:

IDENT a
EQUAL =
AND &
IDENT b
IDENT a
EQUAL =
missing
IDENT b

func (*AST) OrdChoice

func (ast *AST) OrdChoice(nm string, cb ASTNodify, ps ...interface{}) Parser

OrdChoice combinator, same as package level OrdChoice combinator function. `nm` identifies the NonTerminal nodes constructed by this combinator.

Example
// parse a boolean value
text := []byte(`true`)
ast := NewAST("example", 100)
y := ast.OrdChoice("bool", nil, Atom("true", "TRUE"), Atom("false", "FALSE"))
root, _ := ast.Parsewith(y, NewScanner(text))
fmt.Println(root.GetName(), root.GetValue())
Output:

TRUE true

func (*AST) Parsewith

func (ast *AST) Parsewith(y Parser, s Scanner) (Queryable, Scanner)

Parsewith execute the root parser, y, with scanner s. AST will remember the root parser, and root node. Return the root-node as Queryable, if success and scanner with remaining input.

func (*AST) Prettyprint

func (ast *AST) Prettyprint()

Prettyprint to standard output the syntax-tree in human readable plain text.

func (*AST) Query

func (ast *AST) Query(selectors string, ch chan Queryable)

Query is an experimental method on AST. Developers can use the selector specification to pick one or more nodes from the AST.

func (*AST) Reset

func (ast *AST) Reset() *AST

Reset the AST, forget the root parser, and root node. Reuse the AST object via Parsewith different set of root-parser and scanner.

func (*AST) SetDebug

func (ast *AST) SetDebug() *AST

SetDebug enables console logging while parsing the input test, this is useful while developing a parser.

type ASTNodify

type ASTNodify func(name string, s Scanner, node Queryable) Queryable

ASTNodify callback function to construct custom Queryable. Even when combinators like And, OrdChoice, Many etc.. match input string, it is possible to fail them via ASTNodify callback function, by returning nil. This is useful in cases like:

* where lookahead matching is required.
* exceptional cases for a regex pattern.

Note that some combinators like Kleene shall not interpret the return value from ASTNodify callback. `node` will always be of NonTerminal type, although callback can process it and return a different type, provided it implements Queryable interface.

Example
text := []byte("10 * 20")
ast := NewAST("example", 100)
y := ast.And(
	"multiply",
	func(name string, s Scanner, node Queryable) Queryable {
		cs := node.GetChildren()
		x, _ := strconv.Atoi(cs[0].(*Terminal).GetValue())
		y, _ := strconv.Atoi(cs[2].(*Terminal).GetValue())
		return &Terminal{Value: fmt.Sprintf("%v", x*y)}
	},
	Int(), Token(`\*`, "MULT"), Int(),
)
node, _ := ast.Parsewith(y, NewScanner(text))
fmt.Println(node.GetValue())
Output:

200

type MaybeNone

type MaybeNone string

MaybeNone is a placeholder type, similar to Terminal type, used by Maybe combinator if parser does not match the input text.

func (MaybeNone) GetAttribute

func (mn MaybeNone) GetAttribute(attrname string) []string

GetAttribute implement Queryable interface.

func (MaybeNone) GetAttributes

func (mn MaybeNone) GetAttributes() map[string][]string

GetAttributes implement Queryable interface.

func (MaybeNone) GetChildren

func (mn MaybeNone) GetChildren() []Queryable

GetChildren implement Queryable interface.

func (MaybeNone) GetName

func (mn MaybeNone) GetName() string

GetName implement Queryable interface.

func (MaybeNone) GetPosition

func (mn MaybeNone) GetPosition() int

GetPosition implement Queryable interface.

func (MaybeNone) GetValue

func (mn MaybeNone) GetValue() string

GetValue implement Queryable interface.

func (MaybeNone) IsTerminal

func (mn MaybeNone) IsTerminal() bool

IsTerminal implement Queryable interface.

func (MaybeNone) SetAttribute

func (mn MaybeNone) SetAttribute(attrname, value string) Queryable

SetAttribute implement Queryable interface.

type Nodify

type Nodify func([]ParsecNode) ParsecNode

Nodify callback function to construct custom ParsecNode. Even when combinators like And, OrdChoice, Many etc.. can match input string, it is still possible to fail them via nodify callback function, by returning nil. This very useful in cases when,

* lookahead matching is required.
* an exceptional cases for regex pattern.

Note that some combinators like KLEENE shall not interpret the return value from Nodify callback.

Example
text := []byte("10 * 20")
s := NewScanner(text)
y := And(
	func(nodes []ParsecNode) ParsecNode {
		x, _ := strconv.Atoi(nodes[0].(*Terminal).GetValue())
		y, _ := strconv.Atoi(nodes[2].(*Terminal).GetValue())
		return x * y // this is retuned as node further down.
	},
	Int(), Token(`\*`, "MULT"), Int(),
)
node, _ := y(s)
fmt.Println(node)
Output:

200

type NonTerminal

type NonTerminal struct {
	Name       string      // contains terminal's token type
	Children   []Queryable // list of children to this node.
	Attributes map[string][]string
}

NonTerminal will be used by AST methods to construct intermediate nodes. Note that user supplied ASTNodify callback can construct a different type of intermediate node that confirms to Queryable interface.

func NewNonTerminal

func NewNonTerminal(name string) *NonTerminal

NewNonTerminal create and return a new NonTerminal instance.

func (*NonTerminal) GetAttribute

func (nt *NonTerminal) GetAttribute(attrname string) []string

GetAttribute implement Queryable interface.

func (*NonTerminal) GetAttributes

func (nt *NonTerminal) GetAttributes() map[string][]string

GetAttributes implement Queryable interface.

func (*NonTerminal) GetChildren

func (nt *NonTerminal) GetChildren() []Queryable

GetChildren implement Queryable interface.

func (*NonTerminal) GetName

func (nt *NonTerminal) GetName() string

GetName implement Queryable interface.

func (*NonTerminal) GetPosition

func (nt *NonTerminal) GetPosition() int

GetPosition implement Queryable interface.

func (*NonTerminal) GetValue

func (nt *NonTerminal) GetValue() string

GetValue implement Queryable interface.

func (*NonTerminal) IsTerminal

func (nt *NonTerminal) IsTerminal() bool

IsTerminal implement Queryable interface.

func (*NonTerminal) SetAttribute

func (nt *NonTerminal) SetAttribute(attrname, value string) Queryable

SetAttribute implement Queryable interface.

type ParsecNode

type ParsecNode interface{}

ParsecNode for parsers return input text as parsed nodes.

type Parser

type Parser func(Scanner) (ParsecNode, Scanner)

Parser function parses input text encapsulated by Scanner, higher order parsers are constructed using combinators.

func And

func And(callb Nodify, parsers ...interface{}) Parser

And combinator accepts a list of `Parser`, or reference to a parser, that must match the input string, atleast until the last Parser argument. Return a parser function that can further be used to construct higher-level parsers.

If all parser matches, a list of ParsecNode, where each ParsecNode is constructed by matching parser, will be passed as argument to Nodify callback. Even if one of the input parser function fails, And will fail without consuming the input.

Example
// parse a configuration line from ini file.
text := []byte(`loglevel = info`)
y := And(nil, Ident(), Atom("=", "EQUAL"), Ident())
root, _ := y(NewScanner(text))
nodes := root.([]ParsecNode)
t := nodes[0].(*Terminal)
fmt.Println(t.GetName(), t.GetValue())
t = nodes[1].(*Terminal)
fmt.Println(t.GetName(), t.GetValue())
t = nodes[2].(*Terminal)
fmt.Println(t.GetName(), t.GetValue())
Output:

IDENT loglevel
EQUAL =
IDENT info

func Atom

func Atom(match string, name string) Parser

Atom is similar to Token, takes a string to match with input byte-by-byte. Internally uses the MatchString() API from Scanner. Skip leading whitespace. For example:

scanner := NewScanner([]byte("cosmos"))
Atom("cos", "ATOM")(scanner) // will match

func AtomExact

func AtomExact(match string, name string) Parser

AtomExact is similar to Atom(), but string will be matched without skipping leading whitespace.

func Char

func Char() Parser

Char return parser function to match a single character in the input stream. Skip leading whitespace.

func End

func End() Parser

End is a parser function to detect end of scanner output, return boolean as ParseNode, hence incompatible with AST{}. Instead, use AST:End method.

func Float

func Float() Parser

Float return parser function to match a float literal in the input stream. Skip leading whitespace.

func Hex

func Hex() Parser

Hex return parser function to match a hexadecimal literal in the input stream. Skip leading whitespace.

func Ident

func Ident() Parser

Ident return parser function to match an identifier token in the input stream, an identifier is matched with the following pattern: `^[A-Za-z][0-9a-zA-Z_]*`. Skip leading whitespace.

func Int

func Int() Parser

Int return parser function to match an integer literal in the input stream. Skip leading whitespace.

func Kleene

func Kleene(callb Nodify, parsers ...interface{}) Parser

Kleene combinator accepts two parsers, or reference to parsers, namely opScan and sepScan, where opScan parser will be used to match input string and contruct ParsecNode, and sepScan parser will be used to match input string and ignore the matched string. If sepScan parser is not supplied, then opScan parser will be applied on the input until it fails.

The process of matching opScan parser and sepScan parser will continue in a loop until either one of them fails on the input stream.

For every successful match of opScan, the returned ParsecNode from matching parser will be accumulated and passed as argument to Nodify callback. If there is not a single match for opScan, then []ParsecNode of ZERO length will be passed as argument to Nodify callback. Kleene combinator will never fail.

func Many

func Many(callb Nodify, parsers ...interface{}) Parser

Many combinator accepts two parsers, or reference to parsers, namely opScan and sepScan, where opScan parser will be used to match input string and contruct ParsecNode, and sepScan parser will be used to match input string and ignore the matched string. If sepScan parser is not supplied, then opScan parser will be applied on the input until it fails.

The process of matching opScan parser and sepScan parser will continue in a loop until either one of them fails on the input stream.

The difference between `Many` combinator and `Kleene` combinator is that there shall atleast be one match of opScan.

For every successful match of opScan, the returned ParsecNode from matching parser will be accumulated and passed as argument to Nodify callback. If there is not a single match for opScan, then Many will fail without consuming the input.

Example
// parse comma separated values
text := []byte(`10,30,50 wont parse this`)
y := Many(nil, Int(), Atom(",", "COMMA"))
root, _ := y(NewScanner(text))
nodes := root.([]ParsecNode)
t := nodes[0].(*Terminal)
fmt.Println(t.GetName(), t.GetValue())
t = nodes[1].(*Terminal)
fmt.Println(t.GetName(), t.GetValue())
t = nodes[2].(*Terminal)
fmt.Println(t.GetName(), t.GetValue())
Output:

INT 10
INT 30
INT 50

func ManyUntil

func ManyUntil(callb Nodify, parsers ...interface{}) Parser

ManyUntil combinator accepts three parsers, or references to parsers, namely opScan, sepScan and untilScan, where opScan parser will be used to match input string and contruct ParsecNode, and sepScan parser will be used to match input string and ignore the matched string. If sepScan parser is not supplied, then opScan parser will be applied on the input until it fails.

The process of matching opScan parser and sepScan parser will continue in a loop until either one of them fails on the input stream or untilScan matches.

For every successful match of opScan, the returned ParsecNode from matching parser will be accumulated and passed as argument to Nodify callback. If there is not a single match for opScan, then ManyUntil will fail without consuming the input.

Example
// make sure to parse the entire text
text := []byte("10,20,50")
y := ManyUntil(nil, Int(), Atom(",", "COMMA"), End())
root, _ := y(NewScanner(text))
nodes := root.([]ParsecNode)
t := nodes[0].(*Terminal)
fmt.Println(t.GetName(), t.GetValue())
t = nodes[1].(*Terminal)
fmt.Println(t.GetName(), t.GetValue())
t = nodes[2].(*Terminal)
fmt.Println(t.GetName(), t.GetValue())
Output:

INT 10
INT 20
INT 50

func Maybe

func Maybe(callb Nodify, parser interface{}) Parser

Maybe combinator accepts a single parser, or reference to a parser, and tries to match the input stream with it. If parser fails to match the input, returns MaybeNone.

Example
// parse an optional token
equal := Atom("=", "EQUAL")
maybeand := Maybe(nil, Atom("&", "AND"))
y := And(nil, Ident(), equal, maybeand, Ident())

text := []byte("a = &b")
root, _ := y(NewScanner(text))
nodes := root.([]ParsecNode)
t := nodes[0].(*Terminal)
fmt.Println(t.GetName(), t.GetValue())
t = nodes[1].(*Terminal)
fmt.Println(t.GetName(), t.GetValue())
t = nodes[2].([]ParsecNode)[0].(*Terminal)
fmt.Println(t.GetName(), t.GetValue())
t = nodes[3].(*Terminal)
fmt.Println(t.GetName(), t.GetValue())

text = []byte("a = b")
root, _ = y(NewScanner(text))
nodes = root.([]ParsecNode)
t = nodes[0].(*Terminal)
fmt.Println(t.GetName(), t.GetValue())
t = nodes[1].(*Terminal)
fmt.Println(t.GetName(), t.GetValue())
fmt.Println(nodes[2])
t = nodes[3].(*Terminal)
fmt.Println(t.GetName(), t.GetValue())
Output:

IDENT a
EQUAL =
AND &
IDENT b
IDENT a
EQUAL =
missing
IDENT b

func NoEnd

func NoEnd() Parser

NoEnd is a parser function to detect not-an-end of scanner output, return boolean as ParsecNode, hence incompatible with AST{}.

func Oct

func Oct() Parser

Oct return parser function to match an octal number literal in the input stream. Skip leading whitespace.

func OrdChoice

func OrdChoice(callb Nodify, parsers ...interface{}) Parser

OrdChoice combinator accepts a list of `Parser`, or reference to a parser, where atleast one of the parser must match the input string. Return a parser function that can further be used to construct higher level parsers.

The first matching parser function's output is passed as argument to Nodify callback, that is []ParsecNode argument will just have one element in it. If none of the parsers match the input, then OrdChoice will fail without consuming any input.

Example
// parse a boolean value
text := []byte(`true`)
y := OrdChoice(nil, Atom("true", "TRUE"), Atom("false", "FALSE"))
root, _ := y(NewScanner(text))
nodes := root.([]ParsecNode)
t := nodes[0].(*Terminal)
fmt.Println(t.GetName(), t.GetValue())
Output:

TRUE true

func OrdTokens

func OrdTokens(patterns []string, names []string) Parser

OrdTokens to parse a single token based on one of the specified `patterns`. Skip leading whitespaces.

func String

func String() Parser

String parse double quoted string in input text, this parser returns string type as ParsecNode, hence incompatible with AST combinators. Skip leading whitespace.

func Token

func Token(pattern string, name string) Parser

Token takes a regular-expression pattern and return a parser that will match input stream with supplied pattern. Skip leading whitespace. `name` will be used as the Terminal's name.

func TokenExact

func TokenExact(pattern string, name string) Parser

TokenExact same as Token() but pattern will be matched without skipping leading whitespace. `name` will be used as the terminal's name.

type Queryable

type Queryable interface {
	// GetName for the node.
	GetName() string

	// IsTerminal return true if node is a leaf node in syntax-tree.
	IsTerminal() bool

	// GetValue return parsed text, if node is NonTerminal it will
	// concat the entire sub-tree for parsed text and return the same.
	GetValue() string

	// GetChildren relevant only for NonTerminal node.
	GetChildren() []Queryable

	// GetPosition of the first terminal value in input.
	GetPosition() int

	// SetAttribute with a value string, can be called multiple times for the
	// same attrname.
	SetAttribute(attrname, value string) Queryable

	// GetAttribute for attrname, since more than one value can be set on the
	// attribute, return a slice of values.
	GetAttribute(attrname string) []string

	// GetAttributes return a map of all attributes set on this node.
	GetAttributes() map[string][]string
}

Queryable interface to be implemented by all nodes, both terminal and non-terminal nodes constructed using AST object.

type Scanner

type Scanner interface {
	// SetWSPattern to configure white space pattern. Typically used as
	//		scanner := NewScanner(input).SetWSPattern(" ")
	SetWSPattern(pattern string) Scanner

	// TrackLineno as cursor moves forward, this can slow down parsing.
	// Useful when developing with parsec package.
	TrackLineno() Scanner

	// Clone will return new clone of the underlying scanner structure.
	// This will be used by combinators to _backtrack_.
	Clone() Scanner

	// GetCursor gets the current cursor position inside input text.
	GetCursor() int

	// Match the input stream with `pattern` and return matching string
	// after advancing the scanner's cursor.
	Match(pattern string) ([]byte, Scanner)

	// Match the input stream with a simple string, rather that a
	// pattern. It should be more efficient. Return a bool indicating
	// if the match was succesfull after advancing the scanner's cursor.
	MatchString(string) (bool, Scanner)

	// SubmatchAll the input stream with a choice of `patterns`
	// and return matching string and submatches, after advancing the
	// Scanner's cursor.
	SubmatchAll(pattern string) (map[string][]byte, Scanner)

	// SkipWs skips white space characters in the input stream.
	// Return skipped whitespaces as byte-slice after advance the
	// Scanner's cursor.
	SkipWS() ([]byte, Scanner)

	// SkipAny any occurrence of the elements of the slice.
	// Equivalent to Match(`(b[0]|b[1]|...|b[n])*`)
	// Returns Scanner after advancing its cursor.
	SkipAny(pattern string) ([]byte, Scanner)

	// Lineno return the current line-number of the cursor.
	Lineno() int

	// Endof detects whether end-of-file is reached in the input
	// stream and return a boolean indicating the same.
	Endof() bool
}

Scanner interface defines necessary methods to match the input stream.

func NewScanner

func NewScanner(text []byte) Scanner

NewScanner create and return a new instance of SimpleScanner object.

type SimpleScanner

type SimpleScanner struct {
	// contains filtered or unexported fields
}

SimpleScanner implements Scanner interface based on golang's regexp module.

func (*SimpleScanner) Clone

func (s *SimpleScanner) Clone() Scanner

Clone implement Scanner{} interface.

func (*SimpleScanner) Endof

func (s *SimpleScanner) Endof() bool

Endof implement Scanner{} interface.

func (*SimpleScanner) GetCursor

func (s *SimpleScanner) GetCursor() int

GetCursor implement Scanner{} interface.

func (*SimpleScanner) Lineno

func (s *SimpleScanner) Lineno() int

Lineno implement Scanner{} interface.

func (*SimpleScanner) Match

func (s *SimpleScanner) Match(pattern string) ([]byte, Scanner)

Match implement Scanner{} interface.

func (*SimpleScanner) MatchString

func (s *SimpleScanner) MatchString(str string) (bool, Scanner)

MatchString implement Scanner{} interface.

func (*SimpleScanner) SetWSPattern

func (s *SimpleScanner) SetWSPattern(pattern string) Scanner

SetWSPattern implement Scanner{} interface.

func (*SimpleScanner) SkipAny

func (s *SimpleScanner) SkipAny(pattern string) ([]byte, Scanner)

SkipAny implement Scanner{} interface.

func (*SimpleScanner) SkipWS

func (s *SimpleScanner) SkipWS() ([]byte, Scanner)

SkipWS implement Scanner{} interface.

func (*SimpleScanner) SkipWSUnicode

func (s *SimpleScanner) SkipWSUnicode() ([]byte, Scanner)

SkipWSUnicode for looping through runes checking for whitespace.

func (*SimpleScanner) SubmatchAll

func (s *SimpleScanner) SubmatchAll(patt string) (map[string][]byte, Scanner)

SubmatchAll implement Scanner{} interface.

func (*SimpleScanner) TrackLineno

func (s *SimpleScanner) TrackLineno() Scanner

TrackLineno implement Scanner{} interface.

type Terminal

type Terminal struct {
	Name       string // contains terminal's token type
	Value      string // value of the terminal
	Position   int    // Offset into the text stream where token was identified
	Attributes map[string][]string
}

Terminal type can be used to construct a terminal ParsecNode. It implements Queryable interface, hence can be used with AST object.

func NewTerminal

func NewTerminal(name, value string, position int) *Terminal

NewTerminal create a new Terminal instance. Supply the name of the terminal, its matching text from i/p stream as value. And its position within the i/p stream.

func (*Terminal) GetAttribute

func (t *Terminal) GetAttribute(attrname string) []string

GetAttribute implement Queryable interface.

func (*Terminal) GetAttributes

func (t *Terminal) GetAttributes() map[string][]string

GetAttributes implement Queryable interface.

func (*Terminal) GetChildren

func (t *Terminal) GetChildren() []Queryable

GetChildren implement Queryable interface.

func (*Terminal) GetName

func (t *Terminal) GetName() string

GetName implement Queryable interface.

func (*Terminal) GetPosition

func (t *Terminal) GetPosition() int

GetPosition implement Queryable interface.

func (*Terminal) GetValue

func (t *Terminal) GetValue() string

GetValue implement Queryable interface.

func (*Terminal) IsTerminal

func (t *Terminal) IsTerminal() bool

IsTerminal implement Queryable interface.

func (*Terminal) SetAttribute

func (t *Terminal) SetAttribute(attrname, value string) Queryable

SetAttribute implement Queryable interface.

Directories

Path Synopsis
Package json provide a parser to parse JSON string.
Package json provide a parser to parse JSON string.
tools

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
t or T : Toggle theme light dark auto
y or Y : Canonical URL