ungrammar

package module
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 7, 2023 License: Unlicense Imports: 3 Imported by: 0

README

go-ungrammar

Ungrammar implementation and API in Go. Ungrammar is a DSL for concrete syntax trees (CST). For some background on CSTs and how they relate to ASTs, see this blog post.

This implementation is based on the original ungrammar crate, also borrowing some test files from it.

Ungrammar syntax

The syntax of Ungrammar files is very simple:

//           -- comment
Name =       -- non-terminal definition
'ident'      -- token (terminal)
A B          -- sequence
A | B        -- alternation
A*           -- repetition (zero or more)
A?           -- optional (zero or one)
(A B)        -- grouping elements for precedence control
label:A      -- label hint for naming

For some concrete examples, look at files in the testdata directory.

Usage

Go Reference

Usage example:

https://github.com/eliben/go-ungrammar/blob/229d0dd20660980d5069ed676c5c728a9fda5723/example_test.go#L13-L31

For somewhat more sophisticated usage, see the cmd/ungrammar2json command.

Documentation

Overview

package ungrammar provides a parser and representation for Ungrammar concrete syntax trees.

Index

Examples

Constants

View Source
const (
	// Special tokens
	ERROR tokenName = iota
	EOF

	NODE
	TOKEN

	EQ
	STAR
	PIPE
	QMARK
	COLON
	LPAREN
	RPAREN
)

Variables

This section is empty.

Functions

This section is empty.

Types

type Alt

type Alt struct {
	Rules []Rule
}

func (*Alt) Location

func (alt *Alt) Location() location

func (*Alt) String

func (alt *Alt) String() string

type ErrorList

type ErrorList []error

ErrorList represents multiple parse errors reported by the parser on a given source. It's loosely modeled on scanner.ErrorList in the Go standard library. ErrorList implements the error interface.

func (*ErrorList) Add

func (el *ErrorList) Add(err error)

func (ErrorList) Error

func (el ErrorList) Error() string

type Grammar

type Grammar struct {
	// Rules maps ruleName --> Rule
	Rules map[string]Rule

	// NameLoc maps ruleName --> its location in the input, for accurate error
	// reporting. Rules carry their own locations, but since names are just
	// strings, locations are kept here.
	NameLoc map[string]location
}

Grammar represents a parsed Ungrammar file. The input is represented as a mapping between strings (rule names on the left-hand-side of Ungrammar rules) and rules (CST). For example, if we have a rule like "Foo = Bar Baz", the Rules map will contain a mapping between the string "Foo" and the CST Seq(Node(Bar), Node(Baz)).

func (*Grammar) String

func (g *Grammar) String() string

type Labeled

type Labeled struct {
	Label string
	Rule  Rule
	// contains filtered or unexported fields
}

func (*Labeled) Location

func (lbl *Labeled) Location() location

func (*Labeled) String

func (lbl *Labeled) String() string

type Node

type Node struct {
	Name string
	// contains filtered or unexported fields
}

func (*Node) Location

func (node *Node) Location() location

func (*Node) String

func (node *Node) String() string

type Opt

type Opt struct {
	Rule Rule
}

func (*Opt) Location

func (opt *Opt) Location() location

func (*Opt) String

func (opt *Opt) String() string

type Parser

type Parser struct {
	// contains filtered or unexported fields
}

Parser parses ungrammar syntax into a Grammar. Create a new parser with NewParser, and then call its ParseGrammar method.

Example
input := `
Foo = Bar Baz
Baz = ( Kay Jay )* | 'id'`

// Create an Ungrammar parser and parse input.
p := ungrammar.NewParser(input)
ungram, err := p.ParseGrammar()
if err != nil {
	panic(err)
}

// Display the string representation of the parsed ungrammar.
fmt.Println(ungram.Rules["Foo"].String())
fmt.Println(ungram.Rules["Baz"].String())
Output:

Seq(Bar, Baz)
Alt(Rep(Seq(Kay, Jay)), 'id')

func NewParser

func NewParser(buf string) *Parser

NewParser creates a new parser with the given string input.

func (*Parser) ParseGrammar

func (p *Parser) ParseGrammar() (*Grammar, error)

ParseGrammar takes the input the Parser was initialized with and parses it into a Grammar. It returns an ErrorList which collects all the errors encountered during parsing, and in case of errors the returned Grammar may be partial.

type Rep

type Rep struct {
	Rule Rule
}

func (*Rep) Location

func (rep *Rep) Location() location

func (*Rep) String

func (rep *Rep) String() string

type Rule

type Rule interface {
	Location() location
	String() string
}

Rule is the interface defining an Ungrammar CST subtree. At runtime, a value implemeting the Rule interface will have a concrete type which is one of the exported types in this file.

type Seq

type Seq struct {
	Rules []Rule
}

func (*Seq) Location

func (seq *Seq) Location() location

func (*Seq) String

func (seq *Seq) String() string

type Token

type Token struct {
	Value string
	// contains filtered or unexported fields
}

func (*Token) Location

func (tok *Token) Location() location

func (*Token) String

func (tok *Token) String() string

Directories

Path Synopsis
cmd
ungrammar2json command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL