ebnf

package module
v0.0.0-...-a5f62d7 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 7, 2026 License: BSD-3-Clause Imports: 12 Imported by: 0

README

EBNF parser and helpers for Go

This package is a fork of https://pkg.go.dev/golang.org/x/exp/ebnf

It parses a slightly different syntax (= -> ::=) and it includes some additional helpers for working with grammars.

Package ebnf is a library for EBNF grammars. The input is text ([]byte) satisfying the following grammar (represented itself in EBNF):

 Production  ::= name "::=" [ Expression ] "." .
 Expression  ::= Alternative { "|" Alternative } .
 Alternative ::= Term { Term } .
 Term        ::= name | token [ "…" token ] | Group | Option | Repetition .
 Group       ::= "(" Expression ")" .
 Option      ::= "[" Expression "]" .
 Repetition  ::= "{" Expression "}" .

A name is a Go identifier, a token is a Go string, and comments and white space follow the same rules as for the Go language. Production names starting with an uppercase Unicode letter denote non-terminal productions (i.e., productions which allow white-space and comments between tokens); all other production names denote lexical productions.

Examples

Parsing
package main

import (
	"bytes"
	"fmt"

	"github.com/5nord/ebnf"
)

func main() {
	src := []byte(`
		E ::= [ T E ].
		T ::= "a"|"b" .
	`)

	g, err := ebnf.Parse("", bytes.NewBuffer(src))
	if err != nil {
		panic(err)
	}

	// ...
}

New Functions

First

ebnf.First returns the first-set of the given production.

	fmt.Println(ebnf.First(g, g["E"])) // Output: [a b]
Text

ebnf.Text returns the literal text of the given production.

	fmt.Println(ebnf.Text(src, g["E"])) // Output: E := [ T E ].
IsLexical

ebnf.Islexical returns true if the given string is a terminal name.

Inspect

ebnf.Inspect traverses the given expression and calls the given function for each.

	ebnf.Inspect(g["E"], func(e ebnf.Expression) bool {
		fmt.Printf("%T\n", e)
		return true
	})
Format

ebnf.Format formats the grammar.

fmt.Println(ebnf.Format(g))

Documentation

Overview

Package ebnf is a library for EBNF grammars. The input is text ([]byte) satisfying the following grammar (represented itself in EBNF):

Production  = name "=" [ Expression ] "." .
Expression  = Alternative { "|" Alternative } .
Alternative = Term { Term } .
Term        = name | token [ "…" token ] | Group | Option | Repetition .
Group       = "(" Expression ")" .
Option      = "[" Expression "]" .
Repetition  = "{" Expression "}" .

A name is a Go identifier, a token is a Go string, and comments and white space follow the same rules as for the Go language. Production names starting with an uppercase Unicode letter denote non-terminal productions (i.e., productions which allow white-space and comments between tokens); all other production names denote lexical productions.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func First

func First(grammar Grammar, x Expression) []string

First returns the first token set of a given expression

func Format

func Format(g Grammar) string

func Inspect

func Inspect(e Expression, fn func(e Expression) bool) bool

Inspect traverses the given expression and calls the given function for each.

func IsLexical

func IsLexical(name string) bool

IsLexical returns true, when given name is a lexical production.

func Text

func Text(src []byte, p *Production) string

Text returns the text of the production.

func Verify

func Verify(grammar Grammar, start string) error

Verify checks that:

  • all productions used are defined
  • all productions defined are used when beginning at start
  • lexical productions refer only to other lexical productions

Position information is interpreted relative to the file set fset.

Types

type Alternative

type Alternative []Expression // x | y | z

An Alternative node represents a non-empty list of alternative expressions.

func (Alternative) Pos

func (x Alternative) Pos() scanner.Position

type Bad

type Bad struct {
	TokPos scanner.Position
	Error  string // parser error message
}

A Bad node stands for pieces of source code that lead to a parse error.

func (*Bad) Pos

func (x *Bad) Pos() scanner.Position

type Expression

type Expression interface {
	// Pos is the position of the first character of the syntactic construct
	Pos() scanner.Position
}

An Expression node represents a production expression.

type Grammar

type Grammar map[string]*Production

A Grammar is a set of EBNF productions. The map is indexed by production name.

func Parse

func Parse(filename string, src io.Reader) (Grammar, error)

Parse parses a set of EBNF productions from source src. It returns a set of productions. Errors are reported for incorrect syntax and if a production is declared more than once; the filename is used only for error positions.

type Group

type Group struct {
	Lparen scanner.Position
	Body   Expression // (body)
}

A Group node represents a grouped expression.

func (*Group) Pos

func (x *Group) Pos() scanner.Position

type Name

type Name struct {
	StringPos scanner.Position
	String    string
}

A Name node represents a production name.

func (*Name) Pos

func (x *Name) Pos() scanner.Position

type Option

type Option struct {
	Lbrack scanner.Position
	Body   Expression // [body]
}

An Option node represents an optional expression.

func (*Option) Pos

func (x *Option) Pos() scanner.Position

type Production

type Production struct {
	Name *Name
	Expr Expression
}

A Production node represents an EBNF production.

func Productions

func Productions(grammar Grammar) []*Production

Productions returns the productions of the grammar in the order they appear in the source file.

func (*Production) Pos

func (x *Production) Pos() scanner.Position

type Range

type Range struct {
	Begin, End *Token // begin ... end
}

A List node represents a range of characters.

func (*Range) Pos

func (x *Range) Pos() scanner.Position

type Repetition

type Repetition struct {
	Lbrace scanner.Position
	Body   Expression // {body}
}

A Repetition node represents a repeated expression.

func (*Repetition) Pos

func (x *Repetition) Pos() scanner.Position

type Sequence

type Sequence []Expression // x y z

A Sequence node represents a non-empty list of sequential expressions.

func (Sequence) Pos

func (x Sequence) Pos() scanner.Position

type Token

type Token struct {
	StringPos scanner.Position
	String    string
}

A Token node represents a literal.

func (*Token) Pos

func (x *Token) Pos() scanner.Position

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL