earley

package module
v1.0.6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 11, 2024 License: MIT Imports: 7 Imported by: 0

README

Earley Parser

This repository contains an Earley parser implementation for Go, based on my dissertation work from waaaaay back but ported to be useful in Go for building DSLs.

Metagrammar

A simple notation for writing a grammar, which can be turned into a consumer of this library:

%package parser

// import some dependencies
%import "slices"
%import "bitbucket.org/dkolbly/fun"

// define some tokens
%token {lexeme} NAME INTEGER;
%token {string} NAME;
%token {int} INTEGER;
%token {lexeme} "+" "-";

// define the types for rules
%type {fun.Expr} start expr;

// define some rules
start ::= expr {}

expr ::= expr "+" expr {
    // return a composite thing
}

expr ::= n:NAME {
    return &fun.Ident{n}
}

expr ::= n:INTEGER {
    return &fun.Literal{n}
}

For a working example, consult the example/ directory.

Using the generator

The generator is the program that takes a metagrammar file and produces Go code, the latter of which implements the grammar so specified.

go install bitbucket.org/dkolbly/earley3/generate-earley3@latest

Building using Bazel

I find the following genrule useful for building a grammar

genrule(
    name = "grammar_go",
    srcs = ["grammar.el"],
    outs = ["grammar.go"],
    cmd = "generate-earley3 $(location grammar.el) > \"$@\"",
)

then the generated grammar can be added added to the go_library rule, as in:

load("@rules_go//go:def.bzl", "go_library")

go_library(
    name = "parser_lib",
    srcs = glob(["*.go"], exclude=["*_test.go"]) + [":grammar_go"],
    importpath = "bitbucket.org/dkolbly/some/thing",
    visibility = ["//visibility:public"],
    deps = [
        # ...
        "@org_bitbucket_dkolbly_earley3//:earley3",
    ]
)

Documentation

Index

Constants

This section is empty.

Variables

View Source
var ErrAmbiguous = errors.New("ambiguous")
View Source
var ErrNoParse = errors.New("no parse")
View Source
var Trace = false

Functions

func WriteGrammar

func WriteGrammar(dst io.Writer, s Scope)

Types

type GrammarScope

type GrammarScope interface {
	Scope
	Rule(lhs NonTerminal, formals ...interface{}) *Production
}

func NewGrammarScope

func NewGrammarScope(outer Scope) GrammarScope

type Item

type Item struct {
	Of *Production
	At int
}

type Lang

type Lang struct {
	// contains filtered or unexported fields
}

func New

func New(starts ...NonTerminal) *Lang

func (*Lang) Extend

func (l *Lang) Extend(with *Lang) *Lang

Extend takes a given language, extends it with the language (productions) available from the other language; TODO this API could be improved. The Lang object is fuzzy in its semantics; we should be doing this with scope instead.

func (*Lang) Global

func (l *Lang) Global() Scope

func (*Lang) NewLangWithGlobalScope

func (l *Lang) NewLangWithGlobalScope(s Scope) *Lang

func (*Lang) Parse

func (l *Lang) Parse(ctx context.Context, src Scanner) (Meaning, error)

func (*Lang) ParseStart

func (l *Lang) ParseStart(ctx context.Context, src Scanner, start NonTerminal) (Meaning, error)

func (*Lang) Restart

func (l *Lang) Restart(start NonTerminal) *Lang

func (*Lang) Rule

func (l *Lang) Rule(lhs NonTerminal, formals ...interface{}) *Production

func (*Lang) WriteTo

func (s *Lang) WriteTo(dst io.Writer)

type Lexeme

type Lexeme interface {
	//TokenType() TokenTypeCode
	Value() interface{}
}

type Meaning

type Meaning interface{}

type NonTerminal

type NonTerminal string

type ParseError

type ParseError struct {
	// contains filtered or unexported fields
}

func (*ParseError) Error

func (err *ParseError) Error() string

type Production

type Production struct {
	// contains filtered or unexported fields
}

func Lookup

func Lookup(in Scope, nt NonTerminal) []*Production

func (*Production) Accept

func (p *Production) Accept() *Production

setting this on a production causes acceptance of the input which this production is reduced, and the meaning of the accepted production is the meaning of the entire parse

func (*Production) Nth

func (p *Production) Nth(i int) interface{}

Nth returns the nth item in the production, which is either a TokenMatcher or a NonTerminal

func (*Production) Reject

func (p *Production) Reject() *Production

func (*Production) Then

func (p *Production) Then(r Reducer) *Production

type Reducer

type Reducer func([]Meaning) Meaning

type Scanner

type Scanner interface {
	Scan(context.Context) (Lexeme, error)
}

type Scope

type Scope interface {
	Outer() Scope
	Context() context.Context
}

type ScopeChanger

type ScopeChanger interface {
	ScopeChange(Scope) Scope
}

type ScopeChangerWithErr

type ScopeChangerWithErr interface {
	ScopeChange(Scope) (Scope, error)
}

type TokenMatcher

type TokenMatcher interface {
	Match(Scope, Lexeme) bool
	String() string
}

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL