kowhai

package module

v0.0.0-...-5b6ea31 Latest Latest Go to latest Published: May 15, 2015 License: MIT Imports: 2 Imported by: 5

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/jbowtie/kowhai

Links

Open Source Insights

README ¶

Kowhai

Kowhai is an Earley-style parser loosely based on the published MARPA algorithm. Currently it is in an experimental state but should handle all LR grammars.

A kowhai.Grammar is used to build up a ruleset using a API that maps well to typical BNF constructs. Once a top-level start rule is identified, it can build a split-epsilon finite state machine.

A kowhai.Parser, given a stream of tokens (perhaps from a lexer) and a state machine, will produce a parse tree that can then be used to build an astract syntax tree.

Documentation ¶

Index ¶

func DumpTreeNode(parseNode *ParseTreeNode, depth int)
type AhfaCompletion
type AhfaMachine
- func (ah *AhfaMachine) AcceptedState(state int) bool
- func (ah *AhfaMachine) Completed(state int) (rules []Term)
- func (machine *AhfaMachine) Goto(state int, symbol Term) int
- func (machine *AhfaMachine) IndexOf(state AhfaState) int
- func (machine *AhfaMachine) String() string
type AhfaRule
- func (r *AhfaRule) AsTerm() Term
- func (rule *AhfaRule) IsCompleted() bool
- func (rule *AhfaRule) String() string
type AhfaState
- func (state AhfaState) Contains(rule *AhfaRule) bool
type EarleyItem
- func (item EarleyItem) String() string
type EarleyItemSet
- func (set *EarleyItemSet) AddItem(state int, parent int, parseNode *SppfNode)
- func (set *EarleyItemSet) String() string
type Grammar
- func CreateGrammar() *Grammar
- func (g *Grammar) BuildStateMachine() *AhfaMachine
- func (g *Grammar) CreateRule(name string, terms ...Term)
- func (g *Grammar) DumpRules()
- func (g *Grammar) GetStartRule() *Rule
- func (g *Grammar) Lookup(name string) (out *Rule)
- func (g *Grammar) Optional(terms ...Term) (out Term)
- func (g *Grammar) Or(terms ...Term) (out Term)
- func (g *Grammar) Plus(terms ...Term) (out Term)
- func (g *Grammar) SetStart(name string)
- func (g *Grammar) Star(terms ...Term) (out Term)
- func (g *Grammar) Symbol(name string) Term
- func (g *Grammar) Type(tokenType int) (out Term)
- func (g *Grammar) UnionSingleTerms(name string, terms ...Term)
type LiteralToken
- func (l LiteralToken) AsValue() string
- func (l LiteralToken) TokenType() int
type MarpaParser
- func CreateParser(machine *AhfaMachine) MarpaParser
- func (parser *MarpaParser) AddOptimization(o ParseTreeOptimizer)
- func (parser *MarpaParser) BuildParseTree() *ParseTreeNode
- func (parser *MarpaParser) DumpMachine()
- func (parser *MarpaParser) DumpTable()
- func (parser *MarpaParser) MakeParseNode(rule Term, origin int, location int, w *SppfNode, v *SppfNode, ...) (y *SppfNode)
- func (parser *MarpaParser) OptimizeParseTree(tree *ParseTreeNode) *ParseTreeNode
- func (parser *MarpaParser) PrintAcceptedTree() bool
- func (parser *MarpaParser) PrintCNodes()
- func (parser *MarpaParser) ScanToken(token Token) (err error)
type OverlappingChildren
- func (o OverlappingChildren) Postprocess(node *ParseTreeNode) *ParseTreeNode
- func (o OverlappingChildren) Preprocess(node *ParseTreeNode) *ParseTreeNode
type ParseNodes
- func (x ParseNodes) Len() int
- func (x ParseNodes) Less(i, j int) bool
- func (x ParseNodes) Swap(i, j int)
type ParseTreeNode
- func (node *ParseTreeNode) Overlaps(other *ParseTreeNode) bool
type ParseTreeOptimizer
type ParseTreeToken
- func (t *ParseTreeToken) IsRule() bool
- func (t *ParseTreeToken) MatchesToken(token Token) bool
- func (t *ParseTreeToken) String() string
type Production
type RemoveSyntheticNodes
- func (o RemoveSyntheticNodes) GetNaturalChildren(node *ParseTreeNode) (children []*ParseTreeNode)
- func (o RemoveSyntheticNodes) Postprocess(node *ParseTreeNode) *ParseTreeNode
- func (o RemoveSyntheticNodes) Preprocess(node *ParseTreeNode) *ParseTreeNode
type Rule
- func (r *Rule) Add(s Production)
- func (r *Rule) IsAllowedChild(term Term) bool
- func (r *Rule) IsRightRecursive() bool
- func (r *Rule) IsRule() bool
- func (r *Rule) MatchesToken(token Token) bool
- func (r *Rule) String() string
type SppfNode
- func (s *SppfNode) Label() string
- func (s *SppfNode) String() string
type SppfTerm
- func (l SppfTerm) IsRule() bool
- func (l SppfTerm) MatchesToken(token Token) bool
- func (l SppfTerm) String() string
type Symbol
- func (s Symbol) IsRule() bool
- func (s Symbol) MatchesToken(token Token) bool
type Term
type Token
type TrimWhenSingle
- func (o TrimWhenSingle) Postprocess(node *ParseTreeNode) *ParseTreeNode
- func (o TrimWhenSingle) Preprocess(node *ParseTreeNode) *ParseTreeNode
type TypedTerm
- func (t TypedTerm) IsRule() bool
- func (t TypedTerm) MatchesToken(token Token) bool

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func DumpTreeNode ¶

func DumpTreeNode(parseNode *ParseTreeNode, depth int)

Types ¶

func (*AhfaMachine) AcceptedState ¶

func (ah *AhfaMachine) AcceptedState(state int) bool

func (*AhfaMachine) Completed ¶

func (ah *AhfaMachine) Completed(state int) (rules []Term)

func (*AhfaMachine) Goto ¶

func (machine *AhfaMachine) Goto(state int, symbol Term) int

transition to the next state

func (*AhfaMachine) IndexOf ¶

func (machine *AhfaMachine) IndexOf(state AhfaState) int

used during building of machine to deduplicate states

func (*AhfaMachine) String ¶

func (machine *AhfaMachine) String() string

gives us a way to dump and inspect the machine

type AhfaRule ¶

type AhfaRule struct {
	// contains filtered or unexported fields
}

rule as fit for usage in the state machine

func (*AhfaRule) AsTerm ¶

func (r *AhfaRule) AsTerm() Term

convert to a term for convenience

func (*AhfaRule) IsCompleted ¶

func (rule *AhfaRule) IsCompleted() bool

func (*AhfaRule) String ¶

func (rule *AhfaRule) String() string

func (AhfaState) Contains ¶

func (state AhfaState) Contains(rule *AhfaRule) bool

check to see if a rule is present in a state

type EarleyItem ¶

type EarleyItem struct {
	// contains filtered or unexported fields
}

Tracks AFHA state and parent state should parent be a pointer to the parent EI? symbol is used when memoizing leo items

func (EarleyItem) String ¶

func (item EarleyItem) String() string

type EarleyItemSet ¶

type EarleyItemSet struct {
	// contains filtered or unexported fields
}

func (*EarleyItemSet) AddItem ¶

func (set *EarleyItemSet) AddItem(state int, parent int, parseNode *SppfNode)

add an item to the EIS

func (*EarleyItemSet) String ¶

func (set *EarleyItemSet) String() string

type Grammar ¶

type Grammar struct {
	// contains filtered or unexported fields
}

A grammar is a set of rules that the parser should recognize The usage is create a grammar, add some rules, set a start rule (top-level definition) generate a state machine for the parser

func CreateGrammar ¶

func CreateGrammar() *Grammar

func (*Grammar) BuildStateMachine ¶

func (g *Grammar) BuildStateMachine() *AhfaMachine

the machine can be used by many parses

func (*Grammar) CreateRule ¶

func (g *Grammar) CreateRule(name string, terms ...Term)

this creates a rule OR adds a production to a new rule this is a convenience behaviour since it can make it much easier to transcribe EBNF

func (*Grammar) DumpRules ¶

func (g *Grammar) DumpRules()

func (*Grammar) GetStartRule ¶

func (g *Grammar) GetStartRule() *Rule

func (*Grammar) Lookup ¶

func (g *Grammar) Lookup(name string) (out *Rule)

Lookup a rule this creates the rule if it doesn't exist (makes forward declarations possible) TODO: validate all rules have productions at build time

func (*Grammar) Optional ¶

func (g *Grammar) Optional(terms ...Term) (out Term)

Expand a ? operator (zero-or-one)

func (*Grammar) Or ¶

func (g *Grammar) Or(terms ...Term) (out Term)

Create an OR rule where each term is a different branch

func (*Grammar) Plus ¶

func (g *Grammar) Plus(terms ...Term) (out Term)

Expand a + operator (one-or-more)

func (*Grammar) SetStart ¶

func (g *Grammar) SetStart(name string)

func (*Grammar) Star ¶

func (g *Grammar) Star(terms ...Term) (out Term)

Expand a * operator (zero-or-more)

func (*Grammar) Symbol ¶

func (g *Grammar) Symbol(name string) Term

Create a symbol (which matches a token value)

func (*Grammar) Type ¶

func (g *Grammar) Type(tokenType int) (out Term)

Create a TypedTerm (which matches on token type)

func (*Grammar) UnionSingleTerms ¶

func (g *Grammar) UnionSingleTerms(name string, terms ...Term)

handle common right-hand side of "t1 | t2 | t3" same as top-level OR

type LiteralToken ¶

type LiteralToken string

Simple token type that can be delivered to the parser

func (LiteralToken) AsValue ¶

func (l LiteralToken) AsValue() string

func (LiteralToken) TokenType ¶

func (l LiteralToken) TokenType() int

type MarpaParser ¶

type MarpaParser struct {
	// contains filtered or unexported fields
}

func CreateParser ¶

func CreateParser(machine *AhfaMachine) MarpaParser

create a new parser that uses the machine

func (*MarpaParser) AddOptimization ¶

func (parser *MarpaParser) AddOptimization(o ParseTreeOptimizer)

func (*MarpaParser) BuildParseTree ¶

func (parser *MarpaParser) BuildParseTree() *ParseTreeNode

func (*MarpaParser) DumpMachine ¶

func (parser *MarpaParser) DumpMachine()

dump the Earley sets for inspection

func (*MarpaParser) DumpTable ¶

func (parser *MarpaParser) DumpTable()

dump the Earley sets for inspection

func (*MarpaParser) MakeParseNode ¶

func (parser *MarpaParser) MakeParseNode(rule Term, origin int, location int, w *SppfNode, v *SppfNode, nodes map[string]*SppfNode) (y *SppfNode)

func (*MarpaParser) OptimizeParseTree ¶

func (parser *MarpaParser) OptimizeParseTree(tree *ParseTreeNode) *ParseTreeNode

run any optimizations over the tree

func (*MarpaParser) PrintAcceptedTree ¶

func (parser *MarpaParser) PrintAcceptedTree() bool

placeholder function where we can look at the parse tree once we are building one!

func (*MarpaParser) PrintCNodes ¶

func (parser *MarpaParser) PrintCNodes()

func (*MarpaParser) ScanToken ¶

func (parser *MarpaParser) ScanToken(token Token) (err error)

this handles the next token delivered by the lexer

type OverlappingChildren ¶

type OverlappingChildren struct {
}

This optimization handles child nodes that overlap (due to partial parse trees) Generally applying this optimization makes the resulting parse tree easier to understand

func (OverlappingChildren) Postprocess ¶

func (o OverlappingChildren) Postprocess(node *ParseTreeNode) *ParseTreeNode

func (OverlappingChildren) Preprocess ¶

func (o OverlappingChildren) Preprocess(node *ParseTreeNode) *ParseTreeNode

func (ParseNodes) Len ¶

func (x ParseNodes) Len() int

func (ParseNodes) Less ¶

func (x ParseNodes) Less(i, j int) bool

func (ParseNodes) Swap ¶

func (x ParseNodes) Swap(i, j int)

type ParseTreeNode ¶

type ParseTreeNode struct {
	Term     Term
	Parent   *ParseTreeNode
	Children []*ParseTreeNode
	// contains filtered or unexported fields
}

func (*ParseTreeNode) Overlaps ¶

func (node *ParseTreeNode) Overlaps(other *ParseTreeNode) bool

type ParseTreeOptimizer ¶

type ParseTreeOptimizer interface {
	Preprocess(node *ParseTreeNode) *ParseTreeNode
	Postprocess(node *ParseTreeNode) *ParseTreeNode
}

type ParseTreeToken ¶

type ParseTreeToken struct {
	Token Token
}

func (*ParseTreeToken) IsRule ¶

func (t *ParseTreeToken) IsRule() bool

func (*ParseTreeToken) MatchesToken ¶

func (t *ParseTreeToken) MatchesToken(token Token) bool

func (*ParseTreeToken) String ¶

func (t *ParseTreeToken) String() string

type Production ¶

type Production []Term

Production is the set of terms that make up the right side of a rule

type RemoveSyntheticNodes ¶

type RemoveSyntheticNodes struct {
}

The optimization removes parse nodes thar derive from rules that were created by the grammar operators (+, ?, *, |)

func (RemoveSyntheticNodes) GetNaturalChildren ¶

func (o RemoveSyntheticNodes) GetNaturalChildren(node *ParseTreeNode) (children []*ParseTreeNode)

func (RemoveSyntheticNodes) Postprocess ¶

func (o RemoveSyntheticNodes) Postprocess(node *ParseTreeNode) *ParseTreeNode

func (RemoveSyntheticNodes) Preprocess ¶

func (o RemoveSyntheticNodes) Preprocess(node *ParseTreeNode) *ParseTreeNode

type Rule ¶

type Rule struct {
	Productions []Production
	// contains filtered or unexported fields
}

Name is the left hand side Productions are the various right hand sides

func (*Rule) Add ¶

func (r *Rule) Add(s Production)

func (*Rule) IsAllowedChild ¶

func (r *Rule) IsAllowedChild(term Term) bool

this is used when building parse trees to prune out alternative shorter parse possibilities

func (*Rule) IsRightRecursive ¶

func (r *Rule) IsRightRecursive() bool

func (*Rule) IsRule ¶

func (r *Rule) IsRule() bool

func (*Rule) MatchesToken ¶

func (r *Rule) MatchesToken(token Token) bool

func (*Rule) String ¶

func (r *Rule) String() string

type SppfNode ¶

type SppfNode struct {
	// contains filtered or unexported fields
}

func (*SppfNode) Label ¶

func (s *SppfNode) Label() string

func (*SppfNode) String ¶

func (s *SppfNode) String() string

type SppfTerm ¶

type SppfTerm string

Used only in building the parse tree will need to refactor away later

func (SppfTerm) IsRule ¶

func (l SppfTerm) IsRule() bool

func (SppfTerm) MatchesToken ¶

func (l SppfTerm) MatchesToken(token Token) bool

func (SppfTerm) String ¶

func (l SppfTerm) String() string

type Symbol ¶

type Symbol string

Symbol looks at a token and matches the value Generally used for keywords and punctuation referenced directly in rule definitions.

func (Symbol) IsRule ¶

func (s Symbol) IsRule() bool

func (Symbol) MatchesToken ¶

func (s Symbol) MatchesToken(token Token) bool

type Term ¶

type Term interface {
	IsRule() bool
	MatchesToken(token Token) bool
}

A term is anything that can appear on the RHS of a rule Here we define Symbol (for literal terminals that appear in a rule definition), Rule (for non-terminals), TypedTerm (for matching type of token produced by a lexer)

type Token ¶

type Token interface {
	AsValue() string //exposes a string value for symbol matches
	TokenType() int  // exposes a token type for a type match
}

A trivial interface for the tokens comsumed by the parser

type TrimWhenSingle ¶

type TrimWhenSingle struct {
	AppliesTo []string
}

This optimization can be used to remove intermediate nodes that have only a single child.

func (TrimWhenSingle) Postprocess ¶

func (o TrimWhenSingle) Postprocess(node *ParseTreeNode) *ParseTreeNode

func (TrimWhenSingle) Preprocess ¶

func (o TrimWhenSingle) Preprocess(node *ParseTreeNode) *ParseTreeNode

type TypedTerm ¶

type TypedTerm int

A TypedTerm looks at the token type for a match Generally used for recognizing things produced by a lexer such as an int literal, string literal, etc.

func (TypedTerm) IsRule ¶

func (t TypedTerm) IsRule() bool

func (TypedTerm) MatchesToken ¶

func (t TypedTerm) MatchesToken(token Token) bool

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL