lr

package
v0.0.0-...-dd1d5c9 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 1, 2023 License: BSD-3-Clause Imports: 0 Imported by: 0

README

Naive / (mostly) canonical lr parser generator.

This deviates slightly from the dragon book's LR(1) presentation:

1. The start production is item is [#accept -> ^ . S, $] instead of
   [#accept -> . S, $].  This avoids the need to special case the start
   state's kernel.

2. The token stream act as pseudo symbol stack.  Instead of relying on
   a secondary GOTO table for reduction, the goto entries are encoded as
   shift actions in the ACTION table.  The reduction action will pop
   n symbols from state stack, and push the reduced symbol to the front
   of the pseudo symbol stack.  The parser then perform a second lookup
   to shift the reduced symbol.

3. This uses the LALR "core union" idea to merge states.  However, unlike
   LALR, this merges two states iff the merge does not introduce any new
   (shift/reduce, reduce/reduce) conflicts, i.e., we allow multiple copies of
   states with the same core.

4. For each (conflict-free) state, we replace the most used core production
   items of the form [A -> B . , x] with a single [A -> B . , *] entry.  The
   * indicates the look ahead symbol can be anything. Internally, an ACTION
   table lookup will first attempt to look up the look ahead specific action,
   and if that fails, fall back to looking up the wildcard action.

5. For convenience, we allow %start to accept a list of production names
   instead of a single production name.  The generated state graph will
   have one start state for each specified production.

The generator's parser is self bootstrapped from ./internal/parser/grammar.lr

---------

Note on parser state size:

With the optimizations listed above, it looks like we can generate a fully
LR(1) compliant parser using a roughly the same number of states as LALR, but
at the expense of more shift action entries.

For comparison, the following are some stats generated from the ansi c's grammar
(in ./test/ansi_c)

Canonical LR1:
Number of states: 1572
Number of shift actions: 14481
Number of reduce actions: 15182

With Core Union (#3 above):
Number of states: 351
Number of shift actions: 2989
Number of reduce actions: 4084

With Core Union And Reduction Action Merging (#3 and #4 above):
Number of states: 351
Number of shift actions: 2989
Number of reduce actions: 246

YACC / LALR:
Number of states: 349
Number of shift + goto actions: 1702 + 238 = 1940
Number of reduce actions: ?

Documentation

The Go Gopher

There is no documentation for this package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL