sensors-parser

command
v0.2.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 5, 2018 License: BSD-3-Clause Imports: 9 Imported by: 0

README

Parse a sensors.conf file with goyacc and lexmachine

This is an example for how to integrate lexmachine with the standard yacc implementation for Go. Yacc is its own weird and interesting language for specifying bottom up shift reduce parsers. You can "easily" use lexmachine with yacc but it does require some understanding of

  1. How yacc works (eg. the things it generates)
  2. How to use those generated definitions in your code

Running the example

$ go generate -x -v github.com/timtadh/lexmachine/examples/sensors-parser
$ go install github.com/timtadh/lexmachine/examples/sensors-parser
$ cat examples/sensors-parser/sensors.conf | sensors-parser

Partial Explanation

Yacc controls the definitions for Tokens with its %token directives (see the sensors.y file. You will use those definitions in your lexer. An example of how to do this is in sensors_golex.go.

Second, Yacc expects the lexer to conform to the following interface:

type yyLexer interface {
   // Lex gets the next token and puts it in lval
   Lex(lval *yySymType) (tokenType int)
   // Error is called on parse error (it should probably panic?)
   Error(message string)
 }

The yySymType is generate from Yacc via the %union directive. The tokenType is the token identifier. However, the tokenType needs to be in the correct range for Yacc which starts at yyPrivate. The way to get the types identified correctly is to set it as return token.Type + yyPrivate - 1. See sensors_golex.go for a full example.

Yacc in its own special way has each production "return" a yySymType which serves as both an AST node and a token. Thus, my definition for yySymType is:

%union{
    token *lexmachine.Token
    ast   *Node
}

This lets you construct an AST while parsing:

Unary : DASH Factor             { $$.ast = NewNode("negate", $1.token).AddKid($2.ast) }
      | BACKTICK Factor         { $$.ast = NewNode("`", $1.token).AddKid($2.ast) }
      | CARROT Factor           { $$.ast = NewNode("^", $1.token).AddKid($2.ast) }
      | Factor                  { $$.ast = $1.ast }
      ;

Factor : NAME                   { $$.ast = NewNode("name", $1.token) }
       | NUMBER                 { $$.ast = NewNode("number", $1.token) }
       | AT                     { $$.ast = NewNode("@", $1.token) }
       | LPAREN Expr RPAREN     { $$.ast = $2.ast }
       ;

Finally, yacc does not provide any means of returning anything from the parser. To deal with this the lexer you provide needs to have a field which provides the result back to the caller:

type golex struct {
    *lexmachine.Scanner
    stmts []*Node
}

In the example stmts provides the parsed statements from the file back to the caller of the parser:

func parse(lexer *lexmachine.Lexer, fin io.Reader) (stmts []*Node, err error) {
    defer func() {
        if e := recover(); e != nil {
            switch e.(type) {
            case error:
                err = e.(error)
                stmts = nil
            default:
                panic(e)
            }
        }
    }()
    text, err := ioutil.ReadAll(fin)
    if err != nil {
        return nil, err
    }
    scanner, err := newGoLex(lexer, text)
    if err != nil {
        return nil, err
    }
    yyParse(scanner)
    return scanner.stmts, nil
}

Since yyLexer is an interface in goyacc their is some casting involved to populate stmts (note yylex is a magic variable in yacc that refers to the lexer object you provided):

Line : Stmt NEWLINE             { yylex.(*golex).stmts = append(yylex.(*golex).stmts, $1.ast) }
     | NEWLINE
     ;

Documentation

Overview

Package golex implements the same lexer as examples/sensors. However, it shows how to conform to the goyacc's expected interface:

type yyLexer interface {
   Lex(lval *yySymType) (tokenType int)
   Error(message string)
}

You define yySymType. The yyLexer type is defined by the generated code from goyacc. The tokenType is the token identifier. The expectation is the token id's are shared between what is defined in this package and the parser definition in parser.y.

To generate the parser (and make this all work) run:

go generate github.com/tisorlawan/lexmachine/examples/golex

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL