Documentation ¶
Overview ¶
Package lex provides support for a *nix (f)lex like tool on .l sources. The syntax is similar to a subset of (f)lex, see also: http://flex.sourceforge.net/manual/Format.html#Format
Changelog ¶
2021-05-28: Removed global state, NewL can now be called multiple times.
2014-11-18: Add option for marking an accepting state. Required to support POSIX longest match.
Some feature examples:
/* Unindented multiline Go comments in the definitions section */ Any indented text in the definitions section %{ Any text in the definitions section within %{ and %} %} D [0-9] %s non-exclusive-start-condition s2 s3 %x exclusive-start-condition e2 %yyt getTopState() // not required when only INITIAL start condition exists %yyb last == '\n' || last = '\0' %yyc getCurrentChar() %yyn move() // get next character %yym mark() // now in accepting state %% Indented text before the first rule is presumably treated specially (renderer specific) {D}+ return(INT) {D}+\.{D}+ return(FLOAT) [a-z][a-z0-9]+ /* identifier found */ return(IDENT) A"[foo]\"bar"Z println(`A[foo]"barZ`) ^bol|eol$ <non-exclusive-start-condition>foo %{ println("foo found") %} <s2,s3>bar <INITIAL,e2>abc <*>"always" println("active in all start conditions") %% The optional user code section. Possibly the place where a lexem recognition fail will be handled (renderer specific).
Missing/differing functionality of the .l parser/FSM generator (compared to flex):
- Trailing context (re1/re2).
- No requirement of an action to start on the same line as the pattern.
- Processing of actions enclosed in braces. This package mostly treats any non blank text following a pattern up to the next pattern as an action source code.
- All flex % prefixed options except %s and %x.
- Flex incompatible %yy* options
- No cclasses ([[:digit:]]).
- Anything special after '(?'.
- Matching <<EOF>>. Still \0 is OK in a pattern.
- And probably more.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type L ¶
type L struct { // Source code lines for rendering from the definitions section DefCode []string // Names of declared start conditions with their respective numeric // identificators StartConditions map[string]int // Start conditions numeric identificators with their respective DFA // start state StartConditionsStates map[int]*lexer.NfaState // Beginning of line start conditions numeric identificators with their // respective DFA start state StartConditionsBolStates map[int]*lexer.NfaState // Rule[0] is a pseudo rule. It's action contains the source code for // rendering from the rules section before first rule Rules []Rule // The generated FSM Dfa lexer.Nfa // Accept states with their respective rule index Accepts map[*lexer.NfaState]int // Source code for rendering from the user code section UserCode string // Source code for rendering of get_current_start_condition. Set by // %yyt. YYT string // Source code for rendering of get_bol, i.e. if we are at the // beginning of line right now. Set by %yyb. YYB string // Source code for rendering of get_peek_char, i.e. the char the lexer // will now consider in making of a decision. Set by %yyc. YYC string // Source code for rendering of move_to_next_char, i.e. "consume" the // current peek char and go to the next one. Set by %yyn. YYN string // Source code for rendering of mark_accepting, support to accept // longest matching but reusing the "overflowed" input. Set by %yym. YYM string }
L represents selected data structures found in / generated from a .l source. A [command line] tool using this package may then render L to some programming language source code and/or data table(s).
func NewL ¶
NewL parses a .l source fname from src, returns L or an error if any. The unoptdfa argument allows to disable optimization of the produced DFA. The mode32 parameter is not yet supported and must be false.
type Rule ¶
type Rule struct { Conds []string // Start conditions of the rule Pattern string // Original rule's pattern BOL bool // Pattern starts with beginning of line assertion (^) EOL bool // Pattern ends with end of line ($) assertion RE string // Pattern translated to a regular expression Action string // Rule's associated action source code }
Rule represents data for a pattern/action