mexpr

package module

v1.9.1 Latest Latest Go to latest Published: Dec 13, 2024 License: MIT Imports: 9 Imported by: 6

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/danielgtaylor/mexpr

Links

Open Source Insights

README ¶

MicroExpr

GitHub tag (latest by date)

A small & fast dependency-free library for parsing micro expressions.

This library was originally built for use in templating languages (e.g. for-loop variable selection, if-statement evaluation) so is minimal in what it supports by design. If you need a more full-featured expression parser, check out antonmedv/expr instead.

Features:

Fast, low-allocation parser and runtime
- Many simple expressions are zero-allocation
Type checking during parsing
Simple
- Easy to learn
- Easy to read
- No hiding complex branching logic in expressions
Intuitive, e.g. "id" + 1 => "id1"

Useful error messages, example:

missing right operand
  not (1- <= 5)
  ......^

Fuzz tested to prevent crashes

Usage

Try it out on the Go Playground! You can find many example expressions in the tests.

import "github.com/danielgtaylor/mexpr"

// Convenience for lexing/parsing/running in one step:
result, err := mexpr.Eval("a > b", map[string]interface{}{
	"a": 2,
	"b": 1,
})

// Manual method with type checking and fast AST re-use. Error handling is
// omitted for brevity.
l := mexpr.NewLexer("a > b")
p := mexpr.NewParser(l)
ast, err := mexpr.Parse()
typeExamples = map[string]interface{}{
	"a": 2,
	"b": 1,
}
err := mexpr.TypeCheck(ast, typeExamples)
interpreter := mexpr.NewInterpreter(ast)
result1, err := interpreter.Run(map[string]interface{}{
	"a": 1,
	"b": 2,
})
result2, err := interpreter.Run(map[string]interfae{}{
	"a": 150,
	"b": 30,
})

Pretty errors use the passed-in input along with the error's offset to display an arrow of where within the expression the error occurs.

inputStr := "2 * foo"
_, err := mexpr.Eval(inputStr, nil)
if err != nil {
	fmt.Println(err.Pretty(inputStr))
}

Options

When running the interpreter a set of options can be passed in to change behavior. Available options:

Option	Default	Description
`StrictMode`	`false`	Be more strict, for example return an error when an identifier is not found rather than `nil`
`UnquotedStrings`	`false`	Enable the use of unquoted strings, i.e. return a string instead of `nil` for undefined parameters

// Using the top-level eval
mexpr.Eval(expression, inputObj, StrictMode)

// Using an interpreter instance
interpreter.Run(inputObj, StrictMode)

Syntax

Literals

strings double quoted e.g. "hello"
numbers e.g. 123, 2.5, 1_000_000

Internally all numbers are treated as float64, which means fewer conversions/casts when taking arbitrary JSON/YAML inputs.

Accessing properties

Use . between property names
Use [ and ] for indexes, which can be negative

foo.bar[0].value

Arithmetic operators

+ (addition)
- (subtration)
* (multiplication)
/ (division)
% (modulus)
^ (power)

(1 + 2) * 3^2

Math operations between constants are precomputed when possible, so it is efficient to write meaningful operations like size <= 4 * 1024 * 1024. The interpreter will see this as size <= 4194304.

Comparison operators

== (equal)
!= (not equal)
< (less than)
> (greater than)
<= (less than or equal to)
>= (greater than or equal to)

100 >= 42

Logical operators

not (negation)
and
or

1 < 2 and 3 < 4

Non-boolean values are converted to booleans. The following result in true:

numbers greater than zero
non-empty string
array with at least one item
map with at least one key/value pair

String operators

Indexing, e.g. foo[0]
Slicing, e.g. foo[1:2] or foo[2:]
.length pseudo-property, e.g. foo.length
.lower pseudo-property for lowercase, e.g. foo.lower
.upper pseudo-property for uppercase, e.g. foo.upper
+ (concatenation)
in e.g. "f" in "foo"
contains e.g. "foo" contains "f"
startsWith e.g. "foo" startsWith "f"
endsWith e.g. "foo" endsWith "o"

Indexes are zero-based. Slice indexes are optional and are inclusive. foo[1:2] returns el if the foo is hello. Indexes can be negative, e.g. foo[-1] selects the last item in the array.

Any value concatenated with a string will result in a string. For example "id" + 1 will result in "id1".

There is no distinction between strings, bytes, or runes. Everything is treated as a string.

Date Comparisons

String dates & times can be compared if they follow RFC 3339 / ISO 8601 with or without timezones.

before, e.g. start before "2020-01-01"
after, e.g. created after "2020-01-01T12:00:00Z"

Array/slice operators

Indexing, e.g. foo[1]
Slicing, e.g. foo[1:2] or foo[2:]
.length pseudo-property, e.g. foo.length
+ (concatenation)
in (has item), e.g. 1 in foo
contains e.g. foo contains 1

Indexes are zero-based. Slice indexes are optional and are inclusive. foo[1:2] returns [2, 3] if the foo is [1, 2, 3, 4]. Indexes can be negative, e.g. foo[-1] selects the last item in the array.

Array/slice filtering

A where clause can be used to filter the items in an array. The left side of the clause is the array to be filtered, while the right side is an expression to run on each item of the array. If the right side expression evaluates to true then the item is added to the result slice. For example:

// Get a list of items where the item.id is bigger than 3
items where id > 3

// More complex example
items where (id > 3 and labels contains "best")

This also makes it possible to implement one/any/all/none logic:

// One
(items where id > 3).length == 1

// Any
items where id > 3
(items where id > 3).length > 0

// All
(items where id > 3).length == items.length

// None
not (items where id > 3)
(items where id > 3).length == 0

Map operators

Accessing values, e.g. foo.bar.baz
in (has key), e.g. "key" in foo
contains e.g. foo contains "key"

Map wildcard filtering

A where clause can be used as a wildcard key to filter values for all keys in a map. The left side of the clause is the map to be filtered, while the right side is an expression to run on each value of the map. If the right side expression evaluates to true then the value is added to the result slice. For example, given:

{
  "operations": {
    "id1": { "method": "GET", "path": "/op1" },
    "id2": { "method": "PUT", "path": "/op2" },
    "id3": { "method": "DELETE", "path": "/op3" }
  }
}

You can run:

// Get all operations where the HTTP method is GET
operations where method == "GET"

And the result would be a slice of matched values:

[{ "method": "GET", "path": "/op1" }]

Performance

Performance compares favorably to antonmedv/expr for both Eval(...) and cached program performance, which is expected given the more limited feature set. The slow benchmarks include lexing/parsing/interpreting while the cached ones are just the interpreting step. The complex example expression used is non-trivial: foo.bar / (1 * 1024 * 1024) >= 1.0 and "v" in baz and baz.length > 3 and arr[2:].length == 1.

goos: darwin
goarch: amd64
pkg: github.com/danielgtaylor/mexpr
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
Benchmark/mexpr-field-slow-12           3673572       286.5 ns/op    144 B/op      6 allocs/op
Benchmark/_expr-field-slow-12            956689      1276 ns/op     1096 B/op     23 allocs/op

Benchmark/mexpr-comparison-slow-12      1000000      1020 ns/op      656 B/op     16 allocs/op
Benchmark/_expr-comparison-slow-12       383491      3069 ns/op     2224 B/op     38 allocs/op

Benchmark/mexpr-logical-slow-12         1000000      1063 ns/op      464 B/op     17 allocs/op
Benchmark/_expr-logical-slow-12          292824      4148 ns/op     2336 B/op     38 allocs/op

Benchmark/mexpr-math-slow-12            1000000      1035 ns/op      656 B/op     16 allocs/op
Benchmark/_expr-math-slow-12             399708      3004 ns/op     2184 B/op     38 allocs/op

Benchmark/mexpr-string-slow-12          1822945       655.6 ns/op    258 B/op     10 allocs/op
Benchmark/_expr-string-slow-12           428604      2508 ns/op     1640 B/op     35 allocs/op

Benchmark/mexpr-index-slow-12           2015856       592.0 ns/op    280 B/op     10 allocs/op
Benchmark/_expr-index-slow-12            517360      2301 ns/op     1872 B/op     30 allocs/op

Benchmark/mexpr-complex-slow-12          244039      5078 ns/op     2232 B/op     64 allocs/op
Benchmark/_expr-complex-slow-12           69387     16825 ns/op    14378 B/op    107 allocs/op

Benchmark/mexpr-field-cached-12       100000000        11.37 ns/op     0 B/op      0 allocs/op
Benchmark/_expr-field-cached-12         7761153       146.5 ns/op     48 B/op      2 allocs/op

Benchmark/mexpr-comparison-cached-12   38098502        30.93 ns/op     0 B/op      0 allocs/op
Benchmark/_expr-comparison-cached-12    4563463       251.0 ns/op     64 B/op      3 allocs/op

Benchmark/mexpr-logical-cached-12      37563720        31.35 ns/op     0 B/op      0 allocs/op
Benchmark/_expr-logical-cached-12      11000991       105.9 ns/op     32 B/op      1 allocs/op

Benchmark/mexpr-math-cached-12         24463279        47.41 ns/op     8 B/op      1 allocs/op
Benchmark/_expr-math-cached-12          4531693       268.0 ns/op     72 B/op      4 allocs/op

Benchmark/mexpr-string-cached-12       43399368        26.83 ns/op     0 B/op      0 allocs/op
Benchmark/_expr-string-cached-12        7302940       162.0 ns/op     48 B/op      2 allocs/op

Benchmark/mexpr-index-cached-12        45289230        25.67 ns/op     0 B/op      0 allocs/op
Benchmark/_expr-index-cached-12         6057562       180.0 ns/op     48 B/op      2 allocs/op

Benchmark/mexpr-complex-cached-12       4271955       278.7 ns/op     40 B/op      3 allocs/op
Benchmark/_expr-complex-cached-12       1456266       818.7 ns/op    208 B/op      9 allocs/op

On average mexpr is around 3-10x faster for both full parsing and cached performance.

References

These were a big help in understanding how Pratt parsers work:

Documentation ¶

Overview ¶

Package mexpr provides a simple expression parser.

Index ¶

func Parse(expression string, types any, options ...InterpreterOption) (*Node, Error)
type Error
type Interpreter
- func NewInterpreter(ast *Node, options ...InterpreterOption) Interpreter
type InterpreterOption
type Lexer
- func NewLexer(expression string) Lexer
type Node
- func (n Node) Dot(prefix string) string
- func (n Node) String() string
type NodeType
type Parser
- func NewParser(lexer Lexer) Parser
type Token
- func (t *Token) String() string
type TokenType
- func (t TokenType) String() string
type TypeChecker
- func NewTypeChecker(ast *Node, options ...InterpreterOption) TypeChecker

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func Parse ¶

func Parse(expression string, types any, options ...InterpreterOption) (*Node, Error)

Parse an expression and return the abstract syntax tree. If `types` is passed, it should be a set of representative example values for the input which will be used to type check the expression against.

Types ¶

type Error ¶

type Error interface {
	Error() string

	// Offset returns the character offset of the error within the experssion.
	Offset() uint16

	// Length returns the length in bytes after the offset where the error ends.
	Length() uint8

	// Pretty prints out a message with a pointer to the source location of the
	// error.
	Pretty(source string) string
}

Error represents an error at a specific location.

func Eval ¶

func Eval(expression string, input any, options ...InterpreterOption) (any, Error)

Eval is a convenience function which lexes, parses, and executes an expression with the given input. If you plan to execute the expression multiple times consider caching the output of `Parse(...)` instead for a big speed improvement.

func NewError ¶

func NewError(offset uint16, length uint8, format string, a ...interface{}) Error

NewError creates a new error at a specific location.

func Run ¶

func Run(ast *Node, input any, options ...InterpreterOption) (any, Error)

Run executes an AST with the given input and returns the output.

func TypeCheck ¶

func TypeCheck(ast *Node, types any, options ...InterpreterOption) Error

TypeCheck will take a parsed AST and type check against the given input structure with representative example values.

type Interpreter ¶

type Interpreter interface {
	Run(value any) (any, Error)
}

Interpreter executes expression AST programs.

func NewInterpreter ¶

func NewInterpreter(ast *Node, options ...InterpreterOption) Interpreter

NewInterpreter returns an interpreter for the given AST.

type InterpreterOption ¶

type InterpreterOption int

InterpreterOption passes configuration settings when creating a new interpreter instance.

const (
	// StrictMode does extra checks like making sure identifiers exist.
	StrictMode InterpreterOption = iota

	// UnqoutedStrings enables the use of unquoted string values rather than
	// returning nil or a missing identifier error. Identifiers get priority
	// over unquoted strings.
	UnquotedStrings
)

type Lexer ¶

type Lexer interface {
	// Next returns the next token from the expression. The returned token may
	// be changed in-place on subsequent calls and should not be stored.
	Next() (*Token, Error)
}

Lexer returns tokens from an input expression.

func NewLexer ¶

func NewLexer(expression string) Lexer

NewLexer creates a new lexer for the given expression.

type Node ¶

type Node struct {
	Type   NodeType
	Length uint8
	Offset uint16
	Left   *Node
	Right  *Node
	Value  interface{}
}

Node is a unit of the binary tree that makes up the abstract syntax tree.

func (Node) Dot ¶ added in v1.5.0

func (n Node) Dot(prefix string) string

Dot returns a graphviz-compatible dot output, which can be used to render the parse tree at e.g. https://dreampuf.github.io/GraphvizOnline/ or locally. You must wrap the output with `graph G {` and `}`.

func (Node) String ¶ added in v1.5.0

func (n Node) String() string

String converts the node to a string representation (basically the node name or the node's value for identifiers/literals).

type NodeType ¶

type NodeType uint8

NodeType defines the type of the abstract syntax tree node.

const (
	NodeUnknown NodeType = iota
	NodeIdentifier
	NodeLiteral
	NodeAdd
	NodeSubtract
	NodeMultiply
	NodeDivide
	NodeModulus
	NodePower
	NodeEqual
	NodeNotEqual
	NodeLessThan
	NodeLessThanEqual
	NodeGreaterThan
	NodeGreaterThanEqual
	NodeAnd
	NodeOr
	NodeNot
	NodeFieldSelect
	NodeArrayIndex
	NodeSlice
	NodeSign
	NodeIn
	NodeContains
	NodeStartsWith
	NodeEndsWith
	NodeBefore
	NodeAfter
	NodeWhere
)

Possible node types

type Parser ¶

type Parser interface {
	// Parse the expression and return the root node.
	Parse() (*Node, Error)
}

Parser takes a lexer and parses its tokens into an abstract syntax tree.

func NewParser ¶

func NewParser(lexer Lexer) Parser

NewParser creates a new parser that uses the given lexer to get and process tokens into an abstract syntax tree.

type Token ¶

type Token struct {
	Type   TokenType
	Length uint8
	Offset uint16
	Value  string
}

Token describes a single token produced by the lexer.

func (*Token) String ¶

func (t *Token) String() string

type TokenType ¶

type TokenType uint8

TokenType defines the type of token produced by the lexer.

const (
	TokenUnknown TokenType = iota
	TokenIdentifier
	TokenDot
	TokenNumber
	TokenString
	TokenLeftParen
	TokenRightParen
	TokenLeftBracket
	TokenRightBracket
	TokenSlice
	TokenAddSub
	TokenMulDiv
	TokenPower
	TokenComparison
	TokenAnd
	TokenOr
	TokenNot
	TokenStringCompare
	TokenWhere
	TokenEOF
)

Token

func (TokenType) String ¶ added in v1.5.0

func (t TokenType) String() string

type TypeChecker ¶ added in v1.2.0

type TypeChecker interface {
	Run(value any) Error
}

TypeChecker checks to ensure types used for operations will work.

func NewTypeChecker ¶ added in v1.2.0

func NewTypeChecker(ast *Node, options ...InterpreterOption) TypeChecker

NewTypeChecker returns a type checker for the given AST.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL