compiler

package
v0.1.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 21, 2023 License: MIT Imports: 21 Imported by: 0

README

compiler

The compiler package is used to compile text in the Ego language into bytecode that can be executed using the bytecode package. This allows for compiled scripts to be integrated into the application, and run repeatedly without incurring the overhead of parsing and semantic analysis each time.

The Ego language is loosely based on Go but with some important differences. Some important attributes of Ego programs are:

  • There are no pointer types, and no dynamic memory allocation.
  • All objects are passed by value in function calls.
  • Variables are untyped, but can be cast explicitly or will be type converted automatically when possible.

The program stream executes at the topmost scope. You can define one or more functions in that topmost scope, or execute commands directly. Each function runs in its own scope; it can access variables from outer scopes but cannot set them. Functions defined within another function only exist as long as that function is running.

Example

Here is a trivial example of compiling and running some Ego code in your Go program.

// String containing arbitrary _Ego_ statements.
src := "..."

bc, err := compiler.CompileString(src)
if err != nil {
    // Handle compile-time errors
}

syms := symbols.NewSymbolTable("test program")
ctx := bytecode.NewContext(syms, bc)
err := ctx.Run()
if err != nil {
    // Handle run-time errors
}

The general pattern is to pass a string containing the program text to the compiler. The compiler generates a bytecode object containing the pseudocode for the program and any predefined symbols (constants or functions) from the compilation.

The caller then creates a new symbol table (or can re-use an existing one if symbols are meant to be persistent between compilation units). A new runtime context is created (which contains the program counter, stack, error handling stack, etc.) and uses the existing symbol table and bytecode. This allows bytecode to be persisted or re-used and can be executed multiple times on multiple threads, each with it's own context.

Finally, the context is run, which executes the bytecode instructions. If the instructions are meant to return a value, that value is left on the stack for the context, and you can use ctx.Pop() to remove items from the stack. The return values are opaque interface{} objects, and you can use the util.Get*() functions to extract the integer, float64, string, or bool object.

Data types

Ego support six data types, plus limited support for function pointers as values.

type description
int 64-bit integer value
float64 64-bit floating point value
string Unicode string
bool Boolean value of true or false
[...] Array of values
{...} Structure of fields

An array is a list of zero or more values. The array values can be of any type, including other arrays. The array elements are always indexed starting at a value of zero. You can also reference a range (slice) of an array by using the notation a[b:e] which returns an array containing elements from b to e from array a. You cannot reference a subscript that has not been allocated. You can use the array statement to initialize an array, and the array() function to change the size of an existing array.

You can also define an array constant, and assign it to a value, where the initial array is defined by the constant.

names := [ "Tim", "Sue", "Bob", "Robin" ]

This results in names being assigned an array containing the four string values given.

Structures contain zero or more labeled fields. Each field label must be unique, and can reference a value of any type. You can create a struct using a literal:

employee := { name: "Susan", rate: 23.50, active:true }

This creates a structure with three members (name, rate, and active). Once this is stored in a variable, you can use dot-notation to reference a field directly,

pay := 40 * employee.rate

If the member does not exist, an error is generated. You can however add new fields to a structure simply by naming them in an assignment, such as

employee.weekly = pay

If there isn't already a field named weekly in the structure, it is created automatically. The field is then set to the value of pay.

Note that structures and arrays are always passed around by reference. That is, if you create a structure a, assign it to b, and then change a field in b, it will also be changed in a. To make an exact duplicate of an existing structure, use the new() function which creates a new instance of the existing type. For example,

 a := { age: 55, name: "Timmy"}
 b := a
 b.age = 4
 fmt.Println( a.age )   // Prints the value 4

 c := { age: 55, name: "Timmy"}
 d := new(c)
 d.age = 4
 fmt.Println( c.age )    // Prints the value 55

Scope

The console input (when ego is run with no arguments or parameters) or the source file named when ego run is used creates the main symbol table, available to any statement entered by the user or in the source file.

The first time a symbol is created, you must use the := notation to create a variable as well as set its value. All subsequent sets of that variable should use the = notation to store in an existing symbol.

Whenever a { } block is used, a new symbol table is created for use during that block. Any symbols created within the block are deleted when the block exits. The block can of course reference symbols in outer blocks using standard "=" notation. For example,

x := 55
{
    y := 66
    x = 42
}

After this code runs, the value of x will be 42 (because it was changed within the block) and the symbol y will not be defined (because it went out of scope at the end of the basic block.)

array

The array statement is used to allocate an array. An array can also be created as an array constant and stored in a variable. The array statement identifies the name of the array and the size, and optionally an initial value for each member of the array.

array x[5]
array y[2] = 10

The first example creates an array of 5 elements, but the elements are <nil> which means they do not have a usable value yet. The array elements must have a value stored in them before they can be used in an expression. The second example assigns an initial value to each element of the array, so the second statement is really identical to y := [10,10].

const

The const statement can define constant values in the current scope. These values are always readonly values and you cannot use a constant name as a variable name. You can specify a single constant or a group of them; to specify more than one in a single statement enclose the list in parenthesis:

const answer = 42

const (
    first = "a"
    last = "z"
)

This defines three constant values. Note that the value is set using an = character since a symbols is not actually being created.

if

The if statement provides conditional execution. The statement must start with a expression which can be cast as a boolean value. That value is tested; if it is true then the following statement (or statement block) is execued. By convention, even if the conditional code is a single statement, it is enclosed in a statement block. For example,

if age >= 50 {
    call aarp(name)
}

This tests the variable age to determine if it is greater than or equal to the integer value 50, and if so, it calls the function named aarp with the value of the name symbol.

You can optionally include an "else" clause to execute if the condition is false, as in

if flag == "-d" {
    call debug()
} else {
    call regular()
}

If the value of flag does not equal the string "-d" then the code will call the function regular() instead of debug().

func

The func statement defines a function. This must have a name which is a valid symbol, followed by an argument list.

The argument list is a list of names which become local variables in the running function, set to the value of the arguments from the caller. After each argument name in the func statement, you can specify a type of int, string, float64, bool, struct, or array, in which case the value is coerced to that type regardless of the value passed into the function.

After the (possibly empty) argument list you must specify the type of the function's return value. This can be one of the base types (int, float64, string, or bool). It can also be [] which denotes a return of an array type, or {} which denotes the return of a struct type. Finally, the type can be any which means any type can be returned, or void which means no value is returned from this function (it is intended to be invoked as a call statement).

The type declaration is then followed by a statement or block defining the code to execute when the function is used in an expression or in a call statement. For example,

func double(x) float64 {
    return x * 2
}

This accepts a single value, named x when the function is running. The function returns that value multiplied by 2. The type of the result is coerced to be a float64 value. Note that the braces are not required in the above example since the function consists of a single return statement, but by convention braces are always used to indicate the body fo the function. The function just created can then be used in an expression, such as:

fun := 2
moreFun := double(fun)

After this code executes, moreFun will contain the value 4.0 as a float64 value.

return

The return statement contains an expression that is identified as the result of the function value. The generated code adds the value to the runtime stack, and then exits the function. The caller can then retrieve the value from the stack to use in an expression or statement.

return salary/12.0

This statement returns the value of the expression salary/12.0 as the result of the function.

If you use the return statement with no value, then the function simply stops without leaving a value on the arithmetic stack. This is the appropriate behavior for a function that is meant to be invoked with a call statement.

for

The for statement defines a looping construct. A single statement or a statement block is executed based on the definition of the loop. There are two kinds of loops.

x := [101, 232,363]
for n:=0; n < len(x); n = n + 1 {
    fmt.Printf("element %d is %d\n", n, x[n])
}

This example creates an array, and then uses a loop to read all the values of the array. The for statement is followed by three clauses, each separated by a tokenizer.SemicolonToken character. The first clause must be a valid assignment that initializes the loop value. The second clause is a condition which is tested at the start of each loop; when the condition results in a false value, the loop stop executing. The third clause must be a statement that updates the loop value. This is followed by a block containing the statement(s) to execute each time through the loop.

When using a loop to index over an array, you can use a short hand version of this.

x := [ 101, 232, 363 ]
for n := range x {
    fmt.Println( "The value is ", n)
}

In this example, the value of n will take on each element of the array in turn as the body of the loop executes. You can have the range option give you both the index number and the value.

x := [ 101, 232, 363 ]
for i, n := range x {
    fmt.Println( "Element ", i, " is ", n )
}

Here, the array index is stored in i and the value of the array index is stored in n. This is symantically identical to the following more explicit loop structure:

for i := 1; i <= len(x); i = i + 1 {
    n := x[i]
    fmt.Println( "Element ", i, " is ", n )
}

break

The break statement exits from the currently running loop, as if the loop had terminated normally.

for i := 0; i < 10; i = i + 1 {
    if i == 5 {
        break
    }
    fmt.Println( i )
}

This loop run run only five times, printing the values 0..4. On the next iteration, because the index i is equal to 5, the loop is terminated. Note that a break will only exit the current loop; if there are nested loops the break only exits the loop in which it occurred and all outer loops continue to run.

The break statement cannot be used outside of a for loop.

continue

The continue statement exits from the current iteration of the loop, as if the loop had restarted with the next iteration.

for i := 0; i < 10; i = i + 1 {
    if i == 5 {
        continue
    }
    fmt.Println( i )
}

This loop run run only all ten times, but will only output the values 0..4 and 6..9. When the index i is equal to 5, the loop starts again at the top of the loop with the next index value.

The continue statement cannot be used outside of a for loop.

Error handling

You can use the try statement to run a block of code (in the same scope as the enclosing statement) and catch any runtime errors that occur during the execution of that block. The error causes the code to execute the code in the catch block of the statement. If there are no errors, execution continues after the catch block.

x := 0
try {
    x = pay / hours
} catch {
    fmt.Println( "Hours were zero!" )
}
fmt.Println( "The result is ", x )

If the value of hours is non-zero, the assignment statement will assign the dividend to x. However, if hours is zero it will trigger a runtime divide-by-zero error. When this happens, the remainder of the statements (if any) in the try block are skipped, and the catch block is executed. Within this block, there is a variable _error_ that is set to the value of the error that was signalled. This can be used in the catch block if it needs handle more than one possible error, for example.

package

Use the package statement to define a set of related functions in a package in the current source file. A give source file can only contain one package statement and it must be the first statement.

package factor

This defines that all the functions and constants in this module will be defined in the factor package, and must be referenced with the factor prefix, as in

y := factor.intfact(55)

This calls the function intfact() defined in the factor package.

import

Use the import statement to include other files in the compilation of this program. The import statement cannot appear within any other block or function definition. Logically, the statement stops the current compilation, compiles the named object (adding any function and constant definitions to the named package) and then resuming the in-progress compilation.

import factor
import "factor"
import "factor.ego"

All three of these have the same effect. The first assumes a file named "factor.ego" is found in the current directory. The second and third examples assume the quoted string contains a file path. If the suffix ".ego" is not included it is assumed.

If the import name cannot be found in the current directory, then the compiler uses the environment variables EGO_PATH to form a directory path, and adds the "lib" directory to that path to locate the import. So the above statement could resolve to /Users/cole/ego/lib/factor.ego if the EGO_PATH was set to "~/ego".

Finally, the import statement can read an entire directory of source files that all contribute to the same package. If the target of the import is a directory in the $EGO_PATH/lib location, then all the source files within that directory area read and processed as part of one package.

@error

You can generate a runtime error by adding in a @error directive, which is followed by a string expression that is used to formulate the error message text.

v = "unknonwn"
@error "unrecognized value: " + v

This will result in a runtime error being generated with the error text "unrecognized value: unknown". This error can be intercepted in a try/catch block if desired.

@global

You can store a value in the Root symbol table (the table that is the ultimate parent of all other symbols). You cannot modify an existing readonly value, but you can create new readonly values, or values that can be changed by the user.

@global base "http://localhost:8080"

This creates a variable named base that is in the root symbol table, with the value of the given expression. If you do not specify an expression, the variable is created as an empty-string.

@template

You can store away a named Go template as inline code. The template can reference any other templates defined.

@template hello "Greetings, {{.Name}}"

The resulting templates are available to the template() function, whose first parameter is the template name and the second optional parameter is a record containing all the named values that might be substituted into the template. For example,

 fmt.Println( strings.template(hello, { Name: "Tom"}))

This results in the string "Greetings, Tom" being printed on the stdout console. Note that hello becomes a global variable in the program, and is a pointer to the template that was previously compiled. This global value can only be used with template functions.

Documentation

Index

Constants

View Source
const (
	AssertDirective       = "assert"
	AuthentiatedDirective = "authenticated"
	DebugDirective        = "debug"
	EntryPointDirective   = "entrypoint"
	ErrorDirective        = "error"
	FileDirective         = "file"
	FailDirective         = "fail"
	GlobalDirective       = "global"
	HandlerDirective      = "handler"
	JSONDirective         = "json"
	LineDirective         = "line"
	LocalizationDirective = "localization"
	LogDirective          = "log"
	PassDirective         = "pass"
	ResponseDirective     = "response"
	StatusDirective       = "status"
	TemplateDirective     = "template"
	TestDirective         = "test"
	TextDirective         = "text"
	TypeDirective         = "type"
	URLDirective          = "url"
	WaitDirective         = "wait"
)

Variables

This section is empty.

Functions

func CompileTypeSpec

func CompileTypeSpec(source string) (*data.Type, error)

Given a string expression of a type specification, compile it asn return the type it represents, and an optional error if it was incorrectly formed. This cannot reference user types as they are not visible to this function.

If the string starts with the keyword `type` followed by a type name, then the resulting value is a type definition of the given name.

func Run

Given a token stream, compile and execute it immediately. Note that language extensions are always enabled for this kind of execution mode.

func RunString

func RunString(name string, s *symbols.SymbolTable, programText string) error

Given a string, compile and execute it immediately.

func TestAssert

func TestAssert(s *symbols.SymbolTable, args []interface{}) (interface{}, error)

TestAssert implements the T.assert() function.

func TestEqual

func TestEqual(s *symbols.SymbolTable, args []interface{}) (interface{}, error)

TestEqual implements the T.Equal() function.

func TestFail

func TestFail(s *symbols.SymbolTable, args []interface{}) (interface{}, error)

TestFail implements the T.fail() function which generates a fatal error.

func TestFalse

func TestFalse(s *symbols.SymbolTable, args []interface{}) (interface{}, error)

TestFalse implements the T.False() function.

func TestIsType

func TestIsType(s *symbols.SymbolTable, args []interface{}) (interface{}, error)

TestIsType implements the T.type() function.

func TestNil

func TestNil(s *symbols.SymbolTable, args []interface{}) (interface{}, error)

TestNil implements the T.Nil() function.

func TestNotEqual

func TestNotEqual(s *symbols.SymbolTable, args []interface{}) (interface{}, error)

TestNotEqual implements the T.NotEqual() function.

func TestNotNil

func TestNotNil(s *symbols.SymbolTable, args []interface{}) (interface{}, error)

TestNotNil implements the T.NotNil() function.

func TestTrue

func TestTrue(s *symbols.SymbolTable, args []interface{}) (interface{}, error)

TestTrue implements the T.True() function.

Types

type Compiler

type Compiler struct {
	// contains filtered or unexported fields
}

Compiler is a structure defining what we know about the compilation.

func New

func New(name string) *Compiler

New creates a new compiler instance.

func (*Compiler) AddBuiltins

func (c *Compiler) AddBuiltins(pkgname string) bool

AddBuiltins adds the builtins for the named package (or prebuilt builtins if the package name is empty).

func (*Compiler) AddPackageToSymbols

func (c *Compiler) AddPackageToSymbols(s *symbols.SymbolTable) *Compiler

AddPackageToSymbols adds all the defined packages for this compilation to the given symbol table. This function supports attribute chaining for a compiler instance.

func (*Compiler) AddStandard

func (c *Compiler) AddStandard(s *symbols.SymbolTable) bool

AddStandard adds the package-independent standard functions (like len() or make()) to the given symbol table.

func (*Compiler) Assert

func (c *Compiler) Assert() error

Assert implements the @assert directive.

func (*Compiler) AutoImport

func (c *Compiler) AutoImport(all bool, s *symbols.SymbolTable) error

AutoImport arranges for the import of built-in packages. The parameter indicates if all available packages (including those found in the ego path) are imported, versus just essential packages like "util".

func (*Compiler) Clone

func (c *Compiler) Clone(withLock bool) *Compiler

Clone makes a new copy of the current compiler. The withLock flag indicates if the clone should respect symbol table locking. This function supports attribute chaining for a compiler instance.

func (*Compiler) Compile

func (c *Compiler) Compile(name string, t *tokenizer.Tokenizer) (*bytecode.ByteCode, error)

Compile starts a compilation unit, and returns a bytecode of the compiled material.

func (*Compiler) CompileString

func (c *Compiler) CompileString(name string, source string) (*bytecode.ByteCode, error)

CompileString turns a string into a compilation unit. This is a helper function around the Compile() operation that removes the need for the caller to provide a tokenizer.

func (*Compiler) ExitEnabled

func (c *Compiler) ExitEnabled(b bool) *Compiler

If set to true, the compiler allows the "exit" statement. This function supports attribute chaining for a compiler instance.

func (*Compiler) Expression

func (c *Compiler) Expression() (*bytecode.ByteCode, error)

Expression is the public entrypoint to compile an expression which returns a bytecode segment as it's result. This lets code compile an expression, but save the generated code to emit later.

The function grammar considers a conditional to be the top of the parse tree, so we start evaluating there.

From the golang doc, operator precedence is:

 Precedence    Operator
	5             *  /  %  <<  >>  &  &^
	4             +  -  |  ^
	3             ==  !=  <  <=  >  >=
	2             &&
	1             ||

func (*Compiler) ExtensionsEnabled

func (c *Compiler) ExtensionsEnabled(b bool) *Compiler

If set to true, the compiler allows the PRINT, TRY/CATCH, etc. statements. This function supports attribute chaining for a compiler instance.

func (*Compiler) Fail

func (c *Compiler) Fail() error

Fail implements the @fail directive.

func (*Compiler) File

func (c *Compiler) File() error

Fail implements the @fail directive.

func (*Compiler) Get

func (c *Compiler) Get(name string) (interface{}, bool)

Get retrieves a compile-time symbol value.

func (*Compiler) GetPackageType

func (c *Compiler) GetPackageType(packageName, typeName string) (*data.Type, bool)

For a given package and type name, get the underlying type.

func (*Compiler) MainSeen

func (c *Compiler) MainSeen() bool

MainSeen indicates if a "package main" has been seen in this compilation.

func (*Compiler) NormalizedIdentifiers

func (c *Compiler) NormalizedIdentifiers() bool

NormalizedIdentifiers returns true if this instance of the compiler is folding all identifiers to a common (lower) case.

func (*Compiler) ParseFunctionDeclaration

func (c *Compiler) ParseFunctionDeclaration() (*data.FunctionDeclaration, error)

ParseFunctionDeclaration compiles a function declaration, which specifies the parameter and return type of a function.

func (*Compiler) SetInteractive

func (c *Compiler) SetInteractive(b bool) *Compiler

SetInteractive indicates if the compilation is happening in interactive (i.e. REPL) mode. This function supports attribute chaining for a compiler instance.

func (*Compiler) SetNormalizedIdentifiers

func (c *Compiler) SetNormalizedIdentifiers(flag bool) *Compiler

SetNormalizedIdentifiers sets the flag indicating if this compiler instance is folding all identifiers to a common case. This function supports attribute chaining for a compiler instance.

func (*Compiler) SetRoot

func (c *Compiler) SetRoot(s *symbols.SymbolTable) *Compiler

Override the default root symbol table for this compilation. This determines where package names are stored/found, for example. This is overridden by the web service handlers as they have per-call instances of root. This function supports attribute chaining for a compiler instance.

func (*Compiler) SetTestMode

func (c *Compiler) SetTestMode(b bool) *Compiler

SetTestMode is used to set the test mode indicator for the compiler. This is set to true only when running in Ego "test" mode. This function supports attribute chaining for a compiler instance.

func (*Compiler) Symbols

func (c *Compiler) Symbols() *symbols.SymbolTable

Symbols returns the symbol table map from compilation.

func (*Compiler) TestMode

func (c *Compiler) TestMode() bool

TesetMode returns whether the compiler is being used under control of the Ego "test" command, which has slightly different rules for block constructs.

func (*Compiler) TestPass

func (c *Compiler) TestPass() error

TestPass implements the @pass directive.

func (*Compiler) WithNormalization

func (c *Compiler) WithNormalization(f bool) *Compiler

WithNormalization sets the normalization flag. This function supports attribute chaining for a compiler instance.

func (*Compiler) WithSymbols

func (c *Compiler) WithSymbols(s *symbols.SymbolTable) *Compiler

Set the given symbol table as the default symbol table for compilation. This mostly affects how builtins are processed. This function supports attribute chaining for a compiler instance.

func (*Compiler) WithTokens

func (c *Compiler) WithTokens(t *tokenizer.Tokenizer) *Compiler

WithTokens supplies the token stream to a compiler. This function supports attribute chaining for a compiler instance.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL