script

package
v0.0.0-...-4c91ef0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 28, 2019 License: Apache-2.0 Imports: 22 Imported by: 0

Documentation

Overview

Package script implements scripting support for defining Diviner studies through Starlark [1]. This package enables the floating point and lambda extensions of the language.

Script defines the following builtins for defining Diviner configurations (question marks indicate optional arguments):

	discrete(v1, v2, v3...)
		Defines a discrete parameter that takes on the provided set
		set of values (types string, float, or int).

	range(beg, end)
		Defines a range parameter with the given range. (Integers or floats.)

	minimize(metric)
		Defines an objective that minimizes a metric (string).

	maximize(metric)
		Defines an objective that maximizes a metric (string).

	localsystem(name, parallelism?)
		Defines a new local system with the provided name.  The name is used to
		identify the system in tools.  The parallelism limits the number of jobs
		that run on this system simultaneously.  If parallelism is unset, it
		defaults to ∞.

	ec2system(name, ami, instance_profile, instance_type, disk_space?, data_space?, on_demand?, flavor?)
		Defines a new EC2-based system of the given name, and configuration.
		The provided name is used to identify the system in tools.
		- ami:              the EC2 AMI to use when launching new instances;
		- instance_profile: the IAM instance profile assigned to new instances;
		- instance_type:    the instance type used;
		- disk_space:       the amount of root disk space created;
		- data_space:       the amount of data/scratch space created;
		- on_demand:        (bool) whether to launch on-demand instance types;
		- flavor:           the flavor of AMI: "ubuntu" or "coreos".
		See package github.com/grailbio/bigmachine/ec2system for more details on these
		parameters.

	dataset(name, system, if_not_exist?, local_files?, script)
		Defines a dataset (diviner.Dataset):
		- name:         the name of the dataset, which must be unique;
		- system:       the system(s) to be used for run execution. The value is either
                   a single system or a list of systems. In the latter case,
                   the run will use any one of systems can allocate resources.
		- if_not_exist: a URL that is checked for conditional execution;
		                dataset invocations are de-duped based on this URL.
		- local_files:  a list of local files that must be made available
		                in the script's execution environment;
		- script:       the script that is run to produce the dataset.

	run_config(script, system, local_files?, datasets?)
		Defines a run config (diviner.RunConfig) representing a single
		trial:
		- script:      the script that is executed for this trial;
		- system:      the system(s) to be used for run execution. The value is either
                  a single system or a list of systems. In the latter case,
                  the run will use any one of systems can allocate resources.
		- local_files: a list of local files that must be made available
		               in the script's execution environment;
		- datasets:    a list of datasets that must be available before
		               the trial can proceed.

	study(name, params, objective, run, replicates?, oracle?)
		A toplevel function that declares a named study with the provided
		parameters, runner, and objectives.
		- name:       a string specifying the name of the study;
		- objective:  the optimization objective;
		- params:     a dictionary with naming a set of parameters
		              to be optimized;
		- run:        a function that returns a run_config for a set
		              of parameter values; the first argument to the function
		              is a dictionary of parameter values. A number of optional,
		              named arguments follow: "id" is a string providing the
		              run's diviner ID, which may be used as an external key to
		              reference a particular run; "replicate" is an integer
		              specifying the replicate number associated with the run.
		- replicates: the number of replicates to perform for each parameter
		              combination.
   - description:an optional string describing the study.
		- oracle:     the oracle to use (grid search by default).

	grid_search
		The grid search oracle

	skopt(base_estimator?, n_initial_points?, acq_func?, acq_optimizer?)
		A Bayesian optimization oracle based on skopt. The arguments
		are as in skopt.Optimizer, documented at
		https://scikit-optimize.github.io/optimizer/index.html#skopt.optimizer.Optimizer:
		- base_estimator:   the base estimator to be used, one of
		                    "GP", "RF", "ET", "GBRT" (default "GP");
		- n_initial_points: number of evaluations to perform before estimating
		                    using the above estimator (default 10);
		- acq_func:         the acquisition function to use for sampling
		                    new points, one of "LCB", "EI", "PI", or
		                    "gp_hedge" (default "gp_hedge");
		- acq_optimizer:    the optimizer used to minimize the acquisitino function,
		                    one of "sampling", "lgbfs" (by default it is automatically
		                    selected).

 	command(script, interpreter?="bash -c", strip?=False)
		Run a subprocess and return its standard output as a string.
		- script: the script to run; a string.
		- interpreter: command that runs the script. It defaults to "bash -c".
		- strip: strip leading and training whitespace from the command's output.

	For example, command("print('foo'*2)", interpreter="python3 -c") will produce
	"foofoo\n".

	temp_file(contents)
		Create a temporary file from the provided contents (a string), and return
		its path.

	enum_value(str)
		Internal representation of a protocol buffer enumeration value.
		(See to_proto).

	to_proto(dict):
		Render a string-keyed dictionary to the text protocol buffer format.
		Dictionaries cannot currently be nested. Enumeration values as created
		by enum_value are rendered as protocol buffer enumeration, not strings.

 panic(messages...)
   Print the messages and crash the process.

Diviner configs must include one or more studies as toplevel declarations. Global starlark objects are frozen after initial evaluation to prevent functions from modifying shared state.

[1] https://docs.bazel.build/versions/master/skylark/language.html

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Load

func Load(filename string, src interface{}) ([]diviner.Study, error)

Load loads configuration from a Starlark script. Arguments are as in Starlark's syntax.Parse: if src is not nil, it must be a byte source ([]byte or io.Reader); if src is nil, data are parsed from the provided filename.

Load provides the builtins describes in the package documentation.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL