diviner

package module
v0.0.0-...-4c91ef0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 28, 2019 License: Apache-2.0 Imports: 16 Imported by: 0

README

Diviner

Diviner is a serverless machine learning and hyperparameter tuning platform. Diviner runs studies on behalf of a user; each study comprises a set of hyperparameters (e.g., learning rate, data augmentation policy, loss functions) and instructions for how to instantiate a trial based on a set of concrete hyperparameter values. Diviner then manages trial execution, book-keeping, and hyperparameter optimization based on past trials.

Diviner can be used as a scriptable tool or in Go programs through a Go package.

Studies, trials, and runs

Diviner defines a data model that is rooted in user-defined studies. A study contains all the information needed to conduct a number of trials, including a set of hyperparameters over which to conduct the study. A trial is an instantiation of a set of valid hyperparameter values. A run is a trial attempt; runs may fail or be retried.

Diviner stores studies and runs in an underlying database, keyed by study names. The database is used to construct leaderboards that show the best-performing hyperparameter combinations for a study. The database can also be used to query pending runs and detailed information about each.

The diviner tool interprets study definitions written in Starlark. The study definitions include the hyperparameters definitions and a function that determines how to conduct a run based on a set of parameter values, selected by an optimizer (called an oracle).

Example: optimizing MNIST with PyTorch

In this example, we'll run a hyperparameter search on the example PyTorch convolutional neural network on the MNIST dataset.

By default, Diviner uses the dynamoDB table called "diviner". To set this up, we have to run diviner create-table before we begin to use Diviner.

$ diviner create-table

Now we're ready to define our study and instruct Diviner how to run trials.

First, create a file called mnist.dv. We are interested in running training on GPU instances in AWS EC2, so the first thing we do is to define an ec2 system for the trainers to run:

ec2gpu = ec2system(
    "ec2gpu",
    region="us-west-2",
    ami="ami-01a4e5be5f289dd12",
    security_group="SECURITY_GROUP",
    instance_type="p3.2xlarge",
    disk_space=100,
    data_space=500,
    on_demand=True,
    flavor="ubuntu")

The AMI named here is the AWS DLAMI in us-west-2 as of this writing. The security group should give external access to port 443 (HTTPS).

We'll run our examples on p3.2xlarge instances, which are the smallest GPU instances provided by EC2.

Next, we'll define a function that is called with a set of parameter values. The function, run_mnist, returns a run configuration based on the provided parameter values. The run config defines how a trial is to be run given these parameters. In our case, we keep it very simple: we use the pytorch-provided docker image to invoke the mnist example with the provided parameter values.

def run_mnist(params):
    return run_config(
        system=ec2gpu,
        script="""
nvidia-docker run -i pytorch/pytorch bash -x <<EOF
git clone https://github.com/pytorch/examples.git
python examples/mnist/main.py \
	--batch-size %(batch_size)d \
	--lr %(learning_rate)f \
	--epochs %(epochs)d  2>&1 | \
	awk '/^Test set:/ {gsub(",", "", \$5); print "METRICS: loss="\$5} {print}'
EOF
""" % params)

The only thing of note here is that we're using awk to pull out the test losses reported by the PyTorch trainer and formatting them in the manner expected by Diviner. (Any line in the process stdout beginning with "METRICS: " and followed by a set of key=value pairs are interpeted by Diviner as metrics reported by the trial.) The run config also defines the system definition on which to run the trial.

Now that we have a system definition and a run function, we can define our study. The study declares the set of parameters and their ranges. In this case, we're using discrete parameter ranges, but users can also define continuous ranges. Starlark intrinsics available in Diviner are documented here. We define the study's objective to minimize the metric "loss" (as reported by the runner, above). The oracle, grid_search, defines how the parameter space is to be explored. Grid searching exhaustively explores the parameter space.

study(
    name="mnist",
    params={
        "epochs": discrete(1, 10, 50, 100),
        "batch_size": discrete(32, 64, 128, 256),
        "learning_rate": discrete(0.001, 0.01, 0.1, 1.0),
    },
    objective=minimize("loss"),
    oracle=grid_search,
    run=run_mnist,
)

Finally, we can now run the study. We run the study in "streaming" mode, meaning that new trials are started as soon as capacity allows. The -trials argument determines how many trials may be run in parallel. (And in our case, how many EC2 GPU instances are created at a time.)

$ diviner run -stream -trials 5 mnist.dv

While the study is running, we can query the database. For example, to see the current set of trials running:

$ diviner list -runs -state pending -s
mnist:9  6:02PM 11m59s pending running: Train Epoch: 16 [45120/60000 (75%)] Loss: 0.171844
mnist:10 6:08PM 5m29s  pending running: Train Epoch: 9 [14080/60000 (23%)]  Loss: 0.178012
mnist:11 6:09PM 5m14s  pending running:
mnist:12 6:09PM 5m14s  pending running: Train Epoch: 5 [56320/60000 (94%)] Loss: 0.380566
mnist:13 6:09PM 4m59s  pending running: Train Epoch: 7 [19200/60000 (32%)] Loss: 0.069407

(The last column(s) in this output shows the last line printed to standard output.) We can also examine the details of particular run. Runs are named by the study and an index.

$ diviner info mnist:9
run mnist:9:
    state:     pending
    created:   2019-10-25 18:02:14 -0700 PDT
    runtime:   13m29.972611766s
    restarts:  0
    replicate: 0
    values:
        batch_size:    32
        epochs:        50
        learning_rate: 0.001
    metrics:
        loss: 0.0398

Here we can see the parameter values used in this run, the latest reported metrics, and some other relevant metadata. Passing -v to diviner info gives even more detail, including all of the reported metrics and the rendered script.

$ diviner info -v mnist:9
run mnist:9:
    state:     pending
    created:   2019-10-25 18:02:14 -0700 PDT
    runtime:   13m59.973265371s
    restarts:  0
    replicate: 0
    values:
        batch_size:    32
        epochs:        50
        learning_rate: 0.001
    metrics[0]:
        loss: 0.2996
    metrics[1]:
        loss: 0.1848
    ...
    metrics[17]:
        loss: 0.0398
    script:
        nvidia-docker run -i pytorch/pytorch bash -x <<EOF
        git clone https://github.com/pytorch/examples.git
        python examples/mnist/main.py  --batch-size 32  --lr 0.001000  --epochs 50  2>&1 |  awk '/^Test set:/ {gsub(",", "", \$5); print "METRICS: loss="\$5} {print}'
        EOF

After Diviner has accumulated a number of trials, we can request the current leaderboard:

$ diviner leaderboard mnist
study    replicates loss   batch_size epochs learning_rate
mnist:21 0          0.0255 32         10     0.01
mnist:28 0          0.0265 256        50     0.01
mnist:9  0          0.027  32         50     0.001
mnist:14 0          0.0285 64         100    0.001
mnist:27 0          0.029  128        50     0.01
mnist:13 0          0.0298 32         100    0.001
...

This tells us the best hyperparameters (so far) for this MNIST classification task is batch_size=32, epochs=10, and learning_rate=0.01.

In addition to tracking studies and runs, Diviner maintains logs for each run. This can be useful when debugging issues or monitoring ongoing jobs. For example to view the logs of the best entry in the above study:

$ diviner logs mnist:21
diviner: started run (try 1) at 2019-10-25 18:41:57.84792 -0700 PDT on https://ec2-52-88-125-113.us-west-2.compute.amazonaws.com/
+ git clone https://github.com/pytorch/examples.git
Cloning into 'examples'...
+ python examples/mnist/main.py --batch-size 32 --lr 0.010000 --epochs 10
+ awk '/^Test set:/ {gsub(",", "", $5); print "METRICS: loss="$5} {print}'
9920512it [00:01, 8438927.61it/s]
32768it [00:00, 138802.83it/s]
1654784it [00:00, 2387828.32it/s]
8192it [00:00, 53052.53it/s]            Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ../data/MNIST/raw/train-images-idx3-ubyte.gz
Extracting ../data/MNIST/raw/train-images-idx3-ubyte.gz to ../data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ../data/MNIST/raw/train-labels-idx1-ubyte.gz
Extracting ../data/MNIST/raw/train-labels-idx1-ubyte.gz to ../data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ../data/MNIST/raw/t10k-images-idx3-ubyte.gz
Extracting ../data/MNIST/raw/t10k-images-idx3-ubyte.gz to ../data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to ../data/MNIST/raw/t10k-labels-idx1-ubyte.gz
Extracting ../data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ../data/MNIST/raw
Processing...
Done!
Train Epoch: 1 [0/60000 (0%)]	Loss: 2.301286
Train Epoch: 1 [320/60000 (1%)]	Loss: 2.246368
Train Epoch: 1 [640/60000 (1%)]	Loss: 2.143235
Train Epoch: 1 [960/60000 (2%)]	Loss: 2.020995
Train Epoch: 1 [1280/60000 (2%)]	Loss: 1.889598
Train Epoch: 1 [1600/60000 (3%)]	Loss: 1.387788
Train Epoch: 1 [1920/60000 (3%)]	Loss: 1.103807
...

Documentation

Overview

Package diviner implements common data structures to support a black-box optimization framework in the style of Google Vizier [1]. Implementations of optimization algorithms as well runner infrastructure, parallel execution, and a command line tool are implemented in subpackages.

[1] http://www.kdd.org/kdd2017/papers/view/google-vizier-a-service-for-black-box-optimization

Index

Constants

This section is empty.

Variables

View Source
var ErrNotExist = errors.New("study or run does not exist")

ErrNotExist is returned from a database when a study or run does not exist.

Functions

func Hash

func Hash(v Value) uint64

Hash returns a 64-bit hash for the value v.

Types

type Bool

type Bool bool

Bool is a boolean-typed value.

func (Bool) Bool

func (v Bool) Bool() bool

Bool implements Value.

func (Bool) Equal

func (v Bool) Equal(w Value) bool

func (Bool) Float

func (v Bool) Float() float64

Bool implements Value.

func (Bool) Get

func (Bool) Get(key string) Value

func (Bool) Hash

func (v Bool) Hash(h hash.Hash)

func (Bool) Index

func (Bool) Index(int) Value

func (Bool) Int

func (Bool) Int() int64

Int implements Value.

func (Bool) Kind

func (Bool) Kind() Kind

Kind implements Value.

func (Bool) Len

func (Bool) Len() int

func (Bool) Less

func (v Bool) Less(w Value) bool

Less implements Value.

func (Bool) Put

func (Bool) Put(key string, value Value)

func (Bool) Range

func (Bool) Range(func(key string, value Value))

func (Bool) Str

func (Bool) Str() string

Str implements Value.

func (Bool) String

func (v Bool) String() string

String implements Value.

type Database

type Database interface {
	// CreateTable creates the underlying database table.
	CreateTable(context.Context) error

	// CreateStudyIfNotExist creates a new study from the provided Study value.
	// If the study already exists, this is a no-op.
	CreateStudyIfNotExist(ctx context.Context, study Study) (created bool, err error)
	// LookupStudy returns the study with the provided name.
	LookupStudy(ctx context.Context, name string) (Study, error)
	// ListStudies returns the set of studies matching the provided prefix and whose
	// last update time is not before the provided time.
	ListStudies(ctx context.Context, prefix string, since time.Time) ([]Study, error)

	// NextSeq reserves and returns the next run sequence number for the
	// provided study.
	NextSeq(ctx context.Context, study string) (uint64, error)
	// InsertRun inserts the provided run into a study. The run's study,
	// values, and config must be populated; other fields are ignored.
	// If the sequence number is provided (>0), then it is assumed to
	// have been reserved by NextSeq. The run's study must already
	// exist, and the returned Run is assigned a sequence number, state,
	// and creation time.
	InsertRun(ctx context.Context, run Run) (Run, error)
	// UpdateRun updates the run named by the provided study and
	// sequence number with the given run state, message, runtime, and
	// current retry sequence.
	// UpdateRun is used also as a keepalive mechanism: runners must
	// call UpdateRun frequently in order to have the run considered
	// live by Diviner's tooling.
	UpdateRun(ctx context.Context, study string, seq uint64, state RunState, message string, runtime time.Duration, retry int) error
	// AppendRunMetrics reports a new set of metrics to the run named by the provided
	// study and sequence number.
	AppendRunMetrics(ctx context.Context, study string, seq uint64, metrics Metrics) error

	// ListRuns returns the set of runs in the provided study matching the queried
	// run states. ListRuns only returns runs that have been updated since the provided
	// time.
	ListRuns(ctx context.Context, study string, states RunState, since time.Time) ([]Run, error)
	// LookupRun returns the run named by the provided study and sequence number.
	LookupRun(ctx context.Context, study string, seq uint64) (Run, error)

	// Log obtains a reader for the logs emitted by the run named by the study and
	// sequence number. If !since.IsZero(), show messages added at or after the
	// given time. If follow is true, the returned reader is a perpetual stream,
	// updated as new log entries are appended.
	Log(study string, seq uint64, since time.Time, follow bool) io.Reader

	// Logger returns an io.WriteCloser, to which log messages can be written,
	// for the run named by a study and sequence number.
	Logger(study string, seq uint64) io.WriteCloser
}

A Database is used to track and manage studies and runs.

type Dataset

type Dataset struct {
	// Name is a unique name describing the dataset. Uniqueness is
	// required: diviner only runs one dataset for each named dataset,
	Name string
	// IfNotExist may contain a URL which is checked for existence
	// before running the script that produces this dataset. It is
	// assumed the dataset already exists if the URL exists.
	IfNotExist string
	// LocalFiles is a set of files (local to where diviner is run)
	// that should be made available in the script's environment.
	// These files are copied into the script's working directory,
	// retaining their basenames. (Thus the set of basenames in
	// the list should not collide.)
	LocalFiles []string
	// Script is a Bash script that is run to produce this dataset.
	Script string

	// Systems identifies the list of systems where the dataset run should be
	// performed. This can be used to schedule jobs with different kinds of
	// systems requirements. If Len(Systems)>1, each is tried until one of them
	// successfully allocates a machine.
	Systems []*System
}

A Dataset describes a preprocessing step that's required by a run. It may be shared among multiple runs.

func (Dataset) Freeze

func (Dataset) Freeze()

Freeze implements starlark.Value.

func (Dataset) Hash

func (Dataset) Hash() (uint32, error)

Hash implements starlark.Value.

func (Dataset) String

func (d Dataset) String() string

String returns a textual description of the dataset.

func (Dataset) Truth

func (Dataset) Truth() starlark.Bool

Truth implements starlark.Value.

func (Dataset) Type

func (Dataset) Type() string

Type implements starlark.Value.

type Dict

type Dict = Values

Dict is a Value that represents a map from string to a Value.

type Direction

type Direction int

Direction is the direction of the objective.

const (
	// Minimize indicates that the objective is to minimize a metric.
	Minimize Direction = iota
	// Maximize indicates that the objective is to maximize a metric.
	Maximize
)

func (Direction) Arrow

func (d Direction) Arrow() string

Arrow returns a decorative arrow indicating the direction of d.

func (Direction) String

func (d Direction) String() string

String returns a textual representation of the objective direction d.

type Discrete

type Discrete struct {
	DiscreteValues []Value
	DiscreteKind   Kind
}

A Discrete is a parameter that takes on a finite set of values.

func NewDiscrete

func NewDiscrete(values ...Value) *Discrete

NewDiscrete returns a new discrete param comprising the given values. NewDiscrete panics if all returned values are not of the same Kind, or if zero values are passed.

func (*Discrete) Freeze

func (*Discrete) Freeze()

Freeze implements starlark.Value.

func (*Discrete) Hash

func (*Discrete) Hash() (uint32, error)

Hash implements starlark.Value.

func (*Discrete) IsValid

func (d *Discrete) IsValid(v Value) bool

IsValid tells whether the value v belongs to the set of allowable values.

func (*Discrete) Kind

func (d *Discrete) Kind() Kind

Kind returns the kind of values represented by this discrete param.

func (*Discrete) Sample

func (d *Discrete) Sample(r *rand.Rand) Value

Sample draws a value set of parameter values and returns it.

func (*Discrete) String

func (d *Discrete) String() string

String returns a description of this parameter.

func (*Discrete) Truth

func (*Discrete) Truth() starlark.Bool

Truth implements starlark.Value.

func (*Discrete) Type

func (*Discrete) Type() string

Type implements starlark.Value.

func (*Discrete) Values

func (d *Discrete) Values() []Value

Values returns the possible values of the discrete param in the order given.

type Float

type Float float64

Float is a float-typed value.

func (Float) Bool

func (Float) Bool() bool

func (Float) Equal

func (v Float) Equal(w Value) bool

func (Float) Float

func (v Float) Float() float64

Float implements Value.

func (Float) Get

func (Float) Get(key string) Value

func (Float) Hash

func (v Float) Hash(h hash.Hash)

func (Float) Index

func (Float) Index(int) Value

func (Float) Int

func (Float) Int() int64

Int implements Value.

func (Float) Kind

func (Float) Kind() Kind

Kind implements Value.

func (Float) Len

func (Float) Len() int

func (Float) Less

func (v Float) Less(w Value) bool

Less implements Value.

func (Float) Put

func (Float) Put(key string, value Value)

func (Float) Range

func (Float) Range(func(key string, value Value))

func (Float) Str

func (Float) Str() string

Str implements Value.

func (Float) String

func (v Float) String() string

String implements Value.

type Int

type Int int64

Int is an integer-typed value.

func (Int) Bool

func (Int) Bool() bool

func (Int) Equal

func (v Int) Equal(w Value) bool

func (Int) Float

func (Int) Float() float64

Float implements Value.

func (Int) Get

func (Int) Get(key string) Value

func (Int) Hash

func (v Int) Hash(h hash.Hash)

func (Int) Index

func (Int) Index(int) Value

func (Int) Int

func (v Int) Int() int64

Int implements Value.

func (Int) Kind

func (Int) Kind() Kind

Kind implements Value.

func (Int) Len

func (Int) Len() int

func (Int) Less

func (v Int) Less(w Value) bool

Less implements Value.

func (Int) Put

func (Int) Put(key string, value Value)

func (Int) Range

func (Int) Range(func(key string, value Value))

func (Int) Str

func (Int) Str() string

Str implements Value.

func (Int) String

func (v Int) String() string

String implements Value.

type Kind

type Kind int

Kind represents the kind of a value.

const (
	Integer Kind = iota
	Real
	Str
	Seq
	ValueDict
	Boolean
)

func (Kind) String

func (k Kind) String() string

type List

type List []Value

List is a list-typed value.

func (List) Bool

func (List) Bool() bool

func (List) Equal

func (l List) Equal(m Value) bool

func (List) Float

func (List) Float() float64

Float implements Value.

func (List) Get

func (List) Get(key string) Value

func (List) Hash

func (l List) Hash(h hash.Hash)

func (List) Index

func (l List) Index(i int) Value

Index returns the ith element of the list.

func (List) Int

func (List) Int() int64

Int implements Value.

func (List) Kind

func (List) Kind() Kind

Kind implements Value.

func (List) Len

func (l List) Len() int

Len returns the length of the list.

func (List) Less

func (l List) Less(m Value) bool

Less implements Value.

func (List) Put

func (List) Put(key string, value Value)

func (List) Range

func (List) Range(func(key string, value Value))

func (List) Str

func (List) Str() string

Str implements Value.

func (List) String

func (l List) String() string

String implements Value.

type Map

type Map struct {
	// contains filtered or unexported fields
}

Map implements an associative array between Diviner values and Go values. A map is not itself a Value.

func NewMap

func NewMap() *Map

NewMap returns a newly allocated Map.

func Trials

func Trials(ctx context.Context, db Database, study Study, states RunState) (*Map, error)

Trials queries the database db for all runs in the provided study, and returns a set of composite trials for each replicate of a value set. The returned map maps value sets to these composite trials.

Trial metrics are averaged across runs in the states as indicated by the provided run states; flags are set on the returned trials to indicate which replicates they comprise and whether any pending results were used.

TODO(marius): this is a reasonable approach for some metrics, but not for others. We should provide a way for users to (e.g., as part of a study definition) to define their own means of defining composite metrics, e.g., by intepreting metrics from each run, or their outputs directly (e.g., predictions from an evaluation run).

func (*Map) Get

func (m *Map) Get(key Value) (val interface{}, ok bool)

Get retrieves the value associated by the key given by the provided value.

func (*Map) Len

func (m *Map) Len() int

Len returns the number of entries in the map.

func (*Map) Put

func (m *Map) Put(key Value, value interface{})

Put associated the value value with the provided key. Existing entries for a value is overridden.

func (*Map) Range

func (m *Map) Range(fn func(key Value, value interface{}))

Range iterates over all elements in the map.

func (*Map) String

func (m *Map) String() string

type Metric

type Metric struct {
	Name  string
	Value float64
}

A Metric is a single, named metric.

type Metrics

type Metrics map[string]float64

Metrics is a set of measurements output by black boxes. A subset of metrics may be used in the optimization objective, but the set of metrics may include others for diagnostic purposes.

func (Metrics) Equal

func (m Metrics) Equal(n Metrics) bool

func (*Metrics) Merge

func (m *Metrics) Merge(n Metrics)

Merge merges metrics n into m; values in n overwrite values in m.

func (Metrics) Sorted

func (m Metrics) Sorted() []Metric

Sorted returns the metrics in m sorted by name.

type NamedParam

type NamedParam struct {
	// Name is the parameter's name.
	Name string
	// Param is the parameter.
	Param
}

NamedParam represents a named parameter.

type NamedValue

type NamedValue struct {
	Name string
	Value
}

A NamedValue is a value that is assigned a name.

type Objective

type Objective struct {
	// Direction indicates the direction (minimize, maximize) of the
	// the optimization objective.
	Direction Direction
	// Metric names the metric to be optimized.
	Metric string
}

Objective is an optimization objective.

func (Objective) Freeze

func (Objective) Freeze()

Freeze implements starlark.Value.

func (Objective) Hash

func (Objective) Hash() (uint32, error)

Hash implements starlark.Value.

func (Objective) String

func (o Objective) String() string

String returns a textual description of the optimization objective.

func (Objective) Truth

func (Objective) Truth() starlark.Bool

Truth implements starlark.Value.

func (Objective) Type

func (Objective) Type() string

Type implements starlark.Value.

type Oracle

type Oracle interface {
	// Next returns the next n parameter values to run, given the
	// provided history of trials, the set of parameters and an
	// objective. If Next returns fewer than n trials, then the oracle
	// is exhausted.
	Next(previous []Trial, params Params, objective Objective, n int) ([]Values, error)
}

An Oracle is an optimization algorithm that picks a set of parameter values given a history of runs.

type Param

type Param interface {
	// Kind returns the kind of values encapsulated by this param.
	Kind() Kind

	// Values returns the set of values allowable by this parameter.
	// Nil is returned if the param's image is not finite.
	Values() []Value

	// Sample returns a Value from this param sampled by the provided
	// random number generator.
	Sample(r *rand.Rand) Value

	// IsValid tells whether the provided value is valid for this parameter.
	IsValid(value Value) bool

	// Params implement starlark.Value so they can be represented
	// directly in starlark configuration scripts.
	starlark.Value
}

A Param is a kind of parameter. Params determine the range of allowable values of an input.

type Params

type Params map[string]Param

Params stores a set of parameters under optimization.

func (Params) IsValid

func (p Params) IsValid(values Values) bool

IsValid returns whether the given set of values are a valid assignment of exactly the parameters in this Params.

func (Params) Sorted

func (p Params) Sorted() []NamedParam

Sorted returns the set of parameters sorted by name.

type Range

type Range struct {
	Start, End Value
}

Range is a parameter that is defined over a range of real numbers.

func NewRange

func NewRange(start, end Value) *Range

NewRange returns a range parameter representing the range of values [start, end).

func (*Range) Freeze

func (*Range) Freeze()

Freeze implements starlark.Value.

func (*Range) Hash

func (*Range) Hash() (uint32, error)

Hash implements starlark.Value.

func (*Range) IsValid

func (r *Range) IsValid(v Value) bool

IsValid tells whether the value v is inside the range r.

func (*Range) Kind

func (r *Range) Kind() Kind

Kind returns Real.

func (*Range) Sample

func (r *Range) Sample(rnd *rand.Rand) Value

Sample draws a random sample from within the range represented by this parameter.

func (*Range) String

func (r *Range) String() string

String returns a description of this range parameter.

func (*Range) Truth

func (*Range) Truth() starlark.Bool

Truth implements starlark.Value.

func (*Range) Type

func (*Range) Type() string

Type implements starlark.Value.

func (*Range) Values

func (r *Range) Values() []Value

Values returns nil for real ranges (they are infinite), and the set of values in an integer range.

type Replicates

type Replicates uint64

Replicates is a bitset of replicate numbers.

func (*Replicates) Clear

func (r *Replicates) Clear(rep int)

Clear clears the replicate number rep in the replicate set r.

func (Replicates) Completed

func (r Replicates) Completed(n int) bool

Completed reports whether n replicates have completed in the replicate set r.

func (Replicates) Contains

func (r Replicates) Contains(rep int) bool

Contains reports whether the replicate set r contains the replicate number rep.

func (Replicates) Count

func (r Replicates) Count() int

Count returns the number of replicates defined in the replicate set r.

func (Replicates) Next

func (r Replicates) Next() (int, Replicates)

Next iterates over replicates. It returns the first replicate in the set as well as the replicate set with that replicate removed. -1 is returned if the replicate set is empty.

Iteration can thus proceed:

var r Replicates
for num, r := r.Next(); num != -1; num, r = r.Next() {
	// Process num
}

func (*Replicates) Set

func (r *Replicates) Set(rep int)

Set sets the replicate number rep in the replicate set r.

type Run

type Run struct {
	// Values is the set of parameter values represented by this run.
	Values

	// Study is the name of the study serviced by this run.
	Study string
	// Seq is a sequence number assigned to each run in a study.
	// Together, the study and sequence number uniquely names
	// a run.
	Seq uint64

	// Replicate is the replicate of this run.
	Replicate int

	// State is the current state of the run. See RunState for
	// descriptions of these.
	State RunState
	// Status is a human-consumable status indicating the status
	// of the run.
	Status string

	// Config is the RunConfig for this run.
	Config RunConfig

	// Created is the time at which the run was created.
	Created time.Time
	// Updated is the last time the run's state was updated. Updated is
	// used as a keepalive mechanism.
	Updated time.Time
	// Runtime is the runtime duration of the run.
	Runtime time.Duration

	// Number of times the run was retried.
	Retries int

	// Metrics is the history of metrics, in the order reported by the
	// run.
	//
	// TODO(marius): include timestamps for these, or some other
	// reference (e.g., runtime).
	Metrics []Metrics
}

A Run describes a single run, which, upon successful completion, represents a Trial. Runs are managed by a Database.

func (Run) ID

func (r Run) ID() string

ID returns this run's identifier.

func (Run) Trial

func (r Run) Trial() Trial

Trial returns the Trial represented by this run.

TODO(marius): allow other metric selection policies (e.g., minimize train and test loss difference)

type RunConfig

type RunConfig struct {
	// Datasets contains the set of datasets required by this
	// run.
	Datasets []Dataset
	// Script is a script that should be interpreted by Bash.
	Script string
	// LocalFiles is a set of files (local to where diviner is run)
	// that should be made available in the script's environment.
	// These files are copied into the script's working directory,
	// retaining their basenames. (Thus the set of basenames in
	// the list should not collide.)
	LocalFiles []string

	// Systems identifies the list of systems where the run should be
	// performed. This can be used to schedule jobs with different kinds of
	// systems requirements. If Len(Systems)>1, each is tried until one of them
	// successfully allocates a machine.
	Systems []*System
}

A RunConfig describes how to perform a single run of a black box with a set of parameter values. Runs are defined by a bash script and a set of files that must be included in its run environment.

Black boxes emit metrics by printing to standard output lines that begin with "METRICS: ", followed by a set of comma-separated key-value pairs of metric values. Each metric must be a number. For example, the following line emits metrics for "acc" and "loss":

METRICS: acc=0.55,loss=12.3

TODO(marius): make this mechanism more flexible and less error prone.

TODO(marius): allow interpreters other than Bash.

func (RunConfig) Freeze

func (RunConfig) Freeze()

Freeze implements starlark.Value.

func (RunConfig) Hash

func (RunConfig) Hash() (uint32, error)

Hash implements starlark.Value.

func (RunConfig) String

func (c RunConfig) String() string

String returns a textual description of the run config.

func (RunConfig) Truth

func (RunConfig) Truth() starlark.Bool

Truth implements starlark.Value.

func (RunConfig) Type

func (RunConfig) Type() string

Type implements starlark.Value.

type RunState

type RunState int

RunState describes the current state of a particular run.

const (
	// Pending indicates that the run has not yet completed.
	Pending RunState = 1 << iota
	// Success indicates that the run has completed and represents
	// a successful trial.
	Success
	// Failure indicates that the run failed.
	Failure

	// Any contains all run states.
	Any = Pending | Success | Failure
)

func (RunState) String

func (s RunState) String() string

String returns a simple textual representation of a run state.

type String

type String string

String is a string-typed value.

func (String) Bool

func (String) Bool() bool

func (String) Equal

func (v String) Equal(w Value) bool

func (String) Float

func (String) Float() float64

Float implements Value.

func (String) Get

func (String) Get(key string) Value

func (String) Hash

func (v String) Hash(h hash.Hash)

func (String) Index

func (String) Index(int) Value

func (String) Int

func (String) Int() int64

Int implements Value.

func (String) Kind

func (String) Kind() Kind

Kind implements Value.

func (String) Len

func (String) Len() int

func (String) Less

func (v String) Less(w Value) bool

Less implements Value.

func (String) Put

func (String) Put(key string, value Value)

func (String) Range

func (String) Range(func(key string, value Value))

func (String) Str

func (v String) Str() string

Str implements Value.

func (String) String

func (v String) String() string

String implements Value.

type Study

type Study struct {
	// Name is the name of the study.
	Name string
	// Objective is the objective to be maximized.
	Objective Objective
	// Params is the set of parameters accepted by this
	// study.
	Params Params

	// Replicates is the number of additional replicates required for
	// each trial in the study.
	Replicates int

	// Human-readable description of the study.
	Description string

	// Oracle is the oracle used to pick parameter values.
	Oracle Oracle `json:"-"` // TODO(marius): encode oracle name/type/params?

	// Run is called with a set of Values (i.e., a concrete
	// instantiation of values in the ranges as indicated by the black
	// box parameters defined above); it produces a run configuration
	// which is then used to conduct a trial of these parameter values.
	// The run's replicate number is passed in. (This may be used to,
	// e.g., select a model fold.) Parameter id is a unique id for the
	// run (vis-a-vis diviner's database). It may be used to name data
	// and other external resources associated with the run.
	Run func(vals Values, replicate int, id string) (RunConfig, error) `json:"-"`

	// Acquire returns the metrics associated with the set of
	// parameter values that are provided. It is used to support
	// (Go) native trials. Arguments are as in Run.
	Acquire func(vals Values, replicate int, id string) (Metrics, error) `json:"-"`
}

A Study is an experiment comprising a black box, optimization objective, and an oracle responsible for generating trials. Trials can either be managed by Diviner (through Run), or else handled natively within Go (through Acquire). Either Run or Acquire must be defined.

func (Study) Freeze

func (Study) Freeze()

Freeze implements starlark.Value.

func (Study) Hash

func (Study) Hash() (uint32, error)

Hash implements starlark.Value.

func (Study) String

func (s Study) String() string

String returns a textual description of the study.

func (Study) Truth

func (Study) Truth() starlark.Bool

Truth implements starlark.Value.

func (Study) Type

func (Study) Type() string

Type implements starlark.Value.

type System

type System struct {
	// System is the bigmachine system configured by this system.
	bigmachine.System
	// ID is a unique identifier for this system.
	ID string
	// Parallelism specifies the maximum level of job parallelism allowable for
	// this system. If <= 0, the system allows unlimited parallelism.
	Parallelism int
	// Bash snippet to be prepended to the user script.
	// If empty, runner.DefaultPreamble is used.
	Preamble string
}

A System describes a configuration of a machine. It is part of SystemPool.

func (*System) Freeze

func (*System) Freeze()

Freeze implements starlark.Value.

func (*System) Hash

func (s *System) Hash() (uint32, error)

Hash implements starlark.Value.

func (*System) String

func (s *System) String() string

String implements starlark.Value.

func (*System) Truth

func (*System) Truth() starlark.Bool

Truth implements starlark.Value.

func (*System) Type

func (*System) Type() string

Type implements starlark.Value.

type Trial

type Trial struct {
	// Values is the set of parameter values used for the run.
	Values Values
	// Metrics is the metrics produced by the black box during
	// the run.
	Metrics Metrics

	// Pending indicates whether this is a pending trial. Pending trials
	// may have incomplete or non-final metrics.
	Pending bool

	// Replicates contains the set of completed replicates
	// comprising this trial. Replicates are stored in a bitset.
	Replicates Replicates

	// ReplicateMetrics breaks down metrics for each underlying replicate.
	// Valid only for trials that comprise multiple replicates.
	ReplicateMetrics map[int]Metrics

	// Runs stores the set of runs comprised by this trial.
	Runs []Run
}

A Trial is the result of a single run of a black box.

func ReplicatedTrial

func ReplicatedTrial(replicates []Trial) Trial

ReplicatedTrials constructs a single trial from the provided trials. The composite trial represents each replicate present in the provided replicates. Metrics are averaged. The provided trials cannot themselves contain multiple replicates.

func (Trial) Equal

func (t Trial) Equal(u Trial) bool

Equal reports whether the two trials are equal.

func (Trial) Range

func (t Trial) Range(name string) (min, max float64)

Range returns the range of the provided metric.

func (Trial) Timestamp

func (t Trial) Timestamp() time.Time

Timestamp returns the latest time at which any run comprising this trial was done.

type Value

type Value interface {
	// String returns a textual description of the parameter value.
	String() string

	// Kind returns the kind of this value.
	Kind() Kind

	// Equal tells whether two values are equal. Values of different
	// kinds are never equal.
	Equal(Value) bool

	// Less returns true if the value is less than the provided value.
	// Less is defined only for values of the same type.
	Less(Value) bool

	// Float returns the floating point value of float-typed values.
	Float() float64

	// Int returns the integer value of integer-typed values.
	Int() int64

	// Str returns the string of string-typed values.
	Str() string

	// Bool returns the boolean of boolean-typed values.
	Bool() bool

	// Len returns the length of a sequence value.
	Len() int

	// Index returns the value at an index of a sequence.
	Index(i int) Value

	// Hash adds the value's hask to the provided hasher.
	Hash(h hash.Hash)
}

Value is the type of parameter values. Values must be directly comparable.

func Zero

func Zero(kind Kind) Value

Zero returns the zero value for the provided kind.

type Values

type Values map[string]Value

Values is a sorted set of named values, used as a concrete instantiation of a set of parameters.

Note: the representation of Values is not optimal. Ideally we would store it as a sorted list of NamedValues. However, backwards compatibility (by way of gob) forces our hand here.

func (Values) Bool

func (Values) Bool() bool

func (Values) Equal

func (v Values) Equal(wv Value) bool

func (Values) Float

func (Values) Float() float64

func (Values) Hash

func (v Values) Hash(h hash.Hash)

func (Values) Index

func (Values) Index(i int) Value

func (Values) Int

func (Values) Int() int64

func (Values) Kind

func (Values) Kind() Kind

func (Values) Len

func (v Values) Len() int

func (Values) Less

func (Values) Less(Value) bool

func (Values) Sorted

func (v Values) Sorted() []NamedValue

func (Values) Str

func (Values) Str() string

func (Values) String

func (v Values) String() string

String returns a (stable) textual description of the value set.

Directories

Path Synopsis
cmd
diviner
Diviner is a black-box optimization framework that uses Bigmachine to distribute a large number of computationally expensive trials across clusters of machines.
Diviner is a black-box optimization framework that uses Bigmachine to distribute a large number of computationally expensive trials across clusters of machines.
Package dydb implements a diviner.Database on top of dynamodb and the AWS cloudwatch logs storage.
Package dydb implements a diviner.Database on top of dynamodb and the AWS cloudwatch logs storage.
dynamoattr
Package dynamoattr provides functions for marshaling and unmarshaling Go values to and from DynamoDB items (map[string]*dynamodb.AttributeValue).
Package dynamoattr provides functions for marshaling and unmarshaling Go values to and from DynamoDB items (map[string]*dynamodb.AttributeValue).
Package localdb implements a diviner database on the local file system using boltdb.
Package localdb implements a diviner database on the local file system using boltdb.
Package oracle includes pre-defined oracles for picking parameter values.
Package oracle includes pre-defined oracles for picking parameter values.
Package runner provides a simple parallel cluster runner for diviner studies.
Package runner provides a simple parallel cluster runner for diviner studies.
Package script implements scripting support for defining Diviner studies through Starlark [1].
Package script implements scripting support for defining Diviner studies through Starlark [1].

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL