batch

package
v0.0.0-...-90deddd Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 18, 2023 License: Apache-2.0 Imports: 26 Imported by: 1

Documentation

Overview

Package batch implements support for running batches of reflow (stateful) evaluations. The user specifies a reflow program and enumerates each run by way of parameters specified in a CSV file. The batch runner then takes care of the rest.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Batch

type Batch struct {
	// Dir is the batch directory in which configuration, state, and log files
	// are stored.
	Dir string
	// ConfigFilename is the name of the configuration file (relative to Dir)
	ConfigFilename string
	// Rundir is the directory where run state files are stored.
	Rundir string
	// User is the user running the batch; batch runs are named using
	// this value.
	User string
	// Args are additional command line arguments (parsed as flags).
	// They override any supplied in the batch configuration.
	Args []string

	// Status is the status group used to report batch status;
	// individual run statuses are reported as tasks in the group.
	Status *status.Group

	// BatchState is the state of the current batch.
	BatchState

	flow.EvalConfig

	// Limiter is a rate limiter to control the number of parallel evaluations.
	// This can be used to prevent "thundering herds" against systems like S3.
	// Limiterr should be set prior to running the batch.
	Limiter *limiter.Limiter
	// contains filtered or unexported fields
}

Batch represents a batch of reflow evaluations. It manages setting up runs and instantiating them; subsequently it manages the run state of the batch (through state.File) so that batches are resumable across process invocations.

Batches assume they have a (user provided) directory in which its configuration, state, and log files are stored.

func (*Batch) Close

func (b *Batch) Close()

Close releases resources held by the batch.

func (*Batch) Init

func (b *Batch) Init(reset bool, retry bool) error

Init initializes a batch. If reset is set to true, then previously saved state is discarded. If retry is set, then only failed runs in the batch are retried. Init also upgrades old state files.

func (*Batch) ReadState

func (b *Batch) ReadState() error

ReadState populates the current state of the batch run.

func (*Batch) Run

func (b *Batch) Run(ctx context.Context) error

Run runs the batch until completion, too many errors, or context completion. Run reports batch progress to the batch's logger every 10 seconds.

type BatchState

type BatchState struct {
	// ID is the batch identifier. It uniquely identifies a batch (program and the contents of the batch).
	ID digest.Digest
	// Runs is the set of runs managed by this batch.
	Runs map[string]*Run
}

BatchState identifies a batch. It has a unique identifier based on the program and the batch being run. It also includes the state of the runs in the batch.

type Run

type Run struct {
	// ID is the run's identifier. IDs must be unique inside of a batch.
	ID string
	// Program is the path of the Reflow program to be evaluated.
	Program string
	// Args contains the run's parameters.
	Args map[string]string
	// Argv contains the run's argument vector.
	Argv []string

	// RunID is the global run ID for this run.
	RunID taskdb.RunID

	// State stores the runner state of this run.
	State runner.State `json:"-"`

	// Status receives status updates from batch execution.
	Status *status.Task
	// contains filtered or unexported fields
}

A Run comprises the state for an individual run. Runs are serialized and can be restored across process invocations. Run is mostly deprecated in favor of runner.State, but we still maintain the old fields so that we can upgrade old serialized states.

TODO(marius): clean this up when it's safe to do so.

func (*Run) Go

func (r *Run) Go(ctx context.Context, initWG *sync.WaitGroup) error

Go runs the run according to its state. If the run was previously started, Go resumes it. Go returns on completion, error, or when the context is done. initWG is used to coordinate multiple runs. Runs that require new allocs wait for initWG while those that already have allocs reclaim them and then call initWG.Done. This allows the batch to ensure that we don't collect allocs that are potentially useful.

Go commits the run state at every phase transition so that progress should never be lost.

type State

type State int

State tells the state of an individual batch run.

const (
	// StateInit indicates that the run is initializing.
	StateInit State = iota
	// StateRunning indicates that the run is currently evaluating.
	StateRunning
	// StateDone indicates that the run has completed without a runtime error.
	StateDone
	// StateError indicates that the attempted run errored out with a runtime error.
	StateError
)

func (State) Name

func (s State) Name() string

Name returns an abbreviated name for a state.

func (State) String

func (i State) String() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL