local

package
v0.0.0-...-90deddd Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 18, 2023 License: Apache-2.0 Imports: 55 Imported by: 3

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Executor

type Executor struct {
	// RunID of the run - <username>@grailbio.com/<hash>
	RunID string
	// ID is the ID of the executor. It is the URI of the executor and also
	// the prefix used in any Docker containers whose exec's are
	// children of this executor.
	ID string
	// Prefix is the filesystem prefix used to access paths on disk. This is
	// defined so that the executor can run inside of a Docker container
	// (which has the host's filesystem exported at this prefix).
	Prefix string
	// Dir is the root directory of this executor. All of its state is contained
	// within it.
	Dir string
	// Client is the Docker client used by this executor.
	Client *docker.Client
	// Authenticator is used to pull images that are stored on Amazon's ECR
	// service.
	Authenticator ecrauth.Interface
	// AWSCreds is an AWS credentials provider, used for "$aws" passthroughs.
	AWSCreds *credentials.Credentials
	// Log is this executor's logger where operational status is printed.
	Log *log.Logger

	// FileRepository is the (file-based) object repository used by this
	// Executor. It may be provided by the user, or else it is set to a
	// default implementation when (*Executor).Start is called.
	FileRepository *filerepo.Repository

	// HardMemLimit restricts an exec's memory limit to the exec's resource requirements
	HardMemLimit bool

	Blob blob.Mux

	// NodeOomDetector is an oom detector based node metrics
	NodeOomDetector OomDetector

	// IntegrityErrSignal is a channel for signaling an integrity issue with
	// the EC2 instance's EBS volume(s). The signal is sent by this Executor
	// if a file fails integrity verification in Load or VerifyIntegrity.
	IntegrityErrSignal chan struct{}

	// SaveLogsToRepo determines whether or not exec's used by this Executor save their raw stdout/stderr logs during Exec.RunInfo
	SaveLogsToRepo bool
	// contains filtered or unexported fields
}

Executor is a small management layer on top of exec. It implements reflow.Executor. Executor assumes that it has local access to the file system (perhaps with a prefix).

Executor stores its state to disk and, when recovered, re-instantiates all execs (which in turn recover).

func (*Executor) Execs

func (e *Executor) Execs(ctx context.Context) ([]reflow.Exec, error)

Execs returns all execs managed by this executor.

func (*Executor) Get

func (e *Executor) Get(ctx context.Context, id digest.Digest) (reflow.Exec, error)

Get returns the exec named ID, or an errors.NotExist if the exec does not exist.

func (*Executor) Kill

func (e *Executor) Kill(ctx context.Context) error

Kill disposes of the executors and all of its execs. It also sets the executor's "dead" flag, so that all future operations on the executor returns an error.

func (*Executor) Load

func (e *Executor) Load(ctx context.Context, repo *url.URL, fs reflow.Fileset) (reflow.Fileset, error)

Load loads the fileset into the executor repository. If the fileset is resolved, it is loaded from the specified backing repository. Else the file is loaded from its source.

func (*Executor) Put

Put idempotently defines a new exec with a given ID and config. The exec may be (deterministically) rewritten.

func (*Executor) Remove

func (e *Executor) Remove(ctx context.Context, id digest.Digest) error

Remove removes the exec named id.

func (*Executor) Repository

func (e *Executor) Repository() reflow.Repository

Repository returns the repository attached to this executor.

func (*Executor) Resources

func (e *Executor) Resources() reflow.Resources

Resources reports the total capacity of this executor.

func (*Executor) SetResources

func (e *Executor) SetResources(r reflow.Resources)

SetResources sets the resources reported by Resources() to r.

func (*Executor) Start

func (e *Executor) Start() error

Start initializes the executor and recovers previously stored state. It re-initializes all stored execs.

func (*Executor) URI

func (e *Executor) URI() string

URI returns the executor's ID.

func (*Executor) Unload

func (e *Executor) Unload(ctx context.Context, fs reflow.Fileset) error

Unload unloads the fileset from the executor repository. When the fileset's reference count drops to zero, the executor may choose to remove the fileset from its repository.

func (*Executor) VerifyIntegrity

func (e *Executor) VerifyIntegrity(ctx context.Context, fs reflow.Fileset) error

VerifyIntegrity verifies the integrity of the given set of files.

type Manifest

type Manifest struct {
	Type  execType
	State execState
	PID   int

	Created time.Time

	Result    reflow.Result
	Config    reflow.ExecConfig   // The object config used to create this object.
	Docker    types.ContainerJSON // Docker inspect output.
	Resources reflow.Resources
	Stats     stats
	Gauges    reflow.Gauges
}

Manifest stores the state of an exec. It is serialized to JSON and stored on disk so that executors are restartable, and can recover from crashes.

type OomDetector

type OomDetector interface {
	// Oom returns whether an OOM occurred for the given process ID within the given time range,
	// and a string with an explanation of why (if true) an OOM occurrence determination was made.
	// If pid is unspecified (ie, -1), then implementations can make a "possible OOM" determination.
	Oom(pid int, start, end time.Time) (bool, string)
}

OomDetector detects if an OOM has occurred.

type Pool

type Pool struct {
	pool.ResourcePool
	// Dir is the filesystem root of the pool. Everything under this
	// path is assumed to be owned and managed by the pool.
	Dir string
	// Prefix is prepended to paths constructed by allocs. This is to
	// permit running the pool manager inside of a Docker container.
	Prefix string
	// Client is the Docker client. We assume that the Docker daemon
	// runs on the same host from which the pool is managed.
	Client *docker.Client
	// Authenticator is used to authenticate ECR image pulls.
	Authenticator interface {
		Authenticates(ctx context.Context, image string) (bool, error)
		Authenticate(ctx context.Context, cfg *types.AuthConfig) error
	}
	// AWSCreds is a credentials provider used to mint AWS credentials.
	// They are used to access AWS services.
	AWSCreds *credentials.Credentials
	// Session is the AWS session to use for AWS API calls.
	Session *session.Session
	// Blob is the blob store implementation used to fetch data from interns.
	Blob blob.Mux

	// TaskDBPoolId is the identifier of this Pool in TaskDB
	TaskDBPoolId reflow.StringDigest
	TaskDB       taskdb.TaskDB

	// Log
	Log *log.Logger

	HardMemLimit bool

	// NodeOomDetector is an oom detector based node metrics
	NodeOomDetector OomDetector

	// IntegrityErrSignal is a channel for signaling an integrity issue with
	// the EC2 instance's EBS volume(s). The signal is sent by any of the
	// Pool's Executors as a result of a file integrity error.
	IntegrityErrSignal chan struct{}
}

Pool implements a resource pool on top of a Docker client. The pool itself must run on the same machine as the Docker instance as it performs local filesystem operations that must be reflected inside the container.

Pool keeps all state on disk, as follows:

Prefix/Dir/state.json
	Stores the set of currently active allocs, together with their
	resource requirements.

Prefix/Dir/allocs/<id>/
	The root directory for the alloc with id. The state under
	this directory is managed by an executor instance.

func (*Pool) Alloc

func (p *Pool) Alloc(ctx context.Context, id string) (pool.Alloc, error)

Alloc looks up an alloc by ID.

func (*Pool) Kill

func (p *Pool) Kill(a pool.Alloc) error

Kill implements `pool.AllocManager` and kills the underlying alloc.

func (*Pool) MaintainTaskDBRow

func (p *Pool) MaintainTaskDBRow(ctx context.Context)

MaintainTaskDBRow maintains the taskdb row corresponding to this pool (if applicable). MaintainTaskDBRow blocks until the given context is done, if this pool has a taskdb implementation and a PoolID set, Otherwise it'll return immediately. The taskdb row is expected to already exist, and this will simply update the Resources and maintains keepalive until ctx cancellation; and then it updates the End time of the row. MaintainTaskDBRow will panic if called on a Pool with no resources (ie, the pool must've been started)

func (*Pool) Name

func (p *Pool) Name() string

Name implements `pool.AllocManager` and always returns "local".

func (*Pool) New

func (p *Pool) New(ctx context.Context, id string, meta pool.AllocMeta, keepalive time.Duration, existing []pool.Alloc) (pool.Alloc, error)

New implements `pool.AllocManager`. New creates a new alloc with the given id, alloc meta and initial keepalive. The list of other existing allocs are provided here to enable atomic saving of the state of all allocs.

func (*Pool) Resources

func (p *Pool) Resources() reflow.Resources

func (*Pool) Start

func (p *Pool) Start(expectedUsableMemBytes int64) error

Start starts the pool. If the pool has a state snapshot, Start will restore the pool's previous state. Start will also make sure that all zombie allocs are collected. expectedUsableMemBytes is the expected memory on this pool, and if the available memory is less than this, then the pool returns an error. expectedUsableMemBytes can be set to a small number (eg: zero) to signify that there's no specific expectation.

Directories

Path Synopsis
Package pooltest tests pools.
Package pooltest tests pools.
Package testutil provides utilities for testing code that involves pools.
Package testutil provides utilities for testing code that involves pools.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL