worker

package
v0.0.0-...-1d6d2ab Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 24, 2024 License: AGPL-3.0, Apache-2.0, CC-BY-SA-3.0 Imports: 26 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type BootOutcome

type BootOutcome string

BootOutcome is the result of a worker boot. It is used as a label in a metric.

const (
	BootOutcomeFailed      BootOutcome = "failure"
	BootOutcomeSucceeded   BootOutcome = "success"
	BootOutcomeAborted     BootOutcome = "aborted"
	BootOutcomeDisappeared BootOutcome = "disappeared"
)

type Executor

type Executor interface {
	// Run cmd on the current target.
	Execute(env map[string]string, cmd string, stdin io.Reader) (stdout, stderr []byte, err error)

	// Use the given target for subsequent operations. The new
	// target is the same host as the previous target, but it
	// might return a different address and verify a different
	// host key.
	//
	// SetTarget is called frequently, and in most cases the new
	// target will behave exactly the same as the old one. An
	// implementation should optimize accordingly.
	//
	// SetTarget must not block on concurrent Execute calls.
	SetTarget(cloud.ExecutorTarget)

	Close()
}

An Executor executes shell commands on a remote host.

type IdleBehavior

type IdleBehavior string

IdleBehavior indicates the behavior desired when a node becomes idle.

const (
	IdleBehaviorRun   IdleBehavior = "run"   // run containers, or shutdown on idle timeout
	IdleBehaviorHold  IdleBehavior = "hold"  // don't shutdown or run more containers
	IdleBehaviorDrain IdleBehavior = "drain" // shutdown immediately when idle
)

type InstanceView

type InstanceView struct {
	Instance             cloud.InstanceID `json:"instance"`
	Address              string           `json:"address"`
	Price                float64          `json:"price"`
	ArvadosInstanceType  string           `json:"arvados_instance_type"`
	ProviderInstanceType string           `json:"provider_instance_type"`
	LastContainerUUID    string           `json:"last_container_uuid"`
	LastBusy             time.Time        `json:"last_busy"`
	WorkerState          string           `json:"worker_state"`
	IdleBehavior         IdleBehavior     `json:"idle_behavior"`
}

An InstanceView shows a worker's current state and recent activity.

type Pool

type Pool struct {
	// contains filtered or unexported fields
}

Pool is a resizable worker pool backed by a cloud.InstanceSet. A zero Pool should not be used. Call NewPool to create a new Pool.

func NewPool

func NewPool(logger logrus.FieldLogger, arvClient *arvados.Client, reg *prometheus.Registry, instanceSetID cloud.InstanceSetID, instanceSet cloud.InstanceSet, newExecutor func(cloud.Instance) Executor, installPublicKey ssh.PublicKey, cluster *arvados.Cluster) *Pool

NewPool creates a Pool of workers backed by instanceSet.

New instances are configured and set up according to the given cluster configuration.

func (*Pool) AtCapacity

func (wp *Pool) AtCapacity(it arvados.InstanceType) bool

AtCapacity returns true if Create() is currently expected to fail for the given instance type.

func (*Pool) AtQuota

func (wp *Pool) AtQuota() bool

AtQuota returns true if Create is not expected to work at the moment (e.g., cloud provider has reported quota errors, or we are already at our own configured quota).

func (*Pool) CheckHealth

func (wp *Pool) CheckHealth() error

func (*Pool) CountWorkers

func (wp *Pool) CountWorkers() map[State]int

CountWorkers returns the current number of workers in each state.

CountWorkers blocks, if necessary, until the initial instance list has been loaded from the cloud provider.

func (*Pool) Create

func (wp *Pool) Create(it arvados.InstanceType) bool

Create a new instance with the given type, and add it to the worker pool. The worker is added immediately; instance creation runs in the background.

Create returns false if a pre-existing error or a configuration setting prevents it from even attempting to create a new instance. Those errors are logged by the Pool, so the caller does not need to log anything in such cases.

func (*Pool) ForgetContainer

func (wp *Pool) ForgetContainer(uuid string)

ForgetContainer clears the placeholder for the given exited container, so it isn't returned by subsequent calls to Running().

ForgetContainer has no effect if the container has not yet exited.

The "container exited at time T" placeholder (which necessitates ForgetContainer) exists to make it easier for the caller (scheduler) to distinguish a container that exited without finalizing its state from a container that exited too recently for its final state to have appeared in the scheduler's queue cache.

func (*Pool) Instances

func (wp *Pool) Instances() []InstanceView

Instances returns an InstanceView for each worker in the pool, summarizing its current state and recent activity.

func (*Pool) KillContainer

func (wp *Pool) KillContainer(uuid string, reason string) bool

KillContainer kills the crunch-run process for the given container UUID, if it's running on any worker.

KillContainer returns immediately; the act of killing the container takes some time, and runs in the background.

KillContainer returns false if the container has already ended.

func (*Pool) KillInstance

func (wp *Pool) KillInstance(id cloud.InstanceID, reason string) error

KillInstance destroys a cloud VM instance. It returns an error if the given instance does not exist.

func (*Pool) Running

func (wp *Pool) Running() map[string]time.Time

Running returns the container UUIDs being prepared/run on workers.

In the returned map, the time value indicates when the Pool observed that the container process had exited. A container that has not yet exited has a zero time value. The caller should use ForgetContainer() to garbage-collect the entries for exited containers.

func (*Pool) SetIdleBehavior

func (wp *Pool) SetIdleBehavior(id cloud.InstanceID, idleBehavior IdleBehavior) error

SetIdleBehavior determines how the indicated instance will behave when it has no containers running.

func (*Pool) Shutdown

func (wp *Pool) Shutdown(it arvados.InstanceType) bool

Shutdown shuts down a worker with the given type, or returns false if all workers with the given type are busy.

func (*Pool) StartContainer

func (wp *Pool) StartContainer(it arvados.InstanceType, ctr arvados.Container) bool

StartContainer starts a container on an idle worker immediately if possible, otherwise returns false.

func (*Pool) Stop

func (wp *Pool) Stop()

Stop synchronizing with the InstanceSet.

func (*Pool) Subscribe

func (wp *Pool) Subscribe() <-chan struct{}

Subscribe returns a buffered channel that becomes ready after any change to the pool's state that could have scheduling implications: a worker's state changes, a new worker appears, the cloud provider's API rate limiting period ends, etc.

Additional events that occur while the channel is already ready will be dropped, so it is OK if the caller services the channel slowly.

Example:

ch := wp.Subscribe()
defer wp.Unsubscribe(ch)
for range ch {
	tryScheduling(wp)
	if done {
		break
	}
}

func (*Pool) Unallocated

func (wp *Pool) Unallocated() map[arvados.InstanceType]int

Unallocated returns the number of unallocated (creating + booting + idle + unknown) workers for each instance type. Workers in hold/drain mode are not included.

func (*Pool) Unsubscribe

func (wp *Pool) Unsubscribe(ch <-chan struct{})

Unsubscribe stops sending updates to the given channel.

type State

type State int

State indicates whether a worker is available to do work, and (if not) whether/when it is expected to become ready.

const (
	StateUnknown  State = iota // might be running a container already
	StateBooting               // instance is booting
	StateIdle                  // instance booted, no containers are running
	StateRunning               // instance is running one or more containers
	StateShutdown              // worker has stopped monitoring the instance
)

func (State) MarshalText

func (s State) MarshalText() ([]byte, error)

MarshalText implements encoding.TextMarshaler so a JSON encoding of map[State]anything uses the state's string representation.

func (State) String

func (s State) String() string

String implements fmt.Stringer.

type TagVerifier

type TagVerifier struct {
	cloud.Instance
	Secret         string
	ReportVerified func(cloud.Instance)
}

func (TagVerifier) InitCommand

func (tv TagVerifier) InitCommand() cloud.InitCommand

func (TagVerifier) VerifyHostKey

func (tv TagVerifier) VerifyHostKey(pubKey ssh.PublicKey, client *ssh.Client) error

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL