Documentation
¶
Index ¶
- type BootOutcome
- type Executor
- type IdleBehavior
- type InstanceView
- type Pool
- func (wp *Pool) AtCapacity(it arvados.InstanceType) bool
- func (wp *Pool) AtQuota() bool
- func (wp *Pool) CheckHealth() error
- func (wp *Pool) CountWorkers() map[State]int
- func (wp *Pool) Create(it arvados.InstanceType) bool
- func (wp *Pool) ForgetContainer(uuid string)
- func (wp *Pool) Instances() []InstanceView
- func (wp *Pool) KillContainer(uuid string, reason string) bool
- func (wp *Pool) KillInstance(id cloud.InstanceID, reason string) error
- func (wp *Pool) Running() map[string]time.Time
- func (wp *Pool) SetIdleBehavior(id cloud.InstanceID, idleBehavior IdleBehavior) error
- func (wp *Pool) Shutdown(it arvados.InstanceType) bool
- func (wp *Pool) StartContainer(it arvados.InstanceType, ctr arvados.Container) bool
- func (wp *Pool) Stop()
- func (wp *Pool) Subscribe() <-chan struct{}
- func (wp *Pool) Unallocated() map[arvados.InstanceType]int
- func (wp *Pool) Unsubscribe(ch <-chan struct{})
- type State
- type TagVerifier
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type BootOutcome ¶
type BootOutcome string
BootOutcome is the result of a worker boot. It is used as a label in a metric.
const ( BootOutcomeFailed BootOutcome = "failure" BootOutcomeSucceeded BootOutcome = "success" BootOutcomeAborted BootOutcome = "aborted" BootOutcomeDisappeared BootOutcome = "disappeared" )
type Executor ¶
type Executor interface { // Run cmd on the current target. Execute(env map[string]string, cmd string, stdin io.Reader) (stdout, stderr []byte, err error) // Use the given target for subsequent operations. The new // target is the same host as the previous target, but it // might return a different address and verify a different // host key. // // SetTarget is called frequently, and in most cases the new // target will behave exactly the same as the old one. An // implementation should optimize accordingly. // // SetTarget must not block on concurrent Execute calls. SetTarget(cloud.ExecutorTarget) Close() }
An Executor executes shell commands on a remote host.
type IdleBehavior ¶
type IdleBehavior string
IdleBehavior indicates the behavior desired when a node becomes idle.
const ( IdleBehaviorRun IdleBehavior = "run" // run containers, or shutdown on idle timeout IdleBehaviorHold IdleBehavior = "hold" // don't shutdown or run more containers IdleBehaviorDrain IdleBehavior = "drain" // shutdown immediately when idle )
type InstanceView ¶
type InstanceView struct { Instance cloud.InstanceID `json:"instance"` Address string `json:"address"` Price float64 `json:"price"` ArvadosInstanceType string `json:"arvados_instance_type"` ProviderInstanceType string `json:"provider_instance_type"` LastContainerUUID string `json:"last_container_uuid"` LastBusy time.Time `json:"last_busy"` WorkerState string `json:"worker_state"` IdleBehavior IdleBehavior `json:"idle_behavior"` }
An InstanceView shows a worker's current state and recent activity.
type Pool ¶
type Pool struct {
// contains filtered or unexported fields
}
Pool is a resizable worker pool backed by a cloud.InstanceSet. A zero Pool should not be used. Call NewPool to create a new Pool.
func NewPool ¶
func NewPool(logger logrus.FieldLogger, arvClient *arvados.Client, reg *prometheus.Registry, instanceSetID cloud.InstanceSetID, instanceSet cloud.InstanceSet, newExecutor func(cloud.Instance) Executor, installPublicKey ssh.PublicKey, cluster *arvados.Cluster) *Pool
NewPool creates a Pool of workers backed by instanceSet.
New instances are configured and set up according to the given cluster configuration.
func (*Pool) AtCapacity ¶
func (wp *Pool) AtCapacity(it arvados.InstanceType) bool
AtCapacity returns true if Create() is currently expected to fail for the given instance type.
func (*Pool) AtQuota ¶
AtQuota returns true if Create is not expected to work at the moment (e.g., cloud provider has reported quota errors, or we are already at our own configured quota).
func (*Pool) CountWorkers ¶
CountWorkers returns the current number of workers in each state.
CountWorkers blocks, if necessary, until the initial instance list has been loaded from the cloud provider.
func (*Pool) Create ¶
func (wp *Pool) Create(it arvados.InstanceType) bool
Create a new instance with the given type, and add it to the worker pool. The worker is added immediately; instance creation runs in the background.
Create returns false if a pre-existing error or a configuration setting prevents it from even attempting to create a new instance. Those errors are logged by the Pool, so the caller does not need to log anything in such cases.
func (*Pool) ForgetContainer ¶
ForgetContainer clears the placeholder for the given exited container, so it isn't returned by subsequent calls to Running().
ForgetContainer has no effect if the container has not yet exited.
The "container exited at time T" placeholder (which necessitates ForgetContainer) exists to make it easier for the caller (scheduler) to distinguish a container that exited without finalizing its state from a container that exited too recently for its final state to have appeared in the scheduler's queue cache.
func (*Pool) Instances ¶
func (wp *Pool) Instances() []InstanceView
Instances returns an InstanceView for each worker in the pool, summarizing its current state and recent activity.
func (*Pool) KillContainer ¶
KillContainer kills the crunch-run process for the given container UUID, if it's running on any worker.
KillContainer returns immediately; the act of killing the container takes some time, and runs in the background.
KillContainer returns false if the container has already ended.
func (*Pool) KillInstance ¶
func (wp *Pool) KillInstance(id cloud.InstanceID, reason string) error
KillInstance destroys a cloud VM instance. It returns an error if the given instance does not exist.
func (*Pool) Running ¶
Running returns the container UUIDs being prepared/run on workers.
In the returned map, the time value indicates when the Pool observed that the container process had exited. A container that has not yet exited has a zero time value. The caller should use ForgetContainer() to garbage-collect the entries for exited containers.
func (*Pool) SetIdleBehavior ¶
func (wp *Pool) SetIdleBehavior(id cloud.InstanceID, idleBehavior IdleBehavior) error
SetIdleBehavior determines how the indicated instance will behave when it has no containers running.
func (*Pool) Shutdown ¶
func (wp *Pool) Shutdown(it arvados.InstanceType) bool
Shutdown shuts down a worker with the given type, or returns false if all workers with the given type are busy.
func (*Pool) StartContainer ¶
StartContainer starts a container on an idle worker immediately if possible, otherwise returns false.
func (*Pool) Subscribe ¶
func (wp *Pool) Subscribe() <-chan struct{}
Subscribe returns a buffered channel that becomes ready after any change to the pool's state that could have scheduling implications: a worker's state changes, a new worker appears, the cloud provider's API rate limiting period ends, etc.
Additional events that occur while the channel is already ready will be dropped, so it is OK if the caller services the channel slowly.
Example:
ch := wp.Subscribe() defer wp.Unsubscribe(ch) for range ch { tryScheduling(wp) if done { break } }
func (*Pool) Unallocated ¶
func (wp *Pool) Unallocated() map[arvados.InstanceType]int
Unallocated returns the number of unallocated (creating + booting + idle + unknown) workers for each instance type. Workers in hold/drain mode are not included.
type State ¶
type State int
State indicates whether a worker is available to do work, and (if not) whether/when it is expected to become ready.
func (State) MarshalText ¶
MarshalText implements encoding.TextMarshaler so a JSON encoding of map[State]anything uses the state's string representation.
type TagVerifier ¶
func (TagVerifier) InitCommand ¶
func (tv TagVerifier) InitCommand() cloud.InitCommand