manager

package
v0.0.0-...-8d377ce Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 29, 2024 License: Apache-2.0 Imports: 31 Imported by: 0

README

Equinix Shepherd

Manages Equinix machines in sync with BMDB contents. Made up of two components:

Provisioner

Brings up machines from hardware reservations and populates BMDB with new Provided machines.

Initializer

Starts the Agent over SSH (wherever necessary per the BMDB) and reports success into the BMDB.

Running

Unit Tests

The Shepherd has some basic smoke tests which run against a Fakequinix.

Manual Testing

If you have Equinix credentials, you can run:

$ bazel build //cloud/shepherd/provider/equinix
$ bazel build //cloud/shepherd/manager/test_agent
$ bazel-bin/cloud/shepherd/provider/equinix/equinix_/equinix \
    -bmdb_eat_my_data \
    -equinix_project_id FIXME \
    -equinix_api_username FIXME \
    -equinix_api_key FIXME \
    -agent_executable_path bazel-bin/cloud/shepherd/manager/test_agent/test_agent_/test_agent \
    -agent_endpoint example.com \
    -equinix_ssh_key_label $USER-FIXME \
    -equinix_device_prefix $USER-FIXME- \
    -provisioner_assimilate -provisioner_max_machines 10

Replace $USER-FIXME with <your username>-test or some other unique name/prefix.

This will start a single instance of the provisioner accompanied by a single instance of the initializer.

A persistent SSH key will be created in your current working directory.

Prod Deployment

TODO(q3k): split server binary into separate provisioner/initializer for initializer scalability, as that's the main bottleneck.

Documentation

Overview

Package manager, itself a part of BMaaS project, provides implementation governing Equinix bare metal server lifecycle according to conditions set by Bare Metal Database (BMDB).

The implementation will attempt to provide as many machines as possible and register them with BMDB. This is limited by the count of Hardware Reservations available in the Equinix Metal project used. The BMaaS agent will then be started on these machines as soon as they become ready.

The implementation is provided in the form of a library, to which interface is exported through Provisioner and Initializer types, each taking servers through a single stage of their lifecycle.

See the included test code for usage examples.

The terms "device" and "machine" are used interchangeably throughout this package due to differences in Equinix Metal and BMDB nomenclature.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func RunControlLoop

func RunControlLoop(ctx context.Context, conn *bmdb.Connection, loop controlLoop) error

RunControlLoop runs the given controlLoop implementation against the BMDB. The loop will be run with the parallelism and rate configured by the ControlLoopConfig embedded or otherwise returned by the controlLoop.

Types

type ControlLoopConfig

type ControlLoopConfig struct {
	// DBQueryLimiter limits the rate at which BMDB is queried for servers ready
	// for BMaaS agent initialization. Must be set.
	DBQueryLimiter *rate.Limiter

	// Parallelism is how many instances of the Initializer will be allowed to run in
	// parallel against the BMDB. This speeds up the process of starting/restarting
	// agents significantly, as one initializer instance can handle at most one agent
	// (re)starting process.
	//
	// If not set (ie. 0), default to 1. A good starting value for production
	// deployments is 10 or so.
	Parallelism int
}

ControlLoopConfig should be embedded the every component which acts as a control loop. RegisterFlags should be called by the component whenever it is registering its own flags. Check should be called whenever the component is instantiated, after RegisterFlags has been called.

func (*ControlLoopConfig) Check

func (c *ControlLoopConfig) Check() error

Check should be called after RegisterFlags but before the control loop is ran. If an error is returned, the control loop cannot start.

func (*ControlLoopConfig) RegisterFlags

func (c *ControlLoopConfig) RegisterFlags(prefix string)

RegisterFlags should be called on this configuration whenever the embeddeding component/configuration is registering its own flags. The prefix should be the name of the component.

type FakeSSHClient

type FakeSSHClient struct{}

FakeSSHClient is an Client that pretends to start an agent, but in reality just responds with what an agent would respond on every execution attempt.

func (*FakeSSHClient) Dial

func (f *FakeSSHClient) Dial(ctx context.Context, address string, timeout time.Duration) (ssh.Connection, error)

type Initializer

type Initializer struct {
	InitializerConfig
	// contains filtered or unexported fields
}

The Initializer starts the agent on machines that aren't yet running it.

func NewInitializer

func NewInitializer(p shepherd.Provider, sshClient ssh.Client, ic InitializerConfig) (*Initializer, error)

NewInitializer creates an Initializer instance, checking the InitializerConfig, SharedConfig and AgentConfig for errors.

type InitializerConfig

type InitializerConfig struct {
	ControlLoopConfig

	// Executable is the contents of the agent binary created and run
	// at the provisioned servers. Must be set.
	Executable []byte

	// TargetPath is a filesystem destination path used while uploading the BMaaS
	// agent executable to hosts as part of the initialization process. Must be set.
	TargetPath string

	// Endpoint is the address Agent will use to contact the BMaaS
	// infrastructure. Must be set.
	Endpoint string

	// EndpointCACertificate is an optional DER-encoded (but not PEM-armored) X509
	// certificate used to populate the trusted CA store of the agent. It should be
	// set to the CA certificate of the endpoint if not using a system-trusted CA
	// certificate.
	EndpointCACertificate []byte

	// SSHTimeout is the amount of time set aside for the initializing
	// SSH session to run its course. Upon timeout, the iteration would be
	// declared a failure. Must be set.
	SSHConnectTimeout time.Duration
	// SSHExecTimeout is the amount of time set aside for executing the agent and
	// getting its output once the SSH connection has been established. Upon timeout,
	// the iteration would be declared as failure. Must be set.
	SSHExecTimeout time.Duration
}

InitializerConfig configures how the Initializer will deploy Agents on machines. In CLI scenarios, this should be populated from flags via RegisterFlags.

func (*InitializerConfig) Check

func (ic *InitializerConfig) Check() error

func (*InitializerConfig) RegisterFlags

func (ic *InitializerConfig) RegisterFlags()

type Provisioner

type Provisioner struct {
	ProvisionerConfig
	// contains filtered or unexported fields
}

Provisioner implements the server provisioning logic. Provisioning entails bringing all available machines (subject to limits) into BMDB.

func NewProvisioner

func NewProvisioner(p shepherd.Provider, pc ProvisionerConfig) (*Provisioner, error)

NewProvisioner creates a Provisioner instance, checking ProvisionerConfig and providerConfig for errors.

func (*Provisioner) Run

func (p *Provisioner) Run(ctx context.Context, conn *bmdb.Connection) error

Run the provisioner blocking the current goroutine until the given context expires.

type ProvisionerConfig

type ProvisionerConfig struct {
	// MaxCount is the maximum count of managed servers. No new devices will be
	// created after reaching the limit. No attempt will be made to reduce the
	// server count.
	MaxCount uint

	// ReconcileLoopLimiter limits the rate of the main reconciliation loop
	// iterating.
	ReconcileLoopLimiter *rate.Limiter

	// DeviceCreation limits the rate at which devices are created.
	DeviceCreationLimiter *rate.Limiter

	// ChunkSize is how many machines will try to be spawned in a
	// single reconciliation loop. Higher numbers allow for faster initial
	// provisioning, but lower numbers decrease potential raciness with other systems
	// and make sure that other parts of the reconciliation logic are ran regularly.
	//
	// 20 is decent starting point.
	ChunkSize uint
}

ProvisionerConfig configures the provisioning process.

func (*ProvisionerConfig) RegisterFlags

func (pc *ProvisionerConfig) RegisterFlags()

type Recoverer

type Recoverer struct {
	RecovererConfig
	// contains filtered or unexported fields
}

The Recoverer reboots machines whose agent has stopped sending heartbeats or has not sent any heartbeats at all.

func NewRecoverer

func NewRecoverer(r shepherd.Recoverer, rc RecovererConfig) (*Recoverer, error)

type RecovererConfig

type RecovererConfig struct {
	ControlLoopConfig
}

func (*RecovererConfig) RegisterFlags

func (r *RecovererConfig) RegisterFlags()

type SSHKey

type SSHKey struct {

	// SSH key to use when creating machines and then connecting to them. If not
	// provided, it will be automatically loaded from KeyPersistPath, and if that
	// doesn't exist either, it will be first generated and persisted there.
	Key ed25519.PrivateKey

	// Path at which the SSH key will be loaded from and persisted to, if Key is not
	// explicitly set. Either KeyPersistPath or Key must be set.
	KeyPersistPath string
	// contains filtered or unexported fields
}

func (*SSHKey) PublicKey

func (c *SSHKey) PublicKey() (string, error)

PublicKey returns the SSH public key marshaled for use, based on sshKey.

func (*SSHKey) RegisterFlags

func (c *SSHKey) RegisterFlags()

func (*SSHKey) Signer

func (c *SSHKey) Signer() (ssh.Signer, error)

Signer builds an ssh.Signer (for use in SSH connections) based on sshKey.

Directories

Path Synopsis
test_agent is used by the Equinix Metal Manager test code.
test_agent is used by the Equinix Metal Manager test code.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL