archimedes

package module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 26, 2020 License: Apache-2.0 Imports: 12 Imported by: 0

README

archimedes

GoDoc Build License Go Report Card Apache License

Automatic and gradual rebalancing mechanism for Ceph OSDs. This process is designed to be deployed and run as a docker container that periodically reweights given set of OSDs to their target weights. It does across multiple iterations where each iteration upweights an OSD by --weight-increment value. The reweights are applied to CRUSH reweight parameter of an OSD and not the OSD reweight parameter.

Usage

This mechanism is designed to run as a docker container in the background. We have to build the image from the provided Dockerfile before we use it.

docker build -t docker.digitalocean.com/archimedes:latest -f Dockerfile.release .

You will want to change the docker image name/endpoint based on your setup. Once the image is built successfully, you can run docker push <image>:tag for pushing the image to its repository assuming you want save it for later or use it quickly from other machines in your ensemble.

The reweight run is initiated with the following command:

docker run --rm -v /etc/ceph:/etc/ceph -it docker.digitalocean.com/archimedes:latest --ceph-user admin reweight --target-osd-crush-weights "1:1.4999,2:1.4999,3:7.7999" --weight-increment 0.02

It is expected that /etc/ceph directory on the host in the above case contains both:

  • The user keyring, which will be ceph.client.admin.keyring since we passed in user as admin.
  • The ceph config for talking to the cluster: ceph.conf.

Once the container resolves the connection to the cluster correctly, it will run in background until the target weight for every single OSD, until the last one, is achieved.

The runs are further customizable. We can control options like the number of PGs we should expect backfilling / recovering until we kick off next iteration of reweights, etc. The list of options should pop up on --help.

docker run --rm -it docker.digitalocean.com/archimedes:latest reweight --help

Metrics and Logging

Our code uses logrus for structured logging which should be visible via docker logs.

docker logs -f docker.digitalocean.com/archimedes:latest

It also exposes metrics to be scraped by prometheus exporter at :8928 by default. This port address can be changed by passing in --metrics-addr to make it listen elsewhere. We should be able to see the exported metrics at the following endpoint.

curl http://localhost:8928/metrics

Development

The code is written in Golang and compatibility is tested with v1.13+ runtimes.

There is a helper Makefile included to assist with needs of testing. Running the test target should build and run the slew of tests to make sure our new changes are safe.

make test

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type CephClient

type CephClient interface {
	// BackfillingPGs surfaces the list of PGs that are either
	// in 'backfilling' or 'backfill_weight' state.
	BackfillingPGs() (int, error)

	// RecoveringPGs surfaces the list of PGs that are either
	// in 'recovering' or 'recovery_weight' state.
	RecoveringPGs() (int, error)

	// OSDTree returns a parsed version of `ceph osd tree`.
	OSDTree() (*OSDTreeOut, error)

	// CrushReweight updates the given OSD to the crush reweight
	// value provided.
	CrushReweight(osdID int, crushWeight float64) error

	// Close is used to disconnect Ceph connection once used.
	Close()
}

CephClient provides an abstraction for client calls made into Ceph.

func NewCephClient

func NewCephClient(user, configPath string) (CephClient, error)

NewCephClient takes in Ceph user and path to ceph.conf for establishing a connection to ceph cluster and returning a usable handle.

type OSDTreeOut

type OSDTreeOut struct {
	Nodes []nodeType `json:"nodes"`
	Stray []nodeType `json:"stray"`
}

OSDTreeOut provides a representation for output of `ceph osd tree -f json`.

type Option

type Option func(*Rebalancer)

Option provides a safe way to update private variables of rebalancer before creating an instance of it.

func WithCephClient

func WithCephClient(val CephClient) Option

WithCephClient holds the ceph client connected to the ceph cluster we want to perform reweighting on.

func WithDryRun

func WithDryRun(val bool) Option

WithDryRun will change the mode of rebalancer. When dry-run is disabled, the reweights will be actually performed on the cluster.

By default, dry-run is enabled to make sure no adverse impact occurs on the cluster until explicitly requested to.

func WithMaxBackfillPGsAllowed

func WithMaxBackfillPGsAllowed(val int) Option

WithMaxBackfillPGsAllowed allows changing the number of backfilling PGs that are acceptable to be ongoing while we issue another reweight operation.

func WithMaxRecoveryPGsAllowed

func WithMaxRecoveryPGsAllowed(val int) Option

WithMaxRecoveryPGsAllowed allows changing the number of recovering PGs that are acceptable to be ongoing while we issue another reweight operation.

func WithSleepInterval

func WithSleepInterval(val time.Duration) Option

WithSleepInterval updates the duration for which the rebalancer will sleep for between each of its reweight runs.

func WithTargetCrushWeightMap

func WithTargetCrushWeightMap(val map[int]float64) Option

WithTargetCrushWeightMap passes the mapping of each candidate OSD to its target CRUSH weight that it hopes to reach.

This is a required option since we cannot run the reebalancer without any OSDs to reweight.

func WithWeightIncrement

func WithWeightIncrement(val float64) Option

WithWeightIncrement updates the increment value by which each OSD will be upweighted.

type Rebalancer

type Rebalancer struct {
	// contains filtered or unexported fields
}

Rebalancer is responsible for performing data rebalancing by control weight changes to OSDs.

func New

func New(opt ...Option) (*Rebalancer, error)

New returns a new instance of Rebalancer. It is expected that non-empty values for map of osd<->crush weights is passed as an input.

func (*Rebalancer) Collect

func (r *Rebalancer) Collect(ch chan<- prometheus.Metric)

Collect is responsible for collecting values for all declared metrics.

func (*Rebalancer) Describe

func (r *Rebalancer) Describe(ch chan<- *prometheus.Desc)

Describe returns the descriptions for registered metrics.

func (*Rebalancer) DoReweight

func (r *Rebalancer) DoReweight()

DoReweight is the main function where the validation and actual crush reweighting occurs.

func (*Rebalancer) Run

func (r *Rebalancer) Run(ctx context.Context)

Run performs continues reweighting by pausing for `sleepInterval` duration between runs. It returns when either the caller context is cancelled or when all entries from osd<->target-crush-weight are processed.

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL