autoscaler

package
v0.0.0-...-e139c8d Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 17, 2020 License: Apache-2.0 Imports: 12 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func AddResourceList

func AddResourceList(a v1.ResourceList, b v1.ResourceList)

AddResourceList add another v1.ResourceList to first's inner quantity. v1.ResourceList is equal to map[string]Quantity

func WithMaxLoadDesired

func WithMaxLoadDesired(maxLoadDesired float64) func(as *Autoscaler)

WithMaxLoadDesired init with maxLoadDesired

Types

type Autoscaler

type Autoscaler struct {
	// contains filtered or unexported fields
}

Autoscaler launches and scales the training jobs.

func NewAutoscaler

func NewAutoscaler(kubeClient kubernetes.Interface, jobUpdater *sync.Map, options ...func(*Autoscaler)) *Autoscaler

NewAutoscaler creates a new Autoscaler.

func (*Autoscaler) InquiryResource

func (a *Autoscaler) InquiryResource() (ClusterResource, error)

InquiryResource returns the idle and total resources of the k8s cluster.

func (*Autoscaler) Run

func (a *Autoscaler) Run()

Run monitors the cluster resources and training jobs in a loop, scales the training jobs according to the cluster resource.

type ClusterResource

type ClusterResource struct {
	NodeCount int // The total number of nodes in the cluster.

	// Each Kubernetes job could require some number of GPUs in
	// the range of [request, limit].
	GPURequest int // \sum_job num_gpu_request(job)
	GPULimit   int // \sum_job num_gpu_limit(job)
	GPUTotal   int // The total number of GPUs in the cluster

	// Each Kubernetes job could require some CPU timeslices in
	// the unit of *milli*.
	CPURequestMilli int64 // \sum_job cpu_request_in_milli(job)
	CPULimitMilli   int64 // \sum_job cpu_request_in_milli(job)
	CPUTotalMilli   int64 // The total amount of CPUs in the cluster in milli.

	// Each Kubernetes job could require some amount of memory in
	// the unit of *mega*.
	MemoryRequestMega int64 // \sum_job memory_request_in_mega(job)
	MemoryLimitMega   int64 // \sum_job memory_limit_in_mega(job)
	MemoryTotalMega   int64 // The total amount of memory in the cluster in mega.

	Nodes Nodes
}

ClusterResource is the resource of a cluster

type Nodes

type Nodes struct {
	NodesCPUIdleMilli   map[string]int64 // node id -> idle CPU
	NodesMemoryFreeMega map[string]int64 // node id -> free memory
}

Nodes records the amount of idle CPU and free memory of each node in the cluster.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL