scheduler

package
v0.12.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 12, 2022 License: Apache-2.0 Imports: 26 Imported by: 0

Documentation

Index

Constants

View Source
const (
	DefaultCleanRootInterval        = 10000 * time.Millisecond // sleep between queue removal checks
	DefaultCleanExpiredAppsInterval = 24 * time.Hour           // sleep between apps removal checks
)
View Source
const (
	LevelKey = "level"
	PhaseKey = "phase"
	NameKey  = "name"
	StateKey = "state"
	InfoKey  = "info"
)

Variables

This section is empty.

Functions

func CreateCheckInfo added in v0.10.0

func CreateCheckInfo(succeeded bool, name, description, message string) dao.HealthCheckInfo

func GetSchedulerHealthStatus added in v0.10.0

func GetSchedulerHealthStatus(metrics metrics.CoreSchedulerMetrics, schedulerContext *ClusterContext) dao.SchedulerHealthDAOInfo

Types

type ClusterContext added in v0.10.0

type ClusterContext struct {
	sync.RWMutex
	// contains filtered or unexported fields
}

func NewClusterContext added in v0.10.0

func NewClusterContext(rmID, policyGroup string) (*ClusterContext, error)

Create a new cluster context to be used outside of the event system. test only

func (*ClusterContext) GetApplication added in v0.10.0

func (cc *ClusterContext) GetApplication(appID, partitionName string) *objects.Application

Get the scheduling application based on the ID from the partition. Returns nil if the partition or app cannot be found. Visible for tests

func (*ClusterContext) GetNode added in v0.10.0

func (cc *ClusterContext) GetNode(nodeID, partitionName string) *objects.Node

Get a scheduling node based on its name from the partition. Returns nil if the partition or node cannot be found. Visible for tests

func (*ClusterContext) GetPartition added in v0.10.0

func (cc *ClusterContext) GetPartition(partitionName string) *PartitionContext

func (*ClusterContext) GetPartitionMapClone added in v0.10.0

func (cc *ClusterContext) GetPartitionMapClone() map[string]*PartitionContext

func (*ClusterContext) GetPartitionWithoutClusterID added in v0.11.0

func (cc *ClusterContext) GetPartitionWithoutClusterID(partitionName string) *PartitionContext

func (*ClusterContext) GetPolicyGroup added in v0.10.0

func (cc *ClusterContext) GetPolicyGroup() string

Get the config name.

func (*ClusterContext) GetQueue added in v0.10.0

func (cc *ClusterContext) GetQueue(queueName string, partitionName string) *objects.Queue

Get the scheduling queue based on the queue path name from the partition. Returns nil if the partition or queue cannot be found. Visible for tests

func (*ClusterContext) GetReservations added in v0.10.0

func (cc *ClusterContext) GetReservations(partitionName string) map[string]int

Return the list of reservations for the partition. Returns nil if the partition cannot be found or an empty map if there are no reservations Visible for tests

func (*ClusterContext) NeedPreemption added in v0.10.0

func (cc *ClusterContext) NeedPreemption() bool

func (*ClusterContext) UpdateRMSchedulerConfig added in v0.10.0

func (cc *ClusterContext) UpdateRMSchedulerConfig(rmID string) error

Locked version of the configuration update called outside of event system. Updates the current config via the config loader. Used in test only, normal updates use the internal call and the webservice must use the UpdateSchedulerConfig

func (*ClusterContext) UpdateSchedulerConfig added in v0.10.0

func (cc *ClusterContext) UpdateSchedulerConfig(conf *configs.SchedulerConfig) error

Locked version of the configuration update called from the webservice NOTE: this call assumes one RM which is registered and uses that RM for the updates

type DRFPreemptionPolicy

type DRFPreemptionPolicy struct {
}

Preemption policy based-on DRF

func (*DRFPreemptionPolicy) DoPreemption

func (m *DRFPreemptionPolicy) DoPreemption(scheduler *Scheduler)

type PartitionContext added in v0.10.0

type PartitionContext struct {
	RmID string // the RM the partition belongs to
	Name string // name of the partition (logging mainly)

	// The partition write lock must not be held while manipulating an application.
	// Scheduling is running continuously as a lock free background task. Scheduling an application
	// acquires a write lock of the application object. While holding the write lock a list of nodes is
	// requested from the partition. This requires a read lock on the partition.
	// If the partition write lock is held while manipulating an application a dead lock could occur.
	// Since application objects handle their own locks there is no requirement to hold the partition lock
	// while manipulating the application.
	// Similarly adding, updating or removing a node or a queue should only hold the partition write lock
	// while manipulating the partition information not while manipulating the underlying objects.
	sync.RWMutex
	// contains filtered or unexported fields
}

func (*PartitionContext) AddApplication added in v0.10.0

func (pc *PartitionContext) AddApplication(app *objects.Application) error

Add a new application to the partition. NOTE: this is a lock free call. It must NOT be called holding the PartitionContext lock.

func (*PartitionContext) AddNode added in v0.10.0

func (pc *PartitionContext) AddNode(node *objects.Node, existingAllocations []*objects.Allocation) error

Add the node to the partition and process the allocations that are reported by the node. NOTE: this is a lock free call. It must NOT be called holding the PartitionContext lock.

func (*PartitionContext) GetAllocatedResource added in v0.10.0

func (pc *PartitionContext) GetAllocatedResource() *resources.Resource

func (*PartitionContext) GetApplications added in v0.10.0

func (pc *PartitionContext) GetApplications() []*objects.Application

func (*PartitionContext) GetAppsByState added in v0.10.0

func (pc *PartitionContext) GetAppsByState(state string) []*objects.Application

func (*PartitionContext) GetAppsInTerminatedState added in v0.10.0

func (pc *PartitionContext) GetAppsInTerminatedState() []*objects.Application

func (*PartitionContext) GetCompletedApplications added in v0.10.0

func (pc *PartitionContext) GetCompletedApplications() []*objects.Application

func (*PartitionContext) GetCurrentState added in v0.10.0

func (pc *PartitionContext) GetCurrentState() string

func (*PartitionContext) GetNode added in v0.10.0

func (pc *PartitionContext) GetNode(nodeID string) *objects.Node

Get a node from the partition by nodeID.

func (*PartitionContext) GetNodeIterator added in v0.10.0

func (pc *PartitionContext) GetNodeIterator() objects.NodeIterator

Create an ordered node iterator based on the node sort policy set for this partition. The iterator is nil if there are no unreserved nodes available.

func (*PartitionContext) GetNodeSortingPolicyType added in v0.12.0

func (pc *PartitionContext) GetNodeSortingPolicyType() policies.SortingPolicy

func (*PartitionContext) GetNodeSortingResourceWeights added in v0.12.0

func (pc *PartitionContext) GetNodeSortingResourceWeights() map[string]float64

func (*PartitionContext) GetNodes added in v0.10.0

func (pc *PartitionContext) GetNodes() []*objects.Node

func (*PartitionContext) GetPartitionQueues added in v0.11.0

func (pc *PartitionContext) GetPartitionQueues() dao.PartitionQueueDAOInfo

Get the queue info for the whole queue structure to pass to the webservice

func (*PartitionContext) GetQueue added in v0.10.0

func (pc *PartitionContext) GetQueue(name string) *objects.Queue

Get the queue from the structure based on the fully qualified name. Wrapper around the unlocked version getQueueInternal() Visible by tests

func (*PartitionContext) GetQueueInfos added in v0.10.0

func (pc *PartitionContext) GetQueueInfos() dao.QueueDAOInfo

Get the queue info for the whole queue structure to pass to the webservice

func (*PartitionContext) GetStateTime added in v0.10.0

func (pc *PartitionContext) GetStateTime() time.Time

func (*PartitionContext) GetTotalAllocationCount added in v0.10.0

func (pc *PartitionContext) GetTotalAllocationCount() int

func (*PartitionContext) GetTotalApplicationCount added in v0.10.0

func (pc *PartitionContext) GetTotalApplicationCount() int

func (*PartitionContext) GetTotalCompletedApplicationCount added in v0.12.0

func (pc *PartitionContext) GetTotalCompletedApplicationCount() int

func (*PartitionContext) GetTotalNodeCount added in v0.10.0

func (pc *PartitionContext) GetTotalNodeCount() int

func (*PartitionContext) GetTotalPartitionResource added in v0.10.0

func (pc *PartitionContext) GetTotalPartitionResource() *resources.Resource

type PreemptionPolicy

type PreemptionPolicy interface {
	DoPreemption(scheduler *Scheduler)
}

type Scheduler

type Scheduler struct {
	// contains filtered or unexported fields
}

Main Scheduler service that starts the needed sub services

func NewScheduler

func NewScheduler() *Scheduler

func (*Scheduler) GetClusterContext added in v0.10.0

func (s *Scheduler) GetClusterContext() *ClusterContext

Visible by tests

func (*Scheduler) HandleEvent

func (s *Scheduler) HandleEvent(ev interface{})

Implement methods for Scheduler events

func (*Scheduler) MultiStepSchedule

func (s *Scheduler) MultiStepSchedule(nAlloc int)

The scheduler for testing which runs nAlloc times the normal schedule routine. Visible by tests

func (*Scheduler) SingleStepPreemption

func (s *Scheduler) SingleStepPreemption()

Visible by tests

func (*Scheduler) StartService

func (s *Scheduler) StartService(handlers handler.EventHandlers, manualSchedule bool)

Start service

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL