scheduler

package

v0.2.6 Latest Latest Go to latest Published: Feb 16, 2024 License: Apache-2.0 Imports: 4 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

Documentation ¶

Overview ¶

Package scheduler provides primitives for scheduling imbalanced data. In addition to static scheduling that reduces the load imbalance, it supports a feedback-directed optimization that adaptively adjusts the workload on each worker and a guided optimization that minimizes the zero padding for packed sequences.

Index ¶

Constants
func Next(scheduler Scheduler) [][]int64
type DynamicScheduler
- func NewDynamicScheduler(dataset data.Dataset, worldSize, batchSize int, sizes []int64) *DynamicScheduler
type GuidedScheduler
- func NewGuidedScheduler(dataset data.Dataset, worldSize, batchSize int, sizes []int64) *GuidedScheduler
type Scheduler
- func New[T ~int32](dataset data.Dataset, worldSize, batchSize int, sizes []int64, kind T) Scheduler
type StaticScheduler
- func NewStaticScheduler(dataset data.Dataset, worldSize, batchSize int, sizes []int64) *StaticScheduler

Constants ¶

View Source

const (
	STATIC = iota
	DYNAMIC
	GUIDED
)

Variables ¶

This section is empty.

Functions ¶

func Next ¶

func Next(scheduler Scheduler) [][]int64

Next returns mini-batches for the next training epoch. This returns a matrix of shape (world size, # of samples).

Types ¶

type DynamicScheduler ¶

type DynamicScheduler struct {
	// contains filtered or unexported fields
}

DynamicScheduler provides a feedback-directed optimization. It adaptively adjusts the workload on each worker, which can be useful in heterogeneous clusters where the workers have different compute capabilities.

func NewDynamicScheduler ¶

func NewDynamicScheduler(dataset data.Dataset, worldSize, batchSize int, sizes []int64) *DynamicScheduler

NewDynamicScheduler creates a new dynamic scheduler with the given arguments.

func (*DynamicScheduler) Len ¶

func (s *DynamicScheduler) Len() int

func (*DynamicScheduler) OnEpochEnd ¶

func (s *DynamicScheduler) OnEpochEnd(epoch, rank int64, coefficient, intercept float64)

OnEpochEnd updates the worker profile with the given feedback.

func (*DynamicScheduler) OnTrainEnd ¶

func (s *DynamicScheduler) OnTrainEnd()

func (*DynamicScheduler) Schedule ¶

func (s *DynamicScheduler) Schedule(step int) [][]int64

Schedule assigns the next mini-batch to each of the workers based on their performance indicators. It adopts best-fit-decreasing with a random first pivot to equalize the training time while randomizing the training sequence. This is a revised version of our original scheme for straggler mitigation against imbalanced data, which has been proposed in the 2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing (CCGrid). Chronica paper: https://ieeexplore.ieee.org/document/10171495

type GuidedScheduler ¶ added in v0.2.0

type GuidedScheduler struct {
	// contains filtered or unexported fields
}

GuidedScheduler is a padding-aware scheduler that provides a guided optimization for packed sequences. It accelerates training by reducing unnecessary operations caused by zero padding.

func NewGuidedScheduler ¶ added in v0.2.0

func NewGuidedScheduler(dataset data.Dataset, worldSize, batchSize int, sizes []int64) *GuidedScheduler

NewGuidedScheduler creates a new guided scheduler with the given arguments.

func (*GuidedScheduler) Len ¶ added in v0.2.0

func (s *GuidedScheduler) Len() int

func (*GuidedScheduler) OnEpochEnd ¶ added in v0.2.0

func (s *GuidedScheduler) OnEpochEnd(epoch, rank int64, coefficient, intercept float64)

func (*GuidedScheduler) OnTrainEnd ¶ added in v0.2.0

func (s *GuidedScheduler) OnTrainEnd()

func (*GuidedScheduler) Schedule ¶ added in v0.2.0

func (s *GuidedScheduler) Schedule(step int) [][]int64

Schedule assigns the next mini-batch to each of the workers while minimizing the zero padding. The higher the rank, the larger the mini-batches are assigned in that more workloads such as scheduling and parameter synchronization are given to the master.

type Scheduler ¶

type Scheduler interface {
	// Schedule selects data samples for the next mini-batch.
	Schedule(step int) [][]int64

	// Len returns the number of required steps for each training epoch.
	Len() int

	// OnEpochEnd is called at the end of an epoch during training.
	OnEpochEnd(epoch, rank int64, coefficient, intercept float64)

	// OnTrainEnd terminates the training environment.
	OnTrainEnd()
}

Scheduler represents the data scheduler. In addition to the scheduling algorithm, one should implement callbacks called OnEpochEnd and OnTrainEnd, which are called at the end of each epoch and training, respectively.

func New ¶

func New[T ~int32](dataset data.Dataset, worldSize, batchSize int, sizes []int64, kind T) Scheduler

New creates a new scheduler with the given arguments.

type StaticScheduler ¶

type StaticScheduler struct {
	// contains filtered or unexported fields
}

StaticScheduler provides balanced workload to each of the workers while limiting the peak device memory usage; this allows for larger batch size, reducing the communication overheads and thereby improving the scalability.

func NewStaticScheduler ¶

func NewStaticScheduler(dataset data.Dataset, worldSize, batchSize int, sizes []int64) *StaticScheduler

NewStaticScheduler creates a new static scheduler with the given arguments.

func (*StaticScheduler) Len ¶

func (s *StaticScheduler) Len() int

func (*StaticScheduler) OnEpochEnd ¶

func (s *StaticScheduler) OnEpochEnd(epoch, rank int64, coefficient, intercept float64)

func (*StaticScheduler) OnTrainEnd ¶

func (s *StaticScheduler) OnTrainEnd()

func (*StaticScheduler) Schedule ¶

func (s *StaticScheduler) Schedule(step int) [][]int64

Schedule assigns the next mini-batch to each of the workers. It adopts first-fit-decreasing (FFD), which is an approximately-optimal heuristic for bin packing. FFD paper: https://dspace.mit.edu/bitstream/handle/1721.1/57819/17595570-MIT.pdf Python implementation: https://github.com/erelsgl/prtpy/blob/ebe54010513ea725f7a3221e4aa0258afa15d6fb/prtpy/packing/first_fit.py

Source Files ¶

View all Source files

scheduler.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL