scheduler

package
v0.2.6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 16, 2024 License: Apache-2.0 Imports: 4 Imported by: 0

Documentation

Overview

Package scheduler provides primitives for scheduling imbalanced data. In addition to static scheduling that reduces the load imbalance, it supports a feedback-directed optimization that adaptively adjusts the workload on each worker and a guided optimization that minimizes the zero padding for packed sequences.

Index

Constants

View Source
const (
	STATIC = iota
	DYNAMIC
	GUIDED
)

Variables

This section is empty.

Functions

func Next

func Next(scheduler Scheduler) [][]int64

Next returns mini-batches for the next training epoch. This returns a matrix of shape (world size, # of samples).

Types

type DynamicScheduler

type DynamicScheduler struct {
	// contains filtered or unexported fields
}

DynamicScheduler provides a feedback-directed optimization. It adaptively adjusts the workload on each worker, which can be useful in heterogeneous clusters where the workers have different compute capabilities.

func NewDynamicScheduler

func NewDynamicScheduler(dataset data.Dataset, worldSize, batchSize int, sizes []int64) *DynamicScheduler

NewDynamicScheduler creates a new dynamic scheduler with the given arguments.

func (*DynamicScheduler) Len

func (s *DynamicScheduler) Len() int

func (*DynamicScheduler) OnEpochEnd

func (s *DynamicScheduler) OnEpochEnd(epoch, rank int64, coefficient, intercept float64)

OnEpochEnd updates the worker profile with the given feedback.

func (*DynamicScheduler) OnTrainEnd

func (s *DynamicScheduler) OnTrainEnd()

func (*DynamicScheduler) Schedule

func (s *DynamicScheduler) Schedule(step int) [][]int64

Schedule assigns the next mini-batch to each of the workers based on their performance indicators. It adopts best-fit-decreasing with a random first pivot to equalize the training time while randomizing the training sequence. This is a revised version of our original scheme for straggler mitigation against imbalanced data, which has been proposed in the 2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing (CCGrid). Chronica paper: https://ieeexplore.ieee.org/document/10171495

type GuidedScheduler added in v0.2.0

type GuidedScheduler struct {
	// contains filtered or unexported fields
}

GuidedScheduler is a padding-aware scheduler that provides a guided optimization for packed sequences. It accelerates training by reducing unnecessary operations caused by zero padding.

func NewGuidedScheduler added in v0.2.0

func NewGuidedScheduler(dataset data.Dataset, worldSize, batchSize int, sizes []int64) *GuidedScheduler

NewGuidedScheduler creates a new guided scheduler with the given arguments.

func (*GuidedScheduler) Len added in v0.2.0

func (s *GuidedScheduler) Len() int

func (*GuidedScheduler) OnEpochEnd added in v0.2.0

func (s *GuidedScheduler) OnEpochEnd(epoch, rank int64, coefficient, intercept float64)

func (*GuidedScheduler) OnTrainEnd added in v0.2.0

func (s *GuidedScheduler) OnTrainEnd()

func (*GuidedScheduler) Schedule added in v0.2.0

func (s *GuidedScheduler) Schedule(step int) [][]int64

Schedule assigns the next mini-batch to each of the workers while minimizing the zero padding. The higher the rank, the larger the mini-batches are assigned in that more workloads such as scheduling and parameter synchronization are given to the master.

type Scheduler

type Scheduler interface {
	// Schedule selects data samples for the next mini-batch.
	Schedule(step int) [][]int64

	// Len returns the number of required steps for each training epoch.
	Len() int

	// OnEpochEnd is called at the end of an epoch during training.
	OnEpochEnd(epoch, rank int64, coefficient, intercept float64)

	// OnTrainEnd terminates the training environment.
	OnTrainEnd()
}

Scheduler represents the data scheduler. In addition to the scheduling algorithm, one should implement callbacks called OnEpochEnd and OnTrainEnd, which are called at the end of each epoch and training, respectively.

func New

func New[T ~int32](dataset data.Dataset, worldSize, batchSize int, sizes []int64, kind T) Scheduler

New creates a new scheduler with the given arguments.

type StaticScheduler

type StaticScheduler struct {
	// contains filtered or unexported fields
}

StaticScheduler provides balanced workload to each of the workers while limiting the peak device memory usage; this allows for larger batch size, reducing the communication overheads and thereby improving the scalability.

func NewStaticScheduler

func NewStaticScheduler(dataset data.Dataset, worldSize, batchSize int, sizes []int64) *StaticScheduler

NewStaticScheduler creates a new static scheduler with the given arguments.

func (*StaticScheduler) Len

func (s *StaticScheduler) Len() int

func (*StaticScheduler) OnEpochEnd

func (s *StaticScheduler) OnEpochEnd(epoch, rank int64, coefficient, intercept float64)

func (*StaticScheduler) OnTrainEnd

func (s *StaticScheduler) OnTrainEnd()

func (*StaticScheduler) Schedule

func (s *StaticScheduler) Schedule(step int) [][]int64

Schedule assigns the next mini-batch to each of the workers. It adopts first-fit-decreasing (FFD), which is an approximately-optimal heuristic for bin packing. FFD paper: https://dspace.mit.edu/bitstream/handle/1721.1/57819/17595570-MIT.pdf Python implementation: https://github.com/erelsgl/prtpy/blob/ebe54010513ea725f7a3221e4aa0258afa15d6fb/prtpy/packing/first_fit.py

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL