train

package
v0.0.0-...-9f00eee Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 30, 2021 License: Apache-2.0 Imports: 22 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type FunctionArgs

type FunctionArgs struct {
	Id  int
	Num int
}

FunctionArgs holds the arguments needed to build the url of a function, such as the function id and parallelism level

type FunctionResults

type FunctionResults struct {
	// contains filtered or unexported fields
}

FunctionResults holds the function id and the execution results of a function, be it a training or validation function

type FunctionTask

type FunctionTask string
const (
	Train      FunctionTask = "train"
	Validation FunctionTask = "val"
	Init       FunctionTask = "init"
	Inference  FunctionTask = "infer"
)

type MergeResult

type MergeResult int
const (
	MergeSucceeded MergeResult = iota
	MergeFailed
)

type TrainJob

type TrainJob struct {
	K int
	// contains filtered or unexported fields
}

TrainJob is each of the workers launched by the parameter server. The worker is responsible from managing the reference model, saving the intermediate accuracy/validation results in the history, and requesting/receiving new scheduling responses from the scheduler

func NewBasicJob

func NewBasicJob(logger *zap.Logger, jobId string) *TrainJob

NewBasicJob creates a job with no task provided yet. It will start the job api and wait for its task to be defined there.

This is the constructor used when deploying the jobs in separate pods

func NewTrainJob

func NewTrainJob(
	logger *zap.Logger,
	task *api.TrainTask,
	schedulerCh chan *api.JobState,
	client *schedulerClient.Client) *TrainJob

NewTrainJob Creates a new TrainJob that will take care of a specific train request

func (*TrainJob) GetHandler

func (job *TrainJob) GetHandler() http.Handler

func (*TrainJob) Serve

func (job *TrainJob) Serve(port int)

func (*TrainJob) Train

func (job *TrainJob) Train()

Train is the main

Waits for the API to receive all the requests for starting the next epoch After this the job needs to send a request to the scheduler to get the proper amount of functions to use in the next epoch

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL