lcm

package
v0.1.1-rc.0...-32aea41 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 12, 2022 License: Apache-2.0, CC-BY-4.0, MIT Imports: 37 Imported by: 0

Documentation

Index

Constants

View Source
const PodLevelJobDir = "/job"

PodLevelJobDir represents the place to store the job state indicator files, as well as the $BREAK_FILE and $EXITCODE_FILE.

View Source
const PodLevelLogDir = PodLevelJobDir + "/logs"

PodLevelLogDir represents the place to store the per-learner logs.

Variables

View Source
var (
	//NativeFrameworks which support native distribution
	NativeFrameworks = []string{"tensorflow", "caffe2", "mxnet", "horovod"}
)

Functions

func GetVolumeClaim

func GetVolumeClaim(volumeSize int64) (*v1core.PersistentVolumeClaim, error)

GetVolumeClaim returns a PersistentVolumeClaim struct for the given volume size (specified in bytes).

func InitLogger

func InitLogger(trainingID string, userID string) *log.Entry

InitLogger ... initializes new logger with trainingID and userID

Types

type Service

type Service interface {
	service.LifecycleManagerServer
	service.LifecycleHandler
	StopLCM()
}

Service LCM manages the lifecycle of the entire distributed deep learning job

func NewService

func NewService() (Service, error)

NewService is a constructor to initialize LCM

type Training

type Training interface {
	Start() error
}

Training ...

func NewTraining

NewTraining ...

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL