slurm

package
v0.1.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 8, 2024 License: Apache-2.0 Imports: 11 Imported by: 0

Documentation

Overview

Package job contains code for accessing compute resources via Slurm.

Package slurm contains code for accessing compute resources via Slurm.

Package job contains code for accessing compute resources via Slurm.

Index

Constants

This section is empty.

Variables

View Source
var ErrInvalidJob = errors.New("invalid job id")
View Source
var ErrRety = errors.New("retry later")
View Source
var ExcludeNodes = "--exclude="

ExcludeNodes EXISTS ONLY FOR DEBUGGING PURPOSES of Inotify on NFS.

View Source
var NewUserEnv = "--get-user-env=10L"

NewUserEnv is used to generate the /run/user/ folder required by cgroups. The optional mode value control the "su" options. With a mode value of "S", "su" is executed without the "-" option. With a mode value of "L", "su" is executed with the "-" option, replicating the login environment.

View Source
var Signal = "--signal=TERM"
View Source
var SignalChildren = "--full"

SignalChildren signals the submission script and the pause environment. SignalChildren signals the batch script and its children processes (pause).

View Source
var SignalParentOnly = "--batch"

SignalParentOnly signals only the submission script. https://slurm.schedmd.com/scancel.html#OPT_batch

View Source
var Slurm struct {
	SubmitCmd string
	CancelCmd string
	StatsCmd  string
}

Slurm represents a SLURM installation.

Functions

func AllocatableResources added in v0.1.2

func AllocatableResources(ctx context.Context) corev1.ResourceList

func CancelJob added in v0.1.2

func CancelJob(args string) (string, error)

func ConnectionOK added in v0.1.2

func ConnectionOK() bool

ConnectionOK return true if HPK maintains connection with the Slurm manager. Otherwise, it returns false.

func GetJobID added in v0.1.2

func GetJobID(pod *corev1.Pod) string

func HasJobID added in v0.1.2

func HasJobID(pod *corev1.Pod) bool

func SetContainerStatusID added in v0.1.2

func SetContainerStatusID(status *corev1.ContainerStatus, typedValue string)

func SetPodID added in v0.1.2

func SetPodID(pod *corev1.Pod, idType JobIDType, value string)

func SubmitJob added in v0.1.2

func SubmitJob(scriptFile string) (string, error)

func TotalResources added in v0.1.2

func TotalResources() corev1.ResourceList

Types

type JobIDType added in v0.1.2

type JobIDType string
const (
	JobIDTypeInstance JobIDType = "instance://"

	JobIDTypeProcess JobIDType = "pid://"

	JobIDTypeSlurm JobIDType = "slurm://"

	JobIDTypeEmpty JobIDType = "Empty"
)

type NodeInfo added in v0.1.2

type NodeInfo struct {
	Architecture  string `json:"architecture"`
	KernelVersion string `json:"operating_system"`

	Name     string `json:"name"`
	CPUs     uint64 `json:"cpus"`
	CPUCores uint64 `json:"cores"`

	EphemeralStorage uint64 `json:"temporary_disk"`

	// FreeMemory ... reported in MegaBytes
	//[TODO: temporarily changed it to int64 due to sometimes slurm declares freememory as "-2"]
	FreeMemory int64    `json:"free_memory"`
	Partitions []string `json:"partitions"`
}

func (NodeInfo) ResourceList added in v0.1.2

func (i NodeInfo) ResourceList() corev1.ResourceList

ResourceList converts the Slurm-reported stats into Kubernetes-Stats.

type Stats added in v0.1.2

type Stats struct {
	Nodes []NodeInfo `json:"nodes"`
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL