nvidia_smi

package
v0.0.0-...-0fce7b6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 26, 2019 License: MIT Imports: 10 Imported by: 0

README

nvidia-smi Input Plugin

This plugin uses a query on the nvidia-smi binary to pull GPU stats including memory and GPU usage, temp and other.

Configuration
# Pulls statistics from nvidia GPUs attached to the host
[[inputs.nvidia_smi]]
  ## Optional: path to nvidia-smi binary, defaults to $PATH via exec.LookPath
  # bin_path = "/usr/bin/nvidia-smi"

  ## Optional: timeout for GPU polling
  # timeout = "5s"
Windows

On Windows, nvidia-smi is generally located at C:\Program Files\NVIDIA Corporation\NVSMI\nvidia-smi.exe

Metrics
  • measurement: nvidia_smi
    • tags
      • name (type of GPU e.g. GeForce GTX 1070 Ti)
      • compute_mode (The compute mode of the GPU e.g. Default)
      • index (The port index where the GPU is connected to the motherboard e.g. 1)
      • pstate (Overclocking state for the GPU e.g. P0)
      • uuid (A unique identifier for the GPU e.g. GPU-f9ba66fc-a7f5-94c5-da19-019ef2f9c665)
    • fields
      • fan_speed (integer, percentage)
      • memory_free (integer, MiB)
      • memory_used (integer, MiB)
      • memory_total (integer, MiB)
      • power_draw (float, W)
      • temperature_gpu (integer, degrees C)
      • utilization_gpu (integer, percentage)
      • utilization_memory (integer, percentage)
      • pcie_link_gen_current (integer)
      • pcie_link_width_current (integer)
      • encoder_stats_session_count (integer)
      • encoder_stats_average_fps (integer)
      • encoder_stats_average_latency (integer)
      • clocks_current_graphics (integer, MHz)
      • clocks_current_sm (integer, MHz)
      • clocks_current_memory (integer, MHz)
      • clocks_current_video (integer, MHz)
Sample Query

The below query could be used to alert on the average temperature of the your GPUs over the last minute

SELECT mean("temperature_gpu") FROM "nvidia_smi" WHERE time > now() - 5m GROUP BY time(1m), "index", "name", "host"
Example Output
nvidia_smi,compute_mode=Default,host=8218cf,index=0,name=GeForce\ GTX\ 1070,pstate=P2,uuid=GPU-823bc202-6279-6f2c-d729-868a30f14d96 fan_speed=100i,memory_free=7563i,memory_total=8112i,memory_used=549i,temperature_gpu=53i,utilization_gpu=100i,utilization_memory=90i 1523991122000000000
nvidia_smi,compute_mode=Default,host=8218cf,index=1,name=GeForce\ GTX\ 1080,pstate=P2,uuid=GPU-f9ba66fc-a7f5-94c5-da19-019ef2f9c665 fan_speed=100i,memory_free=7557i,memory_total=8114i,memory_used=557i,temperature_gpu=50i,utilization_gpu=100i,utilization_memory=85i 1523991122000000000
nvidia_smi,compute_mode=Default,host=8218cf,index=2,name=GeForce\ GTX\ 1080,pstate=P2,uuid=GPU-d4cfc28d-0481-8d07-b81a-ddfc63d74adf fan_speed=100i,memory_free=7557i,memory_total=8114i,memory_used=557i,temperature_gpu=58i,utilization_gpu=100i,utilization_memory=86i 1523991122000000000
Limitations

Note that there seems to be an issue with getting current memory clock values when the memory is overclocked. This may or may not apply to everyone but it's confirmed to be an issue on an EVGA 2080 Ti.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type NvidiaSMI

type NvidiaSMI struct {
	BinPath string
	Timeout internal.Duration
	// contains filtered or unexported fields
}

NvidiaSMI holds the methods for this plugin

func (*NvidiaSMI) Description

func (smi *NvidiaSMI) Description() string

Description returns the description of the NvidiaSMI plugin

func (*NvidiaSMI) Gather

func (smi *NvidiaSMI) Gather(acc telegraf.Accumulator) error

Gather implements the telegraf interface

func (*NvidiaSMI) SampleConfig

func (smi *NvidiaSMI) SampleConfig() string

SampleConfig returns the sample configuration for the NvidiaSMI plugin

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL