ceph

package
v1.0.0-...-47f3d97 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 30, 2019 License: MIT Imports: 10 Imported by: 0

README

Ceph Storage Input Plugin

Collects performance metrics from the MON and OSD nodes in a Ceph storage cluster.

Admin Socket Stats

This gatherer works by scanning the configured SocketDir for OSD and MON socket files. When it finds a MON socket, it runs ceph --admin-daemon $file perfcounters_dump. For OSDs it runs ceph --admin-daemon $file perf dump

The resulting JSON is parsed and grouped into collections, based on top-level key. Top-level keys are used as collection tags, and all sub-keys are flattened. For example:

 {
   "paxos": {
     "refresh": 9363435,
     "refresh_latency": {
       "avgcount": 9363435,
       "sum": 5378.794002000
     }
   }
 }

Would be parsed into the following metrics, all of which would be tagged with collection=paxos:

  • refresh = 9363435
  • refresh_latency.avgcount: 9363435
  • refresh_latency.sum: 5378.794002000

Cluster Stats

This gatherer works by invoking ceph commands against the cluster thus only requires the ceph client, valid ceph configuration and an access key to function (the ceph_config and ceph_user configuration variables work in conjunction to specify these prerequisites). It may be run on any server you wish which has access to the cluster. The currently supported commands are:

  • ceph status
  • ceph df
  • ceph osd pool stats
Configuration:
# Collects performance metrics from the MON and OSD nodes in a Ceph storage cluster.
[[inputs.ceph]]
  ## This is the recommended interval to poll.  Too frequent and you will lose
  ## data points due to timeouts during rebalancing and recovery
  interval = '1m'

  ## All configuration values are optional, defaults are shown below

  ## location of ceph binary
  ceph_binary = "/usr/bin/ceph"

  ## directory in which to look for socket files
  socket_dir = "/var/run/ceph"

  ## prefix of MON and OSD socket files, used to determine socket type
  mon_prefix = "ceph-mon"
  osd_prefix = "ceph-osd"

  ## suffix used to identify socket files
  socket_suffix = "asok"

  ## Ceph user to authenticate as, ceph will search for the corresponding keyring
  ## e.g. client.admin.keyring in /etc/ceph, or the explicit path defined in the
  ## client section of ceph.conf for example:
  ##
  ##     [client.telegraf]
  ##         keyring = /etc/ceph/client.telegraf.keyring
  ##
  ## Consult the ceph documentation for more detail on keyring generation.
  ceph_user = "client.admin"

  ## Ceph configuration to use to locate the cluster
  ceph_config = "/etc/ceph/ceph.conf"

  ## Whether to gather statistics via the admin socket
  gather_admin_socket_stats = true

  ## Whether to gather statistics via ceph commands, requires ceph_user and ceph_config
  ## to be specified
  gather_cluster_stats = false
Measurements & Fields:

Admin Socket Stats

All fields are collected under the ceph measurement and stored as float64s. For a full list of fields, see the sample perf dumps in ceph_test.go.

Cluster Stats

  • ceph_osdmap

    • epoch (float)
    • full (boolean)
    • nearfull (boolean)
    • num_in_osds (float)
    • num_osds (float)
    • num_remremapped_pgs (float)
    • num_up_osds (float)
  • ceph_pgmap

    • bytes_avail (float)
    • bytes_total (float)
    • bytes_used (float)
    • data_bytes (float)
    • num_pgs (float)
    • op_per_sec (float, ceph < 10)
    • read_op_per_sec (float)
    • write_op_per_sec (float)
    • read_bytes_sec (float)
    • version (float)
    • write_bytes_sec (float)
    • recovering_bytes_per_sec (float)
    • recovering_keys_per_sec (float)
    • recovering_objects_per_sec (float)
  • ceph_pgmap_state

    • count (float)
  • ceph_usage

    • bytes_used (float)
    • kb_used (float)
    • max_avail (float)
    • objects (float)
  • ceph_pool_usage

    • bytes_used (float)
    • kb_used (float)
    • max_avail (float)
    • objects (float)
  • ceph_pool_stats

    • op_per_sec (float, ceph < 10)
    • read_op_per_sec (float)
    • write_op_per_sec (float)
    • read_bytes_sec (float)
    • write_bytes_sec (float)
    • recovering_object_per_sec (float)
    • recovering_bytes_per_sec (float)
    • recovering_keys_per_sec (float)
Tags:

Admin Socket Stats

All measurements will have the following tags:

  • type: either 'osd' or 'mon' to indicate which type of node was queried
  • id: a unique string identifier, parsed from the socket file name for the node
  • collection: the top-level key under which these fields were reported. Possible values are:
    • for MON nodes:
      • cluster
      • leveldb
      • mon
      • paxos
      • throttle-mon_client_bytes
      • throttle-mon_daemon_bytes
      • throttle-msgr_dispatch_throttler-mon
    • for OSD nodes:
      • WBThrottle
      • filestore
      • leveldb
      • mutex-FileJournal::completions_lock
      • mutex-FileJournal::finisher_lock
      • mutex-FileJournal::write_lock
      • mutex-FileJournal::writeq_lock
      • mutex-JOS::ApplyManager::apply_lock
      • mutex-JOS::ApplyManager::com_lock
      • mutex-JOS::SubmitManager::lock
      • mutex-WBThrottle::lock
      • objecter
      • osd
      • recoverystate_perf
      • throttle-filestore_bytes
      • throttle-filestore_ops
      • throttle-msgr_dispatch_throttler-client
      • throttle-msgr_dispatch_throttler-cluster
      • throttle-msgr_dispatch_throttler-hb_back_server
      • throttle-msgr_dispatch_throttler-hb_front_serve
      • throttle-msgr_dispatch_throttler-hbclient
      • throttle-msgr_dispatch_throttler-ms_objecter
      • throttle-objecter_bytes
      • throttle-objecter_ops
      • throttle-osd_client_bytes
      • throttle-osd_client_messages

Cluster Stats

  • ceph_pgmap_state has the following tags:
    • state (state for which the value applies e.g. active+clean, active+remapped+backfill)
  • ceph_pool_usage has the following tags:
    • id
    • name
  • ceph_pool_stats has the following tags:
    • id
    • name
Example Output:

Admin Socket Stats

telegraf --config /etc/telegraf/telegraf.conf --config-directory /etc/telegraf/telegraf.d --input-filter ceph --test
* Plugin: ceph, Collection 1
> ceph,collection=paxos, id=node-2,role=openstack,type=mon accept_timeout=0,begin=14931264,begin_bytes.avgcount=14931264,begin_bytes.sum=180309683362,begin_keys.avgcount=0,begin_keys.sum=0,begin_latency.avgcount=14931264,begin_latency.sum=9293.29589,collect=1,collect_bytes.avgcount=1,collect_bytes.sum=24,collect_keys.avgcount=1,collect_keys.sum=1,collect_latency.avgcount=1,collect_latency.sum=0.00028,collect_timeout=0,collect_uncommitted=0,commit=14931264,commit_bytes.avgcount=0,commit_bytes.sum=0,commit_keys.avgcount=0,commit_keys.sum=0,commit_latency.avgcount=0,commit_latency.sum=0,lease_ack_timeout=0,lease_timeout=0,new_pn=0,new_pn_latency.avgcount=0,new_pn_latency.sum=0,refresh=14931264,refresh_latency.avgcount=14931264,refresh_latency.sum=8706.98498,restart=4,share_state=0,share_state_bytes.avgcount=0,share_state_bytes.sum=0,share_state_keys.avgcount=0,share_state_keys.sum=0,start_leader=0,start_peon=1,store_state=14931264,store_state_bytes.avgcount=14931264,store_state_bytes.sum=353119959211,store_state_keys.avgcount=14931264,store_state_keys.sum=289807523,store_state_latency.avgcount=14931264,store_state_latency.sum=10952.835724 1462821234814535148
> ceph,collection=throttle-mon_client_bytes,id=node-2,type=mon get=1413017,get_or_fail_fail=0,get_or_fail_success=0,get_sum=71211705,max=104857600,put=1413013,put_sum=71211459,take=0,take_sum=0,val=246,wait.avgcount=0,wait.sum=0 1462821234814737219
> ceph,collection=throttle-mon_daemon_bytes,id=node-2,type=mon get=4058121,get_or_fail_fail=0,get_or_fail_success=0,get_sum=6027348117,max=419430400,put=4058121,put_sum=6027348117,take=0,take_sum=0,val=0,wait.avgcount=0,wait.sum=0 1462821234814815661
> ceph,collection=throttle-msgr_dispatch_throttler-mon,id=node-2,type=mon get=54276277,get_or_fail_fail=0,get_or_fail_success=0,get_sum=370232877040,max=104857600,put=54276277,put_sum=370232877040,take=0,take_sum=0,val=0,wait.avgcount=0,wait.sum=0 1462821234814872064

Cluster Stats

> ceph_osdmap,host=ceph-mon-0 epoch=170772,full=false,nearfull=false,num_in_osds=340,num_osds=340,num_remapped_pgs=0,num_up_osds=340 1468841037000000000
> ceph_pgmap,host=ceph-mon-0 bytes_avail=634895531270144,bytes_total=812117151809536,bytes_used=177221620539392,data_bytes=56979991615058,num_pgs=22952,op_per_sec=15869,read_bytes_sec=43956026,version=39387592,write_bytes_sec=165344818 1468841037000000000
> ceph_pgmap_state,host=ceph-mon-0,state=active+clean count=22952 1468928660000000000
> ceph_pgmap_state,host=ceph-mon-0,state=active+degraded count=16 1468928660000000000
> ceph_usage,host=ceph-mon-0 total_avail_bytes=634895514791936,total_bytes=812117151809536,total_used_bytes=177221637017600 1468841037000000000
> ceph_pool_usage,host=ceph-mon-0,id=150,name=cinder.volumes bytes_used=12648553794802,kb_used=12352103316,max_avail=154342562489244,objects=3026295 1468841037000000000
> ceph_pool_usage,host=ceph-mon-0,id=182,name=cinder.volumes.flash bytes_used=8541308223964,kb_used=8341121313,max_avail=39388593563936,objects=2075066 1468841037000000000
> ceph_pool_stats,host=ceph-mon-0,id=150,name=cinder.volumes op_per_sec=1706,read_bytes_sec=28671674,write_bytes_sec=29994541 1468841037000000000
> ceph_pool_stats,host=ceph-mon-0,id=182,name=cinder.volumes.flash op_per_sec=9748,read_bytes_sec=9605524,write_bytes_sec=45593310 1468841037000000000

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Ceph

type Ceph struct {
	CephBinary             string
	OsdPrefix              string
	MonPrefix              string
	SocketDir              string
	SocketSuffix           string
	CephUser               string
	CephConfig             string
	GatherAdminSocketStats bool
	GatherClusterStats     bool
}

func (*Ceph) Description

func (c *Ceph) Description() string

func (*Ceph) Gather

func (c *Ceph) Gather(acc telegraf.Accumulator) error

func (*Ceph) SampleConfig

func (c *Ceph) SampleConfig() string

type CephDf

type CephDf struct {
	Stats struct {
		TotalSpace float64 `json:"total_space"`
		TotalUsed  float64 `json:"total_used"`
		TotalAvail float64 `json:"total_avail"`
	} `json:"stats"`
	Pools []struct {
		Name  string `json:"name"`
		Stats struct {
			KBUsed    float64 `json:"kb_used"`
			BytesUsed float64 `json:"bytes_used"`
			Objects   float64 `json:"objects"`
		} `json:"stats"`
	} `json:"pools"`
}

CephDF is used to unmarshal 'ceph df' output

type CephOSDPoolStats

type CephOSDPoolStats []struct {
	PoolName     string `json:"pool_name"`
	ClientIORate struct {
		ReadBytesSec  float64 `json:"read_bytes_sec"`
		WriteBytesSec float64 `json:"write_bytes_sec"`
		OpPerSec      float64 `json:"op_per_sec"` // This field is no longer reported in ceph 10 and later
		ReadOpPerSec  float64 `json:"read_op_per_sec"`
		WriteOpPerSec float64 `json:"write_op_per_sec"`
	} `json:"client_io_rate"`
	RecoveryRate struct {
		RecoveringObjectsPerSec float64 `json:"recovering_objects_per_sec"`
		RecoveringBytesPerSec   float64 `json:"recovering_bytes_per_sec"`
		RecoveringKeysPerSec    float64 `json:"recovering_keys_per_sec"`
	} `json:"recovery_rate"`
}

CephOSDPoolStats is used to unmarshal 'ceph osd pool stats' output

type CephStatus

type CephStatus struct {
	OSDMap struct {
		OSDMap struct {
			Epoch          float64 `json:"epoch"`
			NumOSDs        float64 `json:"num_osds"`
			NumUpOSDs      float64 `json:"num_up_osds"`
			NumInOSDs      float64 `json:"num_in_osds"`
			Full           bool    `json:"full"`
			NearFull       bool    `json:"nearfull"`
			NumRemappedPGs float64 `json:"num_remapped_pgs"`
		} `json:"osdmap"`
	} `json:"osdmap"`
	PGMap struct {
		PGsByState []struct {
			StateName string  `json:"state_name"`
			Count     float64 `json:"count"`
		} `json:"pgs_by_state"`
		Version       float64 `json:"version"`
		NumPGs        float64 `json:"num_pgs"`
		DataBytes     float64 `json:"data_bytes"`
		BytesUsed     float64 `json:"bytes_used"`
		BytesAvail    float64 `json:"bytes_avail"`
		BytesTotal    float64 `json:"bytes_total"`
		ReadBytesSec  float64 `json:"read_bytes_sec"`
		WriteBytesSec float64 `json:"write_bytes_sec"`
		OpPerSec      float64 `json:"op_per_sec"` // This field is no longer reported in ceph 10 and later
		ReadOpPerSec  float64 `json:"read_op_per_sec"`
		WriteOpPerSec float64 `json:"write_op_per_sec"`
	} `json:"pgmap"`
}

CephStatus is used to unmarshal "ceph -s" output

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL