absent-metrics-operator

command module
v0.4.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 20, 2020 License: Apache-2.0 Imports: 13 Imported by: 0

README

Absent Metrics Operator

GitHub Release GitHub Workflow Status Coveralls github Go Report Card Docker Pulls

Project status: alpha. The API and user facing objects may change.

In this document:

In other documents:

Overview

The absent metrics operator is a companion operator for the Prometheus Operator.

The operator monitors all the PrometheusRule resources deployed across a Kubernetes cluster and creates corresponding absent metric alert rules for the alert rules defined in those resources.

An absent metric alert rule alerts on the absence of a metric.

Motivation

Consider the following alert rule definition:

alert: ImportantAlert
expr: foo_bar > 0
for: 5m
labels:
  tier: network
  service: foo
  severity: critical
annotations:
  summary: Data center is on fire!

This alert would never trigger if the metric foo_bar does not exist in Prometheus.

This can be avoided by using the absent() function with the or operator so the alert rule expression becomes:

absent(foo_bar) or foo_bar > 0

This gets tedious if you have hundreds of alerts deployed across the cluster. There is also the element of human error, e.g. typo or forgetting to include the absent function in the alert expression.

This problem is resolved by the absent metrics operator as it automatically creates the corresponding absent metric alerts for your alert definitions.

The operator would generate the following absent metric alert for the above example:

alert: AbsentFooBar
expr: absent(foo_bar)
for: 10m
labels:
  severity: info
  tier: network
  service: foo
annotations:
  summary: missing foo_bar
  description: The metric 'foo_bar' is missing. 'ImportantAlert' alert using it may not fire as intended.

Usage

We provide pre-compiled binaries and container images. See the latest release.

Alternatively, you can build with make, install with make install, go get, or docker build.

For usage instructions:

$ absent-metrics-operator --help

The operator can be disabled for a specific PrometheusRule or a specific alert definition. Refer to the operator's playbook for more info.

Metrics

Metrics are exposed at port 9659. This port has been allocated for the operator.

Metric Labels
absent_metrics_operator_successful_reconcile_time prometheusrule_namespace, prometheusrule_name

Absent metric alert definition

The absent metric alerts are defined in a separate PrometheusRule resource that is managed by the operator. They are aggregated first by namespace and then by the Prometheus server.

For example, if a namespace has alert rules defined across several PrometheusRule resources for the Prometheus servers called OpenStack and Infra. The absent metric alerts for this namespace would be aggregated in two new PrometheusRule resources called:

  • openstack-absent-metric-alert-rules
  • infra-absent-metric-alert-rules
Template

The absent metric alert rule has the following template:

alert: $name
expr: absent($metric)
for: 10m
labels:
  severity: info
  tier: $tier
  service: $service
annotations:
  summary: missing $metric
  description: The metric '$metric' is missing. '$alert-name' alert using it may not fire as intended.

Consider the metric limes_successful_scrapes:rate5m with tier os and service limes. The corresponding absent metric alert name would be AbsentOsLimesSuccessfulScrapesRate5m.

The values of tier and service labels are only included in the name if the labels are specified in the keep-labels flag. See below.

The description also includes a link to documentation that can be referenced on how to deal with absent metric alerts.

Labels
Defaults

The following labels are always present on every absent metric alert rule:

  • severity is alway info.
Carry over from original alert rule

You can specify which labels to carry over from the original alert rule by specifying a comma-separated list of labels to the --keep-labels flag. The default value for this flag is service,tier.

The tier and service are a special case, they are co-dependent. See the playbook for details.

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis
internal
log
test
fixtures
Package fixtures contains Go struct fixtures that are used by the tests.
Package fixtures contains Go struct fixtures that are used by the tests.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL