k8sprocessor

package module
v0.6.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 15, 2020 License: Apache-2.0 Imports: 16 Imported by: 2

README

Documentation is published to pkg.go.dev

Documentation

Overview

Package k8sprocessor allow automatic tagging of spans and metrics with k8s metadata.

The processor automatically discovers k8s resources (pods), extracts metadata from them and adds the extracted metadata to the relevant spans and metrics. The processor use the kubernetes API to discover all pods running in a cluster, keeps a record of their IP addresses and interesting metadata. Upon receiving spans, the processor tries to identify the source IP address of the service that sent the spans and matches it with the in memory data. To find a k8s pod producing metrics, the processor looks at "host.hostname" resource attribute which is set by prometheus receiver and some metrics instrumentation libraries. If a match is found, the cached metadata is added to the spans and metrics as resource attributes.

RBAC

TODO: mention the required RBAC rules.

Config

TODO: example config.

Deployment scenarios

The processor supports running both in agent and collector mode.

As an agent

When running as an agent, the processor detects IP addresses of pods sending spans or metrics to the agent and uses this information to extract metadata from pods. When running as an agent, it is important to apply a discovery filter so that the processor only discovers pods from the same host that it is running on. Not using such a filter can result in unnecessary resource usage especially on very large clusters. Once the filter is applied, each processor will only query the k8s API for pods running on it's own node.

Node filter can be applied by setting the `filter.node` config option to the name of a k8s node. While this works as expected, it cannot be used to automatically filter pods by the same node that the processor is running on in most cases as it is not know before hand which node a pod will be scheduled on. Luckily, kubernetes has a solution for this called the downward API. To automatically filter pods by the node the processor is running on, you'll need to complete the following steps:

1. Use the downward API to inject the node name as an environment variable. Add the following snippet under the pod env section of the OpenTelemetry container.

  env:
  - name: KUBE_NODE_NAME
    valueFrom:
	  fieldRef:
	    apiVersion: v1
	    fieldPath: spec.nodeName

This will inject a new environment variable to the OpenTelemetry container with the value as the name of the node the pod was scheduled to run on.

2. Set "filter.node_from_env_var" to the name of the environment variable holding the node name.

k8s_tagger:
  filter:
    node_from_env_var: KUBE_NODE_NAME # this should be same as the var name used in previous step

This will restrict each OpenTelemetry agent to query pods running on the same node only dramatically reducing resource requirements for very large clusters.

As a collector

The processor can be deployed both as an agent or as a collector.

When running as a collector, the processor cannot correctly detect the IP address of the pods generating the spans when it receives the spans from an agent instead of receiving them directly from the pods. To workaround this issue, agents deployed with the k8s_tagger processor can be configured to detect the IP addresses and forward them along with the span resources. Collector can then match this IP address with k8s pods and enrich the spans with the metadata. In order to set this up, you'll need to complete the following steps:

1. Setup agents in passthrough mode Configure the agents' k8s_tagger processors to run in passthrough mode.

# k8s_tagger config for agent
k8s_tagger:
  passthrough: true

This will ensure that the agents detect the IP address as add it as an attribute to all span resources. Agents will not make any k8s API calls, do any discovery of pods or extract any metadata.

2. Configure the collector as usual No special configuration changes are needed to be made on the collector. It'll automatically detect the IP address of spans sent by the agents as well as directly by other services/pods.

This approach is also relevant for metrics data since it's not guaranteed that all the metric formats that used to send data from agent to collector preserve "host.hostname" attribute. We need to rely on an additional attribute keeping a k8s pod IP value in the passthrough mode.

Caveats

There are some edge-cases and scenarios where k8s_tagger will not work properly.

Host networking mode

The processor cannot correct identify pods running in the host network mode and enriching spans generated by such pods is not supported at the moment.

As a sidecar

The processor does not support detecting containers from the same pods when running as a sidecar. While this can be done, we think it is simpler to just use the kubernetes downward API to inject environment variables into the pods and directly use their values as tags.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func NewMetricsProcessor added in v0.5.0

func NewMetricsProcessor(
	logger *zap.Logger,
	nextMetricsConsumer consumer.MetricsConsumer,
	kubeClient kube.ClientProvider,
	options ...Option,
) (component.MetricsProcessor, error)

NewMetricsProcessor returns a component.MetricProcessor that adds the k8s attributes to metrics passed to it.

func NewTraceProcessor

func NewTraceProcessor(
	logger *zap.Logger,
	nextTraceConsumer consumer.TraceConsumer,
	kubeClient kube.ClientProvider,
	options ...Option,
) (component.TraceProcessor, error)

NewTraceProcessor returns a component.TraceProcessor that adds the WithAttributeMap(attributes) to all spans passed to it.

Types

type Config

type Config struct {
	configmodels.ProcessorSettings `mapstructure:",squash"`

	k8sconfig.APIConfig `mapstructure:",squash"`

	// Passthrough mode only annotates resources with the pod IP and
	// does not try to extract any other metadata. It does not need
	// access to the K8S cluster API. Agent/Collector must receive spans
	// directly from services to be able to correctly detect the pod IPs.
	Passthrough bool `mapstructure:"passthrough"`

	// Extract section allows specifying extraction rules to extract
	// data from k8s pod specs
	Extract ExtractConfig `mapstructure:"extract"`

	// Filter section allows specifying filters to filter
	// pods by labels, fields, namespaces, nodes, etc.
	Filter FilterConfig `mapstructure:"filter"`
}

Config defines configuration for k8s attributes processor.

type ExtractConfig

type ExtractConfig struct {
	// Metadata allows to extract pod metadata from a list of metadata fields.
	// The field accepts a list of strings.
	//
	// Metadata fields supported right now are,
	//   namespace, podName, podUID, deployment, cluster, node and startTime
	//
	// Specifying anything other than these values will result in an error.
	// By default all of the fields are extracted and added to spans and metrics.
	Metadata []string `mapstructure:"metadata"`

	// Annotations allows extracting data from pod annotations and record it
	// as resource attributes.
	// It is a list of FieldExtractConfig type. See FieldExtractConfig
	// documentation for more details.
	Annotations []FieldExtractConfig `mapstructure:"annotations"`

	// Annotations allows extracting data from pod labels and record it
	// as resource attributes.
	// It is a list of FieldExtractConfig type. See FieldExtractConfig
	// documentation for more details.
	Labels []FieldExtractConfig `mapstructure:"labels"`
}

ExtractConfig section allows specifying extraction rules to extract data from k8s pod specs.

type Factory

type Factory struct {
	// Factory dependencies that can be provided form outside.
	KubeClient kube.ClientProvider
}

Factory is the factory for Attributes processor.

func (*Factory) CreateDefaultConfig

func (f *Factory) CreateDefaultConfig() configmodels.Processor

CreateDefaultConfig creates the default configuration for processor.

func (*Factory) CreateMetricsProcessor

func (f *Factory) CreateMetricsProcessor(
	ctx context.Context,
	params component.ProcessorCreateParams,
	nextMetricsConsumer consumer.MetricsConsumer,
	cfg configmodels.Processor,
) (component.MetricsProcessor, error)

CreateMetricsProcessor creates a metrics processor based on this config.

func (*Factory) CreateTraceProcessor

func (f *Factory) CreateTraceProcessor(
	ctx context.Context,
	params component.ProcessorCreateParams,
	nextTraceConsumer consumer.TraceConsumer,
	cfg configmodels.Processor,
) (component.TraceProcessor, error)

CreateTraceProcessor creates a trace processor based on this config.

func (*Factory) Type

func (f *Factory) Type() configmodels.Type

Type gets the type of the config created by this factory.

type FieldExtractConfig

type FieldExtractConfig struct {
	TagName string `mapstructure:"tag_name"`
	Key     string `mapstructure:"key"`
	Regex   string `mapstructure:"regex"`
}

FieldExtractConfig allows specifying an extraction rule to extract a value from exactly one field.

The field accepts a list FilterExtractConfig map. The map accepts three keys

tag-name, key and regex
  • tag-name represents the name of the tag that will be added to the span. When not specified a default tag name will be used of the format: k8s.<annotation>.<annotation key> For example, if tag-name is not specified and the key is git_sha, then the span name will be `k8s.annotation.deployment.git_sha`.

- key represents the annotation name. This must exactly match an annotation name.

  • regex is an optional field used to extract a sub-string from a complex field value. The supplied regular expression must contain one named parameter with the string "value" as the name. For example, if your pod spec contains the following annotation,

    kubernetes.io/change-cause: 2019-08-28T18:34:33Z APP_NAME=my-app GIT_SHA=58a1e39 CI_BUILD=4120

    and you'd like to extract the GIT_SHA and the CI_BUILD values as tags, then you must specify the following two extraction rules:

    procesors: k8s-tagger: annotations:

  • name: git.sha key: kubernetes.io/change-cause regex: GIT_SHA=(?P<value>\w+)

  • name: ci.build key: kubernetes.io/change-cause regex: JENKINS=(?P<value>[\w]+)

    this will add the `git.sha` and `ci.build` tags to the spans or metrics.

type FieldFilterConfig

type FieldFilterConfig struct {
	// Key represents the key or name of the field or labels that a filter
	// can apply on.
	Key string `mapstructure:"key"`

	// Value represents the value associated with the key that a filter
	// operation specified by the `Op` field applies on.
	Value string `mapstructure:"value"`

	// Op represents the filter operation to apply on the given
	// Key: Value pair. The following operations are supported
	//   equals, not-equals, exists, does-not-exist.
	Op string `mapstructure:"op"`
}

FieldFilterConfig allows specifying exactly one filter by a field. It can be used to represent a label or generic field filter.

type FilterConfig

type FilterConfig struct {
	// Node represents a k8s node or host. If specified, any pods not running
	// on the specified node will be ignored by the tagger.
	Node string `mapstructure:"node"`

	// NodeFromEnv can be used to extract the node name from an environment
	// variable. The value must be the name of the environment variable.
	// This is useful when the node a Otel agent will run on cannot be
	// predicted. In such cases, the Kubernetes downward API can be used to
	// add the node name to each pod as an environment variable. K8s tagger
	// can then read this value and filter pods by it.
	//
	// For example, node name can be passed to each agent with the downward API as follows
	//
	// env:
	//   - name: K8S_NODE_NAME
	//     valueFrom:
	//       fieldRef:
	//         fieldPath: spec.nodeName
	//
	// Then the NodeFromEnv field can be set to `K8S_NODE_NAME` to filter all pods by the node that
	// the agent is running on.
	//
	// More on downward API here: https://kubernetes.io/docs/tasks/inject-data-application/downward-api-volume-expose-pod-information/
	NodeFromEnvVar string `mapstructure:"node_from_env_var"`

	// Namespace filters all pods by the provided namespace. All other pods are ignored.
	Namespace string `mapstructure:"namespace"`

	// Fields allows to filter pods by generic k8s fields.
	// Only the following operations are supported:
	//    - equals
	//    - not-equals
	//
	// Check FieldFilterConfig for more details.
	Fields []FieldFilterConfig `mapstructure:"fields"`

	// Labels allows to filter pods by generic k8s pod labels.
	// Only the following operations are supported:
	//    - equals
	//    - not-equals
	//    - exists
	//    - not-exists
	//
	// Check FieldFilterConfig for more details.
	Labels []FieldFilterConfig `mapstructure:"labels"`
}

FilterConfig section allows specifying filters to filter pods by labels, fields, namespaces, nodes, etc.

type Option

type Option func(*kubernetesprocessor) error

Option represents a configuration option that can be passes. to the k8s-tagger

func WithAPIConfig

func WithAPIConfig(cfg k8sconfig.APIConfig) Option

WithAPIConfig provides k8s API related configuration to the processor. It defaults the authentication method to in-cluster auth using service accounts.

func WithExtractAnnotations

func WithExtractAnnotations(annotations ...FieldExtractConfig) Option

WithExtractAnnotations allows specifying options to control extraction of pod annotations tags.

func WithExtractLabels

func WithExtractLabels(labels ...FieldExtractConfig) Option

WithExtractLabels allows specifying options to control extraction of pod labels.

func WithExtractMetadata

func WithExtractMetadata(fields ...string) Option

WithExtractMetadata allows specifying options to control extraction of pod metadata. If no fields explicitly provided, all metadata extracted by default.

func WithFilterFields

func WithFilterFields(filters ...FieldFilterConfig) Option

WithFilterFields allows specifying options to control filtering pods by pod fields.

func WithFilterLabels

func WithFilterLabels(filters ...FieldFilterConfig) Option

WithFilterLabels allows specifying options to control filtering pods by pod labels.

func WithFilterNamespace

func WithFilterNamespace(ns string) Option

WithFilterNamespace allows specifying options to control filtering pods by a namespace.

func WithFilterNode

func WithFilterNode(node, nodeFromEnvVar string) Option

WithFilterNode allows specifying options to control filtering pods by a node/host.

func WithPassthrough

func WithPassthrough() Option

WithPassthrough enables passthrough mode. In passthrough mode, the processor only detects and tags the pod IP and does not invoke any k8s APIs.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL