cascadingfilterprocessor

package module

v0.22.0-sumo Latest Latest Go to latest Published: Mar 18, 2021 License: Apache-2.0 Imports: 21 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/open-telemetry/opentelemetry-collector-contrib

README ¶

Cascading Filter Processor

Supported pipeline types: traces

The Cascading Filter processor is a fork of tailsamplingprocessor which allows for defining smart cascading filtering rules with preset limits.

Processor configuration

The following configuration options should be configured as desired:

policies (no default): Policies used to make a sampling decision
spans_per_second (default = 1500): Maximum total number of emitted spans per second
probabilistic_filtering_ratio (default = 0.2): Ratio of spans that are always probabilistically filtered (hence might be used for metrics calculation). The ratio is specified as portion of output spans (defined by spans_per_second) rather than input spans. So the default filtering rate of 0.2 and default max span rate of 1500 produces at most 300 probabilistically sampled spans per second.

The following configuration options can also be modified:

decision_wait (default = 30s): Wait time since the first span of a trace before making a filtering decision
num_traces (default = 50000): Number of traces kept in memory
expected_new_traces_per_sec (default = 0): Expected number of new traces (helps in allocating data structures)

Updated span attributes

The processor modifies each span attributes, by setting following two attributes:

sampling.rule: describing if probabilistic or filtered policy was applied
sampling.probability: describing the effective sampling rate in case of probabilistic rule. E.g. if there were 5000 spans evaluated in a given second, with 1500 max total spans per second and 0.2 filtering ratio, at most 300 spans would be selected by such rule. This would effect in having sampling.probability=0.06 (300/5000=0.6). If such value is already set by head-based (or other) sampling, it's multiplied by the calculated value.

Policy configuration

Each defined policy is evaluated with order as specified in config. There are several properties:

name (required): identifies the policy
spans_per_second (default = 0): defines maximum number of spans per second that could be handled by this policy. When set to -1, it selects the traces only if the global limit is not exceeded by other policies (however, without further limitations)

Additionally, each of the policy might have any of the following filtering criteria defined. They are evaluated for each of the trace spans. If at least one span matching all defined criteria is found, the trace is selected:

numeric_attribute: {key: <name>, min_value: <min_value>, max_value: <max_value>}: selects span by matching numeric attribute (either at resource of span level)
string_attribute: {key: <name>, values: [<value1>, <value2>]}: selects span by matching string attribute that is one of the provided values (either at resource of span level)
properties: { min_number_of_spans: <number>}: selects the trace if it has at least provided number of spans
properties: { min_duration: <duration>}: selects the span if the duration is greater or equal the given value (use s or ms as the suffix to indicate unit)
properties: { name_pattern: <regex>}: selects the span if its operation name matches the provided regular expression

To invert the decision (which is still a subject to rate limiting), additional property can be configured:

invert_match: <invert> (default=false): when set to true, the opposite decision is selected for the trace. E.g. if trace matches a given string attribute and invert_match=true, then the trace is not selected

Limiting the number of spans

There are two spans_per_second settings. The global one and the policy-one.

While evaluating traces, the limit is evaluated first on the policy level and then on the global level. The sum of all spans_per_second rates might be actually higher than the global limit, but the latter will never be exceeded (so some of the traces will not be included).

For example, we have 3 policies: A, B, C. Each of them has limit of 300 spans per second and the global limit is 500 spans per second. Now, lets say, that there for each of the policies there were 5 distinct traces, each having 100 spans and matching policy criteria (lets call them A1, A2, ... B1, B2... and so forth:

Policy A: A1, A2, A3 Policy B: B1, B2, B3 Policy C: C1, C2, C3

However, in total, this is 900 spans, which is more than the global limit of 500 spans/second. The processor will take care of that and randomly select only the spans up to the global limit. So eventually, it might for example send further only following traces: A1, A2, B1, C2, C5 and filter out the others.

Example

processors:
  cascading_filter:
    decision_wait: 10s
    num_traces: 100
    expected_new_traces_per_sec: 10
    spans_per_second: 1000
    probabilistic_filtering_ratio: 0.1
    policies:
      [
        {
          name: test-policy-1,
        },
        {
          name: test-policy-2,
          numeric_attribute: { key: key1, min_value: 50, max_value: 100 }
        },
        {
          name: test-policy-3,
          string_attribute: { key: key2, values: [ value1, value2 ] }
        },
        {
          name: test-policy-4,
          spans_per_second: 35,
        },
        {
          name: test-policy-5,
          spans_per_second: 123,
          numeric_attribute: { key: key1, min_value: 50, max_value: 100 },
          invert_match: true
        },
        {
          name: test-policy-6,
          spans_per_second: 50,
          properties: { min_duration: 9s }
        },
        {
          name: test-policy-7,
          properties: {
            name_pattern: "foo.*",
            min_number_of_spans: 10,
            min_duration: 9s
          }
        },
        {
          name: everything_else,
          spans_per_second: -1
        },
      ]

Refer to cascading_filter_config.yaml for detailed examples on using the processor.

Documentation ¶

Index ¶

Constants
func CascadingFilterMetricViews(level configtelemetry.Level) []*view.View
func NewFactory() component.ProcessorFactory
type Policy

Constants ¶

View Source

const (
	AttributeSamplingRule = "sampling.rule"
)

Variables ¶

This section is empty.

Functions ¶

func CascadingFilterMetricViews ¶

func CascadingFilterMetricViews(level configtelemetry.Level) []*view.View

CascadingFilterMetricViews return the metrics views according to given telemetry level.

func NewFactory ¶

func NewFactory() component.ProcessorFactory

NewFactory returns a new factory for the Cascading Filter processor.

Types ¶

type Policy ¶

type Policy struct {
	// Name used to identify this policy instance.
	Name string
	// Evaluator that decides if a trace is sampled or not by this policy instance.
	Evaluator sampling.PolicyEvaluator
	// contains filtered or unexported fields
}

Policy combines a sampling policy evaluator with the destinations to be used for that policy.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
config
idbatcher Package idbatcher defines a pipeline of fixed size in which the elements are batches of ids.	Package idbatcher defines a pipeline of fixed size in which the elements are batches of ids.
sampling Package sampling contains the interfaces and data types used to implement the various sampling policies.	Package sampling contains the interfaces and data types used to implement the various sampling policies.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL