redactionprocessor

package module

v0.136.0 Latest Latest Go to latest Published: Sep 22, 2025 License: Apache-2.0 Imports: 23 Imported by: 9

README ¶

Redaction processor

Status
Stability	alpha: logs, metrics
	beta: traces
Distributions	contrib, k8s
Issues
Code coverage
Code Owners	@dmitryax, @mx-psi, @TylerHelmuth
Emeritus	@leonsp-ai

This processor deletes span, log, and metric datapoint attributes that don't match a list of allowed attributes. It also masks attribute values that match a blocked value list. Attributes that aren't on the allowed list are removed before any value checks are done.

Use Cases

Typical use-cases:

Prevent sensitive fields from accidentally leaking into traces
Ensure compliance with legal, privacy, or security requirements

For example:

EU General Data Protection Regulation (GDPR) prohibits the transfer of any personal data like birthdates, addresses, or ip addresses across borders without explicit consent from the data subject. Popular trace aggregation services are located in US, not in EU. You can use the redaction processor to scrub personal data from your data.
PRC legislation prohibits the transfer of geographic coordinates outside of the PRC. Popular trace aggregation services are located in US, not in the PRC. You can use the redaction processor to scrub geographic coordinates from your data.
Payment Card Industry (PCI) Data Security Standards prohibit logging certain things or storing them unencrypted. You can use the redaction processor to scrub them from your traces.

The above is written by an engineer, not a lawyer. The redaction processor is intended as one line of defence rather than the only compliance measure in place.

Processor Configuration

Please refer to config.go for the config spec.

Examples:

processors:
  redaction:
    # allow_all_keys is a flag that disables the allowed_keys list when set to true.
    # The list of blocked_values is applied regardless. If you just want to block values, set this to true.
    allow_all_keys: false
    # allowed_keys is a list of span/log/datapoint attribute keys that are kept on the span/log/datapoint and
    # processed. The list is designed to fail closed. If allowed_keys is empty,
    # no attributes are allowed and all span attributes are removed. To
    # allow all keys, set allow_all_keys to true.
    allowed_keys:
      - description
      - group
      - id
      - name
    # Ignore the following attributes, allow them to pass without redaction.
    # Any keys in this list are allowed so they don't need to be in both lists.
    ignored_keys:
      - safe_attribute
    # redact_all_types will check incoming fields for sensitive data based on their AsString() representation. This allows the processor to redact sensitive data from ints. This is useful for redacting credit card numbers
    redact_all_types: true
    # blocked_key_patterns is a list of blocked span attribute key patterns. Span attributes
    # matching the regexes on the list are masked.
    blocked_key_patterns:
      - ".*token.*"
      - ".*api_key.*"
    # blocked_values is a list of regular expressions for blocking values of
    # allowed span attributes. Values that match are masked
    blocked_values:
      - "4[0-9]{12}(?:[0-9]{3})?" ## Visa credit card number
      - "(5[1-5][0-9]{14})"       ## MasterCard number
    # AllowedValues is a list of regular expressions for allowing values of
    # blocked span attributes. Values that match are not masked.
    allowed_values:
      - ".+@mycompany.com"
    # hash_function defines the function for hashing the values instead of
    # masking them with a fixed string. By default, no hash function is used
    # and masking with a fixed string is performed.
    hash_function: md5
    # summary controls the verbosity level of the diagnostic attributes that
    # the processor adds to the spans/logs/datapoints when it redacts or masks other
    # attributes. In some contexts a list of redacted attributes leaks
    # information, while it is valuable when integrating and testing a new
    # configuration. Possible values:
    # - `debug` includes both redacted key counts and names in the summary
    # - `info` includes just the redacted key counts in the summary
    # - `silent` omits the summary attributes
    summary: debug

Refer to config.yaml for how to fit the configuration into an OpenTelemetry Collector pipeline definition.

Ignored attributes are processed first so they're always allowed and never blocked. This field should only be used where you know the data is always safe to send to the telemetry system.

Only span/log/datapoint attributes included on the list of allowed keys list are retained. If allowed_keys is empty, then no attributes are allowed. All attributes are removed in that case. To keep all span attributes, you should explicitly set allow_all_keys to true.

blocked_values and allowed_values applies to the values of the allowed keys. If the value of an allowed key matches the regular expression for an allowed value, the matching part of the value is not masked even if it matches the regular expression for a blocked value. If the value matches the regular expression for a blocked value only, the matching part of the value is masked with a fixed length of asterisks.

blocked_key_patterns applies to the values of the keys matching one of the patterns. The value is then masked according to the configuration.

hash_function defines the function for hashing values of matched keys or matches in values instead of masking them with a fixed string. By default, no hash function is used and masking with a fixed string is performed. The supported hash functions are md5, sha1 and sha3 (SHA-256).

For example, if notes is on the list of allowed keys, then the notes attribute is retained. However, if there is a value such as a credit card number in the notes field that matched a regular expression on the list of blocked values, then that value is masked.

Database Query Sanitization

The redaction processor now supports sanitizing database queries and commands to remove sensitive information. This feature supports multiple database systems:

SQL databases
Redis
Memcached
MongoDB
OpenSearch
Elasticsearch

Example configuration with database sanitization:

processors:
  redaction:
    # ... other redaction settings ...
    
    # Database sanitization configuration
    db_sanitizer:
      sql:
        enabled: true
        attributes: ["db.statement", "db.query"]
      redis:
        enabled: true
        attributes: ["db.statement", "redis.command"]
      memcached:
        enabled: true
        attributes: ["db.statement", "memcached.command"] 
      mongo:
        enabled: true
        attributes: ["db.statement", "mongodb.query"]
      opensearch:
        enabled: true
        attributes: ["db.statement", "opensearch.body"]
      es:
        enabled: true
        attributes: ["db.statement", "elasticsearch.body"]

The database sanitizer will:

Remove sensitive data like literal values from SQL queries
Redact command arguments from Redis/Memcached commands
Sanitize MongoDB queries and JSON payloads
Process only specified attributes if provided
Preserve query structure while removing sensitive data

This provides an additional layer of protection when collecting telemetry that includes database operations.

Documentation ¶

Index ¶

func NewFactory() processor.Factory
type Config
type HashFunction
- func (u HashFunction) String() string
- func (u *HashFunction) UnmarshalText(text []byte) error

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func NewFactory ¶

func NewFactory() processor.Factory

NewFactory creates a factory for the redaction processor.

Types ¶

type Config ¶

type Config struct {
	// AllowAllKeys is a flag to allow all span attribute keys. Setting this
	// to true disables the AllowedKeys list. The list of BlockedValues is
	// applied regardless. If you just want to block values, set this to true.
	AllowAllKeys bool `mapstructure:"allow_all_keys"`

	// AllowedKeys is a list of allowed span attribute keys. Span attributes
	// not on the list are removed. The list fails closed if it's empty. To
	// allow all keys, you should explicitly set AllowAllKeys
	AllowedKeys []string `mapstructure:"allowed_keys"`

	// BlockedKeyPatterns is a list of blocked span attribute key patterns. Span attributes
	// matching the regexes on the list are masked.
	BlockedKeyPatterns []string `mapstructure:"blocked_key_patterns"`

	// HashFunction defines the function for hashing the values instead of
	// masking them with a fixed string. By default, no hash function is used
	// and masking with a fixed string is performed.
	HashFunction HashFunction `mapstructure:"hash_function"`

	// IgnoredKeys is a list of span attribute keys that are not redacted.
	// Span attributes in this list are allowed to pass through the filter
	// without being changed or removed.
	IgnoredKeys []string `mapstructure:"ignored_keys"`

	// RedactAllTypes of attributes, including those that are not string, by converting to a string representation.
	// By default only string values are redacted.
	RedactAllTypes bool `mapstructure:"redact_all_types"`

	// BlockedValues is a list of regular expressions for blocking values of
	// allowed span attributes. Values that match are masked.
	BlockedValues []string `mapstructure:"blocked_values"`

	// AllowedValues is a list of regular expressions for allowing values of
	// blocked span attributes. Values that match are not masked.
	AllowedValues []string `mapstructure:"allowed_values"`

	// DBSanitizer is a flag to enable database query sanitization.
	DBSanitizer db.DBSanitizerConfig `mapstructure:"db_sanitizer"`

	// Summary controls the verbosity level of the diagnostic attributes that
	// the processor adds to the spans when it redacts or masks other
	// attributes. In some contexts a list of redacted attributes leaks
	// information, while it is valuable when integrating and testing a new
	// configuration. Possible values are `debug`, `info`, and `silent`.
	Summary string `mapstructure:"summary"`
}

type HashFunction ¶ added in v0.122.0

type HashFunction string

const (
	None HashFunction = ""
	SHA1 HashFunction = "sha1"
	SHA3 HashFunction = "sha3"
	MD5  HashFunction = "md5"
)

func (HashFunction) String ¶ added in v0.122.0

func (u HashFunction) String() string

func (*HashFunction) UnmarshalText ¶ added in v0.122.0

func (u *HashFunction) UnmarshalText(text []byte) error

UnmarshalText unmarshalls text to a HashFunction.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
internal
db
metadata

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL