obfuscate

package module
v0.52.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 3, 2024 License: Apache-2.0 Imports: 14 Imported by: 19

Documentation

Overview

Package obfuscate implements quantizing and obfuscating of tags and resources for a set of spans matching a certain criteria.

This module is used in the Datadog Agent, the Go tracing client (dd-trace-go) and in the OpenTelemetry Collector Datadog exporter./ End-user behavior is stable, but there are no stability guarantees on its public Go API. Nonetheless, if editing try to avoid breaking API changes if possible and double check the API usage on all module dependents.

Index

Constants

View Source
const (
	ObfuscateOnly         = ObfuscationMode("obfuscate_only")
	ObfuscateAndNormalize = ObfuscationMode("obfuscate_and_normalize")
)

ObfuscationMode valid values

View Source
const (
	LexError = TokenKind(57346) + iota

	ID
	Limit
	Null
	String
	DoubleQuotedString
	DollarQuotedString // https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-DOLLAR-QUOTING
	DollarQuotedFunc   // a dollar-quoted string delimited by the tag "$func$"; gets special treatment when feature "dollar_quoted_func" is set
	Number
	BooleanLiteral
	ValueArg
	ListArg
	Comment
	Variable
	Savepoint
	PreparedStatement
	EscapeSequence
	NullSafeEqual
	LE
	GE
	NE
	Not
	As
	Alter
	Drop
	Create
	Grant
	Revoke
	Commit
	Begin
	Truncate
	Select
	From
	Update
	Delete
	Insert
	Into
	Join
	TableName
	ColonCast

	// PostgreSQL specific JSON operators
	JSONSelect         // ->
	JSONSelectText     // ->>
	JSONSelectPath     // #>
	JSONSelectPathText // #>>
	JSONContains       // @>
	JSONContainsLeft   // <@
	JSONKeyExists      // ?
	JSONAnyKeysExist   // ?|
	JSONAllKeysExist   // ?&
	JSONDelete         // #-

	// FilteredGroupable specifies that the given token has been discarded by one of the
	// token filters and that it is groupable together with consecutive FilteredGroupable
	// tokens.
	FilteredGroupable

	// FilteredGroupableParenthesis is a parenthesis marked as filtered groupable. It is the
	// beginning of either a group of values ('(') or a nested query. We track is as
	// a special case for when it may start a nested query as opposed to just another
	// value group to be obfuscated.
	FilteredGroupableParenthesis

	// Filtered specifies that the token is a comma and was discarded by one
	// of the filters.
	Filtered

	// FilteredBracketedIdentifier specifies that we are currently discarding
	// a bracketed identifier (MSSQL).
	// See issue https://github.com/DataDog/datadog-trace-agent/issues/475.
	FilteredBracketedIdentifier
)

list of available tokens; this list has been reduced because we don't need a full-fledged tokenizer to implement a Lexer

View Source
const (
	// DBMSSQLServer is a MS SQL Server
	DBMSSQLServer = "mssql"
	// DBMSPostgres is a PostgreSQL Server
	DBMSPostgres = "postgresql"
	// DBMSMySQL is a MySQL Server
	DBMSMySQL = "mysql"
	// DBMSOracle is an Oracle Server
	DBMSOracle = "oracle"
)
View Source
const EndChar = unicode.MaxRune + 1

EndChar is used to signal that the scanner has finished reading the query. This happens when there are no more characters left in the query or when invalid encoding is discovered. EndChar is an invalid rune value that can not be found in any valid string.

Variables

This section is empty.

Functions

func IsCardNumber

func IsCardNumber(b string, validateLuhn bool) (ok bool)

IsCardNumber checks if b could be a credit card number by checking the digit count and IIN prefix. If validateLuhn is true, the Luhn checksum is also applied to potential candidates.

Types

type Config

type Config struct {
	// SQL holds the obfuscation configuration for SQL queries.
	SQL SQLConfig

	// ES holds the obfuscation configuration for ElasticSearch bodies.
	ES JSONConfig

	// Mongo holds the obfuscation configuration for MongoDB queries.
	Mongo JSONConfig

	// SQLExecPlan holds the obfuscation configuration for SQL Exec Plans. This is strictly for safety related obfuscation,
	// not normalization. Normalization of exec plans is configured in SQLExecPlanNormalize.
	SQLExecPlan JSONConfig

	// SQLExecPlanNormalize holds the normalization configuration for SQL Exec Plans.
	SQLExecPlanNormalize JSONConfig

	// HTTP holds the obfuscation settings for HTTP URLs.
	HTTP HTTPConfig

	// Redis holds the obfuscation settings for Redis commands.
	Redis RedisConfig

	// Memcached holds the obfuscation settings for Memcached commands.
	Memcached MemcachedConfig

	// Statsd specifies the statsd client to use for reporting metrics.
	Statsd StatsClient

	// Logger specifies the logger to use when outputting messages.
	// If unset, no logs will be outputted.
	Logger Logger
}

Config holds the configuration for obfuscating sensitive data for various span types.

type HTTPConfig

type HTTPConfig struct {
	// RemoveQueryStrings determines query strings to be removed from HTTP URLs.
	RemoveQueryString bool `mapstructure:"remove_query_string" json:"remove_query_string"`

	// RemovePathDigits determines digits in path segments to be obfuscated.
	RemovePathDigits bool `mapstructure:"remove_paths_with_digits" json:"remove_path_digits"`
}

HTTPConfig holds the configuration settings for HTTP obfuscation.

type JSONConfig

type JSONConfig struct {
	// Enabled will specify whether obfuscation should be enabled.
	Enabled bool `mapstructure:"enabled"`

	// KeepValues will specify a set of keys for which their values will
	// not be obfuscated.
	KeepValues []string `mapstructure:"keep_values"`

	// ObfuscateSQLValues will specify a set of keys for which their values
	// will be passed through SQL obfuscation
	ObfuscateSQLValues []string `mapstructure:"obfuscate_sql_values"`
}

JSONConfig holds the obfuscation configuration for sensitive data found in JSON objects.

type Logger

type Logger interface {
	// Debugf logs the given message using the given format.
	Debugf(format string, params ...interface{})
}

Logger is able to log certain log messages.

type MemcachedConfig added in v0.49.0

type MemcachedConfig struct {
	// Enabled specifies whether this feature should be enabled.
	Enabled bool `mapstructure:"enabled"`

	// KeepCommand specifies whether the command of a given Memcached
	// query should be kept. If false, the entire tag is removed.
	KeepCommand bool `mapstructure:"keep_command"`
}

MemcachedConfig holds the configuration settings for Memcached obfuscation

type ObfuscatedQuery

type ObfuscatedQuery struct {
	Query    string      `json:"query"`    // the obfuscated SQL query
	Metadata SQLMetadata `json:"metadata"` // metadata extracted from the SQL query
}

ObfuscatedQuery specifies information about an obfuscated SQL query.

func (*ObfuscatedQuery) Cost

func (oq *ObfuscatedQuery) Cost() int64

Cost returns the number of bytes needed to store all the fields of this ObfuscatedQuery.

type ObfuscationMode added in v0.50.0

type ObfuscationMode string

ObfuscationMode specifies the obfuscation mode to use for go-sqllexer pkg.

type Obfuscator

type Obfuscator struct {
	// contains filtered or unexported fields
}

Obfuscator quantizes and obfuscates spans. The obfuscator is not safe for concurrent use.

func NewObfuscator

func NewObfuscator(cfg Config) *Obfuscator

NewObfuscator creates a new obfuscator

func (*Obfuscator) ObfuscateElasticSearchString

func (o *Obfuscator) ObfuscateElasticSearchString(cmd string) string

ObfuscateElasticSearchString obfuscates the given ElasticSearch JSON query.

func (*Obfuscator) ObfuscateMemcachedString

func (o *Obfuscator) ObfuscateMemcachedString(cmd string) string

ObfuscateMemcachedString obfuscates the Memcached command cmd.

func (*Obfuscator) ObfuscateMongoDBString

func (o *Obfuscator) ObfuscateMongoDBString(cmd string) string

ObfuscateMongoDBString obfuscates the given MongoDB JSON query.

func (*Obfuscator) ObfuscateRedisString

func (*Obfuscator) ObfuscateRedisString(rediscmd string) string

ObfuscateRedisString obfuscates the given Redis command.

func (*Obfuscator) ObfuscateSQLExecPlan

func (o *Obfuscator) ObfuscateSQLExecPlan(jsonPlan string, normalize bool) (string, error)

ObfuscateSQLExecPlan obfuscates query conditions in the provided JSON encoded execution plan. If normalize=True, then cost and row estimates are also obfuscated away.

func (*Obfuscator) ObfuscateSQLString

func (o *Obfuscator) ObfuscateSQLString(in string) (*ObfuscatedQuery, error)

ObfuscateSQLString quantizes and obfuscates the given input SQL query string. Quantization removes some elements such as comments and aliases and obfuscation attempts to hide sensitive information in strings and numbers by redacting them.

func (*Obfuscator) ObfuscateSQLStringWithOptions

func (o *Obfuscator) ObfuscateSQLStringWithOptions(in string, opts *SQLConfig) (*ObfuscatedQuery, error)

ObfuscateSQLStringWithOptions accepts an optional SQLOptions to change the behavior of the obfuscator to quantize and obfuscate the given input SQL query string. Quantization removes some elements such as comments and aliases and obfuscation attempts to hide sensitive information in strings and numbers by redacting them.

func (*Obfuscator) ObfuscateURLString

func (o *Obfuscator) ObfuscateURLString(val string) string

ObfuscateURLString obfuscates the given URL. It must be a valid URL.

func (*Obfuscator) ObfuscateWithSQLLexer added in v0.50.0

func (o *Obfuscator) ObfuscateWithSQLLexer(in string, opts *SQLConfig) (*ObfuscatedQuery, error)

ObfuscateWithSQLLexer obfuscates the given SQL query using the go-sqllexer package. If ObfuscationMode is set to ObfuscateOnly, the query will be obfuscated without normalizing it.

func (*Obfuscator) QuantizeRedisString

func (*Obfuscator) QuantizeRedisString(query string) string

QuantizeRedisString returns a quantized version of a Redis query.

TODO(gbbr): Refactor this method to use the tokenizer and remove "compactWhitespaces". This method is buggy when commands contain quoted strings with newlines.

func (*Obfuscator) RemoveAllRedisArgs added in v0.47.0

func (*Obfuscator) RemoveAllRedisArgs(rediscmd string) string

removeAllRedisArgs will take in a command and obfuscate all arguments following the command, regardless of if the command is valid Redis or not

func (*Obfuscator) Stop

func (o *Obfuscator) Stop()

Stop cleans up after a finished Obfuscator.

type RedisConfig added in v0.47.0

type RedisConfig struct {
	// Enabled specifies whether this feature should be enabled.
	Enabled bool `mapstructure:"enabled"`

	// RemoveAllArgs specifies whether all arguments to a given Redis
	// command should be obfuscated.
	RemoveAllArgs bool `mapstructure:"remove_all_args"`
}

RedisConfig holds the configuration settings for Redis obfuscation

type SQLConfig

type SQLConfig struct {
	// DBMS identifies the type of database management system (e.g. MySQL, Postgres, and SQL Server).
	// Valid values for this can be found at https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/database.md#connection-level-attributes
	DBMS string `json:"dbms"`

	// TableNames specifies whether the obfuscator should also extract the table names that a query addresses,
	// in addition to obfuscating.
	TableNames bool `json:"table_names" yaml:"table_names"`

	// CollectCommands specifies whether the obfuscator should extract and return commands as SQL metadata when obfuscating.
	CollectCommands bool `json:"collect_commands" yaml:"collect_commands"`

	// CollectComments specifies whether the obfuscator should extract and return comments as SQL metadata when obfuscating.
	CollectComments bool `json:"collect_comments" yaml:"collect_comments"`

	// CollectProcedures specifies whether the obfuscator should extract and return procedure names as SQL metadata when obfuscating.
	CollectProcedures bool `json:"collect_procedures" yaml:"collect_procedures"`

	// ReplaceDigits specifies whether digits in table names and identifiers should be obfuscated.
	ReplaceDigits bool `json:"replace_digits" yaml:"replace_digits"`

	// KeepSQLAlias reports whether SQL aliases ("AS") should be truncated.
	KeepSQLAlias bool `json:"keep_sql_alias"`

	// DollarQuotedFunc reports whether to treat "$func$" delimited dollar-quoted strings
	// differently and not obfuscate them as a string. To read more about dollar quoted
	// strings see:
	//
	// https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-DOLLAR-QUOTING
	DollarQuotedFunc bool `json:"dollar_quoted_func"`

	// ObfuscationMode specifies the obfuscation mode to use for go-sqllexer pkg.
	// When specified, obfuscator will attempt to use go-sqllexer pkg to obfuscate (and normalize) SQL queries.
	// Valid values are "obfuscate_only", "obfuscate_and_normalize"
	ObfuscationMode ObfuscationMode `json:"obfuscation_mode" yaml:"obfuscation_mode"`

	// RemoveSpaceBetweenParentheses specifies whether to remove spaces between parentheses.
	// By default, spaces are inserted between parentheses during normalization.
	// This option is only valid when ObfuscationMode is "obfuscate_and_normalize".
	RemoveSpaceBetweenParentheses bool `json:"remove_space_between_parentheses" yaml:"remove_space_between_parentheses"`

	// KeepNull specifies whether to disable obfuscate NULL value with ?.
	// This option is only valid when ObfuscationMode is "obfuscate_only" or "obfuscate_and_normalize".
	KeepNull bool `json:"keep_null" yaml:"keep_null"`

	// KeepBoolean specifies whether to disable obfuscate boolean value with ?.
	// This option is only valid when ObfuscationMode is "obfuscate_only" or "obfuscate_and_normalize".
	KeepBoolean bool `json:"keep_boolean" yaml:"keep_boolean"`

	// KeepPositionalParameter specifies whether to disable obfuscate positional parameter with ?.
	// This option is only valid when ObfuscationMode is "obfuscate_only" or "obfuscate_and_normalize".
	KeepPositionalParameter bool `json:"keep_positional_parameter" yaml:"keep_positional_parameter"`

	// KeepTrailingSemicolon specifies whether to keep trailing semicolon.
	// By default, trailing semicolon is removed during normalization.
	// This option is only valid when ObfuscationMode is "obfuscate_only" or "obfuscate_and_normalize".
	KeepTrailingSemicolon bool `json:"keep_trailing_semicolon" yaml:"keep_trailing_semicolon"`

	// KeepIdentifierQuotation specifies whether to keep identifier quotation, e.g. "my_table" or [my_table].
	// By default, identifier quotation is removed during normalization.
	// This option is only valid when ObfuscationMode is "obfuscate_only" or "obfuscate_and_normalize".
	KeepIdentifierQuotation bool `json:"keep_identifier_quotation" yaml:"keep_identifier_quotation"`

	// Cache reports whether the obfuscator should use a LRU look-up cache for SQL obfuscations.
	Cache bool
}

SQLConfig holds the config for obfuscating SQL.

type SQLMetadata

type SQLMetadata struct {
	// Size holds the byte size of the metadata collected.
	Size int64
	// TablesCSV is a comma-separated list of tables that the query addresses.
	TablesCSV string `json:"tables_csv"`
	// Commands holds commands executed in an SQL statement.
	// e.g. SELECT, UPDATE, INSERT, DELETE, etc.
	Commands []string `json:"commands"`
	// Comments holds comments in an SQL statement.
	Comments []string `json:"comments"`
	// Procedures holds procedure names in an SQL statement.
	Procedures []string `json:"procedures"`
}

SQLMetadata holds metadata collected throughout the obfuscation of an SQL statement. It is only collected when enabled via SQLConfig.

type SQLTokenizer

type SQLTokenizer struct {
	// contains filtered or unexported fields
}

SQLTokenizer is the struct used to generate SQL tokens for the parser.

func NewSQLTokenizer

func NewSQLTokenizer(sql string, literalEscapes bool, cfg *SQLConfig) *SQLTokenizer

NewSQLTokenizer creates a new SQLTokenizer for the given SQL string. The literalEscapes argument specifies whether escape characters should be treated literally or as such.

func (*SQLTokenizer) Err

func (tkn *SQLTokenizer) Err() error

Err returns the last error that the tokenizer encountered, or nil.

func (*SQLTokenizer) Position

func (tkn *SQLTokenizer) Position() int

Position exports the tokenizer's current position in the query

func (*SQLTokenizer) Reset

func (tkn *SQLTokenizer) Reset(in string)

Reset the underlying buffer and positions

func (*SQLTokenizer) Scan

func (tkn *SQLTokenizer) Scan() (TokenKind, []byte)

Scan scans the tokenizer for the next token and returns the token type and the token buffer.

func (*SQLTokenizer) SeenEscape

func (tkn *SQLTokenizer) SeenEscape() bool

SeenEscape returns whether or not this tokenizer has seen an escape character within a scanned string

func (*SQLTokenizer) SkipBlank

func (tkn *SQLTokenizer) SkipBlank()

SkipBlank moves the tokenizer forward until hitting a non-whitespace character The whitespace definition used here is the same as unicode.IsSpace

type StatsClient

type StatsClient interface {
	// Gauge reports a gauge stat with the given name, value, tags and rate.
	Gauge(name string, value float64, tags []string, rate float64) error
}

StatsClient implementations are able to emit stats.

type SyntaxError

type SyntaxError struct {
	Offset int64 // error occurred after reading Offset bytes
	// contains filtered or unexported fields
}

A SyntaxError is a description of a JSON syntax error.

func (*SyntaxError) Error

func (e *SyntaxError) Error() string

type TokenKind

type TokenKind uint32

TokenKind specifies the type of the token being scanned. It may be one of the defined constants below or in some cases the actual rune itself.

func (TokenKind) String

func (k TokenKind) String() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL