Documentation
¶
Overview ¶
Package obfuscate implements quantizing and obfuscating of tags and resources for a set of spans matching a certain criteria.
This module is used in the Datadog Agent, the Go tracing client (dd-trace-go) and in the OpenTelemetry Collector Datadog exporter./ End-user behavior is stable, but there are no stability guarantees on its public Go API. Nonetheless, if editing try to avoid breaking API changes if possible and double check the API usage on all module dependents.
Index ¶
- Constants
- func QuantizePeerIPAddresses(raw string) string
- type CacheConfig
- type Config
- type CreditCardsConfig
- type HTTPConfig
- type JSONConfig
- type Logger
- type MemcachedConfig
- type ObfuscatedQuery
- type ObfuscationMode
- type Obfuscator
- func (o *Obfuscator) ObfuscateCreditCardNumber(key, val string) string
- func (o *Obfuscator) ObfuscateElasticSearchString(cmd string) string
- func (o *Obfuscator) ObfuscateMemcachedString(cmd string) string
- func (o *Obfuscator) ObfuscateMongoDBString(cmd string) string
- func (o *Obfuscator) ObfuscateOpenSearchString(cmd string) string
- func (*Obfuscator) ObfuscateRedisString(rediscmd string) string
- func (o *Obfuscator) ObfuscateSQLExecPlan(jsonPlan string, normalize bool) (string, error)
- func (o *Obfuscator) ObfuscateSQLString(in string) (*ObfuscatedQuery, error)
- func (o *Obfuscator) ObfuscateSQLStringForDBMS(in string, dbms string) (*ObfuscatedQuery, error)
- func (o *Obfuscator) ObfuscateSQLStringWithOptions(in string, opts *SQLConfig) (*ObfuscatedQuery, error)
- func (o *Obfuscator) ObfuscateURLString(val string) string
- func (o *Obfuscator) ObfuscateWithSQLLexer(in string, opts *SQLConfig) (*ObfuscatedQuery, error)
- func (*Obfuscator) QuantizeRedisString(query string) string
- func (*Obfuscator) RemoveAllRedisArgs(rediscmd string) string
- func (o *Obfuscator) Stop()
- type RedisConfig
- type SQLConfig
- type SQLMetadata
- type SQLTokenizer
- type StatsClient
- type SyntaxError
- type TokenKind
- type ValkeyConfig
Constants ¶
const ( NormalizeOnly = ObfuscationMode("normalize_only") ObfuscateOnly = ObfuscationMode("obfuscate_only") ObfuscateAndNormalize = ObfuscationMode("obfuscate_and_normalize") )
ObfuscationMode valid values
const ( LexError = TokenKind(57346) + iota ID Limit Null String DoubleQuotedString DollarQuotedString // https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-DOLLAR-QUOTING DollarQuotedFunc // a dollar-quoted string delimited by the tag "$func$"; gets special treatment when feature "dollar_quoted_func" is set Number BooleanLiteral ValueArg ListArg Comment Variable Savepoint PreparedStatement EscapeSequence NullSafeEqual LE GE NE Not As Alter Drop Create Grant Revoke Commit Begin Truncate Select From Update Delete Insert Into Join TableName ColonCast // PostgreSQL specific JSON operators JSONSelect // -> JSONSelectText // ->> JSONSelectPath // #> JSONSelectPathText // #>> JSONContains // @> JSONContainsLeft // <@ JSONKeyExists // ? JSONAnyKeysExist // ?| JSONAllKeysExist // ?& JSONDelete // #- // FilteredGroupable specifies that the given token has been discarded by one of the // token filters and that it is groupable together with consecutive FilteredGroupable // tokens. FilteredGroupable // FilteredGroupableParenthesis is a parenthesis marked as filtered groupable. It is the // beginning of either a group of values ('(') or a nested query. We track is as // a special case for when it may start a nested query as opposed to just another // value group to be obfuscated. FilteredGroupableParenthesis // Filtered specifies that the token is a comma and was discarded by one // of the filters. Filtered // FilteredBracketedIdentifier specifies that we are currently discarding // a bracketed identifier (MSSQL). // See issue https://github.com/DataDog/datadog-trace-agent/issues/475. FilteredBracketedIdentifier )
list of available tokens; this list has been reduced because we don't need a full-fledged tokenizer to implement a Lexer
const ( // DBMSSQLServer is a MS SQL Server DBMSSQLServer = "mssql" // DBMSPostgres is a PostgreSQL Server DBMSPostgres = "postgresql" // DBMSMySQL is a MySQL Server DBMSMySQL = "mysql" // DBMSOracle is an Oracle Server DBMSOracle = "oracle" )
const EndChar = unicode.MaxRune + 1
EndChar is used to signal that the scanner has finished reading the query. This happens when there are no more characters left in the query or when invalid encoding is discovered. EndChar is an invalid rune value that can not be found in any valid string.
Variables ¶
This section is empty.
Functions ¶
func QuantizePeerIPAddresses ¶ added in v0.57.0
QuantizePeerIPAddresses quantizes a comma separated list of hosts. Each entry which is an IP address is replaced using quantizeIP. Duplicate entries post-quantization or collapsed into a single unique value. Entries which are not IP addresses are left unchanged. Comma-separated host lists are common for peer tags like peer.cassandra.contact.points, peer.couchbase.seed.nodes, peer.kafka.bootstrap.servers
Types ¶
type CacheConfig ¶ added in v0.61.0
type CacheConfig struct { // Enabled specifies whether caching should be enabled. Enabled bool `mapstructure:"enabled"` // MaxSize is the maximum size of the cache in bytes. MaxSize int64 `mapstructure:"max_size"` }
CacheConfig holds the configuration for caching obfuscated queries.
type Config ¶
type Config struct { // SQL holds the obfuscation configuration for SQL queries. SQL SQLConfig // ES holds the obfuscation configuration for ElasticSearch bodies. ES JSONConfig `mapstructure:"elasticsearch"` // OpenSearch holds the obfuscation configuration for OpenSearch bodies. OpenSearch JSONConfig `mapstructure:"opensearch"` // Mongo holds the obfuscation configuration for MongoDB queries. Mongo JSONConfig `mapstructure:"mongodb"` // SQLExecPlan holds the obfuscation configuration for SQL Exec Plans. This is strictly for safety related obfuscation, // not normalization. Normalization of exec plans is configured in SQLExecPlanNormalize. SQLExecPlan JSONConfig `mapstructure:"sql_exec_plan"` // SQLExecPlanNormalize holds the normalization configuration for SQL Exec Plans. SQLExecPlanNormalize JSONConfig `mapstructure:"sql_exec_plan_normalize"` // HTTP holds the obfuscation settings for HTTP URLs. HTTP HTTPConfig `mapstructure:"http"` // Redis holds the obfuscation settings for Redis commands. Redis RedisConfig `mapstructure:"redis"` // Valkey holds the obfuscation settings for Valkey commands. Valkey ValkeyConfig `mapstructure:"valkey"` // Memcached holds the obfuscation settings for Memcached commands. Memcached MemcachedConfig `mapstructure:"memcached"` // Memcached holds the obfuscation settings for obfuscation of CC numbers in meta. CreditCard CreditCardsConfig `mapstructure:"credit_cards"` // Statsd specifies the statsd client to use for reporting metrics. Statsd StatsClient // Logger specifies the logger to use when outputting messages. // If unset, no logs will be outputted. Logger Logger // Cache enables the query cache for obfuscation for SQL and MongoDB queries. Cache CacheConfig `mapstructure:"cache"` }
Config holds the configuration for obfuscating sensitive data for various span types.
type CreditCardsConfig ¶ added in v0.55.0
type CreditCardsConfig struct { // Enabled specifies whether this feature should be enabled. Enabled bool `mapstructure:"enabled"` // Luhn specifies whether Luhn checksum validation should be enabled. // https://dev.to/shiraazm/goluhn-a-simple-library-for-generating-calculating-and-verifying-luhn-numbers-588j // It reduces false positives, but increases the CPU time X3. Luhn bool `mapstructure:"luhn"` // KeepValues specifies tag keys that are known to not ever contain credit cards // and therefore their values can be kept. KeepValues []string `mapstructure:"keep_values"` }
CreditCardsConfig holds the configuration for credit card obfuscation in (Meta) tags.
type HTTPConfig ¶
type HTTPConfig struct { // RemoveQueryStrings determines query strings to be removed from HTTP URLs. RemoveQueryString bool `mapstructure:"remove_query_string" json:"remove_query_string"` // RemovePathDigits determines digits in path segments to be obfuscated. RemovePathDigits bool `mapstructure:"remove_paths_with_digits" json:"remove_path_digits"` }
HTTPConfig holds the configuration settings for HTTP obfuscation.
type JSONConfig ¶
type JSONConfig struct { // Enabled will specify whether obfuscation should be enabled. Enabled bool `mapstructure:"enabled"` // KeepValues will specify a set of keys for which their values will // not be obfuscated. KeepValues []string `mapstructure:"keep_values"` // ObfuscateSQLValues will specify a set of keys for which their values // will be passed through SQL obfuscation ObfuscateSQLValues []string `mapstructure:"obfuscate_sql_values"` }
JSONConfig holds the obfuscation configuration for sensitive data found in JSON objects.
type Logger ¶
type Logger interface { // Debugf logs the given message using the given format. Debugf(format string, params ...interface{}) }
Logger is able to log certain log messages.
type MemcachedConfig ¶ added in v0.49.0
type MemcachedConfig struct { // Enabled specifies whether this feature should be enabled. Enabled bool `mapstructure:"enabled"` // KeepCommand specifies whether the command of a given Memcached // query should be kept. If false, the entire tag is removed. KeepCommand bool `mapstructure:"keep_command"` }
MemcachedConfig holds the configuration settings for Memcached obfuscation
type ObfuscatedQuery ¶
type ObfuscatedQuery struct { Query string `json:"query"` // the obfuscated SQL query Metadata SQLMetadata `json:"metadata"` // metadata extracted from the SQL query }
ObfuscatedQuery specifies information about an obfuscated SQL query.
func (*ObfuscatedQuery) Cost ¶
func (oq *ObfuscatedQuery) Cost() int64
Cost returns the number of bytes needed to store all the fields of this ObfuscatedQuery.
type ObfuscationMode ¶ added in v0.50.0
type ObfuscationMode string
ObfuscationMode specifies the obfuscation mode to use for go-sqllexer pkg.
type Obfuscator ¶
type Obfuscator struct {
// contains filtered or unexported fields
}
Obfuscator quantizes and obfuscates spans. The obfuscator is not safe for concurrent use.
func NewObfuscator ¶
func NewObfuscator(cfg Config) *Obfuscator
NewObfuscator creates a new obfuscator
func (*Obfuscator) ObfuscateCreditCardNumber ¶ added in v0.55.0
func (o *Obfuscator) ObfuscateCreditCardNumber(key, val string) string
ObfuscateCreditCardNumber obfuscates any "credit card like" numbers in value for keys not in the allow-list
func (*Obfuscator) ObfuscateElasticSearchString ¶
func (o *Obfuscator) ObfuscateElasticSearchString(cmd string) string
ObfuscateElasticSearchString obfuscates the given ElasticSearch JSON query.
func (*Obfuscator) ObfuscateMemcachedString ¶
func (o *Obfuscator) ObfuscateMemcachedString(cmd string) string
ObfuscateMemcachedString obfuscates the Memcached command cmd.
func (*Obfuscator) ObfuscateMongoDBString ¶
func (o *Obfuscator) ObfuscateMongoDBString(cmd string) string
ObfuscateMongoDBString obfuscates the given MongoDB JSON query.
func (*Obfuscator) ObfuscateOpenSearchString ¶ added in v0.56.0
func (o *Obfuscator) ObfuscateOpenSearchString(cmd string) string
ObfuscateOpenSearchString obfuscates the given OpenSearch JSON query.
func (*Obfuscator) ObfuscateRedisString ¶
func (*Obfuscator) ObfuscateRedisString(rediscmd string) string
ObfuscateRedisString obfuscates the given Redis command.
func (*Obfuscator) ObfuscateSQLExecPlan ¶
func (o *Obfuscator) ObfuscateSQLExecPlan(jsonPlan string, normalize bool) (string, error)
ObfuscateSQLExecPlan obfuscates query conditions in the provided JSON encoded execution plan. If normalize=True, then cost and row estimates are also obfuscated away.
func (*Obfuscator) ObfuscateSQLString ¶
func (o *Obfuscator) ObfuscateSQLString(in string) (*ObfuscatedQuery, error)
ObfuscateSQLString quantizes and obfuscates the given input SQL query string. Quantization removes some elements such as comments and aliases and obfuscation attempts to hide sensitive information in strings and numbers by redacting them.
func (*Obfuscator) ObfuscateSQLStringForDBMS ¶ added in v0.63.0
func (o *Obfuscator) ObfuscateSQLStringForDBMS(in string, dbms string) (*ObfuscatedQuery, error)
ObfuscateSQLStringForDBMS quantizes and obfuscates the given input SQL query string for a specific DBMS.
func (*Obfuscator) ObfuscateSQLStringWithOptions ¶
func (o *Obfuscator) ObfuscateSQLStringWithOptions(in string, opts *SQLConfig) (*ObfuscatedQuery, error)
ObfuscateSQLStringWithOptions accepts an optional SQLOptions to change the behavior of the obfuscator to quantize and obfuscate the given input SQL query string. Quantization removes some elements such as comments and aliases and obfuscation attempts to hide sensitive information in strings and numbers by redacting them.
func (*Obfuscator) ObfuscateURLString ¶
func (o *Obfuscator) ObfuscateURLString(val string) string
ObfuscateURLString obfuscates the given URL. It must be a valid URL.
func (*Obfuscator) ObfuscateWithSQLLexer ¶ added in v0.50.0
func (o *Obfuscator) ObfuscateWithSQLLexer(in string, opts *SQLConfig) (*ObfuscatedQuery, error)
ObfuscateWithSQLLexer obfuscates the given SQL query using the go-sqllexer package. If ObfuscationMode is set to ObfuscateOnly, the query will be obfuscated without normalizing it.
func (*Obfuscator) QuantizeRedisString ¶
func (*Obfuscator) QuantizeRedisString(query string) string
QuantizeRedisString returns a quantized version of a Redis query.
TODO(gbbr): Refactor this method to use the tokenizer and remove "compactWhitespaces". This method is buggy when commands contain quoted strings with newlines.
func (*Obfuscator) RemoveAllRedisArgs ¶ added in v0.47.0
func (*Obfuscator) RemoveAllRedisArgs(rediscmd string) string
RemoveAllRedisArgs will take in a command and obfuscate all arguments following the command, regardless of if the command is valid Redis or not
type RedisConfig ¶ added in v0.47.0
type RedisConfig struct { // Enabled specifies whether this feature should be enabled. Enabled bool `mapstructure:"enabled"` // RemoveAllArgs specifies whether all arguments to a given Redis // command should be obfuscated. RemoveAllArgs bool `mapstructure:"remove_all_args"` }
RedisConfig holds the configuration settings for Redis obfuscation
type SQLConfig ¶
type SQLConfig struct { // DBMS identifies the type of database management system (e.g. MySQL, Postgres, and SQL Server). // Valid values for this can be found at https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/database.md#connection-level-attributes DBMS string `json:"dbms"` // TableNames specifies whether the obfuscator should also extract the table names that a query addresses, // in addition to obfuscating. TableNames bool `json:"table_names" yaml:"table_names"` // CollectCommands specifies whether the obfuscator should extract and return commands as SQL metadata when obfuscating. CollectCommands bool `json:"collect_commands" yaml:"collect_commands"` // CollectComments specifies whether the obfuscator should extract and return comments as SQL metadata when obfuscating. CollectComments bool `json:"collect_comments" yaml:"collect_comments"` // CollectProcedures specifies whether the obfuscator should extract and return procedure names as SQL metadata when obfuscating. CollectProcedures bool `json:"collect_procedures" yaml:"collect_procedures"` // ReplaceDigits specifies whether digits in table names and identifiers should be obfuscated. ReplaceDigits bool `json:"replace_digits" yaml:"replace_digits"` // KeepSQLAlias reports whether SQL aliases ("AS") should be truncated. KeepSQLAlias bool `json:"keep_sql_alias"` // DollarQuotedFunc reports whether to treat "$func$" delimited dollar-quoted strings // differently and not obfuscate them as a string. To read more about dollar quoted // strings see: // // https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-DOLLAR-QUOTING DollarQuotedFunc bool `json:"dollar_quoted_func"` // ObfuscationMode specifies the obfuscation mode to use for go-sqllexer pkg. // When specified, obfuscator will attempt to use go-sqllexer pkg to obfuscate (and normalize) SQL queries. // Valid values are "normalize_only", "obfuscate_only", "obfuscate_and_normalize" ObfuscationMode ObfuscationMode `json:"obfuscation_mode" yaml:"obfuscation_mode"` // RemoveSpaceBetweenParentheses specifies whether to remove spaces between parentheses. // By default, spaces are inserted between parentheses during normalization. // This option is only valid when ObfuscationMode is "normalize_only" or "obfuscate_and_normalize". RemoveSpaceBetweenParentheses bool `json:"remove_space_between_parentheses" yaml:"remove_space_between_parentheses"` // KeepNull specifies whether to disable obfuscate NULL value with ?. // This option is only valid when ObfuscationMode is "obfuscate_only" or "obfuscate_and_normalize". KeepNull bool `json:"keep_null" yaml:"keep_null"` // KeepBoolean specifies whether to disable obfuscate boolean value with ?. // This option is only valid when ObfuscationMode is "obfuscate_only" or "obfuscate_and_normalize". KeepBoolean bool `json:"keep_boolean" yaml:"keep_boolean"` // KeepPositionalParameter specifies whether to disable obfuscate positional parameter with ?. // This option is only valid when ObfuscationMode is "obfuscate_only" or "obfuscate_and_normalize". KeepPositionalParameter bool `json:"keep_positional_parameter" yaml:"keep_positional_parameter"` // KeepTrailingSemicolon specifies whether to keep trailing semicolon. // By default, trailing semicolon is removed during normalization. // This option is only valid when ObfuscationMode is "normalize_only" or "obfuscate_and_normalize". KeepTrailingSemicolon bool `json:"keep_trailing_semicolon" yaml:"keep_trailing_semicolon"` // KeepIdentifierQuotation specifies whether to keep identifier quotation, e.g. "my_table" or [my_table]. // By default, identifier quotation is removed during normalization. // This option is only valid when ObfuscationMode is "normalize_only" or "obfuscate_and_normalize". KeepIdentifierQuotation bool `json:"keep_identifier_quotation" yaml:"keep_identifier_quotation"` // KeepJSONPath specifies whether to keep JSON paths following JSON operators in SQL statements in obfuscation. // By default, JSON paths are treated as literals and are obfuscated to ?, e.g. "data::jsonb -> 'name'" -> "data::jsonb -> ?". // This option is only valid when ObfuscationMode is "normalize_only" or "obfuscate_and_normalize". KeepJSONPath bool `json:"keep_json_path" yaml:"keep_json_path"` // Cache is deprecated. Please use `apm_config.obfuscation.cache` instead. Cache bool `json:"cache" yaml:"cache"` }
SQLConfig holds the config for obfuscating SQL.
type SQLMetadata ¶
type SQLMetadata struct { // Size holds the byte size of the metadata collected. Size int64 // TablesCSV is a comma-separated list of tables that the query addresses. TablesCSV string `json:"tables_csv"` // Commands holds commands executed in an SQL statement. // e.g. SELECT, UPDATE, INSERT, DELETE, etc. Commands []string `json:"commands"` // Comments holds comments in an SQL statement. Comments []string `json:"comments"` // Procedures holds procedure names in an SQL statement. Procedures []string `json:"procedures"` }
SQLMetadata holds metadata collected throughout the obfuscation of an SQL statement. It is only collected when enabled via SQLConfig.
type SQLTokenizer ¶
type SQLTokenizer struct {
// contains filtered or unexported fields
}
SQLTokenizer is the struct used to generate SQL tokens for the parser.
func NewSQLTokenizer ¶
func NewSQLTokenizer(sql string, literalEscapes bool, cfg *SQLConfig) *SQLTokenizer
NewSQLTokenizer creates a new SQLTokenizer for the given SQL string. The literalEscapes argument specifies whether escape characters should be treated literally or as such.
func (*SQLTokenizer) Err ¶
func (tkn *SQLTokenizer) Err() error
Err returns the last error that the tokenizer encountered, or nil.
func (*SQLTokenizer) Position ¶
func (tkn *SQLTokenizer) Position() int
Position exports the tokenizer's current position in the query
func (*SQLTokenizer) Reset ¶
func (tkn *SQLTokenizer) Reset(in string)
Reset the underlying buffer and positions
func (*SQLTokenizer) Scan ¶
func (tkn *SQLTokenizer) Scan() (TokenKind, []byte)
Scan scans the tokenizer for the next token and returns the token type and the token buffer.
func (*SQLTokenizer) SeenEscape ¶
func (tkn *SQLTokenizer) SeenEscape() bool
SeenEscape returns whether or not this tokenizer has seen an escape character within a scanned string
func (*SQLTokenizer) SkipBlank ¶
func (tkn *SQLTokenizer) SkipBlank()
SkipBlank moves the tokenizer forward until hitting a non-whitespace character The whitespace definition used here is the same as unicode.IsSpace
type StatsClient ¶
type StatsClient interface { // Gauge reports a gauge stat with the given name, value, tags and rate. Gauge(name string, value float64, tags []string, rate float64) error }
StatsClient implementations are able to emit stats.
type SyntaxError ¶
type SyntaxError struct { Offset int64 // error occurred after reading Offset bytes // contains filtered or unexported fields }
A SyntaxError is a description of a JSON syntax error.
func (*SyntaxError) Error ¶
func (e *SyntaxError) Error() string
type TokenKind ¶
type TokenKind uint32
TokenKind specifies the type of the token being scanned. It may be one of the defined constants below or in some cases the actual rune itself.
type ValkeyConfig ¶ added in v0.64.0
type ValkeyConfig struct { // Enabled specifies whether this feature should be enabled. Enabled bool `mapstructure:"enabled"` // RemoveAllArgs specifies whether all arguments to a given Valkey // command should be obfuscated. RemoveAllArgs bool `mapstructure:"remove_all_args"` }
ValkeyConfig holds the configuration settings for Valkey obfuscation