normalize

package
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 4, 2026 License: Apache-2.0 Imports: 9 Imported by: 0

Documentation

Overview

Package normalize implements the Unicode normalization pipeline applied to every field value before tokenization.

The pipeline replaces invalid UTF-8 bytes with U+FFFD, applies NFKD decomposition, casefolds with Unicode-aware lowercasing, and then dispatches to a field-specific normalizer chosen by the FieldPath namespace (digits-only for identifiers, RFC 3339 parsing for dates, diacritic-stripping for names, and so on). Calling Normalize directly is rarely needed — token.Tokenizer runs it as the first stage of every tokenize call. The surface is exported so callers can pre-validate input or build custom indexing pipelines that share the canonical form.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Normalize

func Normalize(value string, path sriracha.FieldPath) (string, error)

Normalize applies the standard Sriracha normalization pipeline to value for the given field path. Returns an error if the value is invalid for the field's expected format.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL