normalize

package

v0.1.0 Latest Latest Go to latest Published: May 4, 2026 License: Apache-2.0 Imports: 9 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/ccuetoh/sriracha

Links

Open Source Insights

Documentation ¶

Overview ¶

Package normalize implements the Unicode normalization pipeline applied to every field value before tokenization.

The pipeline replaces invalid UTF-8 bytes with U+FFFD, applies NFKD decomposition, casefolds with Unicode-aware lowercasing, and then dispatches to a field-specific normalizer chosen by the FieldPath namespace (digits-only for identifiers, RFC 3339 parsing for dates, diacritic-stripping for names, and so on). Calling Normalize directly is rarely needed — token.Tokenizer runs it as the first stage of every tokenize call. The surface is exported so callers can pre-validate input or build custom indexing pipelines that share the canonical form.

Index ¶

func Normalize(value string, path sriracha.FieldPath) (string, error)

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func Normalize ¶

func Normalize(value string, path sriracha.FieldPath) (string, error)

Normalize applies the standard Sriracha normalization pipeline to value for the given field path. Returns an error if the value is invalid for the field's expected format.

Types ¶

This section is empty.

Source Files ¶

View all Source files

normalize.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL