Documentation ¶
Overview ¶
Package sanitary provides a collection of string processing functions that pre-process or clean up some user-input strings (the process is called the "sanitization"). These functions are either pure Go functions or a Go string transformer.
Index ¶
Constants ¶
This section is empty.
Variables ¶
var RemoveAccentsTransformer = transform.Chain( norm.NFKD, runes.Remove(runes.In(runedata.AllCombiningDiacriticalMarks)), norm.NFKC, )
RemoveAccentsTransformer is a Unicode stream transformer object which tries to removes as many combining diacritical marks from the input string as possible. It handles various combinations of the same Unicode characters whenever possible (such as 'ö' as a single codepoint vs. 'o' + '¨' = 'ö' which has 2 codepoints).
The removal process is preceded by Unicode decomposition, and the result is then re-combined to get final output.
var StripNonPrintTransformer = runes.Remove(runes.NotIn(runedata.PrintsAndWhiteSpaces))
StripNonPrintingTransform is a Unicode stream transformer object which removes all occurrences of non-printing and non-spacing rune characters from a string.
var ToLowerTransformer = runes.Map(unicode.ToLower)
ToLowerTransformer is a Unicode stream transformer object which transforms all unicode characters into its lowercase forms as defined by Unicode property.
var ToNormalSpaceTransformer = runes.If( runes.In(unicode.White_Space), runes.Map(func(r rune) rune { return ' ' }), nil, )
ToNormalSpaceTransformer is a Unicode stream transformer object which replaces all white space rune characters into a normal space.
Functions ¶
func ApplyTransformers ¶
func ApplyTransformers(str string, ts ...transform.Transformer) string
ApplyTransformers applies each string transformer in the given sequence of transformers to the given input string. If any transformer produces an error, it will be silently ignored and intermediate string will not be affected.
func LatinExtendedSanitize ¶
LatinExtendedSanitize sanitizes an input string via various string sanitization methods related to Extended Latin scripts.
Types ¶
This section is empty.