Documentation
¶
Overview ¶
Package stringutils implements additional functions to support the go standard library strings module.
The algorithms chosen are based on benchmarks from the stringbenchmarks module. ymmv...
The current implementation at the start of this project was .../go/1.15.3/libexec/src/strings/strings.go
For information about UTF-8 strings in Go, see https://blog.golang.org/strings.
Index ¶
Constants ¶
View Source
const ( RuneError = utf8.RuneError // '\uFFFD' // the "error" Rune or "Unicode replacement character" RuneSelf = utf8.RuneSelf // 0x80 // characters below RuneSelf are represented as themselves in a single byte. MaxRune = utf8.MaxRune // '\U0010FFFF' // Maximum valid Unicode code point. UTFMax = utf8.UTFMax // 4 // maximum number of bytes of a UTF-8 encoded Unicode character. )
Numbers fundamental to the encoding.
Variables ¶
View Source
var IsDigit = stringbenchmarks.IsDigit
View Source
var ( // UnicodeWhiteSpaceMap provides a mapping from Unicode runes to strings // with descriptions of each. It is marginally slower than the bool map. // // In computer programming, whitespace is any character or series of // characters that represent horizontal or vertical space in typography. // When rendered, a whitespace character does not correspond to a visible // mark, but typically does occupy an area on a page. For example, the // common whitespace symbol SPACE (unicode: U+0020 ASCII: 32 decimal 0x20 // hex) represents a blank space punctuation character in text, used as a // word divider in Western scripts. // // Reference: https://en.wikipedia.org/wiki/Whitespace_character UnicodeWhiteSpaceMap = map[rune]string{ 0x0009: `CHARACTER TABULATION <TAB>`, 0x000A: `ASCII LF`, 0x000B: `LINE TABULATION <VT>`, 0x000C: `FORM FEED <FF>`, 0x000D: `ASCII CR`, 0x0020: `SPACE <SP>`, 0x00A0: `NO-BREAK SPACE <NBSP>`, 0x0085: `NEL; Next Line`, 0x1680: `Ogham space mark, interword separation in Ogham text`, 0x2000: `EN QUAD, 0x2002 is preferred`, 0x2001: `EM QUAD, mutton quad, 0x2003 is preferred`, 0x2002: `EN SPACE, "nut", &ensp, LaTeX: '\enspace'`, 0x2003: `EM SPACE, "mutton",  , LaTeX: '\quad'`, 0x2004: `THREE-PER-EM SPACE, "thick space",  `, 0x2005: `four-per-em space, "mid space",  `, 0x2006: `SIX-PER-EM SPACE, sometimes equated to U+2009`, 0x2007: `FIGURE SPACE, width of monospaced char,  `, 0x2008: `PUNCTUATION SPACE, width of period or comma,  `, 0x2009: `THIN SPACE, 1/5th em, thousands sep,  ; LaTeX: '\,'`, 0x200A: `HAIR SPACE,  `, 0x2028: `LINE SEPARATOR`, 0x2029: `PARAGRAPH SEPARATOR`, 0x202F: `NARROW NO-BREAK SPACE`, 0x205F: `MEDIUM MATHEMATICAL SPACE, MMSP, &MediumSpace, 4/18 em`, 0x3000: `IDEOGRAPHIC SPACE, full width CJK character cell`, 0xFFEF: `ZERO WIDTH NO-BREAK SPACE <ZWNBSP> (BOM), deprecated Unicode 3.2 (use U+2060)`, } )
Functions ¶
func ByteToDigit ¶
Types ¶
This section is empty.
Directories
¶
| Path | Synopsis |
|---|---|
|
_examples
|
|
|
ascii
command
|
|
|
cli
command
|
|
|
defer
command
This an example of the defer statement in action.
|
This an example of the defer statement in action. |
|
gogen
Package gogen implements automation for Go code generation using go generate and templates.
|
Package gogen implements automation for Go code generation using go generate and templates. |
|
gogen/cmd/gogen
command
|
|
|
head
command
package main provides a simplified re-implementation of the head command, which displays the first several lines of a given file.
|
package main provides a simplified re-implementation of the head command, which displays the first several lines of a given file. |
|
reference
|
|
|
diffuser
Package diffuser implements a stream and interprets text.
|
Package diffuser implements a stream and interprets text. |
|
Package stringutils implements additional functions to support the go standard library strings module.
|
Package stringutils implements additional functions to support the go standard library strings module. |
Click to show internal directories.
Click to hide internal directories.