- func CaseVariations(word string, style WordCase) string
- func ReadTextFile(filename string) (string, error)
- func RemoveEmail(s string) string
- func RemoveHost(s string) string
- func RemoveNotWords(s string) string
- func RemovePath(s string) string
- func StripURL(s string) string
- type Diff
- type Replacer
- type WordCase
This section is empty.
DictAmerican converts UK spellings to US spellings
DictBritish converts US spellings to UK spellings
DictMain is the main rule set, not including locale-specific spellings
func CaseVariations ¶
CaseVariations returns If AllUpper or First-Letter-Only is upcased: add the all upper case version If AllLower, add the original, the title and upcase forms If Mixed, return the original, and the all upcase form
func ReadTextFile ¶
ReadTextFile returns the contents of a file, first testing if it is a text file
returns ("", nil) if not a text file returns ("", error) if error returns (string, nil) if text
unfortunately, in worse case, this does
1 stat 1 open,read,close of 512 bytes 1 more stat,open, read everything, close (via ioutil.ReadAll) This could be kinder to the filesystem.
This uses some heuristics of the file's extension (e.g. .zip, .txt) and uses a sniffer to determine if the file is text or not. Using file extensions isn't great, but probably good enough for real-world use. Golang's built in sniffer is problematic for differnet reasons. It's optimized for HTML, and is very limited in detection. It would be good to explicitly add some tests for ELF/DWARF formats to make sure we never corrupt binary files.
func RemoveEmail ¶
RemoveEmail remove email-like strings, e.g. "email@example.com", "firstname.lastname@example.org"
func RemoveHost ¶
RemoveHost removes host-like strings "foobar.com" "abc123.fo1231.biz"
func RemoveNotWords ¶
RemoveNotWords blanks out all the not words
func RemovePath ¶
RemovePath attempts to strip away embedded file system paths, e.g.
/foo/bar or /static/myimg.png TODO: windows style
Diff is datastructure showing what changed in a single line
Replacer is the main struct for spelling correction
func (*Replacer) AddRuleList ¶
AddRuleList appends new rules. Input is in the same form as Strings.Replacer: [ old1, new1, old2, new2, ....] Note: does not check for duplictes
func (r *Replacer) Compile()
Compile compiles the rules. Required before using the Replace functions
func (*Replacer) RemoveRule ¶
RemoveRule deletes existings rules. TODO: make inplace to save memory
Replace is corrects misspellings in input, returning corrected version
along with a list of diffs.